- Description:
* Baidu AI Cloud's [Qianfan
Platform](https://cloud.baidu.com/doc/WENXINWORKSHOP/index.html) is an
all-in-one platform for large model development and service deployment,
catering to enterprise developers in China. Qianfan Platform offers a
wide range of resources, including the Wenxin Yiyan model (ERNIE-Bot)
and various third-party open-source models.
- Issue: none
- Dependencies:
* qianfan
- Tag maintainer: @baskaryan
- Twitter handle:
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
The latest version of HazyResearch/manifest doesn't support accessing
the "client" directly. The latest version supports connection pools and
a client has to be requested from the client pool.
**Issue:**
No matching issue was found
**Dependencies:**
The manifest.ipynb file in docs/extras/integrations/llms need to be
updated
**Twitter handle:**
@hrk_cbe
## Description:
I've integrated CTranslate2 with LangChain. CTranlate2 is a recently
popular library for efficient inference with Transformer models that
compares favorably to alternatives such as HF Text Generation Inference
and vLLM in
[benchmarks](https://hamel.dev/notes/llm/inference/03_inference.html).
## Description
This PR introduces a minor change to the TitanTakeoff integration.
Instead of specifying a port on localhost, this PR will allow users to
specify a baseURL instead. This will allow users to use the integration
if they have TitanTakeoff deployed externally (not on localhost). This
removes the hardcoded reference to localhost "http://localhost:{port}".
### Info about Titan Takeoff
Titan Takeoff is an inference server created by
[TitanML](https://www.titanml.co/) that allows you to deploy large
language models locally on your hardware in a single command. Most
generative model architectures are included, such as Falcon, Llama 2,
GPT2, T5 and many more.
Read more about Titan Takeoff here:
-
[Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e)
- [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started)
### Dependencies
No new dependencies are introduced. However, users will need to install
the titan-iris package in their local environment and start the Titan
Takeoff inferencing server in order to use the Titan Takeoff
integration.
Thanks for your help and please let me know if you have any questions.
cc: @hwchase17 @baskaryan
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Fixed navbar:
- renamed several files, so ToC is sorted correctly
- made ToC items consistent: formatted several Titles
- added several links
- reformatted several docs to a consistent format
- renamed several files (removed `_example` suffix)
- added renamed files to the `docs/docs_skeleton/vercel.json`
This PR follows the **Eden AI (LLM + embeddings) integration**. #8633
We added an optional parameter to choose different AI models for
providers (like 'text-bison' for provider 'google', 'text-davinci-003'
for provider 'openai', etc.).
Usage:
```python
llm = EdenAI(
feature="text",
provider="google",
params={
"model": "text-bison", # new
"temperature": 0.2,
"max_tokens": 250,
},
)
```
You can also change the provider + model after initialization
```python
llm = EdenAI(
feature="text",
provider="google",
params={
"temperature": 0.2,
"max_tokens": 250,
},
)
prompt = """
hi
"""
llm(prompt, providers='openai', model='text-davinci-003') # change provider & model
```
The jupyter notebook as been updated with an example well.
Ping: @hwchase17, @baskaryan
---------
Co-authored-by: RedhaWassim <rwasssim@gmail.com>
Co-authored-by: sam <melaine.samy@gmail.com>
# Description
This PR adds additional documentation on how to use Azure Active
Directory to authenticate to an OpenAI service within Azure. This method
of authentication allows organizations with more complex security
requirements to use Azure OpenAI.
# Issue
N/A
# Dependencies
N/A
# Twitter
https://twitter.com/CamAHutchison
Fixed title for the `extras/integrations/llms/llm_caching.ipynb`.
Existing title breaks the sorted order of items in the navbar.
Updated some formatting.
## Description
The following PR enables the [grammar-based
sampling](https://github.com/ggerganov/llama.cpp/tree/master/grammars)
in llama-cpp LLM.
In short, loading file with formal grammar definition will constrain
model outputs. For instance, one can force the model to generate valid
JSON or generate only python lists.
In the follow-up PR we will add:
* docs with some description why it is cool and how it works
* maybe some code sample for some task such as in llama repo
---------
Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Add PromptGuard integration
-------
There are two approaches to integrate PromptGuard with a LangChain
application.
1. PromptGuardLLMWrapper
2. functions that can be used in LangChain expression.
-----
- Dependencies
`promptguard` python package, which is a runtime requirement if you'd
try out the demo.
- @baskaryan @hwchase17 Thanks for the ideas and suggestions along the
development process.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Description: Added improvements in Nebula LLM to perform auto-retry;
more generation parameters supported. Conversation is no longer required
to be passed in the LLM object. Examples are updated.
- Issue: N/A
- Dependencies: N/A
- Tag maintainer: @baskaryan
- Twitter handle: symbldotai
---------
Co-authored-by: toshishjawale <toshish@symbl.ai>
Adds [DeepSparse](https://github.com/neuralmagic/deepsparse) as an LLM
backend. DeepSparse supports running various open-source sparsified
models hosted on [SparseZoo](https://sparsezoo.neuralmagic.com/) for
performance gains on CPUs.
Twitter handles: @mgoin_ @neuralmagic
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Description
This PR adds the `aembed_query` and `aembed_documents` async methods for
improving the embeddings generation for large documents. The
implementation uses asyncio tasks and gather to achieve concurrency as
there is no bedrock async API in boto3.
### Maintainers
@agola11
@aarora79
### Open questions
To avoid throttling from the Bedrock API, should there be an option to
limit the concurrency of the calls?
## Description:
This PR adds the Titan Takeoff Server to the available LLMs in
LangChain.
Titan Takeoff is an inference server created by
[TitanML](https://www.titanml.co/) that allows you to deploy large
language models locally on your hardware in a single command. Most
generative model architectures are included, such as Falcon, Llama 2,
GPT2, T5 and many more.
Read more about Titan Takeoff here:
-
[Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e)
- [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started)
#### Testing
As Titan Takeoff runs locally on port 8000 by default, no network access
is needed. Responses are mocked for testing.
- [x] Make Lint
- [x] Make Format
- [x] Make Test
#### Dependencies
No new dependencies are introduced. However, users will need to install
the titan-iris package in their local environment and start the Titan
Takeoff inferencing server in order to use the Titan Takeoff
integration.
Thanks for your help and please let me know if you have any questions.
cc: @hwchase17 @baskaryan
This addresses some issues with introducing the Nebula LLM to LangChain
in this PR:
https://github.com/langchain-ai/langchain/pull/8876
This fixes the following:
- Removes `SYMBLAI` from variable names
- Fixes bug with `Bearer` for the API KEY
Thanks again in advance for your help!
cc: @hwchase17, @baskaryan
---------
Co-authored-by: dvonthenen <david.vonthenen@gmail.com>
Adds Ollama as an LLM. Ollama can run various open source models locally
e.g. Llama 2 and Vicuna, automatically configuring and GPU-optimizing
them.
@rlancemartin @hwchase17
---------
Co-authored-by: Lance Martin <lance@langchain.dev>
## Description
This PR adds Nebula to the available LLMs in LangChain.
Nebula is an LLM focused on conversation understanding and enables users
to extract conversation insights from video, audio, text, and chat-based
conversations. These conversations can occur between any mix of human or
AI participants.
Examples of some questions you could ask Nebula from a given
conversation are:
- What could be the customer’s pain points based on the conversation?
- What sales opportunities can be identified from this conversation?
- What best practices can be derived from this conversation for future
customer interactions?
You can read more about Nebula here:
https://symbl.ai/blog/extract-insights-symbl-ai-generative-ai-recall-ai-meetings/
#### Integration Test
An integration test is added, but it requires network access. Since
Nebula is fully managed like OpenAI, network access is required to
exercise the integration test.
#### Linting
- [x] make lint
- [x] make test (TODO: there seems to be a failure in another
non-related test??? Need to check on this.)
- [x] make format
### Dependencies
No new dependencies were introduced.
### Twitter handle
[@symbldotai](https://twitter.com/symbldotai)
[@dvonthenen](https://twitter.com/dvonthenen)
If you have any questions, please let me know.
cc: @hwchase17, @baskaryan
---------
Co-authored-by: dvonthenen <david.vonthenen@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Hello langchain maintainers,
this PR aims at integrating
[vllm](https://vllm.readthedocs.io/en/latest/#) into langchain. This PR
closes#8729.
This feature clearly depends on `vllm`, but I've seen other models
supported here depend on packages that are not included in the
pyproject.toml (e.g. `gpt4all`, `text-generation`) so I thought it was
the case for this as well.
@hwchase17, @baskaryan
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Description - Integrates Fireworks within Langchain LLMs to allow users
to use Fireworks models with Langchain, mainly for summarization.
Issue - Not applicable
Dependencies - None
Tag maintainer - @rlancemartin
---------
Co-authored-by: Raj Janardhan <rajjanardhan@Rajs-Laptop.attlocal.net>