This PR adds 8 new loaders:
* `AirbyteCDKLoader` This reader can wrap and run all python-based
Airbyte source connectors.
* Separate loaders for the most commonly used APIs:
* `AirbyteGongLoader`
* `AirbyteHubspotLoader`
* `AirbyteSalesforceLoader`
* `AirbyteShopifyLoader`
* `AirbyteStripeLoader`
* `AirbyteTypeformLoader`
* `AirbyteZendeskSupportLoader`
## Documentation and getting started
I added the basic shape of the config to the notebooks. This increases
the maintenance effort a bit, but I think it's worth it to make sure
people can get started quickly with these important connectors. This is
also why I linked the spec and the documentation page in the readme as
these two contain all the information to configure a source correctly
(e.g. it won't suggest using oauth if that's avoidable even if the
connector supports it).
## Document generation
The "documents" produced by these loaders won't have a text part
(instead, all the record fields are put into the metadata). If a text is
required by the use case, the caller needs to do custom transformation
suitable for their use case.
## Incremental sync
All loaders support incremental syncs if the underlying streams support
it. By storing the `last_state` from the reader instance away and
passing it in when loading, it will only load updated records.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This addresses some issues with introducing the Nebula LLM to LangChain
in this PR:
https://github.com/langchain-ai/langchain/pull/8876
This fixes the following:
- Removes `SYMBLAI` from variable names
- Fixes bug with `Bearer` for the API KEY
Thanks again in advance for your help!
cc: @hwchase17, @baskaryan
---------
Co-authored-by: dvonthenen <david.vonthenen@gmail.com>
Minor doc fix to awslambda tool notebook.
Add missing import for initialize_agent to awslambda agent example
Co-authored-by: Josh Hart <josharj@amazon.com>
Description:
Fixed inaccurate import in integrations:providers:bedrock documentation
In the current version of the bedrock documentation, page
https://python.langchain.com/docs/integrations/providers/bedrock it
states that the import is from langchain import Bedrock
This has been changed to from langchain.llms.bedrock import Bedrock as
stated in https://python.langchain.com/docs/integrations/llms/bedrock
Issue:
Not applicable
Dependencies
No dependencies required
Tag maintainer
@baskaryan
Twitter handle:
Not applicable
Adds Ollama as an LLM. Ollama can run various open source models locally
e.g. Llama 2 and Vicuna, automatically configuring and GPU-optimizing
them.
@rlancemartin @hwchase17
---------
Co-authored-by: Lance Martin <lance@langchain.dev>
## Description
I am excited to propose an integration with USearch, a lightweight
vector-search engine available for both Python and JavaScript, among
other languages.
## Dependencies
It introduces a new PyPi dependency - `usearch`. I am unsure if it must
be added to the Poetry file, as this would make the PR too clunky.
Please let me know.
## Profiles
- Maintainers: @ashvardanian @davvard
- Twitter handles: @ashvardanian @unum_cloud
---------
Co-authored-by: Davit Vardanyan <78792753+davvard@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- fix install command
- change example notebook to use Metaphor autoprompt by default
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
- Description: Adds the ChatAnyscale class with llama-2 7b, llama-2 13b,
and llama-2 70b on [Anyscale
Endpoints](https://app.endpoints.anyscale.com/)
- It inherits from ChatOpenAI and requires openai (probably unnecessary
but it made for a quick and easy implementation)
- Inspired by https://github.com/langchain-ai/langchain/pull/8434
(@kylehh and @baskaryan )
## Description
This PR adds Nebula to the available LLMs in LangChain.
Nebula is an LLM focused on conversation understanding and enables users
to extract conversation insights from video, audio, text, and chat-based
conversations. These conversations can occur between any mix of human or
AI participants.
Examples of some questions you could ask Nebula from a given
conversation are:
- What could be the customer’s pain points based on the conversation?
- What sales opportunities can be identified from this conversation?
- What best practices can be derived from this conversation for future
customer interactions?
You can read more about Nebula here:
https://symbl.ai/blog/extract-insights-symbl-ai-generative-ai-recall-ai-meetings/
#### Integration Test
An integration test is added, but it requires network access. Since
Nebula is fully managed like OpenAI, network access is required to
exercise the integration test.
#### Linting
- [x] make lint
- [x] make test (TODO: there seems to be a failure in another
non-related test??? Need to check on this.)
- [x] make format
### Dependencies
No new dependencies were introduced.
### Twitter handle
[@symbldotai](https://twitter.com/symbldotai)
[@dvonthenen](https://twitter.com/dvonthenen)
If you have any questions, please let me know.
cc: @hwchase17, @baskaryan
---------
Co-authored-by: dvonthenen <david.vonthenen@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
This adds support for [Xata](https://xata.io) (data platform based on
Postgres) as a vector store. We have recently added [Xata to
Langchain.js](https://github.com/hwchase17/langchainjs/pull/2125) and
would love to have the equivalent in the Python project as well.
The PR includes integration tests and a Jupyter notebook as docs. Please
let me know if anything else would be needed or helpful.
I have added the xata python SDK as an optional dependency.
## To run the integration tests
You will need to create a DB in xata (see the docs), then run something
like:
```
OPENAI_API_KEY=sk-... XATA_API_KEY=xau_... XATA_DB_URL='https://....xata.sh/db/langchain' poetry run pytest tests/integration_tests/vectorstores/test_xata.py
```
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Philip Krauss <35487337+philkra@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Hello langchain maintainers,
this PR aims at integrating
[vllm](https://vllm.readthedocs.io/en/latest/#) into langchain. This PR
closes#8729.
This feature clearly depends on `vllm`, but I've seen other models
supported here depend on packages that are not included in the
pyproject.toml (e.g. `gpt4all`, `text-generation`) so I thought it was
the case for this as well.
@hwchase17, @baskaryan
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- Updated to use newer better function interaction
- Previous version had only one callback
- @hinthornw @hwchase17 Can you look into this
- Shout out to @MultiON_AI @DivGarg9 on twitter
---------
Co-authored-by: Naman Garg <ngarg3@binghamton.edu>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Description: This PR improves the function of recursive_url_loader, such
as limiting the depth of the access, and customizable extractors(from
the raw webpage to the text of the Document object), so that users can
use other tools to extract the webpage. This PR also includes the
document and test for the new loader.
Old PR closed due to project structure change. #7756
Because socket requests are not allowed, the old unit test was removed.
Issue: N/A
Dependencies: asyncio, aiohttp
Tag maintainer: @rlancemartin
Twitter handle: @ Zend_Nihility
---------
Co-authored-by: Lance Martin <lance@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: docstore had two main method: add and search, however,
dealing with docstore sometimes requires deleting an entry from
docstore. So I have added a simple delete method that deletes items from
docstore. Additionally, I have added the delete method to faiss
vectorstore for the very same reason.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
begining -> beginning
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
- Description: 2 links were not working on Question Answering Use Cases
documentation page. Hence, changed them to nearest useful links,
- Issue: NA,
- Dependencies: NA,
- Tag maintainer: @baskaryan,
- Twitter handle: NA
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
Refactor for the extraction use case documentation
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
Description: Add ScaNN vectorstore to langchain.
ScaNN is a Open Source, high performance vector similarity library
optimized for AVX2-enabled CPUs.
https://github.com/google-research/google-research/tree/master/scann
- Dependencies: scann
Python notebook to illustrate the usage:
docs/extras/integrations/vectorstores/scann.ipynb
Integration test:
libs/langchain/tests/integration_tests/vectorstores/test_scann.py
@rlancemartin, @eyurtsev for review.
Thanks!
# What
- This is to add filter option to sklearn vectore store functions
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: Add filter to sklearn vectore store functions.
- Issue: None
- Dependencies: None
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @MlopsJ
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
This is to add save_local and load_local to tfidf_vectorizer and docs in
tfidf_retriever to make the vectorizer reusable.
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: add save_local and load_local to tfidf_vectorizer and
docs in tfidf_retriever
- Issue: None
- Dependencies: None
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @MlopsJ
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Simple retriever that applies an LLM between the user input and the
query pass the to retriever.
It can be used to pre-process the user input in any way.
The default prompt:
```
DEFAULT_QUERY_PROMPT = PromptTemplate(
input_variables=["question"],
template="""You are an assistant tasked with taking a natural languge query from a user
and converting it into a query for a vectorstore. In this process, you strip out
information that is not relevant for the retrieval task. Here is the user query: {question} """
)
```
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Replace this comment with:
- Description: added a document loader for a list of RSS feeds or OPML.
It iterates through the list and uses NewsURLLoader to load each
article.
- Issue: N/A
- Dependencies: feedparser, listparser
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @ruze
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>