langchain

Commit Graph

Author	SHA1	Message	Date
Tomaz Bratanic	09a0ecd000	langchain[minor]: Tests update metadata filtering examples of documents (#19963 ) Removing metadata properties that are dicts as some databases don't support that, and those properties aren't used in tests anyhow..	6 months ago
happy-go-lucky	c6432abdbe	community[patch]: Implement delete method and all async methods in opensearch_vector_search (#17321 ) - Description: In order to use index and aindex in libs/langchain/langchain/indexes/_api.py, I implemented delete method and all async methods in opensearch_vector_search - Dependencies: No changes	6 months ago
Cheng, Penghui	cc407e8a1b	community[minor]: weight only quantization with intel-extension-for-transformers. (#14504 ) Support weight only quantization with intel-extension-for-transformers. [Intel® Extension for Transformers](https://github.com/intel/intel-extension-for-transformers) is an innovative toolkit to accelerate Transformer-based models on Intel platforms, in particular effective on 4th Intel Xeon Scalable processor [Sapphire Rapids](https://www.intel.com/content/www/us/en/products/docs/processors/xeon-accelerated/4th-gen-xeon-scalable-processors.html) (codenamed Sapphire Rapids). The toolkit provides the below key features: * Seamless user experience of model compressions on Transformer-based models by extending [Hugging Face transformers](https://github.com/huggingface/transformers) APIs and leveraging [Intel® Neural Compressor](https://github.com/intel/neural-compressor) * Advanced software optimizations and unique compression-aware runtime. * Optimized Transformer-based model packages. * [NeuralChat](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat), a customizable chatbot framework to create your own chatbot within minutes by leveraging a rich set of plugins and SOTA optimizations. * [Inference](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/graph) of Large Language Model (LLM) in pure C/C++ with weight-only quantization kernels. This PR is an integration of weight only quantization feature with intel-extension-for-transformers. Unit test is in lib/langchain/tests/integration_tests/llm/test_weight_only_quantization.py The notebook is in docs/docs/integrations/llms/weight_only_quantization.ipynb. The document is in docs/docs/integrations/providers/weight_only_quantization.mdx. --------- Signed-off-by: Cheng, Penghui <penghui.cheng@intel.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Erick Friis	f0d5b59962	core[patch]: remove requests (#19891 ) Removes required usage of `requests` from `langchain-core`, all of which has been deprecated. - removes Tracer V1 implementations - removes old `try_load_from_hub` github-based hub implementations Removal done in a way where imports will still succeed, and usage will fail with a `RuntimeError`.	6 months ago
Peter Vandenabeele	e830a4e731	community[patch]: Add remove_comments option (default True): do not extract html comments (#13259 ) - Description: add `remove_comments` option (default: True): do not extract html _comments_, - Issue: None, - Dependencies: None, - Tag maintainer: @nfcampos , - Twitter handle: peter_v I ran `make format`, `make lint` and `make test`. Discussion: I my use case, I prefer to not have the comments in the extracted text: * e.g. from a Google tag that is added in the html as comment * e.g. content that the authors have temporarily hidden to make it non visible to the regular reader Removing the comments makes the extracted text more alike the intended text to be seen by the reader. Choice to make: do we prefer to make the default for this `remove_comments` option to be True or False? I have changed it to True in a second commit, since that is how I would prefer to use it by default. Have the cleaned text (without technical Google tags etc.) and also closer to the actually visible and intended content. I am not sure what is best aligned with the conventions of langchain in general ... INITIAL VERSION (new version above): ~Choice to make: do we prefer to make the default for this `ignore_comments` option to be True or False? I have set it to False now to be backwards compatible. On the other hand, I would use it mostly with True. I am not sure what is best aligned with the conventions of langchain in general ...~ --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Jamsheed Mistri	4f70bc119d	community[minor]: add Layerup Security integration (#19787 ) Description: adds integration with [Layerup Security](https://uselayerup.com). Docs can be found [here](https://docs.uselayerup.com). Integrates directly with our Python SDK. Dependencies: [LayerupSecurity](https://pypi.org/project/LayerupSecurity/) Note: all methods for our product require a paid API key, so I only included 1 test which checks for an invalid API key response. I have tested extensively locally. Twitter handle: [@layerup_](https://twitter.com/layerup_) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Anıl Berk Altuner	4384fa8e49	community[minor]: Add Dria retriever (#17098 ) [Dria](https://dria.co/) is a hub of public RAG models for developers to both contribute and utilize a shared embedding lake. This PR adds a retriever that can retrieve documents from Dria.	6 months ago
Chenhui Zhang	a1f3e9f537	community[minor]: Update ChatZhipuAI to support GLM-4 model (#16695 ) Description: Update `ChatZhipuAI` to support the latest `glm-4` model. Issue: N/A Dependencies: httpx, httpx-sse, PyJWT The previous `ChatZhipuAI` implementation requires the `zhipuai` package, and cannot call the latest GLM model. This is because - The old version `zhipuai==1.` doesn't support the latest model. - `zhipuai==2.` requires `pydantic V2`, which is incompatible with 'langchain-community'. This re-implementation invokes the GLM model by sending HTTP requests to [open.bigmodel.cn](https://open.bigmodel.cn/dev/api) via the `httpx` package, and uses the `httpx-sse` package to handle stream events. --------- Co-authored-by: zR <2448370773@qq.com>	6 months ago
Kenneth Choe	f98d7f7494	langchain[minor], community[minor]: add CrossEncoderReranker with HuggingFaceCrossEncoder and SagemakerEndpointCrossEncoder (#13687 ) - Description: Support reranking based on cross encoder models available from HuggingFace. - Added `CrossEncoder` schema - Implemented `HuggingFaceCrossEncoder` and `SagemakerEndpointCrossEncoder` - Implemented `CrossEncoderReranker` that performs similar functionality to `CohereRerank` - Added `cross-encoder-reranker.ipynb` to demonstrate how to use it. Please let me know if anything else needs to be done to make it visible on the table-of-contents navigation bar on the left, or on the card list on [retrievers documentation page](https://python.langchain.com/docs/integrations/retrievers). - Issue: N/A - Dependencies: None other than the existing ones. --------- Co-authored-by: Kenny Choe <kchoe@amazon.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Kamal Zhang	368e35c3b1	community[patch]: introduce convert_to_secret() to bananadev llm (#14283 ) - Description: Per #12165, this PR add to BananaLLM the function convert_to_secret_str() during environment variable validation. - Issue: #12165 - Tag maintainer: @eyurtsev - Twitter handle: @treewatcha75751 --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
anshaneel	0884e5de7f	community[minor]: Add Alpha Vantage API Tool (#14332 ) ### Description This implementation adds functionality from the AlphaVantage API, renowned for its comprehensive financial data. The class encapsulates various methods, each dedicated to fetching specific types of financial information from the API. ### Implemented Functions - `search_symbols`: - Searches the AlphaVantage API for financial symbols using the provided keywords. - `_get_market_news_sentiment`: - Retrieves market news sentiment for a specified stock symbol from the AlphaVantage API. - `_get_time_series_daily`: - Fetches daily time series data for a specific symbol from the AlphaVantage API. - `_get_quote_endpoint`: - Obtains the latest price and volume information for a given symbol from the AlphaVantage API. - `_get_time_series_weekly`: - Gathers weekly time series data for a particular symbol from the AlphaVantage API. - `_get_top_gainers_losers`: - Provides details on top gainers, losers, and most actively traded tickers in the US market from the AlphaVantage API. ### Issue: - #11994 ### Dependencies: - 'requests' library for HTTP requests. (import requests) - 'pytest' library for testing. (import pytest) --------- Co-authored-by: Adam Badar <94140103+adam-badar@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Alex Sherstinsky	a9bc212bf2	community[minor]: fix failing Predibase integration (#19776 ) - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: Langchain-Predibase integration was failing, because it was not current with the Predibase SDK; in addition, Predibase integration tests were instantiating the Langchain Community `Predibase` class with one required argument (`model`) missing. This change updates the Predibase SDK usage and fixes the integration tests. - Twitter handle: `@alexsherstinsky` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
M.Abdulrahman Alnaseer	ba54f1577f	community[minor]: add support for llmsherpa (#19741 ) Thank you for contributing to LangChain! - [x] PR title: "community: added support for llmsherpa library" - [x] Add tests and docs: 1. Integration test: 'docs/docs/integrations/document_loaders/test_llmsherpa.py'. 2. an example notebook: `docs/docs/integrations/document_loaders/llmsherpa.ipynb`. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Hrvoje Milković	b7344e3347	community[minor]: Infobip tool integration (#16805 ) Description: Adding Tool that wraps Infobip API for sending sms or emails and email validation. Dependencies: None, Twitter handle: @hmilkovic Implementation: ``` libs/community/langchain_community/utilities/infobip.py ``` Integration tests: ``` libs/community/tests/integration_tests/utilities/test_infobip.py ``` Example notebook: ``` docs/docs/integrations/tools/infobip.ipynb ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
morgana	074ad5095f	community[patch]: mmr search for Rockset vectorstore integration (#16908 ) - Description: Adding support for mmr search in the Rockset vectorstore integration. - Issue: N/A - Dependencies: N/A - Twitter handle: `@_morgan_adams_` --------- Co-authored-by: Rockset API Bot <admin@rockset.io> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
shahrin014	f51e6a35ba	community[patch]: OllamaEmbeddings - Pass headers to post request (#16880 ) ## Feature - Set additional headers in constructor - Headers will be sent in post request This feature is useful if deploying Ollama on a cloud service such as hugging face, which requires authentication tokens to be passed in the request header. ## Tests - Test if header is passed - Test if header is not passed Similar to https://github.com/langchain-ai/langchain/pull/15881 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Jan Chorowski	b8b42ccbc5	community[minor]: Pathway vectorstore(#14859 ) - Description: Integration with pathway.com data processing pipeline acting as an always updated vectorstore - Issue: not applicable - Dependencies: optional dependency on [`pathway`](https://pypi.org/project/pathway/) - Twitter handle: pathway_com The PR provides and integration with `pathway` to provide an easy to use always updated vector store: ```python import pathway as pw from langchain.embeddings.openai import OpenAIEmbeddings from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import PathwayVectorClient, PathwayVectorServer data_sources = [] data_sources.append( pw.io.gdrive.read(object_id="17H4YpBOAKQzEJ93xmC2z170l0bP2npMy", service_user_credentials_file="credentials.json", with_metadata=True)) text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) embeddings_model = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"]) vector_server = PathwayVectorServer( *data_sources, embedder=embeddings_model, splitter=text_splitter, ) vector_server.run_server(host="127.0.0.1", port="8765", threaded=True, with_cache=False) client = PathwayVectorClient( host="127.0.0.1", port="8765", ) query = "What is Pathway?" docs = client.similarity_search(query) ``` The `PathwayVectorServer` builds a data processing pipeline which continusly scans documents in a given source connector (google drive, s3, ...) and builds a vector store. The `PathwayVectorClient` implements LangChain's `VectorStore` interface and connects to the server to retrieve documents. --------- Co-authored-by: Mateusz Lewandowski <lewymati@users.noreply.github.com> Co-authored-by: mlewandowski <mlewandowski@MacBook-Pro-mlewandowski.local> Co-authored-by: Berke <berkecanrizai1@gmail.com> Co-authored-by: Adrian Kosowski <adrian@pathway.com> Co-authored-by: mlewandowski <mlewandowski@macbook-pro-mlewandowski.home> Co-authored-by: berkecanrizai <63911408+berkecanrizai@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: mlewandowski <mlewandowski@MBPmlewandowski.ht.home> Co-authored-by: Szymon Dudycz <szymond@pathway.com> Co-authored-by: Szymon Dudycz <szymon.dudycz@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
高璟琦	ec7a59c96c	community[minor]: Add solar embedding (#19761 ) Solar is a large language model developed by [Upstage](https://upstage.ai/). It's a powerful and purpose-trained LLM. You can visit the embedding service provided by Solar within this pr. You may get SOLAR_API_KEY from https://console.upstage.ai/services/embedding You can refer to more details about accepted llm integration at https://python.langchain.com/docs/integrations/llms/solar.	6 months ago
Tomaz Bratanic	dec00d3050	community[patch]: Add the ability to pass maps to neo4j retrieval query (#19758 ) Makes it easier to flatten complex values to text, so you don't have to use a lot of Cypher to do it.	6 months ago
Robby	f7e8a382cc	community[minor]: add hugging face text-to-speech inference API (#18880 ) Description: I implemented a tool to use Hugging Face text-to-speech inference API. Issue: n/a Dependencies: n/a Twitter handle: No Twitter, but do have [LinkedIn](https://www.linkedin.com/in/robby-horvath/) lol. --------- Co-authored-by: Robby <h0rv@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	6 months ago
DasDingoCodes	73eb3f8fd9	community[minor]: Implement DirectoryLoader lazy_load function (#19537 ) Thank you for contributing to LangChain! - [x] PR title: "community: Implement DirectoryLoader lazy_load function" - [x] Description: The `lazy_load` function of the `DirectoryLoader` yields each document separately. If the given `loader_cls` of the `DirectoryLoader` also implemented `lazy_load`, it will be used to yield subdocuments of the file. - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access: `libs/community/tests/unit_tests/document_loaders/test_directory_loader.py` 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory: `docs/docs/integrations/document_loaders/directory.ipynb` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	6 months ago
Jialei	f7c903e24a	community[minor]: add support for Moonshot llm and chat model (#17100 )	6 months ago
Ethan Yang	7164015135	community[minor]: Add Openvino embedding support (#19632 ) This PR is used to support both HF and BGE embeddings with openvino --------- Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com>	6 months ago
kYLe	124ab79c23	community[minor]: Add Anyscale embedding support (#17605 ) Description: Add embedding model support for Anyscale Endpoint Dependencies: openai --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Paulo Nascimento	44a3484503	community[patch]: add NotebookLoader unit test (#17721 ) Thank you for contributing to LangChain! - Description: added unit tests for NotebookLoader. Linked PR: https://github.com/langchain-ai/langchain/pull/17614 - Issue: [#17614](https://github.com/langchain-ai/langchain/pull/17614) - Twitter handle: @paulodoestech - [x] Pass lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified to check that you're passing lint and testing. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: lachiewalker <lachiewalker1@hotmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Paulo Nascimento	4c3a67122f	community[patch]: add Integration for OpenAI image gen with v1 sdk (#17771 ) Description: Created a Langchain Tool for OpenAI DALLE Image Generation. Issue: [#15901](https://github.com/langchain-ai/langchain/issues/15901) Dependencies: n/a Twitter handle: @paulodoestech - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Jiaming	3d3cc71287	community[patch]: fix bugs for bilibili Loader (#18036 ) - Description: 1. Fix the BiliBiliLoader that can receive cookie parameters, it requires 3 other parameters to run. The change is backward compatible. 2. Add test; 3. Add example in docs - Issue: [#14213] Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Sachin Paryani	25c9f3d1d1	community[patch]: Support Streaming in Azure Machine Learning (#18246 ) - [x] PR title: "community: Support streaming in Azure ML and few naming changes" - [x] PR message: - Description: Added support for streaming for azureml_endpoint. Also, renamed and AzureMLEndpointApiType.realtime to AzureMLEndpointApiType.dedicated. Also, added new classes CustomOpenAIChatContentFormatter and CustomOpenAIContentFormatter and updated the classes LlamaChatContentFormatter and LlamaContentFormatter to now show a deprecated warning message when instantiated. --------- Co-authored-by: Sachin Paryani <saparan@microsoft.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Victor Adan	afa2d85405	community[patch]: Added missing from_documents method to KNNRetriever. (#18411 ) - Description: Added missing `from_documents` method to `KNNRetriever`, providing the ability to supply metadata to LangChain `Document`s, and to give it parity to the other retrievers, which do have `from_documents`. - Issue: None - Dependencies: None - Twitter handle: None Co-authored-by: Victor Adan <vadan@netroadshow.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
wulixuan	b7c8bc8268	community[patch]: fix yuan2 errors in LLMs (#19004 ) 1. fix yuan2 errors while invoke Yuan2. 2. update tests.	6 months ago
Davide Menini	f7042321f1	community[patch]: gather token usage info in BedrockChat during generation (#19127 ) This PR allows to calculate token usage for prompts and completion directly in the generation method of BedrockChat. The token usage details are then returned together with the generations, so that other downstream tasks can access them easily. This allows to define a callback for tokens tracking and cost calculation, similarly to what happens with OpenAI (see [OpenAICallbackHandler](https://api.python.langchain.com/en/latest/_modules/langchain_community/callbacks/openai_info.html#OpenAICallbackHandler). I plan on adding a BedrockCallbackHandler later. Right now keeping track of tokens in the callback is already possible, but it requires passing the llm, as done here: https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html. However, I find the approach of this PR cleaner. Thanks for your reviews. FYI @baskaryan, @hwchase17 --------- Co-authored-by: taamedag <Davide.Menini@swisscom.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Christian Galo	1adaa3c662	community[minor]: Update Azure Cognitive Services to Azure AI Services (#19488 ) This is a follow up to #18371. These are the changes: - New Azure AI Services toolkit and tools to replace those of Azure Cognitive Services. - Updated documentation for Microsoft platform. - The image analysis tool has been rewritten to use the new package `azure-ai-vision-imageanalysis`, doing a proper replacement of `azure-ai-vision`. These changes: - Update outdated naming from "Azure Cognitive Services" to "Azure AI Services". - Update documentation to use non-deprecated methods to create and use agents. - Removes need to depend on yanked python package (`azure-ai-vision`) There is one new dependency that is needed as a replacement to `azure-ai-vision`: - `azure-ai-vision-imageanalysis`. This is optional and declared within a function. There is a new `azure_ai_services.ipynb` notebook showing usage; Changes have been linted and formatted. I am leaving the actions of adding deprecation notices and future removal of Azure Cognitive Services up to the LangChain team, as I am not sure what the current practice around this is. --- If this PR makes it, my handle is @galo@mastodon.social --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	6 months ago
Shengsheng Huang	ac1dd8ad94	community[minor]: migrate `bigdl-llm` to `ipex-llm` (#19518 ) - Description: `bigdl-llm` library has been renamed to [`ipex-llm`](https://github.com/intel-analytics/ipex-llm). This PR migrates the `bigdl-llm` integration to `ipex-llm` . - Issue: N/A. The original PR of `bigdl-llm` is https://github.com/langchain-ai/langchain/pull/17953 - Dependencies: `ipex-llm` library - Contribution maintainer: @shane-huang Updated doc: docs/docs/integrations/llms/ipex_llm.ipynb Updated test: libs/community/tests/integration_tests/llms/test_ipex_llm.py	6 months ago
Chaunte W. Lacewell	a31f692f4e	community[minor]: Add VDMS vectorstore (#19551 ) - Description: Add support for Intel Lab's [Visual Data Management System (VDMS)](https://github.com/IntelLabs/vdms) as a vector store - Dependencies: `vdms` library which requires protobuf = "4.24.2". There is a conflict with dashvector in `langchain` package but conflict is resolved in `community`. - Contribution maintainer: [@cwlacewe](https://github.com/cwlacewe) - Added tests: libs/community/tests/integration_tests/vectorstores/test_vdms.py - Added docs: docs/docs/integrations/vectorstores/vdms.ipynb - Added cookbook: cookbook/multi_modal_RAG_vdms.ipynb --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
yongheng.liu	7e29b6061f	community[minor]: integrate China Mobile Ecloud vector search (#15298 ) - Description: integrate China Mobile Ecloud vector search, - Dependencies: elasticsearch==7.10.1 Co-authored-by: liuyongheng <liuyongheng@cmss.chinamobile.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
CaroFG	cf96060ab7	community[patch]: update for compatibility with latest Meilisearch version (#18970 ) - Description: Updates Meilisearch vectorstore for compatibility with v1.6 and above. Adds embedders settings and embedder_name which are now required. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
yuwenzho	3a7d2cf443	community[minor]: Add ITREX optimized Embeddings (#18474 ) Introduction [Intel® Extension for Transformers](https://github.com/intel/intel-extension-for-transformers) is an innovative toolkit designed to accelerate GenAI/LLM everywhere with the optimal performance of Transformer-based models on various Intel platforms Description adding ITREX runtime embeddings using intel-extension-for-transformers. added mdx documentation and example notebooks added embedding import testing. --------- Signed-off-by: yuwenzho <yuwen.zhou@intel.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Fabrizio Ruocco	f12cb0bea4	community[patch]: Microsoft Azure Document Intelligence updates (#16932 ) - Description: Update Azure Document Intelligence implementation by Microsoft team and RAG cookbook with Azure AI Search --------- Co-authored-by: Lu Zhang (AI) <luzhan@microsoft.com> Co-authored-by: Yateng Hong <yatengh@microsoft.com> Co-authored-by: teethache <hongyateng2006@126.com> Co-authored-by: Lu Zhang <44625949+luzhang06@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
xsai9101	160a8eb178	community[minor]: add oracle autonomous database doc loader integration (#19536 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: Adding oracle autonomous database document loader integration. This will allow users to connect to oracle autonomous database through connection string or TNS configuration. https://www.oracle.com/autonomous-database/ - Issue: None - Dependencies: oracledb python package https://pypi.org/project/oracledb/ - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Unit test and doc are added. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Adrian Valente	2763d8cbe5	community: add len() implementation to Chroma (#19419 ) Thank you for contributing to LangChain! - [x] Add len() implementation to Chroma: "package: community" - [x] PR message: - Description: add an implementation of the __len__() method for the Chroma vectostore, for convenience. - Issue: no exposed method to know the size of a Chroma vectorstore - Dependencies: None - Twitter handle: lowrank_adrian - [x] Add tests and docs - [x] Lint and test --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Yuki Watanabe	cfecbda48b	community[minor]: Allow passing `allow_dangerous_deserialization` when loading LLM chain (#18894 ) ### Issue Recently, the new `allow_dangerous_deserialization` flag was introduced for preventing unsafe model deserialization that relies on pickle without user's notice (#18696). Since then some LLMs like Databricks requires passing in this flag with true to instantiate the model. However, this breaks existing functionality to loading such LLMs within a chain using `load_chain` method, because the underlying loader function [load_llm_from_config](`f96dd57501/libs/langchain/langchain/chains/loading.py (L40)`) (and load_llm) ignores keyword arguments passed in. ### Solution This PR fixes this issue by propagating the `allow_dangerous_deserialization` argument to the class loader iff the LLM class has that field. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Christophe Bornet	8595c3ab59	community[minor]: Add InMemoryVectorStore to module level imports (#19576 )	6 months ago
Aayush Kataria	03c38005cb	community[patch]: Fixing some caching issues for AzureCosmosDBSemanticCache (#18884 ) Fixing some issues for AzureCosmosDBSemanticCache - Added the entry for "AzureCosmosDBSemanticCache" which was missing in langchain/cache.py - Added application name when creating the MongoClient for the AzureCosmosDBVectorSearch, for tracking purposes. @baskaryan, can you please review this PR, we need this to go in asap. These are just small fixes which we found today in our testing.	6 months ago
Anindyadeep	b2a11ce686	community[minor]: Prem AI langchain integration (#19113 ) ### Prem SDK integration in LangChain This PR adds the integration with [PremAI's](https://www.premai.io/) prem-sdk with langchain. User can now access to deployed models (llms/embeddings) and use it with langchain's ecosystem. This PR adds the following: ### This PR adds the following: - [x] Add chat support - [X] Adding embedding support - [X] writing integration tests - [X] writing tests for chat - [X] writing tests for embedding - [X] writing unit tests - [X] writing tests for chat - [X] writing tests for embedding - [X] Adding documentation - [X] writing documentation for chat - [X] writing documentation for embedding - [X] run `make test` - [X] run `make lint`, `make lint_diff` - [X] Final checks (spell check, lint, format and overall testing) --------- Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Dmitry Tyumentsev	08b769d539	community[patch]: YandexGPT Use recent yandexcloud sdk version (#19341 ) Fixed inability to work with [yandexcloud SDK](https://pypi.org/project/yandexcloud/) version higher 0.265.0	6 months ago
Marlene	f1313339ac	community[patch]: Fixing incorrect base URLs for Azure Cognitive Search Retriever (#19352 ) This PR adds code to make sure that the correct base URL is being created for the Azure Cognitive Search retriever. At the moment an incorrect base URL is being generated. I think this is happening because the original code was based on a depreciated API version. No dependencies need to be added. I've also added more context to the test doc strings. I should also note that ACS is now Azure AI Search. I will open a separate PR to make these changes as that would be a breaking change and should potentially be discussed. Twitter: @marlene_zw - No new tests added, however the current ACS retriever tests are now passing when I run them. - Code was linted. Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Mikelarg	dac2e0165a	community[minor]: Added GigaChat Embeddings support + updated previous GigaChat integration (#19516 ) - Description: Added integration with [GigaChat](https://developers.sber.ru/portal/products/gigachat) embeddings. Also added support for extra fields in GigaChat LLM and fixed docs.	6 months ago
Martin Kolb	e5bdb26f76	community[patch]: More flexible handling for entity names in vector store "HANA Cloud" (#19523 ) - Description: Added support for lower-case and mixed-case names The names for tables and columns previouly had to be UPPER_CASE. With this enhancement, also lower_case and MixedCase are supported, - Issue: N/A - Dependencies: no new dependecies added - Twitter handle: @sapopensource	6 months ago
Igor Muniz Soares	743f888580	community[minor]: Dappier chat model integration (#19370 ) Description: This PR adds [Dappier](https://dappier.com/) for the chat model. It supports generate, async generate, and batch functionalities. We added unit and integration tests as well as a notebook with more details about our chat model. Dependencies: No extra dependencies are needed.	6 months ago
Hugoberry	96dc180883	community[minor]: Add `DuckDB` as a vectorstore (#18916 ) DuckDB has a cosine similarity function along list and array data types, which can be used as a vector store. - Description: The latest version of DuckDB features a cosine similarity function, which can be used with its support for list or array column types. This PR surfaces this functionality to langchain. - Dependencies: duckdb 0.10.0 - Twitter handle: @igocrite --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Christophe Bornet	00614f332a	community[minor]: Add InMemoryVectorStore (#19326 ) This is a basic VectorStore implementation using an in-memory dict to store the documents. It doesn't need any extra/optional dependency as it uses numpy which is already a dependency of langchain. This is useful for quick testing, demos, examples. Also it allows to write vendor-neutral tutorials, guides, etc...	6 months ago
Nithish Raghunandanan	7ad0a3f2a7	community: add Couchbase Vector Store (#18994 ) - Description: Added support for Couchbase Vector Search to LangChain. - Dependencies: couchbase>=4.1.12 - Twitter handle: @nithishr --------- Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>	6 months ago
Vittorio Rigamonti	9b2f9ee952	community: VectorStore Infinispan, adding autoconfiguration (#18967 ) Description: this PR enable VectorStore autoconfiguration for Infinispan: if metadatas are only of basic types, protobuf config will be automatically generated for the user.	6 months ago
gonvee	b82644078e	community: Add `keep_alive` parameter to control how long the model w… (#19005 ) Add `keep_alive` parameter to control how long the model will stay loaded into memory with Ollama。 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Leonid Ganeline	7de1d9acfd	community: `llms` imports fixes (#18943 ) Classes are missed in __all__ and in different places of __init__.py - BaichuanLLM - ChatDatabricks - ChatMlflow - Llamafile - Mlflow - Together Added classes to __all__. I also sorted __all__ list. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	6 months ago
Pengfei Jiang	514fe80778	community[patch]: add stop parameter support to volcengine maas (#19052 ) - Description: add stop parameter to volcengine maas model - Dependencies: no --------- Co-authored-by: 江鹏飞 <jiangpengfei.jiangpf@bytedance.com>	6 months ago
wulixuan	0e0030f494	community[patch]: fix yuan2 chat model errors while invoke. (#19015 ) 1. fix yuan2 chat model errors while invoke. 2. update related tests. 3. fix some deprecationWarning.	6 months ago
fengjial	c922ea36cb	community[minor]: Add Baidu VectorDB as vector store (#17997 ) Co-authored-by: fengjialin <fengjialin@MacBook-Pro.local>	6 months ago
Eugene Yurtsev	6cdca4355d	community[minor]: Revamp PGVector Filtering (#18992 ) This PR makes the following updates in the pgvector database: 1. Use JSONB field for metadata instead of JSON 2. Update operator syntax to include required `$` prefix before the operators (otherwise there will be name collisions with fields) 3. The change is non-breaking, old functionality is still the default, but it will emit a deprecation warning 4. Previous functionality has bugs associated with comparisons due to casting to text (so lexical ordering is used incorrectly for numeric fields) 5. Adds an a GIN index on the JSONB field for more efficient querying	6 months ago
Leonid Ganeline	9c8523b529	community[patch]: flattening imports 3 (#18939 ) @eyurtsev	6 months ago
Virat Singh	cafffe8a21	community: Add PolygonAggregates tool (#18882 ) Description: In this PR, I am adding a `PolygonAggregates` tool, which can be used to get historical stock price data (called aggregates by Polygon) for a given ticker. Polygon [docs](https://polygon.io/docs/stocks/get_v2_aggs_ticker__stocksticker__range__multiplier___timespan___from___to) for this endpoint. Twitter: [@virattt](https://twitter.com/virattt)	6 months ago
Joshua Carroll	ddaf9de169	community: Fix bug with StreamlitChatMessageHistory (#18834 ) - Description: Fix Streamlit bug which was introduced by https://github.com/langchain-ai/langchain/pull/18250, update integration test - Issue: https://github.com/langchain-ai/langchain/issues/18684 - Dependencies: None	7 months ago
Ishani Vyas	2b0cbd65ba	community[patch]: Add Passio Nutrition AI Food Search Tool to Community Package (#18278 ) ## Add Passio Nutrition AI Food Search Tool to Community Package ### Description We propose adding a new tool to the `community` package, enabling integration with Passio Nutrition AI for food search functionality. This tool will provide a simple interface for retrieving nutrition facts through the Passio Nutrition AI API, simplifying user access to nutrition data based on food search queries. ### Implementation Details - Class Structure: Implement `NutritionAI`, extending `BaseTool`. It includes an `_run` method that accepts a query string and, optionally, a `CallbackManagerForToolRun`. - API Integration: Use `NutritionAIAPI` for the API wrapper, encapsulating all interactions with the Passio Nutrition AI and providing a clean API interface. - Error Handling: Implement comprehensive error handling for API request failures. ### Expected Outcome - User Benefits: Enable easy querying of nutrition facts from Passio Nutrition AI, enhancing the utility of the `langchain_community` package for nutrition-related projects. - Functionality: Provide a straightforward method for integrating nutrition information retrieval into users' applications. ### Dependencies - `langchain_core` for base tooling support - `pydantic` for data validation and settings management - Consider `requests` or another HTTP client library if not covered by `NutritionAIAPI`. ### Tests and Documentation - Unit Tests: Include tests that mock network interactions to ensure tool reliability without external API dependency. - Documentation: Create an example notebook in `docs/docs/integrations/tools/passio_nutrition_ai.ipynb` showing usage, setup, and example queries. ### Contribution Guidelines Compliance - Adhere to the project's linting and formatting standards (`make format`, `make lint`, `make test`). - Ensure compliance with LangChain's contribution guidelines, particularly around dependency management and package modifications. ### Additional Notes - Aim for the tool to be a lightweight, focused addition, not introducing significant new dependencies or complexity. - Potential future enhancements could include caching for common queries to improve performance. ### Twitter Handle - Here is our Passio AI [twitter handle](https://twitter.com/@passio_ai) where we announce our products. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	7 months ago
Eugene Yurtsev	1f50274df7	community[patch]: Add pgvector to docker compose and update settings used in integration test (#18815 )	7 months ago
Christophe Bornet	e54a49b697	community[minor]: Add lazy_table_reflection param to SqlDatabase (#18742 ) For some DBs with lots of tables, reflection of all the tables can take very long. So this change will make the tables be reflected lazily when get_table_info() is called and `lazy_table_reflection` is True.	7 months ago
Tomaz Bratanic	4bfe888717	comunity[patch]: Fix neo4j sanitizing values (#18750 ) Fixing sanitization for when deeply nested lists appear	7 months ago
Yunmo Koo	fee6f983ef	community[minor]: Integration for `Friendli` LLM and `ChatFriendli` ChatModel. (#17913 ) ## Description - Add [Friendli](https://friendli.ai/) integration for `Friendli` LLM and `ChatFriendli` chat model. - Unit tests and integration tests corresponding to this change are added. - Documentations corresponding to this change are added. ## Dependencies - Optional dependency [`friendli-client`](https://pypi.org/project/friendli-client/) package is added only for those who use `Frienldi` or `ChatFriendli` model. ## Twitter handle - https://twitter.com/friendliai	7 months ago
Ian	390ef6abe3	community[minor]: Add Initial Support for TiDB Vector Store (#15796 ) This pull request introduces initial support for the TiDB vector store. The current version is basic, laying the foundation for the vector store integration. While this implementation provides the essential features, we plan to expand and improve the TiDB vector store support with additional enhancements in future updates. Upcoming Enhancements: * Support for Vector Index Creation: To enhance the efficiency and performance of the vector store. * Support for max marginal relevance search. * Customized Table Structure Support: Recognizing the need for flexibility, we plan for more tailored and efficient data store solutions. Simple use case exmaple ```python from typing import List, Tuple from langchain.docstore.document import Document from langchain_community.vectorstores import TiDBVectorStore from langchain_openai import OpenAIEmbeddings db = TiDBVectorStore.from_texts( embedding=embeddings, texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'], table_name="tidb_vector_langchain", connection_string=tidb_connection_url, distance_strategy="cosine", ) query = "Can you tell me about Alexandra?" docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query) for doc, score in docs_with_score: print("-" * 80) print("Score: ", score) print(doc.page_content) print("-" * 80) ```	7 months ago
Eugene Yurtsev	e188d4ecb0	Add dangerous parameter to requests tool (#18697 ) The tools are already documented as dangerous. Not clear whether adding an opt-in parameter is necessary or not	7 months ago
Erick Friis	1beb84b061	community[patch]: move pdf text tests to integration (#18746 )	7 months ago
Christophe Bornet	4a7d73b39d	community: If load() has been overridden, use it in default lazy_load() (#18690 )	7 months ago
Sam Khano	1b4dcf22f3	community[minor]: Add DocumentDBVectorSearch VectorStore (#17757 ) Description: - Added Amazon DocumentDB Vector Search integration (HNSW index) - Added integration tests - Updated AWS documentation with DocumentDB Vector Search instructions - Added notebook for DocumentDB integration with example usage --------- Co-authored-by: EC2 Default User <ec2-user@ip-172-31-95-226.ec2.internal>	7 months ago
Vittorio Rigamonti	51f3902bc4	community[minor]: Adding support for Infinispan as VectorStore (#17861 ) Description: This integrates Infinispan as a vectorstore. Infinispan is an open-source key-value data grid, it can work as single node as well as distributed. Vector search is supported since release 15.x For more: [Infinispan Home](https://infinispan.org) Integration tests are provided as well as a demo notebook	7 months ago
Djordje	12b4a4d860	community[patch]: Opensearch delete method added - indexing supported (#18522 ) - Description: Added delete method for OpenSearchVectorSearch, therefore indexing supported - Issue: No - Dependencies: No - Twitter handle: stkbmf	7 months ago
Eugene Yurtsev	4c25b49229	community[major]: breaking change in some APIs to force users to opt-in for pickling (#18696 ) This is a PR that adds a dangerous load parameter to force users to opt in to use pickle. This is a PR that's meant to raise user awareness that the pickling module is involved.	7 months ago
Eugene Yurtsev	0e52961562	community[patch]: Patch tdidf retriever (CVE-2024-2057) (#18695 ) This is a patch for `CVE-2024-2057`: https://www.cve.org/CVERecord?id=CVE-2024-2057 This affects users that: * Use the `TFIDFRetriever` * Attempt to de-serialize it from an untrusted source that contains a malicious payload	7 months ago
Christophe Bornet	15b1770326	Merge pull request #18421 * Implement lazy_load() for AssemblyAIAudioTranscriptLoader	7 months ago
Christophe Bornet	bb284eebe4	Merge pull request #18436 * Implement lazy_load() for ConfluenceLoader	7 months ago
Liang Zhang	81985b31e6	community[patch]: Databricks SerDe uses cloudpickle instead of pickle (#18607 ) - Description: Databricks SerDe uses cloudpickle instead of pickle when serializing a user-defined function transform_input_fn since pickle does not support functions defined in `__main__`, and cloudpickle supports this. - Dependencies: cloudpickle>=2.0.0 Added a unit test.	7 months ago
Dounx	ad48f55357	community[minor]: add Yuque document loader (#17924 ) This pull request support loading documents from Yuque with Langchain. Yuque is a professional cloud-based knowledge base for team collaboration in documentation. Website: https://www.yuque.com OpenAPI: https://www.yuque.com/yuque/developer/openapi	7 months ago
Kazuki Maeda	60c5d964a8	community[minor]: use jq schema for content_key in json_loader (#18003 ) ### Description Changed the value specified for `content_key` in JSONLoader from a single key to a value based on jq schema. I created [similar PR](https://github.com/langchain-ai/langchain/pull/11255) before, but it has several conflicts because of the architectural change associated stable version release, so I re-create this PR to fit new architecture. ### Why For json data like the following, specify `.data[].attributes.message` for page_content and `.data[].attributes.id` or `.data[].attributes.attributes. tags`, etc., the `content_key` must also parse the json structure. <details> <summary>sample json data</summary> ```json { "data": [ { "attributes": { "message": "message1", "tags": [ "tag1" ] }, "id": "1" }, { "attributes": { "message": "message2", "tags": [ "tag2" ] }, "id": "2" } ] } ``` </details> <details> <summary>sample code</summary> ```python def metadata_func(record: dict, metadata: dict) -> dict: metadata["source"] = None metadata["id"] = record.get("id") metadata["tags"] = record["attributes"].get("tags") return metadata sample_file = "sample1.json" loader = JSONLoader( file_path=sample_file, jq_schema=".data[]", content_key=".attributes.message", ## content_key is parsable into jq schema is_content_key_jq_parsable=True, ## this is added parameter metadata_func=metadata_func ) data = loader.load() data ``` </details> ### Dependencies none ### Twitter handle [kzk_maeda](https://twitter.com/kzk_maeda)	7 months ago
Tomaz Bratanic	ea51cdaede	Remove neo4j bloom labels from graph schema (#18564 ) Neo4j tools use particular node labels and relationship types to store metadata, but are irrelevant for text2cypher or graph generation, so we want to ignore them in the schema representation.	7 months ago
Erick Friis	e1924b3e93	core[patch]: deprecate hwchase17/langchain-hub, address path traversal (#18600 ) Deprecates the old langchain-hub repository. Does not deprecate the new https://smith.langchain.com/hub @PinkDraconian has correctly raised that in the event someone is loading unsanitized user input into the `try_load_from_hub` function, they have the ability to load files from other locations in github than the hwchase17/langchain-hub repository. This PR adds some more path checking to that function and deprecates the functionality in favor of the hub built into LangSmith.	7 months ago
Scott Nath	b051bba1a9	community: Add you.com tool, add async to retriever, add async testing, add You tool doc (#18032 ) - Description: finishes adding the you.com functionality including: - add async functions to utility and retriever - add the You.com Tool - add async testing for utility, retriever, and tool - add a tool integration notebook page - Dependencies: any dependencies required for this change - Twitter handle: @scottnath	7 months ago
Aayush Kataria	7c2f3f6f95	community[minor]: Adding Azure Cosmos Mongo vCore Vector DB Cache (#16856 ) Description: This pull request introduces several enhancements for Azure Cosmos Vector DB, primarily focused on improving caching and search capabilities using Azure Cosmos MongoDB vCore Vector DB. Here's a summary of the changes: - AzureCosmosDBSemanticCache: Added a new cache implementation called AzureCosmosDBSemanticCache, which utilizes Azure Cosmos MongoDB vCore Vector DB for efficient caching of semantic data. Added comprehensive test cases for AzureCosmosDBSemanticCache to ensure its correctness and robustness. These tests cover various scenarios and edge cases to validate the cache's behavior. - HNSW Vector Search: Added HNSW vector search functionality in the CosmosDB Vector Search module. This enhancement enables more efficient and accurate vector searches by utilizing the HNSW (Hierarchical Navigable Small World) algorithm. Added corresponding test cases to validate the HNSW vector search functionality in both AzureCosmosDBSemanticCache and AzureCosmosDBVectorSearch. These tests ensure the correctness and performance of the HNSW search algorithm. - LLM Caching Notebook - The notebook now includes a comprehensive example showcasing the usage of the AzureCosmosDBSemanticCache. This example highlights how the cache can be employed to efficiently store and retrieve semantic data. Additionally, the example provides default values for all parameters used within the AzureCosmosDBSemanticCache, ensuring clarity and ease of understanding for users who are new to the cache implementation. @hwchase17,@baskaryan, @eyurtsev,	7 months ago
Kate Silverstein	b7c71e2e07	community[minor]: llamafile embeddings support (#17976 ) * Description: adds `LlamafileEmbeddings` class implementation for generating embeddings using [llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models. Includes related unit tests and notebook showing example usage. * Issue: N/A * Dependencies: N/A	7 months ago
Tomaz Bratanic	f6bfb969ba	community[patch]: Add an option for indexed generic label when import neo4j graph documents (#18122 ) Current implementation doesn't have an indexed property that would optimize the import. I have added a `baseEntityLabel` parameter that allows you to add a secondary node label, which has an indexed id `property`. By default, the behaviour is identical to previous version. Since multi-labeled nodes are terrible for text2cypher, I removed the secondary label from schema representation object and string, which is used in text2cypher.	7 months ago
Arun Sathiya	4adac20d7b	community[patch]: Make cohere_api_key a SecretStr (#12188 ) This PR makes `cohere_api_key` in `llms/cohere` a SecretStr, so that the API Key is not leaked when `Cohere.cohere_api_key` is represented as a string. --------- Signed-off-by: Arun <arun@arun.blog> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	7 months ago
Petteri Johansson	6c1989d292	community[minor], langchain[minor], docs: Gremlin Graph Store and QA Chain (#17683 ) - Description: New feature: Gremlin graph-store and QA chain (including docs). Compatible with Azure CosmosDB. - Dependencies: no changes	7 months ago
Ather Fawaz	a5ccf5d33c	community[minor]: Add support for Perplexity chat model(#17024 ) - Description: This PR adds support for [Perplexity AI APIs](https://blog.perplexity.ai/blog/introducing-pplx-api). - Issues: None - Dependencies: None - Twitter handle: [@atherfawaz](https://twitter.com/AtherFawaz) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	7 months ago
Rodrigo Nogueira	3438d2cbcc	community[minor]: add maritalk chat (#17675 ) Description: Adds the MariTalk chat that is based on a LLM specially trained for Portuguese. Twitter handle: @MaritacaAI	7 months ago
Yujie Qian	cbb65741a7	community[patch]: Voyage AI updates default model and batch size (#17655 ) - Description: update the default model and batch size in VoyageEmbeddings - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Co-authored-by: fodizoltan <zoltan@conway.expert>	7 months ago
Shengsheng Huang	ae471a7dcb	community[minor]: add BigDL-LLM integrations (#17953 ) - Description: [`bigdl-llm`](https://github.com/intel-analytics/BigDL) is a library for running LLM on Intel XPU (from Laptop to GPU to Cloud) using INT4/FP4/INT8/FP8 with very low latency (for any PyTorch model). This PR adds bigdl-llm integrations to langchain. - Issue: NA - Dependencies: `bigdl-llm` library - Contribution maintainer: @shane-huang Examples added: - docs/docs/integrations/llms/bigdl.ipynb	7 months ago
Ethan Yang	f61cb8d407	community[minor]: Add openvino backend support (#11591 ) - Description: add openvino backend support by HuggingFace Optimum Intel, - Dependencies: “optimum[openvino]”, --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
RadhikaBansal97	8bafd2df5e	community[patch]: Change github endpoint in GithubLoader (#17622 ) Description- - Changed the GitHub endpoint as existing was not working and giving 404 not found error - Also the existing function was failing if file_filter is not passed as the tree api return all paths including directory as well, and when get_file_content was iterating over these path, the function was failing for directory as the api was returning list of files inside the directory, so added a condition to ignore the paths if it a directory - Fixes this issue - https://github.com/langchain-ai/langchain/issues/17453 Co-authored-by: Radhika Bansal <Radhika.Bansal@veritas.com>	7 months ago
Eugene Yurtsev	51b661cfe8	community[patch]: BaseLoader load method should just delegate to lazy_load (#18289 ) load() should just reference lazy_load()	7 months ago
Kai Kugler	df234fb171	community[patch]: Fixing embedchain document mapping (#18255 ) - Description: The current embedchain implementation seems to handle document metadata differently than done in the current implementation of langchain and a KeyError is thrown. I would love for someone else to test this... --------- Co-authored-by: KKUGLER <kai.kugler@mercedes-benz.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Deshraj Yadav <deshraj@gatech.edu>	7 months ago
Erick Friis	040271f33a	community[patch]: remove llmlingua extended tests (#18344 )	7 months ago
Tomaz Bratanic	5999c4a240	Add support for parameters in neo4j retrieval query (#18310 ) Sometimes, you want to use various parameters in the retrieval query of Neo4j Vector to personalize/customize results. Before, when there were only predefined chains, it didn't really make sense. Now that it's all about custom chains and LCEL, it is worth adding since users can inject any params they wish at query time. Isn't prone to SQL injection-type attacks since we use parameters and not concatenating strings.	7 months ago
Virat Singh	cd926ac3dd	community: Add PolygonFinancials Tool (#18324 ) Description: In this PR, I am adding a `PolygonFinancials` tool, which can be used to get financials data for a given ticker. The financials data is the fundamental data that is found in income statements, balance sheets, and cash flow statements of public US companies. Twitter: [@virattt](https://twitter.com/virattt)	7 months ago
Eugene Yurtsev	cd52433ba0	community[minor]: Add `SQLDatabaseLoader` document loader (#18281 ) - Description: A generic document loader adapter for SQLAlchemy on top of LangChain's `SQLDatabaseLoader`. - Needed by: https://github.com/crate-workbench/langchain/pull/1 - Depends on: GH-16655 - Addressed to: @baskaryan, @cbornet, @eyurtsev Hi from CrateDB again, in the same spirit like GH-16243 and GH-16244, this patch breaks out another commit from https://github.com/crate-workbench/langchain/pull/1, in order to reduce the size of this patch before submitting it, and to separate concerns. To accompany the SQLAlchemy adapter implementation, the patch includes integration tests for both SQLite and PostgreSQL. Let me know if corresponding utility resources should be added at different spots. With kind regards, Andreas. ### Software Tests ```console docker compose --file libs/community/tests/integration_tests/document_loaders/docker-compose/postgresql.yml up ``` ```console cd libs/community pip install psycopg2-binary pytest -vvv tests/integration_tests -k sqldatabase ``` ``` 14 passed ``` ![image](https://github.com/langchain-ai/langchain/assets/453543/42be233c-eb37-4c76-a830-474276e01436) --------- Co-authored-by: Andreas Motl <andreas.motl@crate.io>	7 months ago
David Ruan	af35e2525a	community[minor]: add hugging_face_model document loader (#17323 ) - Description: add hugging_face_model document loader, - Issue: NA, - Dependencies: NA, --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	7 months ago
Ayo Ayibiowu	ac1d7d9de8	community[feat]: Adds LLMLingua as a document compressor (#17711 ) Description: This PR adds support for using the [LLMLingua project ](https://github.com/microsoft/LLMLingua) especially the LongLLMLingua (Enhancing Large Language Model Inference via Prompt Compression) as a document compressor / transformer. The LLMLingua project is an interesting project that can greatly improve RAG system by compressing prompts and contexts while keeping their semantic relevance. Issue: https://github.com/microsoft/LLMLingua/issues/31 Dependencies: [llmlingua](https://pypi.org/project/llmlingua/) @baskaryan --------- Co-authored-by: Ayodeji Ayibiowu <ayodeji.ayibiowu@getinge.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	7 months ago
am-kinetica	9b8f6455b1	Langchain vectorstore integration with Kinetica (#18102 ) - Description: New vectorstore integration with the Kinetica database - Issue: - Dependencies: the Kinetica Python API `pip install gpudb==7.2.0.1`, - Tag maintainer: @baskaryan, @hwchase17 - Twitter handle: --------- Co-authored-by: Chad Juliano <cjuliano@kinetica.com>	7 months ago
Dan Stambler	69344a0661	community: Add Laser Embedding Integration (#18111 ) - Description: Added Integration with Meta AI's LASER Language-Agnostic SEntence Representations embedding library, which supports multilingual embedding for any of the languages listed here: https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200, including several low resource languages - Dependencies: laser_encoders	7 months ago
Barun Amalkumar Halder	cc69976860	community[minor] : adds callback handler for Fiddler AI (#17708 ) Description: Callback handler to integrate fiddler with langchain. This PR adds the following - 1. `FiddlerCallbackHandler` implementation into langchain/community 2. Example notebook `fiddler.ipynb` for usage documentation [Internal Tracker : FDL-14305] Issue: NA Dependencies: - Installation of langchain-community is unaffected. - Usage of FiddlerCallbackHandler requires installation of latest fiddler-client (2.5+) Twitter handle: @fiddlerlabs @behalder Co-authored-by: Barun Halder <barun@fiddler.ai>	7 months ago
2jimoo	7fc903464a	community: Add document manager and mongo document manager (#17320 ) - Description: - Add DocumentManager class, which is a nosql record manager. - In order to use index and aindex in libs/langchain/langchain/indexes/_api.py, DocumentManager inherits RecordManager. - Also I added the MongoDB implementation of Document Manager too. - Dependencies: pymongo, motor <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: Add DocumentManager class, which is a no sql record manager. To use index method and aindex method in indexes._api.py, Document Manager inherits RecordManager.Add the MongoDB implementation of Document Manager. - Dependencies: pymongo, motor Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	7 months ago
Chad Juliano	50ba3c68bb	community[minor]: add Kinetica LLM wrapper (#17879 ) Description: Initial pull request for Kinetica LLM wrapper Issue: N/A Dependencies: No new dependencies for unit tests. Integration tests require gpudb, typeguard, and faker Twitter handle: @chad_juliano Note: There is another pull request for Kinetica vectorstore. Ultimately we would like to make a partner package but we are starting with a community contribution.	7 months ago
Brad Erickson	ecd72d26cf	community: Bugfix - correct Ollama API path to avoid HTTP 307 (#17895 ) Sets the correct /api/generate path, without ending /, to reduce HTTP requests. Reference: https://github.com/ollama/ollama/blob/efe040f8/docs/api.md#generate-request-streaming Before: DEBUG: Starting new HTTP connection (1): localhost:11434 DEBUG: http://localhost:11434 "POST /api/generate/ HTTP/1.1" 307 0 DEBUG: http://localhost:11434 "POST /api/generate HTTP/1.1" 200 None After: DEBUG: Starting new HTTP connection (1): localhost:11434 DEBUG: http://localhost:11434 "POST /api/generate HTTP/1.1" 200 None	7 months ago
Ian	3019a594b7	community[minor]: Add tidb loader support (#17788 ) This pull request support loading data from TiDB database with Langchain. A simple usage: ``` from langchain_community.document_loaders import TiDBLoader CONNECTION_STRING = "mysql+pymysql://root@127.0.0.1:4000/test" QUERY = "select id, name, description from items;" loader = TiDBLoader( connection_string=CONNECTION_STRING, query=QUERY, page_content_columns=["name", "description"], metadata_columns=["id"], ) documents = loader.load() print(documents) ```	7 months ago
ehude	9e54c227f1	community[patch]: Bug Neo4j VectorStore when having multiple indexes the sort is not working and the store that returned is random (#17396 ) Bug fix: when having multiple indexes the sort is not working and the store that returned is random. The following small fix resolves the issue.	7 months ago
Michael Feil	242981b8f0	community[minor]: infinity embedding local option (#17671 ) drop-in-replacement for sentence-transformers inference. https://github.com/langchain-ai/langchain/discussions/17670 tldr from the discussion above -> around a 4x-22x speedup over using SentenceTransformers / huggingface embeddings. For more info: https://github.com/michaelfeil/infinity (pure-python dependency) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	7 months ago
Zachary Toliver	29ee0496b6	community[patch]: Allow override of 'fetch_schema_from_transport' in the GraphQL tool (#17649 ) - Description: In order to override the bool value of "fetch_schema_from_transport" in the GraphQLAPIWrapper, a "fetch_schema_from_transport" value needed to be added to the "_EXTRA_OPTIONAL_TOOLS" dictionary in load_tools in the "graphql" key. The parameter "fetch_schema_from_transport" must also be passed in to the GraphQLAPIWrapper to allow reading of the value when creating the client. Passing as an optional parameter is probably best to avoid breaking changes. This change is necessary to support GraphQL instances that do not support fetching schema, such as TigerGraph. More info here: [TigerGraph GraphQL Schema Docs](https://docs.tigergraph.com/graphql/current/schema) - Threads handle: @zacharytoliver --------- Co-authored-by: Zachary Toliver <zt10191991@hotmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
volodymyr-memsql	0a9a519a39	community[patch]: Added add_images method to SingleStoreDB vector store (#17871 ) In this pull request, we introduce the add_images method to the SingleStoreDB vector store class, expanding its capabilities to handle multi-modal embeddings seamlessly. This method facilitates the incorporation of image data into the vector store by associating each image's URI with corresponding document content, metadata, and either pre-generated embeddings or embeddings computed using the embed_image method of the provided embedding object. the change includes integration tests, validating the behavior of the add_images. Additionally, we provide a notebook showcasing the usage of this new method. --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	7 months ago
Guangdong Liu	47b1b7092d	community[minor]: Add SparkLLM to community (#17702 )	7 months ago
Guangdong Liu	3ba1cb8650	community[minor]: Add SparkLLM Text Embedding Model and SparkLLM introduction (#17573 )	7 months ago
Virat Singh	92e52e89ca	community: Add PolygonTickerNews Tool (#17808 ) Description: In this PR, I am adding a PolygonTickerNews Tool, which can be used to get the latest news for a given ticker / stock. Twitter handle: [@virattt](https://twitter.com/virattt)	7 months ago
CogniJT	919ebcc596	community[minor]: CogniSwitch Agent Toolkit for LangChain (#17312 ) Description: CogniSwitch focusses on making GenAI usage more reliable. It abstracts out the complexity & decision making required for tuning processing, storage & retrieval. Using simple APIs documents / URLs can be processed into a Knowledge Graph that can then be used to answer questions. Dependencies: No dependencies. Just network calls & API key required Tag maintainer: @hwchase17 Twitter handle: https://github.com/CogniSwitch Documentation: Please check `docs/docs/integrations/toolkits/cogniswitch.ipynb` Tests: The usual tool & toolkits tests using `test_imports.py` PR has passed linting and testing before this submission. --------- Co-authored-by: Saicharan Sridhara <145636106+saiCogniswitch@users.noreply.github.com>	7 months ago
Aymeric Roucher	0d294760e7	Community: Fuse HuggingFace Endpoint-related classes into one (#17254 ) ## Description Fuse HuggingFace Endpoint-related classes into one: - [HuggingFaceHub](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_hub.py`) - [HuggingFaceTextGenInference](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_text_gen_inference.py`) - and [HuggingFaceEndpoint](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_endpoint.py`) Are fused into - HuggingFaceEndpoint ## Issue The deduplication of classes was creating a lack of clarity, and additional effort to develop classes leads to issues like [this hack](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_endpoint.py (L159)`). ## Dependancies None, this removes dependancies. ## Twitter handle If you want to post about this: @AymericRoucher --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
Raghav Dixit	6c18f73ca5	community[patch]: LanceDB integration improvements/fixes (#16173 ) Hi, I'm from the LanceDB team. Improves LanceDB integration by making it easier to use - now you aren't required to create tables manually and pass them in the constructor, although that is still backward compatible. Bug fix - pandas was being used even though it's not a dependency for LanceDB or langchain PS - this issue was raised a few months ago but lost traction. It is a feature improvement for our users kindly review this , Thanks !	7 months ago
Christophe Bornet	e92e96193f	community[minor]: Add async methods to the AstraDB BaseStore (#16872 ) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	7 months ago
Guangdong Liu	73edf17b4e	community[minor]: Add Apache Doris as vector store (#17527 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
Nejc Habjan	b4fa847a90	community[minor]: add exclude parameter to DirectoryLoader (#17316 ) - Description: adds an `exclude` parameter to the DirectoryLoader class, based on similar behavior in GenericLoader - Issue: discussed in https://github.com/langchain-ai/langchain/discussions/9059 and I think in some other issues that I cannot find at the moment 🙇 - Dependencies: None - Twitter handle: don't have one sorry! Just https://github/nejch --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	7 months ago
Bagatur	8f14234afb	infra: ignore flakey lua test (#17618 )	7 months ago
morgana	9d7ca7df6e	community[patch]: update copy of metadata in rockset vectorstore integration (#17612 ) - Description: This fixes an issue with working with RecordManager. RecordManager was generating new hashes on documents because `add_texts` was modifying the metadata directly. Additionally moved some tests to unit tests since that was a more appropriate home. - Issue: N/A - Dependencies: N/A - Twitter handle: `@_morgan_adams_`	7 months ago
Moshe Berchansky	20a56fe0a2	community[minor]: Add QuantizedEmbedders (#17391 ) Description: * adding Quantized embedders using optimum-intel and intel-extension-for-pytorch. * added mdx documentation and example notebooks * added embedding import testing. Dependencies: optimum = {extras = ["neural-compressor"], version = "^1.14.0", optional = true} intel_extension_for_pytorch = {version = "^2.2.0", optional = true} Dependencies have been added to pyproject.toml for the community lib. Twitter handle: @peter_izsak --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
Amir Karbasi	bccc9241ea	community[patch]: Resolve KuzuQAChain API Changes (#16885 ) - Description: Updates to the Kuzu API had broken this functionality. These updates resolve those issues and add a new test to demonstrate the updates. - Issue: #11874 - Dependencies: No new dependencies - Twitter handle: @amirk08 Test results: ``` tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params PASSED [ 33%] tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params PASSED [ 66%] tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema PASSED [100%] =================================================== slowest 5 durations =================================================== 0.53s call tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema 0.34s call tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params 0.28s call tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params 0.03s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema 0.02s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params ==================================================== 3 passed in 1.27s ==================================================== ```	7 months ago
Christophe Bornet	789cd5198d	community[patch]: Use astrapy built-in pagination prefetch in AstraDBLoader (#17569 )	7 months ago
Christophe Bornet	ff1f985a2a	community: Fix some mypy types in cassandra doc loader (#17570 ) Thank you!	7 months ago
nvpranak	91bcc9c5c9	community[minor]: Nemo embeddings(#16206 ) This PR is adding support for NVIDIA NeMo embeddings issue #16095. --------- Co-authored-by: Praveen Nakshatrala <pnakshatrala@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
wulixuan	c776cfc599	community[minor]: integrate with model Yuan2.0 (#15411 ) 1. integrate with [`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md) 2. update `langchain.llms` 3. add a new doc for [Yuan2.0 integration](docs/docs/integrations/llms/yuan2.ipynb) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
volodymyr-memsql	e36bc379f2	community[patch]: Add vector index support to SingleStoreDB VectorStore (#17308 ) This pull request introduces support for various Approximate Nearest Neighbor (ANN) vector index algorithms in the VectorStore class, starting from version 8.5 of SingleStore DB. Leveraging this enhancement enables users to harness the power of vector indexing, significantly boosting search speed, particularly when handling large sets of vectors. --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
Kate Silverstein	0bc4a9b3fc	community[minor]: Adds Llamafile as an LLM (#17431 ) * Description: Adds a simple LLM implementation for interacting with [llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models. * Dependencies: N/A * Issue: N/A Detail [llamafile](https://github.com/Mozilla-Ocho/llamafile) lets you run LLMs locally from a single file on most computers without installing any dependencies. To use the llamafile LLM implementation, the user needs to: 1. Download a llamafile e.g. https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile?download=true 2. Make the file executable. 3. Run the llamafile in 'server mode'. (All llamafiles come packaged with a lightweight server; by default, the server listens at `http://localhost:8080`.) ```bash wget https://url/of/model.llamafile chmod +x model.llamafile ./model.llamafile --server --nobrowser ``` Now, the user can invoke the LLM via the LangChain client: ```python from langchain_community.llms.llamafile import Llamafile llm = Llamafile() llm.invoke("Tell me a joke.") ```	7 months ago
Qihui Xie	5738143d4b	add mongodb_store (#13801 ) # Add MongoDB storage - Description: Add MongoDB Storage as an option for large doc store. Example usage: ```Python # Instantiate the MongodbStore with a MongoDB connection from langchain.storage import MongodbStore mongo_conn_str = "mongodb://localhost:27017/" mongodb_store = MongodbStore(mongo_conn_str, db_name="test-db", collection_name="test-collection") # Set values for keys doc1 = Document(page_content='test1') doc2 = Document(page_content='test2') mongodb_store.mset([("key1", doc1), ("key2", doc2)]) # Get values for keys values = mongodb_store.mget(["key1", "key2"]) # [doc1, doc2] # Iterate over keys for key in mongodb_store.yield_keys(): print(key) # Delete keys mongodb_store.mdelete(["key1", "key2"]) ``` - Dependencies: Use `mongomock` for integration test. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	7 months ago
wulixuan	5d06797905	community[minor]: integrate chat models with Yuan2.0 (#16575 ) 1. integrate chat models with [`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md) 2. add a new doc for [Yuan2.0 integration](docs/docs/integrations/llms/yuan2.ipynb) Yuan2.0 is a new generation Fundamental Large Language Model developed by IEIT System. We have published all three models, Yuan 2.0-102B, Yuan 2.0-51B, and Yuan 2.0-2B. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
Ian Gregory	e5472b5eb8	Framework for supporting more languages in LanguageParser (#13318 ) ## Description I am submitting this for a school project as part of a team of 5. Other team members are @LeilaChr, @maazh10, @Megabear137, @jelalalamy. This PR also has contributions from community members @Harrolee and @Mario928. Initial context is in the issue we opened (#11229). This pull request adds: - Generic framework for expanding the languages that `LanguageParser` can handle, using the [tree-sitter](https://github.com/tree-sitter/py-tree-sitter#py-tree-sitter) parsing library and existing language-specific parsers written for it - Support for the following additional languages in `LanguageParser`: - C - C++ - C# - Go - Java (contributed by @Mario928 https://github.com/ThatsJustCheesy/langchain/pull/2) - Kotlin - Lua - Perl - Ruby - Rust - Scala - TypeScript (contributed by @Harrolee https://github.com/ThatsJustCheesy/langchain/pull/1) Here is the [design document](https://docs.google.com/document/d/17dB14cKCWAaiTeSeBtxHpoVPGKrsPye8W0o_WClz2kk) if curious, but no need to read it. ## Issues - Closes #11229 - Closes #10996 - Closes #8405 ## Dependencies `tree_sitter` and `tree_sitter_languages` on PyPI. We have tried to add these as optional dependencies. ## Documentation We have updated the list of supported languages, and also added a section to `source_code.ipynb` detailing how to add support for additional languages using our framework. ## Maintainer - @hwchase17 (previously reviewed https://github.com/langchain-ai/langchain/pull/6486) Thanks!! ## Git commits We will gladly squash any/all of our commits (esp merge commits) if necessary. Let us know if this is desirable, or if you will be squash-merging anyway. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Maaz Hashmi <mhashmi373@gmail.com> Co-authored-by: LeilaChr <87657694+LeilaChr@users.noreply.github.com> Co-authored-by: Jeremy La <jeremylai511@gmail.com> Co-authored-by: Megabear137 <zubair.alnoor27@gmail.com> Co-authored-by: Lee Harrold <lhharrold@sep.com> Co-authored-by: Mario928 <88029051+Mario928@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	7 months ago
Sridhar Ramaswamy	9f1cbbc6ed	community[minor]: Add pebblo safe document loader (#16862 ) - Description: Pebblo opensource project enables developers to safely load data to their Gen AI apps. It identifies semantic topics and entities found in the loaded data and summarizes them in a developer-friendly report. - Dependencies: none - Twitter handle: srics @hwchase17	7 months ago
mhavey	1bbb64d956	community[minor], langchian[minor]: Add Neptune Rdf graph and chain (#16650 ) Description: This PR adds a chain for Amazon Neptune graph database RDF format. It complements the existing Neptune Cypher chain. The PR also includes a Neptune RDF graph class to connect to, introspect, and query a Neptune RDF graph database from the chain. A sample notebook is provided under docs that demonstrates the overall effect: invoking the chain to make natural language queries against Neptune using an LLM. Issue: This is a new feature Dependencies: The RDF graph class depends on the AWS boto3 library if using IAM authentication to connect to the Neptune database. --------- Co-authored-by: Piyush Jain <piyushjain@duck.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
Andreas Motl	1fdd9bd980	community/SQLDatabase: Generalize and trim software tests (#16659 ) - Description: Improve test cases for `SQLDatabase` adapter component, see [suggestion](https://github.com/langchain-ai/langchain/pull/16655#pullrequestreview-1846749474). - Depends on: GH-16655 - Addressed to: @baskaryan, @cbornet, @eyurtsev _Remark: This PR is stacked upon GH-16655, so that one will need to go in first._ Edit: Thank you for bringing in GH-17191, @eyurtsev. This is a little aftermath, improving/streamlining the corresponding test cases.	7 months ago
Chris	f9f5626ca4	community[patch]: Fix github search issues and PRs PaginatedList has no len() error (#16806 ) Description: Bugfix: Langchain_community's GitHub Api wrapper throws a TypeError when searching for issues and/or PRs (the `search_issues_and_prs` method). This is because PyGithub's PageinatedList type does not support the len() method. See https://github.com/PyGithub/PyGithub/issues/1476 ![image](https://github.com/langchain-ai/langchain/assets/8849021/57390b11-ed41-4f48-ba50-f3028610789c) Dependencies: None Twitter handle: @ChrisKeoghNZ I haven't registered an issue as it would take me longer to fill the template out than to make the fix, but I'm happy to if that's deemed essential. I've added a simple integration test to cover this as there were no existing unit tests and it was going to be tricky to set them up. Co-authored-by: Chris Keogh <chris.keogh@xero.com>	7 months ago
morgana	722aae4fd1	community: add delete method to rocksetdb vectorstore to support recordmanager (#17030 ) - Description: This adds a delete method so that rocksetdb can be used with `RecordManager`. - Issue: N/A - Dependencies: N/A - Twitter handle: `@_morgan_adams_` --------- Co-authored-by: Rockset API Bot <admin@rockset.io>	7 months ago
yin1991	37ef6ac113	community[patch]: Add Pagination to GitHubIssuesLoader for Efficient GitHub Issues Retrieval (#16934 ) - Description: Add Pagination to GitHubIssuesLoader for Efficient GitHub Issues Retrieval - Issue: [the issue # it fixes if applicable,](https://github.com/langchain-ai/langchain/issues/16864) --------- Co-authored-by: root <root@ip-172-31-46-160.ap-southeast-1.compute.internal> Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
Spencer Kelly	54fa78c887	community[patch]: fixed vector similarity filtering (#16967 ) Description: changed filtering so that failed filter doesn't add document to results. Currently filtering is entirely broken and all documents are returned whether or not they pass the filter. fixes issue introduced in https://github.com/langchain-ai/langchain/pull/16190	7 months ago
Abhijeeth Padarthi	584b647b96	community[minor]: AWS Athena Document Loader (#15625 ) - Description: Adds the document loader for [AWS Athena](https://aws.amazon.com/athena/), a serverless and interactive analytics service. - Dependencies: Added boto3 as a dependency	7 months ago
david-tempelmann	93da18b667	community[minor]: Add mmr and similarity_score_threshold retrieval to DatabricksVectorSearch (#16829 ) - Description: This PR adds support for `search_types="mmr"` and `search_type="similarity_score_threshold"` to retrievers using `DatabricksVectorSearch`, - Issue: - Dependencies: - Twitter handle: --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	7 months ago
Erick Friis	3a2eb6e12b	infra: add print rule to ruff (#16221 ) Added noqa for existing prints. Can slowly remove / will prevent more being intro'd	8 months ago
Quang Hoa	54c1fb3f25	community[patch]: Make some functions work with Milvus (#10695 ) Description Make some functions work with Milvus: 1. get_ids: Get primary keys by field in the metadata 2. delete: Delete one or more entities by ids 3. upsert: Update/Insert one or more entities Issue None Dependencies None Tag maintainer: @hwchase17 Twitter handle: None --------- Co-authored-by: HoaNQ9 <hoanq.1811@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	8 months ago
kYLe	c9999557bf	community[patch]: Modify LLMs/Anyscale work with OpenAI API v1 (#14206 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: 1. Modify LLMs/Anyscale to work with OAI v1 2. Get rid of openai_ prefixed variables in Chat_model/ChatAnyscale 3. Modify `anyscale_api_base` to `anyscale_base_url` to follow OAI name convention (reverted) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	8 months ago
Bagatur	65e97c9b53	infra: mv SQLDatabase tests to community (#17276 )	8 months ago
Armin Stepanyan	641efcf41c	community: add runtime kwargs to HuggingFacePipeline (#17005 ) This PR enables changing the behaviour of huggingface pipeline between different calls. For example, before this PR there's no way of changing maximum generation length between different invocations of the chain. This is desirable in cases, such as when we want to scale the maximum output size depending on a dynamic prompt size. Usage example: ```python from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline model_id = "gpt2" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) hf = HuggingFacePipeline(pipeline=pipe) hf("Say foo:", pipeline_kwargs={"max_new_tokens": 42}) ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	8 months ago
Scott Nath	a32798abd7	community: Add you.com utility, update you retriever integration docs (#17014 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: changes to you.com files - general cleanup - adds community/utilities/you.py, moving bulk of code from retriever -> utility - removes `snippet` as endpoint - adds `news` as endpoint - adds more tests <s>Description: update community MAKE file - adds `integration_tests` - adds `coverage`</s> - Issue: the issue # it fixes if applicable, - [For New Contributors: Update Integration Documentation](https://github.com/langchain-ai/langchain/issues/15664#issuecomment-1920099868) - Dependencies: n/a - Twitter handle: @scottnath - Mastodon handle: scottnath@mastodon.social --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	8 months ago
Liang Zhang	7306600e2f	community[patch]: Support SerDe transform functions in Databricks LLM (#16752 ) Description: Databricks LLM does not support SerDe the transform_input_fn and transform_output_fn. After saving and loading, the LLM will be broken. This PR serialize these functions into a hex string using pickle, and saving the hex string in the yaml file. Using pickle to serialize a function can be flaky, but this is a simple workaround that unblocks many use cases. If more sophisticated SerDe is needed, we can improve it later. Test: Added a simple unit test. I did manual test on Databricks and it works well. The saved yaml looks like: ``` llm: _type: databricks cluster_driver_port: null cluster_id: null databricks_uri: databricks endpoint_name: databricks-mixtral-8x7b-instruct extra_params: {} host: e2-dogfood.staging.cloud.databricks.com max_tokens: null model_kwargs: null n: 1 stop: null task: null temperature: 0.0 transform_input_fn: 80049520000000000000008c085f5f6d61696e5f5f948c0f7472616e73666f726d5f696e7075749493942e transform_output_fn: null ``` @baskaryan ```python from langchain_community.embeddings import DatabricksEmbeddings from langchain_community.llms import Databricks from langchain.chains import RetrievalQA from langchain.document_loaders import TextLoader from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import FAISS import mlflow embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en") def transform_input(**request): request["messages"] = [ { "role": "user", "content": request["prompt"] } ] del request["prompt"] return request llm = Databricks(endpoint_name="databricks-mixtral-8x7b-instruct", transform_input_fn=transform_input) persist_dir = "faiss_databricks_embedding" # Create the vector db, persist the db to a local fs folder loader = TextLoader("state_of_the_union.txt") documents = loader.load() text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(documents) db = FAISS.from_documents(docs, embeddings) db.save_local(persist_dir) def load_retriever(persist_directory): embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en") vectorstore = FAISS.load_local(persist_directory, embeddings) return vectorstore.as_retriever() retriever = load_retriever(persist_dir) retrievalQA = RetrievalQA.from_llm(llm=llm, retriever=retriever) with mlflow.start_run() as run: logged_model = mlflow.langchain.log_model( retrievalQA, artifact_path="retrieval_qa", loader_fn=load_retriever, persist_dir=persist_dir, ) # Load the retrievalQA chain loaded_model = mlflow.pyfunc.load_model(logged_model.model_uri) print(loaded_model.predict([{"query": "What did the president say about Ketanji Brown Jackson"}])) ```	8 months ago
Neli Hateva	9bb5157a3d	langchain[patch], community[patch]: Fixes in the Ontotext GraphDB Graph and QA Chain (#17239 ) - Description: Fixes in the Ontotext GraphDB Graph and QA Chain related to the error handling in case of invalid SPARQL queries, for which `prepareQuery` doesn't throw an exception, but the server returns 400 and the query is indeed invalid - Issue: N/A - Dependencies: N/A - Twitter handle: @OntotextGraphDB	8 months ago
ByeongUk Choi	b88329e9a5	community[patch]: Implement Unique ID Enforcement in FAISS (#17244 ) Description: Implemented unique ID validation in the FAISS component to ensure all document IDs are distinct. This update resolves issues related to non-unique IDs, such as inconsistent behavior during deletion processes.	8 months ago
Luiz Ferreira	34d2daffb3	community[patch]: Fix chat openai unit test (#17124 ) - Description: Actually the test named `test_openai_apredict` isn't testing the apredict method from ChatOpenAI. - Twitter handle: https://twitter.com/OAlmofadas	8 months ago
Frank	ef082c77b1	community[minor]: add github file loader to load any github file content b… (#15305 ) ### Description support load any github file content based on file extension. Why not use [git loader](https://python.langchain.com/docs/integrations/document_loaders/git#load-existing-repository-from-disk) ? git loader clones the whole repo even only interested part of files, that's too heavy. This GithubFileLoader only downloads that you are interested files. ### Twitter handle my twitter: @shufanhaotop --------- Co-authored-by: Hao Fan <h_fan@apple.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	8 months ago
Ryan Kraus	f027696b5f	community: Added new Utility runnables for NVIDIA Riva. (#15966 ) Please tag this issue with `nvidia_genai` - Description: Added new Runnables for integration NVIDIA Riva into LCEL chains for Automatic Speech Recognition (ASR) and Text To Speech (TTS). - Issue: N/A - Dependencies: To use these runnables, the NVIDIA Riva client libraries are required. It they are not installed, an error will be raised instructing how to install them. The Runnables can be safely imported without the riva client libraries. - Twitter handle: N/A All of the Riva Runnables are inside a single folder in the Utilities module. In this folder are four files: - common.py - Contains all code that is common to both TTS and ASR - stream.py - Contains a class representing an audio stream that allows the end user to put data into the stream like a queue. - asr.py - Contains the RivaASR runnable - tts.py - Contains the RivaTTS runnable The following Python function is an example of creating a chain that makes use of both of these Runnables: ```python def create( config: Configuration, audio_encoding: RivaAudioEncoding, sample_rate: int, audio_channels: int = 1, ) -> Runnable[ASRInputType, TTSOutputType]: """Create a new instance of the chain.""" _LOGGER.info("Instantiating the chain.") # create the riva asr client riva_asr = RivaASR( url=str(config.riva_asr.service.url), ssl_cert=config.riva_asr.service.ssl_cert, encoding=audio_encoding, audio_channel_count=audio_channels, sample_rate_hertz=sample_rate, profanity_filter=config.riva_asr.profanity_filter, enable_automatic_punctuation=config.riva_asr.enable_automatic_punctuation, language_code=config.riva_asr.language_code, ) # create the prompt template prompt = PromptTemplate.from_template("{user_input}") # model = ChatOpenAI() model = ChatNVIDIA(model="mixtral_8x7b") # type: ignore # create the riva tts client riva_tts = RivaTTS( url=str(config.riva_asr.service.url), ssl_cert=config.riva_asr.service.ssl_cert, output_directory=config.riva_tts.output_directory, language_code=config.riva_tts.language_code, voice_name=config.riva_tts.voice_name, ) # construct and return the chain return {"user_input": riva_asr} \| prompt \| model \| riva_tts # type: ignore ``` The following code is an example of creating a new audio stream for Riva: ```python input_stream = AudioStream(maxsize=1000) # Send bytes into the stream for chunk in audio_chunks: await input_stream.aput(chunk) input_stream.close() ``` The following code is an example of how to execute the chain with RivaASR and RivaTTS ```python output_stream = asyncio.Queue() while not input_stream.complete: async for chunk in chain.astream(input_stream): output_stream.put(chunk) ``` Everything should be async safe and thread safe. Audio data can be put into the input stream while the chain is running without interruptions. --------- Co-authored-by: Hayden Wolff <hwolff@nvidia.com> Co-authored-by: Hayden Wolff <hwolff@Haydens-Laptop.local> Co-authored-by: Hayden Wolff <haydenwolff99@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	8 months ago
Bagatur	6e2ed9671f	infra: fix breebs test lint (#17075 )	8 months ago
Alex Boury	334b6ebdf3	community[minor]: Breebs docs retriever (#16578 ) - Description: Implementation of breeb retriever with integration tests -> libs/community/tests/integration_tests/retrievers/test_breebs.py and documentation (notebook) -> docs/docs/integrations/retrievers/breebs.ipynb. - Dependencies: None	8 months ago
Harrison Chase	4eda647fdd	infra: add -p to mkdir in lint steps (#17013 ) Previously, if this did not find a mypy cache then it wouldnt run this makes it always run adding mypy ignore comments with existing uncaught issues to unblock other prs --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	8 months ago
Killinsun - Ryota Takeuchi	bcfce146d8	community[patch]: Correct the calling to collection_name in qdrant (#16920 ) ## Description In #16608, the calling `collection_name` was wrong. I made a fix for it. Sorry for the inconvenience! ## Issue https://github.com/langchain-ai/langchain/issues/16962 ## Dependencies N/A <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Kumar Shivendu <kshivendu1@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	8 months ago
Erick Friis	b1a847366c	community: revert SQL Stores (#16912 ) This reverts commit `cfc225ecb3`. https://github.com/langchain-ai/langchain/pull/15909#issuecomment-1922418097 These will have existed in langchain-community 0.0.16 and 0.0.17.	8 months ago
Christophe Bornet	af8c5c185b	langchain[minor],community[minor]: Add async methods in BaseLoader (#16634 ) Adds: * methods `aload()` and `alazy_load()` to interface `BaseLoader` * implementation for class `MergedDataLoader ` * support for class `BaseLoader` in async function `aindex()` with unit tests Note: this is compatible with existing `aload()` methods that some loaders already had. Twitter handle: @cbornet_ --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	8 months ago
Christophe Bornet	744070ee85	Add async methods for the AstraDB VectorStore (#16391 ) - Description: fully async versions are available for astrapy 0.7+. For older astrapy versions or if the user provides a sync client without an async one, the async methods will call the sync ones wrapped in `run_in_executor` - Twitter handle: cbornet_	8 months ago
baichuan-assistant	f8f2649f12	community: Add Baichuan LLM to community (#16724 ) Replace this entire comment with: - Description: Add Baichuan LLM to integration/llm, also updated related docs. Co-authored-by: BaiChuanHelper <wintergyc@WinterGYCs-MacBook-Pro.local>	8 months ago
thiswillbeyourgithub	1d082359ee	community: add support for callable filters in FAISS (#16190 ) - Description: Filtering in a FAISS vectorstores is very inflexible and doesn't allow that many use case. I think supporting callable like this enables a lot: regular expressions, condition on multiple keys etc. Note I had to manually alter a test. I don't understand if it was falty to begin with or if there is something funky going on. - Issue: None - Dependencies: None - Twitter handle: None Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithub@users.noreply.github.com>	8 months ago
Volodymyr Machula	32c5be8b73	community[minor]: Connery Tool and Toolkit (#14506 ) ## Summary This PR implements the "Connery Action Tool" and "Connery Toolkit". Using them, you can integrate Connery actions into your LangChain agents and chains. Connery is an open-source plugin infrastructure for AI. With Connery, you can easily create a custom plugin with a set of actions and seamlessly integrate them into your LangChain agents and chains. Connery will handle the rest: runtime, authorization, secret management, access management, audit logs, and other vital features. Additionally, Connery and our community offer a wide range of ready-to-use open-source plugins for your convenience. Learn more about Connery: - GitHub: https://github.com/connery-io/connery-platform - Documentation: https://docs.connery.io - Twitter: https://twitter.com/connery_io ## TODOs - [x] API wrapper - [x] Integration tests - [x] Connery Action Tool - [x] Docs - [x] Example - [x] Integration tests - [x] Connery Toolkit - [x] Docs - [x] Example - [x] Formatting (`make format`) - [x] Linting (`make lint`) - [x] Testing (`make test`)	8 months ago
Harrison Chase	8457c31c04	community[patch]: activeloop ai tql deprecation (#14634 ) Co-authored-by: AdkSarsen <adilkhan@activeloop.ai>	8 months ago
Neli Hateva	c95facc293	langchain[minor], community[minor]: Implement Ontotext GraphDB QA Chain (#16019 ) - Description: Implement Ontotext GraphDB QA Chain - Issue: N/A - Dependencies: N/A - Twitter handle: @OntotextGraphDB	8 months ago
Jael Gu	a1aa3a657c	community[patch]: Milvus supports add & delete texts by ids (#16256 ) # Description To support [langchain indexing](https://python.langchain.com/docs/modules/data_connection/indexing) as requested by users, vectorstore Milvus needs to support: - document addition by id (`add_documents` method with `ids` argument) - delete by id (`delete` method with `ids` argument) Example usage: ```python from langchain.indexes import SQLRecordManager, index from langchain.schema import Document from langchain_community.vectorstores import Milvus from langchain_openai import OpenAIEmbeddings collection_name = "test_index" embedding = OpenAIEmbeddings() vectorstore = Milvus(embedding_function=embedding, collection_name=collection_name) namespace = f"milvus/{collection_name}" record_manager = SQLRecordManager( namespace, db_url="sqlite:///record_manager_cache.sql" ) record_manager.create_schema() doc1 = Document(page_content="kitty", metadata={"source": "kitty.txt"}) doc2 = Document(page_content="doggy", metadata={"source": "doggy.txt"}) index( [doc1, doc1, doc2], record_manager, vectorstore, cleanup="incremental", # None, "incremental", or "full" source_id_key="source", ) ``` # Fix issues Fix https://github.com/milvus-io/milvus/issues/30112 --------- Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	8 months ago
Benito Geordie	f3fdc5c5da	community: Added integrations for ThirdAI's NeuralDB with Retriever and VectorStore frameworks (#15280 ) Description: Adds ThirdAI NeuralDB retriever and vectorstore integration. NeuralDB is a CPU-friendly and fine-tunable text retrieval engine.	8 months ago
Christophe Bornet	2e3af04080	Use Postponed Evaluation of Annotations in Astra and Cassandra doc loaders (#16694 ) Minor/cosmetic change	8 months ago
Christophe Bornet	36e432672a	community[minor]: Add async methods to AstraDBLoader (#16652 )	8 months ago
Christophe Bornet	4915c3cd86	[Fix] Fix Cassandra Document loader default page content mapper (#16273 ) We can't use `json.dumps` by default as many types returned by the cassandra driver are not serializable. It's safer to use `str` and let users define their own custom `page_content_mapper` if needed.	8 months ago
Micah Parker	6543e585a5	community[patch]: Added support for Ollama's num_predict option in ChatOllama (#16633 ) Just a simple default addition to the options payload for a ollama generate call to support a max_new_tokens parameter. Should fix issue: https://github.com/langchain-ai/langchain/issues/14715	8 months ago
baichuan-assistant	70ff54eace	community[minor]: Add Baichuan Text Embedding Model and Baichuan Inc introduction (#16568 ) - Description: Adding Baichuan Text Embedding Model and Baichuan Inc introduction. Baichuan Text Embedding ranks #1 in C-MTEB leaderboard: https://huggingface.co/spaces/mteb/leaderboard Co-authored-by: BaiChuanHelper <wintergyc@WinterGYCs-MacBook-Pro.local>	8 months ago
Ghani	e30c6662df	Langchain-community : EdenAI chat integration. (#16377 ) - Description: This PR adds [EdenAI](https://edenai.co/) for the chat model (already available in LLM & Embeddings). It supports all [ChatModel] functionality: generate, async generate, stream, astream and batch. A detailed notebook was added. - Dependencies: No dependencies are added as we call a rest API. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	8 months ago
Bagatur	5df8ab574e	infra: move indexing documentation test (#16595 )	8 months ago
Brian Burgin	148347e858	community[minor]: Add LiteLLM Router Integration (#15588 ) community: - Description: - Add new ChatLiteLLMRouter class that allows a client to use a LiteLLM Router as a LangChain chat model. - Note: The existing ChatLiteLLM integration did not cover the LiteLLM Router class. - Add tests and Jupyter notebook. - Issue: None - Dependencies: Relies on existing ChatLiteLLM integration - Twitter handle: @bburgin_0 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	8 months ago
Rave Harpaz	c4e9c9ca29	community[minor]: Add OCI Generative AI integration (#16548 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: Adding Oracle Cloud Infrastructure Generative AI integration. Oracle Cloud Infrastructure (OCI) Generative AI is a fully managed service that provides a set of state-of-the-art, customizable large language models (LLMs) that cover a wide range of use cases, and which is available through a single API. Using the OCI Generative AI service you can access ready-to-use pretrained models, or create and host your own fine-tuned custom models based on your own data on dedicated AI clusters. https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm - Issue: None, - Dependencies: OCI Python SDK, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. Passed See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. we provide unit tests. However, we cannot provide integration tests due to Oracle policies that prohibit public sharing of api keys. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	8 months ago
Harel Gal	a91181fe6d	community[minor]: add support for Guardrails for Amazon Bedrock (#15099 ) Added support for optionally supplying 'Guardrails for Amazon Bedrock' on both types of model invocations (batch/regular and streaming) and for all models supported by the Amazon Bedrock service. @baskaryan @hwchase17 ```python llm = Bedrock(model_id="<model_id>", client=bedrock, model_kwargs={}, guardrails={"id": " <guardrail_id>", "version": "<guardrail_version>", "trace": True}, callbacks=[BedrockAsyncCallbackHandler()]) class BedrockAsyncCallbackHandler(AsyncCallbackHandler): """Async callback handler that can be used to handle callbacks from langchain.""" async def on_llm_error( self, error: BaseException, **kwargs: Any, ) -> Any: reason = kwargs.get("reason") if reason == "GUARDRAIL_INTERVENED": # kwargs contains additional trace information sent by 'Guardrails for Bedrock' service. print(f"""Guardrails: {kwargs}""") # streaming llm = Bedrock(model_id="<model_id>", client=bedrock, model_kwargs={}, streaming=True, guardrails={"id": "<guardrail_id>", "version": "<guardrail_version>"}) ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	8 months ago
Martin Kolb	04651f0248	community[minor]: VectorStore integration for SAP HANA Cloud Vector Engine (#16514 ) - Description: This PR adds a VectorStore integration for SAP HANA Cloud Vector Engine, which is an upcoming feature in the SAP HANA Cloud database (https://blogs.sap.com/2023/11/02/sap-hana-clouds-vector-engine-announcement/). - Issue: N/A - Dependencies: [SAP HANA Python Client](https://pypi.org/project/hdbcli/) - Twitter handle: @sapopensource Implementation of the integration: `libs/community/langchain_community/vectorstores/hanavector.py` Unit tests: `libs/community/tests/unit_tests/vectorstores/test_hanavector.py` Integration tests: `libs/community/tests/integration_tests/vectorstores/test_hanavector.py` Example notebook: `docs/docs/integrations/vectorstores/hanavector.ipynb` Access credentials for execution of the integration tests can be provided to the maintainers. --------- Co-authored-by: sascha <sascha.stoll@sap.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	8 months ago
Raunak	476bf8b763	community[patch]: Load list of files using UnstructuredFileLoader (#16216 ) - Description: Updated `_get_elements()` function of `UnstructuredFileLoader `class to check if the argument self.file_path is a file or list of files. If it is a list of files then it iterates over the list of file paths, calls the partition function for each one, and appends the results to the elements list. If self.file_path is not a list, it calls the partition function as before. - Issue: Fixed #15607, - Dependencies: NA - Twitter handle: NA Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>	8 months ago
Xudong Sun	019b6ebe8d	community[minor]: Add iFlyTek Spark LLM chat model support (#13389 ) - Description: This PR enables LangChain to access the iFlyTek's Spark LLM via the chat_models wrapper. - Dependencies: websocket-client ^1.6.1 - Tag maintainer: @baskaryan ### SparkLLM chat model usage Get SparkLLM's app_id, api_key and api_secret from [iFlyTek SparkLLM API Console](https://console.xfyun.cn/services/bm3) (for more info, see [iFlyTek SparkLLM Intro](https://xinghuo.xfyun.cn/sparkapi) ), then set environment variables `IFLYTEK_SPARK_APP_ID`, `IFLYTEK_SPARK_API_KEY` and `IFLYTEK_SPARK_API_SECRET` or pass parameters when using it like the demo below: ```python3 from langchain.chat_models.sparkllm import ChatSparkLLM client = ChatSparkLLM( spark_app_id="<app_id>", spark_api_key="<api_key>", spark_api_secret="<api_secret>" ) ```	8 months ago
bu2kx	ff3163297b	community[minor]: Add KDBAI vector store (#12797 ) Addition of KDBAI vector store (https://kdb.ai). Dependencies: `kdbai_client` v0.1.2 Python package. Sample notebook: `docs/docs/integrations/vectorstores/kdbai.ipynb` Tag maintainer: @bu2kx Twitter handle: @kxsystems	8 months ago
Shivani Modi	4e160540ff	community[minor]: Adding Konko Completion endpoint (#15570 ) This PR introduces update to Konko Integration with LangChain. 1. New Endpoint Addition: Integration of a new endpoint to utilize completion models hosted on Konko. 2. Chat Model Updates for Backward Compatibility: We have updated the chat models to ensure backward compatibility with previous OpenAI versions. 4. Updated Documentation: Comprehensive documentation has been updated to reflect these new changes, providing clear guidance on utilizing the new features and ensuring seamless integration. Thank you to the LangChain team for their exceptional work and for considering this PR. Please let me know if any additional information is needed. --------- Co-authored-by: Shivani Modi <shivanimodi@Shivanis-MacBook-Pro.local> Co-authored-by: Shivani Modi <shivanimodi@Shivanis-MBP.lan>	8 months ago
Facundo Santiago	92e6a641fd	feat: adding paygo api support for Azure ML / Azure AI Studio (#14560 ) - Description: Introducing support for LLMs and Chat models running in Azure AI studio and Azure ML using the new deployment mode pay-as-you-go (model as a service). - Issue: NA - Dependencies: None. - Tag maintainer: @prakharg-msft @gdyre - Twitter handle: @santiagofacundo Examples added: * [docs/docs/integrations/llms/azure_ml.ipynb](https://github.com/santiagxf/langchain/blob/santiagxf/azureml-endpoints-paygo-community/docs/docs/integrations/chat/azureml_endpoint.ipynb) * [docs/docs/integrations/chat/azureml_chat_endpoint.ipynb](https://github.com/santiagxf/langchain/blob/santiagxf/azureml-endpoints-paygo-community/docs/docs/integrations/chat/azureml_chat_endpoint.ipynb) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	8 months ago
baichuan-assistant	20fcd49348	community: Fix Baichuan Chat. (#15207 ) - Description: Baichuan Chat (with both Baichuan-Turbo and Baichuan-Turbo-192K models) has updated their APIs. There are breaking changes. For example, BAICHUAN_SECRET_KEY is removed in the latest API but is still required in Langchain. Baichuan's Langchain integration needs to be updated to the latest version. - Issue: #15206 - Dependencies: None, - Twitter handle: None @hwchase17. Co-authored-by: BaiChuanHelper <wintergyc@WinterGYCs-MacBook-Pro.local>	8 months ago
gcheron	cfc225ecb3	community: SQLStrStore/SQLDocStore provide an easy SQL alternative to `InMemoryStore` to persist data remotely in a SQL storage (#15909 ) Description: - Implement `SQLStrStore` and `SQLDocStore` classes that inherits from `BaseStore` to allow to persist data remotely on a SQL server. - SQL is widely used and sometimes we do not want to install a caching solution like Redis. - Multiple issues/comments complain that there is no easy remote and persistent solution that are not in memory (users want to replace InMemoryStore), e.g., https://github.com/langchain-ai/langchain/issues/14267, https://github.com/langchain-ai/langchain/issues/15633, https://github.com/langchain-ai/langchain/issues/14643, https://stackoverflow.com/questions/77385587/persist-parentdocumentretriever-of-langchain - This is particularly painful when wanting to use `ParentDocumentRetriever ` - This implementation is particularly useful when: * it's expensive to construct an InMemoryDocstore/dict * you want to retrieve documents from remote sources * you just want to reuse existing objects - This implementation integrates well with PGVector, indeed, when using PGVector, you already have a SQL instance running. `SQLDocStore` is a convenient way of using this instance to store documents associated to vectors. An integration example with ParentDocumentRetriever and PGVector is provided in docs/docs/integrations/stores/sql.ipynb or [here](https://github.com/gcheron/langchain/blob/sql-store/docs/docs/integrations/stores/sql.ipynb). - It persists `str` and `Document` objects but can be easily extended. Issue: Provide an easy SQL alternative to `InMemoryStore`. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	8 months ago
Tomaz Bratanic	d0a8082188	Fix neo4j sanitize (#16439 ) Fix the sanitization bug and add an integration test	8 months ago
Florian MOREL	4b7969efc5	community[minor]: New documents loader for visio files (with extension .vsdx) (#16171 ) Description : New documents loader for visio files (with extension .vsdx) A [visio file](https://fr.wikipedia.org/wiki/Microsoft_Visio) (with extension .vsdx) is associated with Microsoft Visio, a diagram creation software. It stores information about the structure, layout, and graphical elements of a diagram. This format facilitates the creation and sharing of visualizations in areas such as business, engineering, and computer science. A Visio file can contain multiple pages. Some of them may serve as the background for others, and this can occur across multiple layers. This loader extracts the textual content from each page and its associated pages, enabling the extraction of all visible text from each page, similar to what an OCR algorithm would do. Dependencies : xmltodict package	8 months ago
DL	b9e7f6f38a	community[minor]: Bedrock async methods (#12477 ) Description: Added support for asynchronous streaming in the Bedrock class and corresponding tests. Primarily: async def aprepare_output_stream async def _aprepare_input_and_invoke_stream async def _astream async def _acall I've ensured that the code adheres to the project's linting and formatting standards by running make format, make lint, and make test. Issue: #12054, #11589 Dependencies: None Tag maintainer: @baskaryan Twitter handle: @dominic_lovric --------- Co-authored-by: Piyush Jain <piyushjain@duck.com>	8 months ago
parkererickson-tg	b26a22f307	community[minor]: add TigerGraph support (#16280 ) Description: Add support for querying TigerGraph databases through the InquiryAI service. Issue: N/A Dependencies: N/A Twitter handle: @TigerGraphDB	8 months ago
Ian	b9f5104e6c	communty[minor]: Store Message History to TiDB Database (#16304 ) This pull request integrates the TiDB database into LangChain for storing message history, marking one of several steps towards a comprehensive integration of TiDB with LangChain. A simple usage ```python from datetime import datetime from langchain_community.chat_message_histories import TiDBChatMessageHistory history = TiDBChatMessageHistory( connection_string="mysql+pymysql://<host>:<PASSWORD>@<host>:4000/<db>?ssl_ca=/etc/ssl/cert.pem&ssl_verify_cert=true&ssl_verify_identity=true", session_id="code_gen", earliest_time=datetime.utcnow(), # Optional to set earliest_time to load messages after this time point. ) history.add_user_message("hi! How's feature going?") history.add_ai_message("It's almot done") ```	8 months ago
Eli Lucherini	6b2a57161a	community[patch]: allow additional kwargs in MlflowEmbeddings for compatibility with Cohere API (#15242 ) - Description: add support for kwargs in`MlflowEmbeddings` `embed_document()` and `embed_query()` so that all the arguments required by Cohere API (and others?) can be passed down to the server. - Issue: #15234 - Dependencies: MLflow with MLflow Deployments (`pip install mlflow[genai]`) Tests Now this code [adapted from the docs](https://python.langchain.com/docs/integrations/providers/mlflow#embeddings-example) for the Cohere API works locally. ```python """ Setup ----- export COHERE_API_KEY=... mlflow deployments start-server --config-path examples/deployments/cohere/config.yaml Run --- python /path/to/this/file.py """ embeddings = MlflowCohereEmbeddings(target_uri="http://127.0.0.1:5000", endpoint="embeddings") print(embeddings.embed_query("hello")[:3]) print(embeddings.embed_documents(["hello", "world"])[0][:3]) ``` Output ``` [0.060455322, 0.028793335, -0.025848389] [0.031707764, 0.021057129, -0.009361267] ```	8 months ago
Guillem Orellana Trullols	aad2aa7188	community[patch]: BedrockChat -> Support Titan express as chat model (#15408 ) Titan Express model was not supported as a chat model because LangChain messages were not "translated" to a text prompt. Co-authored-by: Guillem Orellana Trullols <guillem.orellana_trullols@siemens.com>	8 months ago
Katarina Supe	01c2f27ffa	community[patch]: Update Memgraph support (#16360 ) - Description: I removed two queries to the database and left just one whose results were formatted afterward into other type of schema (avoided two calls to DB) - Issue: / - Dependencies: / - Twitter handle: @supe_katarina	8 months ago
Iskren Ivov Chernev	fc196cab12	community[minor]: DeepInfra support for chat models (#16380 ) Add deepinfra chat models support. This is https://github.com/langchain-ai/langchain/pull/14234 re-opened from my branch (so maintainers can edit).	8 months ago
Max Jakob	de209af533	community[patch]: ElasticsearchStore: add relevance function selector (#16378 ) Implement similarity function selector for ElasticsearchStore. The scores coming back from Elasticsearch are already similarities (not distances) and they are already normalized (see [docs](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-params)). Hence we leave the scores untouched and just forward them. This fixes #11539. However, in hybrid mode (when keyword search and vector search are involved) Elasticsearch currently returns no scores. This PR adds an error message around this fact. We need to think a bit more to come up with a solution for this case. This PR also corrects a small error in the Elasticsearch integration test. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	8 months ago
Bagatur	8779013847	community[patch]: Release 0.0.14 (#16384 )	8 months ago

... 2 3 4 5 6 ...

423 Commits (c417803908aba89a7c2035093c27365db054893c)