langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-10 01:10:59 +00:00

Author	SHA1	Message	Date
JeffKatzy	5ab3f9a995	community[patch]: standardize chat init args (#20844 ) Thank you for contributing to LangChain! community:perplexity[patch]: standardize init args updated pplx_api_key and request_timeout so that aliased to api_key, and timeout respectively. Added test that both continue to set the same underlying attributes. Related to [20085](https://github.com/langchain-ai/langchain/issues/20085) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-24 12:26:05 -07:00
Eugene Yurtsev	30e48c9878	core[patch],community[patch]: Move file chat history back to community (#20834 ) Marking as patch since we haven't had releases in between. This just reverting part of a PR from yesterday.	2024-04-24 12:47:25 -04:00
Eugene Yurtsev	645b1e142e	core[minor],langchain[patch],community[patch]: Move InMemory and File implementations of Chat History to core (#20752 ) This PR moves the implementations for chat history to core. So it's easier to determine which dependencies need to be broken / add deprecation warnings	2024-04-23 10:22:11 -04:00
ccurme	c010ec8b71	patch: deprecate (a)get_relevant_documents (#20477 ) - `.get_relevant_documents(query)` -> `.invoke(query)` - `.get_relevant_documents(query=query)` -> `.invoke(query)` - `.get_relevant_documents(query, callbacks=callbacks)` -> `.invoke(query, config={"callbacks": callbacks})` - `.get_relevant_documents(query, kwargs)` -> `.invoke(query, kwargs)` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-04-22 11:14:53 -04:00
shumway743	cb6e5e56c2	community[minor]: add graph store implementation for apache age (#20582 ) Description: implemented GraphStore class for Apache Age graph db Dependencies: depends on psycopg2 Unit and integration tests included. Formatting and linting have been run. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-04-20 14:31:04 -07:00
Christophe Bornet	c909ae0152	community[minor]: Add async methods to CassandraVectorStore (#20602 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-04-20 02:09:58 +00:00
Tomaz Bratanic	8c08cf4619	community: Add support for relationship indexes in neo4j vector (#20657 ) Neo4j has added relationship vector indexes. We can't populate them, but we can use existing indexes for retrieval	2024-04-19 11:22:42 -07:00
Lance Martin	d5c22b80a5	community[patch]: Fix Ollama for LLaMA3 (#20624 ) We see verbose generations w/ LLaMA3 and Ollama - https://smith.langchain.com/public/88c4cd21-3d57-4229-96fe-53443398ca99/r --- Fix here implies that when stop was being set to an empty list, the stream had no conditions under which to stop, which could lead to excessive or unintended output. Test LLaMA2 - https://smith.langchain.com/public/57dfc64a-591b-46fa-a1cd-8783acaefea2/r Test LLaMA3 - https://smith.langchain.com/public/76ff5f47-ac89-4772-a7d2-5caa907d3fd6/r https://smith.langchain.com/public/a31d2fad-9094-4c93-949a-964b27630ccb/r Test Mistral - https://smith.langchain.com/public/a4fe7114-c308-4317-b9fd-6c86d31f1c5b/r --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-04-19 00:20:32 +00:00
Christophe Bornet	d2d01370bc	community[minor]: Add async methods to CassandraLoader (#20609 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-04-18 19:45:20 +00:00
Pengcheng Liu	ecd19a9e58	community[patch]: Add function call support in Tongyi chat model. (#20119 ) - [ ] PR message: - Description: This pr adds function calling support in Tongyi chat model. - Issue: None - Dependencies: None - Twitter handle: None Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-17 20:42:23 +00:00
Sevin F. Varoglu	3f156e0ece	community[minor]: add ChatOctoAI (#20059 ) This PR adds ChatOctoAI, a chat model integration for OctoAI.	2024-04-17 03:20:56 -07:00
pjb157	479be3cc91	community[minor]: Unify Titan Takeoff Integrations and Adding Embedding Support (#18775 ) Community: Unify Titan Takeoff Integrations and Adding Embedding Support Description: Titan Takeoff no longer reflects this either of the integrations in the community folder. The two integrations (TitanTakeoffPro and TitanTakeoff) where causing confusion with clients, so have moved code into one place and created an alias for backwards compatibility. Added Takeoff Client python package to do the bulk of the work with the requests, this is because this package is actively updated with new versions of Takeoff. So this integration will be far more robust and will not degrade as badly over time. Issue: Fixes bugs in the old Titan integrations and unified the code with added unit test converge to avoid future problems. Dependencies: Added optional dependency takeoff-client, all imports still work without dependency including the Titan Takeoff classes but just will fail on initialisation if not pip installed takeoff-client Twitter @MeryemArik9 Thanks all :) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-17 01:43:35 +00:00
Guangdong Liu	b78ede2f96	community[patch]: standardize init args (#20166 ) Related to https://github.com/langchain-ai/langchain/issues/20085 @baskaryan	2024-04-16 18:30:26 -07:00
Guangdong Liu	3729bec1a2	community[patch]: standardize init args (#20210 ) Related to https://github.com/langchain-ai/langchain/issues/20085 @baskaryan	2024-04-16 18:29:57 -07:00
sdan	a7c5e41443	community[minor]: Added VLite as VectorStore (#20245 ) Support [VLite](https://github.com/sdan/vlite) as a new VectorStore type. Description: vlite is a simple and blazing fast vector database(vdb) made with numpy. It abstracts a lot of the functionality around using a vdb in the retrieval augmented generation(RAG) pipeline such as embeddings generation, chunking, and file processing while still giving developers the functionality to change how they're made/stored. Before submitting: Added tests [here](`c09c2ebd5c/libs/community/tests/integration_tests/vectorstores/test_vlite.py`) Added ipython notebook [here](`c09c2ebd5c/docs/docs/integrations/vectorstores/vlite.ipynb`) Added simple docs on how to use [here](`c09c2ebd5c/docs/docs/integrations/providers/vlite.mdx`) Profiles Maintainers: @sdan Twitter handles: [@sdand](https://x.com/sdand) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-04-17 01:24:38 +00:00
Sevin F. Varoglu	54d388d898	community[patch]: update OctoAI endpoint to subclass BaseOpenAI (#19757 ) This PR updates OctoAIEndpoint LLM to subclass BaseOpenAI as OctoAI is an OpenAI-compatible service. The documentation and tests have also been updated.	2024-04-16 17:32:20 -07:00
Benito Geordie	57b226532d	community[minor]: Added integrations for ThirdAI's NeuralDB as a Retriever (#17334 ) Description: Adds ThirdAI NeuralDB retriever integration. NeuralDB is a CPU-friendly and fine-tunable text retrieval engine. We previously added a vector store integration but we think that it will be easier for our customers if they can also find us under under langchain-community/retrievers. --------- Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com> Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>	2024-04-16 16:36:55 -07:00
Dhruv Chawla	d6d559d50d	community[minor]: add UpTrainCallbackHandler (#19956 ) - Description: This PR adds a callback handler for UpTrain. It performs evaluations in the RAG pipeline to check the quality of retrieved documents, generated queries and responses. - Dependencies: - The UpTrainCallbackHandler requires the uptrain package --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-04-16 19:32:03 +00:00
Ravindu Somawansa	5acc7ba622	community[minor]: Add glue catalog loader (#20220 ) Add Glue Catalog loader	2024-04-16 11:39:23 -04:00
Martín Gotelli Ferenaz	b48add4353	community[patch]: Fix pgvector deprecated filter clause usage with OR and AND conditions (#20446 ) Description: Support filter by OR and AND for deprecated PGVector version Issue: #20445 Dependencies: N/A Twitter handle: @martinferenaz	2024-04-16 14:08:07 +00:00
Juan Carlos José Camacho	450c458f8f	community[minor]: Add Datahareld tool (#19680 ) Description: Integrate [dataherald](https://www.dataherald.com) tool, It is a natural language-to-SQL tool. Dependencies: Install dataherald sdk to use it, ``` pip install dataherald ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com>	2024-04-13 23:27:16 +00:00
Egor Krasheninnikov	c8391d4ff1	community[patch]: Fix YandexGPT embeddings (#19720 ) Fix of YandexGPT embeddings. The current version uses a single `model_name` for queries and documents, essentially making the `embed_documents` and `embed_query` methods the same. Yandex has a different endpoint (`model_uri`) for encoding documents, see [this](https://yandex.cloud/en/docs/yandexgpt/concepts/embeddings). The bug may impact retrievers built with `YandexGPTEmbeddings` (for instance FAISS database as retriever) since they use both `embed_documents` and `embed_query`. A simple snippet to test the behaviour: ```python from langchain_community.embeddings.yandex import YandexGPTEmbeddings embeddings = YandexGPTEmbeddings() q_emb = embeddings.embed_query('hello world') doc_emb = embeddings.embed_documents(['hello world', 'hello world']) q_emb == doc_emb[0] ``` The response is `True` with the current version and `False` with the changes I made. Twitter: @egor_krash --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-04-13 16:23:01 -07:00
Guangdong Liu	4be7ca7b4c	community[patch]:sparkllm standardize init args (#20194 ) Related to https://github.com/langchain-ai/langchain/issues/20085 @baskaryan	2024-04-13 16:03:19 -07:00
ccurme	38faa74c23	community[patch]: update use of deprecated llm methods (#20393 ) .predict and .predict_messages for BaseLanguageModel and BaseChatModel	2024-04-12 17:28:23 -04:00
Corey Zumar	3a068b26f3	community[patch]: Databricks - fix scope of dangerous deserialization error in Databricks LLM connector (#20368 ) fix scope of dangerous deserialization error in Databricks LLM connector --------- Signed-off-by: dbczumar <corey.zumar@databricks.com>	2024-04-12 17:27:26 -04:00
Nicolas	ad04585e30	community[minor]: Firecrawl.dev integration (#20364 ) Added the [FireCrawl](https://firecrawl.dev) document loader. Firecrawl crawls and convert any website into LLM-ready data. It crawls all accessible subpages and give you clean markdown for each. - Description: Adds FireCrawl data loader - Dependencies: firecrawl-py - Twitter handle: @mendableai ccing contributors: (@ericciarla @nickscamara) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-12 19:13:48 +00:00
Alex Sherstinsky	fad0962643	community: for Predibase -- enable both Predibase-hosted and HuggingFace-hosted fine-tuned adapter repositories (#20370 )	2024-04-12 08:32:00 -07:00
Leonid Ganeline	4cb5f4c353	community[patch]: import flattening fix (#20110 ) This PR should make it easier for linters to do type checking and for IDEs to jump to definition of code. See #20050 as a template for this PR. - As a byproduct: Added 3 missed `test_imports`. - Added missed `SolarChat` in to __init___.py Added it into test_import ut. - Added `# type: ignore` to fix linting. It is not clear, why linting errors appear after ^ changes. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-04-10 13:01:19 -04:00
jeff kit	ac42e96e4c	community[patch], langchain[minor]: Enhance Tencent Cloud VectorDB, langchain: make Tencent Cloud VectorDB self query retrieve compatible (#19651 ) - make Tencent Cloud VectorDB support metadata filtering. - implement delete function for Tencent Cloud VectorDB. - support both Langchain Embedding model and Tencent Cloud VDB embedding model. - Tencent Cloud VectorDB support filter search keyword, compatible with langchain filtering syntax. - add Tencent Cloud VectorDB TranslationVisitor, now work with self query retriever. - more documentations. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-09 16:50:48 +00:00
Guangdong Liu	97d91ec17c	community[patch]: standardize baichuan init args (#20209 ) Related to https://github.com/langchain-ai/langchain/issues/20085 @baskaryan	2024-04-09 11:00:40 -05:00
Piyush Jain	cd7abc495a	community[minor]: add neptune analytics graph (#20047 ) Replacement for PR [#19772](https://github.com/langchain-ai/langchain/pull/19772). --------- Co-authored-by: Dave Bechberger <dbechbe@amazon.com> Co-authored-by: bechbd <bechbd@users.noreply.github.com>	2024-04-09 09:20:59 -05:00
Shuqian	ad9750403b	community[minor]: add bedrock anthropic callback for token usage counting (#19864 ) Description: add bedrock anthropic callback for token usage counting, consulted openai callback. --------- Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>	2024-04-09 09:18:48 -05:00
Prince Canuma	1f9f4d8742	community[minor]: Add support for MLX models (chat & llm) (#18152 ) Description: This PR adds support for MLX models both chat (i.e., instruct) and llm (i.e., pretrained) types/ Dependencies: mlx, mlx_lm, transformers Twitter handle: @Prince_Canuma --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-09 14:17:07 +00:00
Leonid Ganeline	2f8dd1a161	community[patch]: `cross_encoders` flatten namespaces (#20183 ) Issue `langchain_community.cross_encoders` didn't have flattening namespace code in the __init__.py file. Changes: - added code to flattening namespaces (used #20050 as a template) - added ut for a change - added missed `test_imports` for `chat_loaders` and `chat_message_histories` modules	2024-04-08 20:50:23 -04:00
Alex Sherstinsky	5f563e040a	community: extend Predibase integration to support fine-tuned LLM adapters (#19979 ) - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: Langchain-Predibase integration was failing, because it was not current with the Predibase SDK; in addition, Predibase integration tests were instantiating the Langchain Community `Predibase` class with one required argument (`model`) missing. This change updates the Predibase SDK usage and fixes the integration tests. - Twitter handle: `@alexsherstinsky` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-04-08 18:54:29 +00:00
david02871	e1a24d09c5	community: Add PHP language parser to document_loaders (#19850 ) Description: Added a PHP language parser to document_loaders Issue: N/A Dependencies: N/A Twitter handle: N/A --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-04-08 11:30:28 -04:00
Marlene	2f03bc397e	Community: Updating Azure Retriever and Docs to be Azure AI Search instead of Azure Cognitive Search (#19925 ) Last year Microsoft [changed the name](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search) of Azure Cognitive Search to Azure AI Search. This PR updates the Langchain Azure Retriever API and it's associated docs to reflect this change. It may be confusing for users to see the name Cognitive here and AI in the Microsoft documentation which is why this is needed. I've also added a more detailed example to the Azure retriever doc page. There are more places that need a similar update but I'm breaking it up so the PRs are not too big 😄 Fixing my errors from the previous PR. Twitter: @marlene_zw Two new tests added to test backward compatibility in `libs/community/tests/integration_tests/retrievers/test_azure_cognitive_search.py` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-04-08 11:12:41 -04:00
Rahul Triptahi	820b713086	community[minor]: Add support for Pebblo cloud_api_key in PebbloSafeLoader (#19855 ) Description: _PebbloSafeLoader_: Add support for pebblo's cloud api-key in PebbloSafeLoader - This Pull request enables PebbloSafeLoader to accept pebblo's cloud api-key and send the semantic classification data to pebblo cloud. Documentation: Updated Unit test: Added Issue: NA Dependencies: - None Twitter handle: @rahul_tripathi2 Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-04-08 11:10:04 -04:00
Eugene Yurtsev	520ff50adc	community[patch]: Improve import callbacks to make it IDE friendly (#20050 ) * declares __all__ as a list of strings (instead of dynamically computing it) * import type definitions when TYPE_CHECKING is true	2024-04-05 15:17:51 -04:00
Leonid Ganeline	3aacd11846	community[minor]: added missed class to __all__ (#19888 ) Added missed `UnstructuredCHMLoader` class to the document_loader.\_\_init\_\_.py \_\_all\_\_	2024-04-04 16:16:51 -04:00
Tomaz Bratanic	df25829f33	community[minor]: Add metadata filtering support for neo4j vector (#20001 )	2024-04-04 11:37:06 -04:00
Tomaz Bratanic	09a0ecd000	langchain[minor]: Tests update metadata filtering examples of documents (#19963 ) Removing metadata properties that are dicts as some databases don't support that, and those properties aren't used in tests anyhow..	2024-04-03 12:44:14 -04:00
happy-go-lucky	c6432abdbe	community[patch]: Implement delete method and all async methods in opensearch_vector_search (#17321 ) - Description: In order to use index and aindex in libs/langchain/langchain/indexes/_api.py, I implemented delete method and all async methods in opensearch_vector_search - Dependencies: No changes	2024-04-03 09:40:49 -07:00
Cheng, Penghui	cc407e8a1b	community[minor]: weight only quantization with intel-extension-for-transformers. (#14504 ) Support weight only quantization with intel-extension-for-transformers. [Intel® Extension for Transformers](https://github.com/intel/intel-extension-for-transformers) is an innovative toolkit to accelerate Transformer-based models on Intel platforms, in particular effective on 4th Intel Xeon Scalable processor [Sapphire Rapids](https://www.intel.com/content/www/us/en/products/docs/processors/xeon-accelerated/4th-gen-xeon-scalable-processors.html) (codenamed Sapphire Rapids). The toolkit provides the below key features: * Seamless user experience of model compressions on Transformer-based models by extending [Hugging Face transformers](https://github.com/huggingface/transformers) APIs and leveraging [Intel® Neural Compressor](https://github.com/intel/neural-compressor) * Advanced software optimizations and unique compression-aware runtime. * Optimized Transformer-based model packages. * [NeuralChat](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat), a customizable chatbot framework to create your own chatbot within minutes by leveraging a rich set of plugins and SOTA optimizations. * [Inference](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/graph) of Large Language Model (LLM) in pure C/C++ with weight-only quantization kernels. This PR is an integration of weight only quantization feature with intel-extension-for-transformers. Unit test is in lib/langchain/tests/integration_tests/llm/test_weight_only_quantization.py The notebook is in docs/docs/integrations/llms/weight_only_quantization.ipynb. The document is in docs/docs/integrations/providers/weight_only_quantization.mdx. --------- Signed-off-by: Cheng, Penghui <penghui.cheng@intel.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-04-03 16:21:34 +00:00
Erick Friis	f0d5b59962	core[patch]: remove requests (#19891 ) Removes required usage of `requests` from `langchain-core`, all of which has been deprecated. - removes Tracer V1 implementations - removes old `try_load_from_hub` github-based hub implementations Removal done in a way where imports will still succeed, and usage will fail with a `RuntimeError`.	2024-04-02 20:28:10 +00:00
Peter Vandenabeele	e830a4e731	community[patch]: Add remove_comments option (default True): do not extract html comments (#13259 ) - Description: add `remove_comments` option (default: True): do not extract html _comments_, - Issue: None, - Dependencies: None, - Tag maintainer: @nfcampos , - Twitter handle: peter_v I ran `make format`, `make lint` and `make test`. Discussion: I my use case, I prefer to not have the comments in the extracted text: * e.g. from a Google tag that is added in the html as comment * e.g. content that the authors have temporarily hidden to make it non visible to the regular reader Removing the comments makes the extracted text more alike the intended text to be seen by the reader. Choice to make: do we prefer to make the default for this `remove_comments` option to be True or False? I have changed it to True in a second commit, since that is how I would prefer to use it by default. Have the cleaned text (without technical Google tags etc.) and also closer to the actually visible and intended content. I am not sure what is best aligned with the conventions of langchain in general ... INITIAL VERSION (new version above): ~Choice to make: do we prefer to make the default for this `ignore_comments` option to be True or False? I have set it to False now to be backwards compatible. On the other hand, I would use it mostly with True. I am not sure what is best aligned with the conventions of langchain in general ...~ --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-04-02 00:19:12 +00:00
Jamsheed Mistri	4f70bc119d	community[minor]: add Layerup Security integration (#19787 ) Description: adds integration with [Layerup Security](https://uselayerup.com). Docs can be found [here](https://docs.uselayerup.com). Integrates directly with our Python SDK. Dependencies: [LayerupSecurity](https://pypi.org/project/LayerupSecurity/) Note: all methods for our product require a paid API key, so I only included 1 test which checks for an invalid API key response. I have tested extensively locally. Twitter handle: [@layerup_](https://twitter.com/layerup_) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-04-01 23:49:00 +00:00
Anıl Berk Altuner	4384fa8e49	community[minor]: Add Dria retriever (#17098 ) [Dria](https://dria.co/) is a hub of public RAG models for developers to both contribute and utilize a shared embedding lake. This PR adds a retriever that can retrieve documents from Dria.	2024-04-01 12:04:19 -07:00
Chenhui Zhang	a1f3e9f537	community[minor]: Update ChatZhipuAI to support GLM-4 model (#16695 ) Description: Update `ChatZhipuAI` to support the latest `glm-4` model. Issue: N/A Dependencies: httpx, httpx-sse, PyJWT The previous `ChatZhipuAI` implementation requires the `zhipuai` package, and cannot call the latest GLM model. This is because - The old version `zhipuai==1.` doesn't support the latest model. - `zhipuai==2.` requires `pydantic V2`, which is incompatible with 'langchain-community'. This re-implementation invokes the GLM model by sending HTTP requests to [open.bigmodel.cn](https://open.bigmodel.cn/dev/api) via the `httpx` package, and uses the `httpx-sse` package to handle stream events. --------- Co-authored-by: zR <2448370773@qq.com>	2024-04-01 18:11:21 +00:00
Kenneth Choe	f98d7f7494	langchain[minor], community[minor]: add CrossEncoderReranker with HuggingFaceCrossEncoder and SagemakerEndpointCrossEncoder (#13687 ) - Description: Support reranking based on cross encoder models available from HuggingFace. - Added `CrossEncoder` schema - Implemented `HuggingFaceCrossEncoder` and `SagemakerEndpointCrossEncoder` - Implemented `CrossEncoderReranker` that performs similar functionality to `CohereRerank` - Added `cross-encoder-reranker.ipynb` to demonstrate how to use it. Please let me know if anything else needs to be done to make it visible on the table-of-contents navigation bar on the left, or on the card list on [retrievers documentation page](https://python.langchain.com/docs/integrations/retrievers). - Issue: N/A - Dependencies: None other than the existing ones. --------- Co-authored-by: Kenny Choe <kchoe@amazon.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-31 20:51:31 +00:00
Kamal Zhang	368e35c3b1	community[patch]: introduce convert_to_secret() to bananadev llm (#14283 ) - Description: Per #12165, this PR add to BananaLLM the function convert_to_secret_str() during environment variable validation. - Issue: #12165 - Tag maintainer: @eyurtsev - Twitter handle: @treewatcha75751 --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-30 00:52:25 +00:00
anshaneel	0884e5de7f	community[minor]: Add Alpha Vantage API Tool (#14332 ) ### Description This implementation adds functionality from the AlphaVantage API, renowned for its comprehensive financial data. The class encapsulates various methods, each dedicated to fetching specific types of financial information from the API. ### Implemented Functions - `search_symbols`: - Searches the AlphaVantage API for financial symbols using the provided keywords. - `_get_market_news_sentiment`: - Retrieves market news sentiment for a specified stock symbol from the AlphaVantage API. - `_get_time_series_daily`: - Fetches daily time series data for a specific symbol from the AlphaVantage API. - `_get_quote_endpoint`: - Obtains the latest price and volume information for a given symbol from the AlphaVantage API. - `_get_time_series_weekly`: - Gathers weekly time series data for a particular symbol from the AlphaVantage API. - `_get_top_gainers_losers`: - Provides details on top gainers, losers, and most actively traded tickers in the US market from the AlphaVantage API. ### Issue: - #11994 ### Dependencies: - 'requests' library for HTTP requests. (import requests) - 'pytest' library for testing. (import pytest) --------- Co-authored-by: Adam Badar <94140103+adam-badar@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-30 00:44:01 +00:00
Alex Sherstinsky	a9bc212bf2	community[minor]: fix failing Predibase integration (#19776 ) - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: Langchain-Predibase integration was failing, because it was not current with the Predibase SDK; in addition, Predibase integration tests were instantiating the Langchain Community `Predibase` class with one required argument (`model`) missing. This change updates the Predibase SDK usage and fixes the integration tests. - Twitter handle: `@alexsherstinsky` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-30 00:38:13 +00:00
M.Abdulrahman Alnaseer	ba54f1577f	community[minor]: add support for llmsherpa (#19741 ) Thank you for contributing to LangChain! - [x] PR title: "community: added support for llmsherpa library" - [x] Add tests and docs: 1. Integration test: 'docs/docs/integrations/document_loaders/test_llmsherpa.py'. 2. an example notebook: `docs/docs/integrations/document_loaders/llmsherpa.ipynb`. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-29 16:04:57 -07:00
Hrvoje Milković	b7344e3347	community[minor]: Infobip tool integration (#16805 ) Description: Adding Tool that wraps Infobip API for sending sms or emails and email validation. Dependencies: None, Twitter handle: @hmilkovic Implementation: ``` libs/community/langchain_community/utilities/infobip.py ``` Integration tests: ``` libs/community/tests/integration_tests/utilities/test_infobip.py ``` Example notebook: ``` docs/docs/integrations/tools/infobip.ipynb ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-29 19:01:27 +00:00
morgana	074ad5095f	community[patch]: mmr search for Rockset vectorstore integration (#16908 ) - Description: Adding support for mmr search in the Rockset vectorstore integration. - Issue: N/A - Dependencies: N/A - Twitter handle: `@_morgan_adams_` --------- Co-authored-by: Rockset API Bot <admin@rockset.io> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-29 18:45:22 +00:00
shahrin014	f51e6a35ba	community[patch]: OllamaEmbeddings - Pass headers to post request (#16880 ) ## Feature - Set additional headers in constructor - Headers will be sent in post request This feature is useful if deploying Ollama on a cloud service such as hugging face, which requires authentication tokens to be passed in the request header. ## Tests - Test if header is passed - Test if header is not passed Similar to https://github.com/langchain-ai/langchain/pull/15881 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-29 18:44:52 +00:00
Jan Chorowski	b8b42ccbc5	community[minor]: Pathway vectorstore(#14859 ) - Description: Integration with pathway.com data processing pipeline acting as an always updated vectorstore - Issue: not applicable - Dependencies: optional dependency on [`pathway`](https://pypi.org/project/pathway/) - Twitter handle: pathway_com The PR provides and integration with `pathway` to provide an easy to use always updated vector store: ```python import pathway as pw from langchain.embeddings.openai import OpenAIEmbeddings from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import PathwayVectorClient, PathwayVectorServer data_sources = [] data_sources.append( pw.io.gdrive.read(object_id="17H4YpBOAKQzEJ93xmC2z170l0bP2npMy", service_user_credentials_file="credentials.json", with_metadata=True)) text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) embeddings_model = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"]) vector_server = PathwayVectorServer( *data_sources, embedder=embeddings_model, splitter=text_splitter, ) vector_server.run_server(host="127.0.0.1", port="8765", threaded=True, with_cache=False) client = PathwayVectorClient( host="127.0.0.1", port="8765", ) query = "What is Pathway?" docs = client.similarity_search(query) ``` The `PathwayVectorServer` builds a data processing pipeline which continusly scans documents in a given source connector (google drive, s3, ...) and builds a vector store. The `PathwayVectorClient` implements LangChain's `VectorStore` interface and connects to the server to retrieve documents. --------- Co-authored-by: Mateusz Lewandowski <lewymati@users.noreply.github.com> Co-authored-by: mlewandowski <mlewandowski@MacBook-Pro-mlewandowski.local> Co-authored-by: Berke <berkecanrizai1@gmail.com> Co-authored-by: Adrian Kosowski <adrian@pathway.com> Co-authored-by: mlewandowski <mlewandowski@macbook-pro-mlewandowski.home> Co-authored-by: berkecanrizai <63911408+berkecanrizai@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: mlewandowski <mlewandowski@MBPmlewandowski.ht.home> Co-authored-by: Szymon Dudycz <szymond@pathway.com> Co-authored-by: Szymon Dudycz <szymon.dudycz@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-29 10:50:39 -07:00
高璟琦	ec7a59c96c	community[minor]: Add solar embedding (#19761 ) Solar is a large language model developed by [Upstage](https://upstage.ai/). It's a powerful and purpose-trained LLM. You can visit the embedding service provided by Solar within this pr. You may get SOLAR_API_KEY from https://console.upstage.ai/services/embedding You can refer to more details about accepted llm integration at https://python.langchain.com/docs/integrations/llms/solar.	2024-03-29 09:36:05 -07:00
Tomaz Bratanic	dec00d3050	community[patch]: Add the ability to pass maps to neo4j retrieval query (#19758 ) Makes it easier to flatten complex values to text, so you don't have to use a lot of Cypher to do it.	2024-03-29 08:33:48 -07:00
Robby	f7e8a382cc	community[minor]: add hugging face text-to-speech inference API (#18880 ) Description: I implemented a tool to use Hugging Face text-to-speech inference API. Issue: n/a Dependencies: n/a Twitter handle: No Twitter, but do have [LinkedIn](https://www.linkedin.com/in/robby-horvath/) lol. --------- Co-authored-by: Robby <h0rv@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-03-29 15:02:29 +00:00
DasDingoCodes	73eb3f8fd9	community[minor]: Implement DirectoryLoader lazy_load function (#19537 ) Thank you for contributing to LangChain! - [x] PR title: "community: Implement DirectoryLoader lazy_load function" - [x] Description: The `lazy_load` function of the `DirectoryLoader` yields each document separately. If the given `loader_cls` of the `DirectoryLoader` also implemented `lazy_load`, it will be used to yield subdocuments of the file. - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access: `libs/community/tests/unit_tests/document_loaders/test_directory_loader.py` 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory: `docs/docs/integrations/document_loaders/directory.ipynb` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-03-29 14:46:52 +00:00
Jialei	f7c903e24a	community[minor]: add support for Moonshot llm and chat model (#17100 )	2024-03-29 08:54:23 +00:00
Ethan Yang	7164015135	community[minor]: Add Openvino embedding support (#19632 ) This PR is used to support both HF and BGE embeddings with openvino --------- Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com>	2024-03-29 01:34:51 -07:00
kYLe	124ab79c23	community[minor]: Add Anyscale embedding support (#17605 ) Description: Add embedding model support for Anyscale Endpoint Dependencies: openai --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-29 00:53:53 +00:00
Paulo Nascimento	44a3484503	community[patch]: add NotebookLoader unit test (#17721 ) Thank you for contributing to LangChain! - Description: added unit tests for NotebookLoader. Linked PR: https://github.com/langchain-ai/langchain/pull/17614 - Issue: [#17614](https://github.com/langchain-ai/langchain/pull/17614) - Twitter handle: @paulodoestech - [x] Pass lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified to check that you're passing lint and testing. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: lachiewalker <lachiewalker1@hotmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-29 00:27:46 +00:00
Paulo Nascimento	4c3a67122f	community[patch]: add Integration for OpenAI image gen with v1 sdk (#17771 ) Description: Created a Langchain Tool for OpenAI DALLE Image Generation. Issue: [#15901](https://github.com/langchain-ai/langchain/issues/15901) Dependencies: n/a Twitter handle: @paulodoestech - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-29 00:23:14 +00:00
Jiaming	3d3cc71287	community[patch]: fix bugs for bilibili Loader (#18036 ) - Description: 1. Fix the BiliBiliLoader that can receive cookie parameters, it requires 3 other parameters to run. The change is backward compatible. 2. Add test; 3. Add example in docs - Issue: [#14213] Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-28 16:39:38 -07:00
Sachin Paryani	25c9f3d1d1	community[patch]: Support Streaming in Azure Machine Learning (#18246 ) - [x] PR title: "community: Support streaming in Azure ML and few naming changes" - [x] PR message: - Description: Added support for streaming for azureml_endpoint. Also, renamed and AzureMLEndpointApiType.realtime to AzureMLEndpointApiType.dedicated. Also, added new classes CustomOpenAIChatContentFormatter and CustomOpenAIContentFormatter and updated the classes LlamaChatContentFormatter and LlamaContentFormatter to now show a deprecated warning message when instantiated. --------- Co-authored-by: Sachin Paryani <saparan@microsoft.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-28 23:38:20 +00:00
Victor Adan	afa2d85405	community[patch]: Added missing from_documents method to KNNRetriever. (#18411 ) - Description: Added missing `from_documents` method to `KNNRetriever`, providing the ability to supply metadata to LangChain `Document`s, and to give it parity to the other retrievers, which do have `from_documents`. - Issue: None - Dependencies: None - Twitter handle: None Co-authored-by: Victor Adan <vadan@netroadshow.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-28 23:18:50 +00:00
wulixuan	b7c8bc8268	community[patch]: fix yuan2 errors in LLMs (#19004 ) 1. fix yuan2 errors while invoke Yuan2. 2. update tests.	2024-03-28 14:37:44 -07:00
Davide Menini	f7042321f1	community[patch]: gather token usage info in BedrockChat during generation (#19127 ) This PR allows to calculate token usage for prompts and completion directly in the generation method of BedrockChat. The token usage details are then returned together with the generations, so that other downstream tasks can access them easily. This allows to define a callback for tokens tracking and cost calculation, similarly to what happens with OpenAI (see [OpenAICallbackHandler](https://api.python.langchain.com/en/latest/_modules/langchain_community/callbacks/openai_info.html#OpenAICallbackHandler). I plan on adding a BedrockCallbackHandler later. Right now keeping track of tokens in the callback is already possible, but it requires passing the llm, as done here: https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html. However, I find the approach of this PR cleaner. Thanks for your reviews. FYI @baskaryan, @hwchase17 --------- Co-authored-by: taamedag <Davide.Menini@swisscom.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-28 18:58:46 +00:00
Christian Galo	1adaa3c662	community[minor]: Update Azure Cognitive Services to Azure AI Services (#19488 ) This is a follow up to #18371. These are the changes: - New Azure AI Services toolkit and tools to replace those of Azure Cognitive Services. - Updated documentation for Microsoft platform. - The image analysis tool has been rewritten to use the new package `azure-ai-vision-imageanalysis`, doing a proper replacement of `azure-ai-vision`. These changes: - Update outdated naming from "Azure Cognitive Services" to "Azure AI Services". - Update documentation to use non-deprecated methods to create and use agents. - Removes need to depend on yanked python package (`azure-ai-vision`) There is one new dependency that is needed as a replacement to `azure-ai-vision`: - `azure-ai-vision-imageanalysis`. This is optional and declared within a function. There is a new `azure_ai_services.ipynb` notebook showing usage; Changes have been linted and formatted. I am leaving the actions of adding deprecation notices and future removal of Azure Cognitive Services up to the LangChain team, as I am not sure what the current practice around this is. --- If this PR makes it, my handle is @galo@mastodon.social --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-03-28 03:19:02 +00:00
Shengsheng Huang	ac1dd8ad94	community[minor]: migrate `bigdl-llm` to `ipex-llm` (#19518 ) - Description: `bigdl-llm` library has been renamed to [`ipex-llm`](https://github.com/intel-analytics/ipex-llm). This PR migrates the `bigdl-llm` integration to `ipex-llm` . - Issue: N/A. The original PR of `bigdl-llm` is https://github.com/langchain-ai/langchain/pull/17953 - Dependencies: `ipex-llm` library - Contribution maintainer: @shane-huang Updated doc: docs/docs/integrations/llms/ipex_llm.ipynb Updated test: libs/community/tests/integration_tests/llms/test_ipex_llm.py	2024-03-27 20:12:59 -07:00
Chaunte W. Lacewell	a31f692f4e	community[minor]: Add VDMS vectorstore (#19551 ) - Description: Add support for Intel Lab's [Visual Data Management System (VDMS)](https://github.com/IntelLabs/vdms) as a vector store - Dependencies: `vdms` library which requires protobuf = "4.24.2". There is a conflict with dashvector in `langchain` package but conflict is resolved in `community`. - Contribution maintainer: [@cwlacewe](https://github.com/cwlacewe) - Added tests: libs/community/tests/integration_tests/vectorstores/test_vdms.py - Added docs: docs/docs/integrations/vectorstores/vdms.ipynb - Added cookbook: cookbook/multi_modal_RAG_vdms.ipynb --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-28 03:12:11 +00:00
yongheng.liu	7e29b6061f	community[minor]: integrate China Mobile Ecloud vector search (#15298 ) - Description: integrate China Mobile Ecloud vector search, - Dependencies: elasticsearch==7.10.1 Co-authored-by: liuyongheng <liuyongheng@cmss.chinamobile.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-27 23:02:40 +00:00
CaroFG	cf96060ab7	community[patch]: update for compatibility with latest Meilisearch version (#18970 ) - Description: Updates Meilisearch vectorstore for compatibility with v1.6 and above. Adds embedders settings and embedder_name which are now required. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-27 22:08:27 +00:00
yuwenzho	3a7d2cf443	community[minor]: Add ITREX optimized Embeddings (#18474 ) Introduction [Intel® Extension for Transformers](https://github.com/intel/intel-extension-for-transformers) is an innovative toolkit designed to accelerate GenAI/LLM everywhere with the optimal performance of Transformer-based models on various Intel platforms Description adding ITREX runtime embeddings using intel-extension-for-transformers. added mdx documentation and example notebooks added embedding import testing. --------- Signed-off-by: yuwenzho <yuwen.zhou@intel.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-27 07:22:06 +00:00
Fabrizio Ruocco	f12cb0bea4	community[patch]: Microsoft Azure Document Intelligence updates (#16932 ) - Description: Update Azure Document Intelligence implementation by Microsoft team and RAG cookbook with Azure AI Search --------- Co-authored-by: Lu Zhang (AI) <luzhan@microsoft.com> Co-authored-by: Yateng Hong <yatengh@microsoft.com> Co-authored-by: teethache <hongyateng2006@126.com> Co-authored-by: Lu Zhang <44625949+luzhang06@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-26 23:36:59 -07:00
xsai9101	160a8eb178	community[minor]: add oracle autonomous database doc loader integration (#19536 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: Adding oracle autonomous database document loader integration. This will allow users to connect to oracle autonomous database through connection string or TNS configuration. https://www.oracle.com/autonomous-database/ - Issue: None - Dependencies: oracledb python package https://pypi.org/project/oracledb/ - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Unit test and doc are added. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-26 17:02:18 -07:00
Adrian Valente	2763d8cbe5	community: add len() implementation to Chroma (#19419 ) Thank you for contributing to LangChain! - [x] Add len() implementation to Chroma: "package: community" - [x] PR message: - Description: add an implementation of the __len__() method for the Chroma vectostore, for convenience. - Issue: no exposed method to know the size of a Chroma vectorstore - Dependencies: None - Twitter handle: lowrank_adrian - [x] Add tests and docs - [x] Lint and test --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-26 12:53:10 -04:00
Yuki Watanabe	cfecbda48b	community[minor]: Allow passing `allow_dangerous_deserialization` when loading LLM chain (#18894 ) ### Issue Recently, the new `allow_dangerous_deserialization` flag was introduced for preventing unsafe model deserialization that relies on pickle without user's notice (#18696). Since then some LLMs like Databricks requires passing in this flag with true to instantiate the model. However, this breaks existing functionality to loading such LLMs within a chain using `load_chain` method, because the underlying loader function [load_llm_from_config](`f96dd57501/libs/langchain/langchain/chains/loading.py (L40)`) (and load_llm) ignores keyword arguments passed in. ### Solution This PR fixes this issue by propagating the `allow_dangerous_deserialization` argument to the class loader iff the LLM class has that field. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-26 11:07:55 -04:00
Christophe Bornet	8595c3ab59	community[minor]: Add InMemoryVectorStore to module level imports (#19576 )	2024-03-26 14:07:44 +00:00
Aayush Kataria	03c38005cb	community[patch]: Fixing some caching issues for AzureCosmosDBSemanticCache (#18884 ) Fixing some issues for AzureCosmosDBSemanticCache - Added the entry for "AzureCosmosDBSemanticCache" which was missing in langchain/cache.py - Added application name when creating the MongoClient for the AzureCosmosDBVectorSearch, for tracking purposes. @baskaryan, can you please review this PR, we need this to go in asap. These are just small fixes which we found today in our testing.	2024-03-25 19:06:17 -07:00
Anindyadeep	b2a11ce686	community[minor]: Prem AI langchain integration (#19113 ) ### Prem SDK integration in LangChain This PR adds the integration with [PremAI's](https://www.premai.io/) prem-sdk with langchain. User can now access to deployed models (llms/embeddings) and use it with langchain's ecosystem. This PR adds the following: ### This PR adds the following: - [x] Add chat support - [X] Adding embedding support - [X] writing integration tests - [X] writing tests for chat - [X] writing tests for embedding - [X] writing unit tests - [X] writing tests for chat - [X] writing tests for embedding - [X] Adding documentation - [X] writing documentation for chat - [X] writing documentation for embedding - [X] run `make test` - [X] run `make lint`, `make lint_diff` - [X] Final checks (spell check, lint, format and overall testing) --------- Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-26 01:37:19 +00:00
Dmitry Tyumentsev	08b769d539	community[patch]: YandexGPT Use recent yandexcloud sdk version (#19341 ) Fixed inability to work with [yandexcloud SDK](https://pypi.org/project/yandexcloud/) version higher 0.265.0	2024-03-25 17:05:57 -07:00
Marlene	f1313339ac	community[patch]: Fixing incorrect base URLs for Azure Cognitive Search Retriever (#19352 ) This PR adds code to make sure that the correct base URL is being created for the Azure Cognitive Search retriever. At the moment an incorrect base URL is being generated. I think this is happening because the original code was based on a depreciated API version. No dependencies need to be added. I've also added more context to the test doc strings. I should also note that ACS is now Azure AI Search. I will open a separate PR to make these changes as that would be a breaking change and should potentially be discussed. Twitter: @marlene_zw - No new tests added, however the current ACS retriever tests are now passing when I run them. - Code was linted. Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-26 00:04:59 +00:00
Mikelarg	dac2e0165a	community[minor]: Added GigaChat Embeddings support + updated previous GigaChat integration (#19516 ) - Description: Added integration with [GigaChat](https://developers.sber.ru/portal/products/gigachat) embeddings. Also added support for extra fields in GigaChat LLM and fixed docs.	2024-03-25 16:08:37 -07:00
Martin Kolb	e5bdb26f76	community[patch]: More flexible handling for entity names in vector store "HANA Cloud" (#19523 ) - Description: Added support for lower-case and mixed-case names The names for tables and columns previouly had to be UPPER_CASE. With this enhancement, also lower_case and MixedCase are supported, - Issue: N/A - Dependencies: no new dependecies added - Twitter handle: @sapopensource	2024-03-25 15:52:45 -07:00
Igor Muniz Soares	743f888580	community[minor]: Dappier chat model integration (#19370 ) Description: This PR adds [Dappier](https://dappier.com/) for the chat model. It supports generate, async generate, and batch functionalities. We added unit and integration tests as well as a notebook with more details about our chat model. Dependencies: No extra dependencies are needed.	2024-03-25 07:29:05 +00:00
Hugoberry	96dc180883	community[minor]: Add `DuckDB` as a vectorstore (#18916 ) DuckDB has a cosine similarity function along list and array data types, which can be used as a vector store. - Description: The latest version of DuckDB features a cosine similarity function, which can be used with its support for list or array column types. This PR surfaces this functionality to langchain. - Dependencies: duckdb 0.10.0 - Twitter handle: @igocrite --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-25 07:02:35 +00:00
Christophe Bornet	00614f332a	community[minor]: Add InMemoryVectorStore (#19326 ) This is a basic VectorStore implementation using an in-memory dict to store the documents. It doesn't need any extra/optional dependency as it uses numpy which is already a dependency of langchain. This is useful for quick testing, demos, examples. Also it allows to write vendor-neutral tutorials, guides, etc...	2024-03-20 10:21:07 -04:00
Nithish Raghunandanan	7ad0a3f2a7	community: add Couchbase Vector Store (#18994 ) - Description: Added support for Couchbase Vector Search to LangChain. - Dependencies: couchbase>=4.1.12 - Twitter handle: @nithishr --------- Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>	2024-03-19 12:39:51 -07:00
Vittorio Rigamonti	9b2f9ee952	community: VectorStore Infinispan, adding autoconfiguration (#18967 ) Description: this PR enable VectorStore autoconfiguration for Infinispan: if metadatas are only of basic types, protobuf config will be automatically generated for the user.	2024-03-18 21:33:45 -07:00
gonvee	b82644078e	community: Add `keep_alive` parameter to control how long the model w… (#19005 ) Add `keep_alive` parameter to control how long the model will stay loaded into memory with Ollama。 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-19 04:29:01 +00:00
Leonid Ganeline	7de1d9acfd	community: `llms` imports fixes (#18943 ) Classes are missed in __all__ and in different places of __init__.py - BaichuanLLM - ChatDatabricks - ChatMlflow - Llamafile - Mlflow - Together Added classes to __all__. I also sorted __all__ list. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-18 20:24:40 +00:00
Pengfei Jiang	514fe80778	community[patch]: add stop parameter support to volcengine maas (#19052 ) - Description: add stop parameter to volcengine maas model - Dependencies: no --------- Co-authored-by: 江鹏飞 <jiangpengfei.jiangpf@bytedance.com>	2024-03-17 01:58:50 +00:00
wulixuan	0e0030f494	community[patch]: fix yuan2 chat model errors while invoke. (#19015 ) 1. fix yuan2 chat model errors while invoke. 2. update related tests. 3. fix some deprecationWarning.	2024-03-15 16:28:36 -07:00
fengjial	c922ea36cb	community[minor]: Add Baidu VectorDB as vector store (#17997 ) Co-authored-by: fengjialin <fengjialin@MacBook-Pro.local>	2024-03-15 19:01:58 +00:00
Eugene Yurtsev	6cdca4355d	community[minor]: Revamp PGVector Filtering (#18992 ) This PR makes the following updates in the pgvector database: 1. Use JSONB field for metadata instead of JSON 2. Update operator syntax to include required `$` prefix before the operators (otherwise there will be name collisions with fields) 3. The change is non-breaking, old functionality is still the default, but it will emit a deprecation warning 4. Previous functionality has bugs associated with comparisons due to casting to text (so lexical ordering is used incorrectly for numeric fields) 5. Adds an a GIN index on the JSONB field for more efficient querying	2024-03-14 16:56:00 -04:00
Leonid Ganeline	9c8523b529	community[patch]: flattening imports 3 (#18939 ) @eyurtsev	2024-03-12 15:18:54 -07:00
Virat Singh	cafffe8a21	community: Add PolygonAggregates tool (#18882 ) Description: In this PR, I am adding a `PolygonAggregates` tool, which can be used to get historical stock price data (called aggregates by Polygon) for a given ticker. Polygon [docs](https://polygon.io/docs/stocks/get_v2_aggs_ticker__stocksticker__range__multiplier___timespan___from___to) for this endpoint. Twitter: [@virattt](https://twitter.com/virattt)	2024-03-11 11:58:10 -07:00
Joshua Carroll	ddaf9de169	community: Fix bug with StreamlitChatMessageHistory (#18834 ) - Description: Fix Streamlit bug which was introduced by https://github.com/langchain-ai/langchain/pull/18250, update integration test - Issue: https://github.com/langchain-ai/langchain/issues/18684 - Dependencies: None	2024-03-09 13:42:22 -08:00
Ishani Vyas	2b0cbd65ba	community[patch]: Add Passio Nutrition AI Food Search Tool to Community Package (#18278 ) ## Add Passio Nutrition AI Food Search Tool to Community Package ### Description We propose adding a new tool to the `community` package, enabling integration with Passio Nutrition AI for food search functionality. This tool will provide a simple interface for retrieving nutrition facts through the Passio Nutrition AI API, simplifying user access to nutrition data based on food search queries. ### Implementation Details - Class Structure: Implement `NutritionAI`, extending `BaseTool`. It includes an `_run` method that accepts a query string and, optionally, a `CallbackManagerForToolRun`. - API Integration: Use `NutritionAIAPI` for the API wrapper, encapsulating all interactions with the Passio Nutrition AI and providing a clean API interface. - Error Handling: Implement comprehensive error handling for API request failures. ### Expected Outcome - User Benefits: Enable easy querying of nutrition facts from Passio Nutrition AI, enhancing the utility of the `langchain_community` package for nutrition-related projects. - Functionality: Provide a straightforward method for integrating nutrition information retrieval into users' applications. ### Dependencies - `langchain_core` for base tooling support - `pydantic` for data validation and settings management - Consider `requests` or another HTTP client library if not covered by `NutritionAIAPI`. ### Tests and Documentation - Unit Tests: Include tests that mock network interactions to ensure tool reliability without external API dependency. - Documentation: Create an example notebook in `docs/docs/integrations/tools/passio_nutrition_ai.ipynb` showing usage, setup, and example queries. ### Contribution Guidelines Compliance - Adhere to the project's linting and formatting standards (`make format`, `make lint`, `make test`). - Ensure compliance with LangChain's contribution guidelines, particularly around dependency management and package modifications. ### Additional Notes - Aim for the tool to be a lightweight, focused addition, not introducing significant new dependencies or complexity. - Potential future enhancements could include caching for common queries to improve performance. ### Twitter Handle - Here is our Passio AI [twitter handle](https://twitter.com/@passio_ai) where we announce our products. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-03-08 20:33:22 +00:00
Eugene Yurtsev	1f50274df7	community[patch]: Add pgvector to docker compose and update settings used in integration test (#18815 )	2024-03-08 14:39:28 -05:00
Christophe Bornet	e54a49b697	community[minor]: Add lazy_table_reflection param to SqlDatabase (#18742 ) For some DBs with lots of tables, reflection of all the tables can take very long. So this change will make the tables be reflected lazily when get_table_info() is called and `lazy_table_reflection` is True.	2024-03-08 14:10:23 -05:00
Tomaz Bratanic	4bfe888717	comunity[patch]: Fix neo4j sanitizing values (#18750 ) Fixing sanitization for when deeply nested lists appear	2024-03-07 19:21:52 -08:00
Yunmo Koo	fee6f983ef	community[minor]: Integration for `Friendli` LLM and `ChatFriendli` ChatModel. (#17913 ) ## Description - Add [Friendli](https://friendli.ai/) integration for `Friendli` LLM and `ChatFriendli` chat model. - Unit tests and integration tests corresponding to this change are added. - Documentations corresponding to this change are added. ## Dependencies - Optional dependency [`friendli-client`](https://pypi.org/project/friendli-client/) package is added only for those who use `Frienldi` or `ChatFriendli` model. ## Twitter handle - https://twitter.com/friendliai	2024-03-08 02:20:47 +00:00
Ian	390ef6abe3	community[minor]: Add Initial Support for TiDB Vector Store (#15796 ) This pull request introduces initial support for the TiDB vector store. The current version is basic, laying the foundation for the vector store integration. While this implementation provides the essential features, we plan to expand and improve the TiDB vector store support with additional enhancements in future updates. Upcoming Enhancements: * Support for Vector Index Creation: To enhance the efficiency and performance of the vector store. * Support for max marginal relevance search. * Customized Table Structure Support: Recognizing the need for flexibility, we plan for more tailored and efficient data store solutions. Simple use case exmaple ```python from typing import List, Tuple from langchain.docstore.document import Document from langchain_community.vectorstores import TiDBVectorStore from langchain_openai import OpenAIEmbeddings db = TiDBVectorStore.from_texts( embedding=embeddings, texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'], table_name="tidb_vector_langchain", connection_string=tidb_connection_url, distance_strategy="cosine", ) query = "Can you tell me about Alexandra?" docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query) for doc, score in docs_with_score: print("-" * 80) print("Score: ", score) print(doc.page_content) print("-" * 80) ```	2024-03-07 17:18:20 -08:00
Eugene Yurtsev	e188d4ecb0	Add dangerous parameter to requests tool (#18697 ) The tools are already documented as dangerous. Not clear whether adding an opt-in parameter is necessary or not	2024-03-07 15:10:56 -05:00
Erick Friis	1beb84b061	community[patch]: move pdf text tests to integration (#18746 )	2024-03-07 10:34:22 -08:00
Christophe Bornet	4a7d73b39d	community: If load() has been overridden, use it in default lazy_load() (#18690 )	2024-03-07 11:52:19 -05:00
Sam Khano	1b4dcf22f3	community[minor]: Add DocumentDBVectorSearch VectorStore (#17757 ) Description: - Added Amazon DocumentDB Vector Search integration (HNSW index) - Added integration tests - Updated AWS documentation with DocumentDB Vector Search instructions - Added notebook for DocumentDB integration with example usage --------- Co-authored-by: EC2 Default User <ec2-user@ip-172-31-95-226.ec2.internal>	2024-03-06 15:11:34 -08:00
Vittorio Rigamonti	51f3902bc4	community[minor]: Adding support for Infinispan as VectorStore (#17861 ) Description: This integrates Infinispan as a vectorstore. Infinispan is an open-source key-value data grid, it can work as single node as well as distributed. Vector search is supported since release 15.x For more: [Infinispan Home](https://infinispan.org) Integration tests are provided as well as a demo notebook	2024-03-06 15:11:02 -08:00
Djordje	12b4a4d860	community[patch]: Opensearch delete method added - indexing supported (#18522 ) - Description: Added delete method for OpenSearchVectorSearch, therefore indexing supported - Issue: No - Dependencies: No - Twitter handle: stkbmf	2024-03-06 15:08:47 -08:00
Eugene Yurtsev	4c25b49229	community[major]: breaking change in some APIs to force users to opt-in for pickling (#18696 ) This is a PR that adds a dangerous load parameter to force users to opt in to use pickle. This is a PR that's meant to raise user awareness that the pickling module is involved.	2024-03-06 16:43:01 -05:00
Eugene Yurtsev	0e52961562	community[patch]: Patch tdidf retriever (CVE-2024-2057) (#18695 ) This is a patch for `CVE-2024-2057`: https://www.cve.org/CVERecord?id=CVE-2024-2057 This affects users that: * Use the `TFIDFRetriever` * Attempt to de-serialize it from an untrusted source that contains a malicious payload	2024-03-06 15:49:04 -05:00
Christophe Bornet	15b1770326	Merge pull request #18421 * Implement lazy_load() for AssemblyAIAudioTranscriptLoader	2024-03-06 13:16:05 -05:00
Christophe Bornet	bb284eebe4	Merge pull request #18436 * Implement lazy_load() for ConfluenceLoader	2024-03-06 13:15:24 -05:00
Liang Zhang	81985b31e6	community[patch]: Databricks SerDe uses cloudpickle instead of pickle (#18607 ) - Description: Databricks SerDe uses cloudpickle instead of pickle when serializing a user-defined function transform_input_fn since pickle does not support functions defined in `__main__`, and cloudpickle supports this. - Dependencies: cloudpickle>=2.0.0 Added a unit test.	2024-03-05 18:04:45 -08:00
Dounx	ad48f55357	community[minor]: add Yuque document loader (#17924 ) This pull request support loading documents from Yuque with Langchain. Yuque is a professional cloud-based knowledge base for team collaboration in documentation. Website: https://www.yuque.com OpenAPI: https://www.yuque.com/yuque/developer/openapi	2024-03-05 15:54:07 -08:00
Kazuki Maeda	60c5d964a8	community[minor]: use jq schema for content_key in json_loader (#18003 ) ### Description Changed the value specified for `content_key` in JSONLoader from a single key to a value based on jq schema. I created [similar PR](https://github.com/langchain-ai/langchain/pull/11255) before, but it has several conflicts because of the architectural change associated stable version release, so I re-create this PR to fit new architecture. ### Why For json data like the following, specify `.data[].attributes.message` for page_content and `.data[].attributes.id` or `.data[].attributes.attributes. tags`, etc., the `content_key` must also parse the json structure. <details> <summary>sample json data</summary> ```json { "data": [ { "attributes": { "message": "message1", "tags": [ "tag1" ] }, "id": "1" }, { "attributes": { "message": "message2", "tags": [ "tag2" ] }, "id": "2" } ] } ``` </details> <details> <summary>sample code</summary> ```python def metadata_func(record: dict, metadata: dict) -> dict: metadata["source"] = None metadata["id"] = record.get("id") metadata["tags"] = record["attributes"].get("tags") return metadata sample_file = "sample1.json" loader = JSONLoader( file_path=sample_file, jq_schema=".data[]", content_key=".attributes.message", ## content_key is parsable into jq schema is_content_key_jq_parsable=True, ## this is added parameter metadata_func=metadata_func ) data = loader.load() data ``` </details> ### Dependencies none ### Twitter handle [kzk_maeda](https://twitter.com/kzk_maeda)	2024-03-05 15:51:24 -08:00
Tomaz Bratanic	ea51cdaede	Remove neo4j bloom labels from graph schema (#18564 ) Neo4j tools use particular node labels and relationship types to store metadata, but are irrelevant for text2cypher or graph generation, so we want to ignore them in the schema representation.	2024-03-05 12:54:05 -08:00
Erick Friis	e1924b3e93	core[patch]: deprecate hwchase17/langchain-hub, address path traversal (#18600 ) Deprecates the old langchain-hub repository. Does not deprecate the new https://smith.langchain.com/hub @PinkDraconian has correctly raised that in the event someone is loading unsanitized user input into the `try_load_from_hub` function, they have the ability to load files from other locations in github than the hwchase17/langchain-hub repository. This PR adds some more path checking to that function and deprecates the functionality in favor of the hub built into LangSmith.	2024-03-05 12:49:38 -08:00
Scott Nath	b051bba1a9	community: Add you.com tool, add async to retriever, add async testing, add You tool doc (#18032 ) - Description: finishes adding the you.com functionality including: - add async functions to utility and retriever - add the You.com Tool - add async testing for utility, retriever, and tool - add a tool integration notebook page - Dependencies: any dependencies required for this change - Twitter handle: @scottnath	2024-03-03 14:30:05 -08:00
Aayush Kataria	7c2f3f6f95	community[minor]: Adding Azure Cosmos Mongo vCore Vector DB Cache (#16856 ) Description: This pull request introduces several enhancements for Azure Cosmos Vector DB, primarily focused on improving caching and search capabilities using Azure Cosmos MongoDB vCore Vector DB. Here's a summary of the changes: - AzureCosmosDBSemanticCache: Added a new cache implementation called AzureCosmosDBSemanticCache, which utilizes Azure Cosmos MongoDB vCore Vector DB for efficient caching of semantic data. Added comprehensive test cases for AzureCosmosDBSemanticCache to ensure its correctness and robustness. These tests cover various scenarios and edge cases to validate the cache's behavior. - HNSW Vector Search: Added HNSW vector search functionality in the CosmosDB Vector Search module. This enhancement enables more efficient and accurate vector searches by utilizing the HNSW (Hierarchical Navigable Small World) algorithm. Added corresponding test cases to validate the HNSW vector search functionality in both AzureCosmosDBSemanticCache and AzureCosmosDBVectorSearch. These tests ensure the correctness and performance of the HNSW search algorithm. - LLM Caching Notebook - The notebook now includes a comprehensive example showcasing the usage of the AzureCosmosDBSemanticCache. This example highlights how the cache can be employed to efficiently store and retrieve semantic data. Additionally, the example provides default values for all parameters used within the AzureCosmosDBSemanticCache, ensuring clarity and ease of understanding for users who are new to the cache implementation. @hwchase17,@baskaryan, @eyurtsev,	2024-03-03 14:04:15 -08:00
Kate Silverstein	b7c71e2e07	community[minor]: llamafile embeddings support (#17976 ) * Description: adds `LlamafileEmbeddings` class implementation for generating embeddings using [llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models. Includes related unit tests and notebook showing example usage. * Issue: N/A * Dependencies: N/A	2024-03-01 13:49:18 -08:00
Tomaz Bratanic	f6bfb969ba	community[patch]: Add an option for indexed generic label when import neo4j graph documents (#18122 ) Current implementation doesn't have an indexed property that would optimize the import. I have added a `baseEntityLabel` parameter that allows you to add a secondary node label, which has an indexed id `property`. By default, the behaviour is identical to previous version. Since multi-labeled nodes are terrible for text2cypher, I removed the secondary label from schema representation object and string, which is used in text2cypher.	2024-03-01 12:33:52 -08:00
Arun Sathiya	4adac20d7b	community[patch]: Make cohere_api_key a SecretStr (#12188 ) This PR makes `cohere_api_key` in `llms/cohere` a SecretStr, so that the API Key is not leaked when `Cohere.cohere_api_key` is represented as a string. --------- Signed-off-by: Arun <arun@arun.blog> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-03-01 20:27:53 +00:00
Petteri Johansson	6c1989d292	community[minor], langchain[minor], docs: Gremlin Graph Store and QA Chain (#17683 ) - Description: New feature: Gremlin graph-store and QA chain (including docs). Compatible with Azure CosmosDB. - Dependencies: no changes	2024-03-01 12:21:14 -08:00
Ather Fawaz	a5ccf5d33c	community[minor]: Add support for Perplexity chat model(#17024 ) - Description: This PR adds support for [Perplexity AI APIs](https://blog.perplexity.ai/blog/introducing-pplx-api). - Issues: None - Dependencies: None - Twitter handle: [@atherfawaz](https://twitter.com/AtherFawaz) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-01 12:19:23 -08:00
Rodrigo Nogueira	3438d2cbcc	community[minor]: add maritalk chat (#17675 ) Description: Adds the MariTalk chat that is based on a LLM specially trained for Portuguese. Twitter handle: @MaritacaAI	2024-03-01 12:18:23 -08:00
Yujie Qian	cbb65741a7	community[patch]: Voyage AI updates default model and batch size (#17655 ) - Description: update the default model and batch size in VoyageEmbeddings - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Co-authored-by: fodizoltan <zoltan@conway.expert>	2024-03-01 10:22:24 -08:00
Shengsheng Huang	ae471a7dcb	community[minor]: add BigDL-LLM integrations (#17953 ) - Description: [`bigdl-llm`](https://github.com/intel-analytics/BigDL) is a library for running LLM on Intel XPU (from Laptop to GPU to Cloud) using INT4/FP4/INT8/FP8 with very low latency (for any PyTorch model). This PR adds bigdl-llm integrations to langchain. - Issue: NA - Dependencies: `bigdl-llm` library - Contribution maintainer: @shane-huang Examples added: - docs/docs/integrations/llms/bigdl.ipynb	2024-03-01 10:04:53 -08:00
Ethan Yang	f61cb8d407	community[minor]: Add openvino backend support (#11591 ) - Description: add openvino backend support by HuggingFace Optimum Intel, - Dependencies: “optimum[openvino]”, --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-01 10:04:24 -08:00
RadhikaBansal97	8bafd2df5e	community[patch]: Change github endpoint in GithubLoader (#17622 ) Description- - Changed the GitHub endpoint as existing was not working and giving 404 not found error - Also the existing function was failing if file_filter is not passed as the tree api return all paths including directory as well, and when get_file_content was iterating over these path, the function was failing for directory as the api was returning list of files inside the directory, so added a condition to ignore the paths if it a directory - Fixes this issue - https://github.com/langchain-ai/langchain/issues/17453 Co-authored-by: Radhika Bansal <Radhika.Bansal@veritas.com>	2024-03-01 09:36:31 -08:00
Eugene Yurtsev	51b661cfe8	community[patch]: BaseLoader load method should just delegate to lazy_load (#18289 ) load() should just reference lazy_load()	2024-02-29 21:45:28 -05:00
Kai Kugler	df234fb171	community[patch]: Fixing embedchain document mapping (#18255 ) - Description: The current embedchain implementation seems to handle document metadata differently than done in the current implementation of langchain and a KeyError is thrown. I would love for someone else to test this... --------- Co-authored-by: KKUGLER <kai.kugler@mercedes-benz.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Deshraj Yadav <deshraj@gatech.edu>	2024-02-29 14:54:37 -08:00
Erick Friis	040271f33a	community[patch]: remove llmlingua extended tests (#18344 )	2024-02-29 13:51:29 -08:00
Tomaz Bratanic	5999c4a240	Add support for parameters in neo4j retrieval query (#18310 ) Sometimes, you want to use various parameters in the retrieval query of Neo4j Vector to personalize/customize results. Before, when there were only predefined chains, it didn't really make sense. Now that it's all about custom chains and LCEL, it is worth adding since users can inject any params they wish at query time. Isn't prone to SQL injection-type attacks since we use parameters and not concatenating strings.	2024-02-29 13:00:54 -08:00
Virat Singh	cd926ac3dd	community: Add PolygonFinancials Tool (#18324 ) Description: In this PR, I am adding a `PolygonFinancials` tool, which can be used to get financials data for a given ticker. The financials data is the fundamental data that is found in income statements, balance sheets, and cash flow statements of public US companies. Twitter: [@virattt](https://twitter.com/virattt)	2024-02-29 10:56:05 -08:00
Eugene Yurtsev	cd52433ba0	community[minor]: Add `SQLDatabaseLoader` document loader (#18281 ) - Description: A generic document loader adapter for SQLAlchemy on top of LangChain's `SQLDatabaseLoader`. - Needed by: https://github.com/crate-workbench/langchain/pull/1 - Depends on: GH-16655 - Addressed to: @baskaryan, @cbornet, @eyurtsev Hi from CrateDB again, in the same spirit like GH-16243 and GH-16244, this patch breaks out another commit from https://github.com/crate-workbench/langchain/pull/1, in order to reduce the size of this patch before submitting it, and to separate concerns. To accompany the SQLAlchemy adapter implementation, the patch includes integration tests for both SQLite and PostgreSQL. Let me know if corresponding utility resources should be added at different spots. With kind regards, Andreas. ### Software Tests ```console docker compose --file libs/community/tests/integration_tests/document_loaders/docker-compose/postgresql.yml up ``` ```console cd libs/community pip install psycopg2-binary pytest -vvv tests/integration_tests -k sqldatabase ``` ``` 14 passed ``` ![image](https://github.com/langchain-ai/langchain/assets/453543/42be233c-eb37-4c76-a830-474276e01436) --------- Co-authored-by: Andreas Motl <andreas.motl@crate.io>	2024-02-28 21:02:28 +00:00
David Ruan	af35e2525a	community[minor]: add hugging_face_model document loader (#17323 ) - Description: add hugging_face_model document loader, - Issue: NA, - Dependencies: NA, --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-28 20:05:35 +00:00
Ayo Ayibiowu	ac1d7d9de8	community[feat]: Adds LLMLingua as a document compressor (#17711 ) Description: This PR adds support for using the [LLMLingua project ](https://github.com/microsoft/LLMLingua) especially the LongLLMLingua (Enhancing Large Language Model Inference via Prompt Compression) as a document compressor / transformer. The LLMLingua project is an interesting project that can greatly improve RAG system by compressing prompts and contexts while keeping their semantic relevance. Issue: https://github.com/microsoft/LLMLingua/issues/31 Dependencies: [llmlingua](https://pypi.org/project/llmlingua/) @baskaryan --------- Co-authored-by: Ayodeji Ayibiowu <ayodeji.ayibiowu@getinge.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-27 19:23:56 -08:00
am-kinetica	9b8f6455b1	Langchain vectorstore integration with Kinetica (#18102 ) - Description: New vectorstore integration with the Kinetica database - Issue: - Dependencies: the Kinetica Python API `pip install gpudb==7.2.0.1`, - Tag maintainer: @baskaryan, @hwchase17 - Twitter handle: --------- Co-authored-by: Chad Juliano <cjuliano@kinetica.com>	2024-02-26 12:46:48 -08:00
Dan Stambler	69344a0661	community: Add Laser Embedding Integration (#18111 ) - Description: Added Integration with Meta AI's LASER Language-Agnostic SEntence Representations embedding library, which supports multilingual embedding for any of the languages listed here: https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200, including several low resource languages - Dependencies: laser_encoders	2024-02-26 12:16:37 -08:00
Barun Amalkumar Halder	cc69976860	community[minor] : adds callback handler for Fiddler AI (#17708 ) Description: Callback handler to integrate fiddler with langchain. This PR adds the following - 1. `FiddlerCallbackHandler` implementation into langchain/community 2. Example notebook `fiddler.ipynb` for usage documentation [Internal Tracker : FDL-14305] Issue: NA Dependencies: - Installation of langchain-community is unaffected. - Usage of FiddlerCallbackHandler requires installation of latest fiddler-client (2.5+) Twitter handle: @fiddlerlabs @behalder Co-authored-by: Barun Halder <barun@fiddler.ai>	2024-02-25 18:17:03 -08:00
2jimoo	7fc903464a	community: Add document manager and mongo document manager (#17320 ) - Description: - Add DocumentManager class, which is a nosql record manager. - In order to use index and aindex in libs/langchain/langchain/indexes/_api.py, DocumentManager inherits RecordManager. - Also I added the MongoDB implementation of Document Manager too. - Dependencies: pymongo, motor <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: Add DocumentManager class, which is a no sql record manager. To use index method and aindex method in indexes._api.py, Document Manager inherits RecordManager.Add the MongoDB implementation of Document Manager. - Dependencies: pymongo, motor Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-23 21:32:52 -05:00
Chad Juliano	50ba3c68bb	community[minor]: add Kinetica LLM wrapper (#17879 ) Description: Initial pull request for Kinetica LLM wrapper Issue: N/A Dependencies: No new dependencies for unit tests. Integration tests require gpudb, typeguard, and faker Twitter handle: @chad_juliano Note: There is another pull request for Kinetica vectorstore. Ultimately we would like to make a partner package but we are starting with a community contribution.	2024-02-22 16:02:00 -08:00
Brad Erickson	ecd72d26cf	community: Bugfix - correct Ollama API path to avoid HTTP 307 (#17895 ) Sets the correct /api/generate path, without ending /, to reduce HTTP requests. Reference: https://github.com/ollama/ollama/blob/efe040f8/docs/api.md#generate-request-streaming Before: DEBUG: Starting new HTTP connection (1): localhost:11434 DEBUG: http://localhost:11434 "POST /api/generate/ HTTP/1.1" 307 0 DEBUG: http://localhost:11434 "POST /api/generate HTTP/1.1" 200 None After: DEBUG: Starting new HTTP connection (1): localhost:11434 DEBUG: http://localhost:11434 "POST /api/generate HTTP/1.1" 200 None	2024-02-22 11:59:55 -05:00

1 2 3 4 5 ...

414 Commits