langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-18 09:25:54 +00:00

Author	SHA1	Message	Date
Ashley Xu	ce7723c1e5	community[minor]: add additional support for `BigQueryVectorSearch` (#15904 ) BigQuery vector search lets you use GoogleSQL to do semantic search, using vector indexes for fast but approximate results, or using brute force for exact results. This PR: 1. Add `metadata[_job_ib]` in Document returned by any similarity search 2. Add `explore_job_stats` to enable users to explore job statistics and better the debuggability 3. Set the minimum row limit for running create vector index.	2024-01-15 10:45:15 -08:00
Mohammed Naqi	8799b028a6	community[minor]: Adding asynchronous function implementation for Doctran (#15941 ) ## Description In this update, I addressed the missing implementation for atransform_document, which is the asynchronous counterpart of transform_document in Doctran. ### Usage Example: ```py # Instantiate DoctranPropertyExtractor with specified properties property_extractor = DoctranPropertyExtractor(properties=properties) # Asynchronously extract properties from a list of documents extracted_document = await property_extractor.atransform_documents( documents, properties=properties ) # Display metadata of the first extracted document print(json.dumps(extracted_document[0].metadata, indent=2)) ``` ## Issue - Pull request #14525 has caused a break in the aforementioned code. Instead of removing an asynchronous implementation of a function, consider implementing a synchronous version alongside it.	2024-01-15 10:39:25 -08:00
Raunak	c0773ab329	community[patch]: Fixed 'coroutine' object is not subscriptable error (#15986 ) - Description: Added parenthesis in return statement of aembed_query() funtion to fix 'coroutine' object is not subscriptable error. - Dependencies: NA Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>	2024-01-15 10:34:10 -08:00
Karim Lalani	14244bd7e5	community[minor]: Added document loader for SurrealDB (#15995 ) Added a simple document loader to work with SurrealDB.	2024-01-15 10:32:42 -08:00
Karim Lalani	768e5e33bc	community[minor]: Fix to match SurrealDB 0.3.2 SDK (#15996 ) New version of SurrealDB python sdk was causing the integration to break. This fix addresses that change.	2024-01-15 10:31:59 -08:00
shahrin014	86321a949f	community: Ollama - Parameter structure to follow official documentation (#16035 ) ## Feature - Follow parameter structure as per official documentation - top level parameters (e.g. model, system, template) will be passed as top level parameters - other parameters will be sent in options unless options is provided ![image](https://github.com/langchain-ai/langchain/assets/17451563/d14715d9-9701-4ee3-b44b-89fffea62389) ## Tests - Test if top level parameters handled properly - Test if parameters that are not top level parameters are handled as options - Test if options is provided, it will be passed as is	2024-01-15 10:17:58 -08:00
Nir Kopler	0fa06732b7	community: add new gpt-3.5-turbo-1106 finetuned for cost calculation (#16039 ) Description: Added the new gpt-3.5-turbo-1106 for finetuned cost calculation, Issue: no issue found open By the information in OpenAI the pricing is the same as the older model (0613)	2024-01-15 08:36:54 -08:00
Virat Singh	eb6e385dc5	community: Add PolygonAPIWrapper and get_last_quote endpoint (#15971 ) - Description: Added a `PolygonAPIWrapper` and an initial `get_last_quote` endpoint, which allows us to get the last price quote for a given `ticker`. Once merged, I can add a Polygon tool in `tools/` for agents to use. - Twitter handle: [@virattt](https://twitter.com/virattt) The Polygon.io Stocks API provides REST endpoints that let you query the latest market data from all US stock exchanges.	2024-01-12 17:52:09 -08:00
Varik Matevosyan	efe6cfafe2	community: Added Lantern as VectorStore (#12951 ) Support [Lantern](https://github.com/lanterndata/lantern) as a new VectorStore type. - Added Lantern as VectorStore. It will support 3 distance functions `l2 squared`, `cosine` and `hamming` and will use `HNSW` index. - Added tests - Added example notebook	2024-01-12 12:00:16 -08:00
Edwin Wenink	9fb09c1c30	community: fix the "page" mode in the AzureAIDocumentIntelligenceParser (bug) (#15958 ) Description: the "page" mode in the AzureAIDocumentIntelligenceParser is not accessible due to a wrong membership test. The mode argument can only be a string (also see the assertion in the `__init__`: `assert self.mode in ["single", "page", "object", "markdown"]`, so the check `elif self.mode == ["page"]:` always fails. As a result, effectively the "object" mode is used when selecting the "page" mode, which may lead to errors. The docstring of the `AzureAIDocumentIntelligenceLoader` also ommitted the `mode` parameter alltogether, so I added it. Issue: I could not find a related issue (this class is only 3 weeks old anyways) Dependencies: this PR does not introduce or affect dependencies. The current demo notebook and examples are not affected because they all use the default markdown mode.	2024-01-12 11:01:28 -08:00
Mahdi Setayesh	eb76f9c9fe	community: Fixing a performance issue with AzureSearch to perform batch embedding (#15594 ) - Description: Azure Cognitive Search vector DB store performs slow embedding as it does not utilize the batch embedding functionality. This PR provide a fix to improve the performance of Azure Search class when adding documents to the vector search, - Issue: #11313 , - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-01-12 10:58:55 -08:00
ChengZi	d5808f786c	community: Support milvus partition key. (#15740 ) - Description: Milvus's partition key is an important feature. It can support multi-tenancy. We hope to introduce this feature. https://milvus.io/docs/partition_key.md - Issue: No - Dependencies: No - Twitter handle: No --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-12 09:15:03 -08:00
ohbeep	9b3962fc25	community: Add support of "http" URI for Milvus (#12710 ) (#15683 ) - Description: Add support of HTTP URI for Milvus - Issue: #12710 - Dependencies: N/A,	2024-01-11 21:55:35 -08:00
Raunak	e26e1f8b37	community: Added functions to make async calls to HuggingFaceHub's embedding endpoint in HuggingFaceHubEmbeddings class (#15737 ) Description: Added aembed_documents() and aembed_query() async functions in HuggingFaceHubEmbeddings class in langchain_community\embeddings\huggingface_hub.py file. It will support to make async calls to HuggingFaceHub's embedding endpoint and generate embeddings asynchronously. Test Cases: Added test_huggingfacehub_embedding_async_documents() and test_huggingfacehub_embedding_async_query() functions in test_huggingface_hub.py file to test the two async functions created in HuggingFaceHubEmbeddings class. Documentation: Updated huggingfacehub.ipynb with steps to install huggingface_hub package and use HuggingFaceHubEmbeddings. Dependencies: None, Twitter handle: I do not have a Twitter account --------- Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>	2024-01-11 21:52:55 -08:00
Christophe Bornet	81d1ba05dc	Add a BaseStore backed by AstraDB (#15812 ) - Description: this change adds a `BaseStore` backed by AstraDB - Twitter handle: cbornet_	2024-01-11 21:41:24 -08:00
manishsahni2000	74d9fc2f9e	PR community:Removing knn beta content in mongodb atlas vectorstore (#15865 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-01-11 21:40:54 -08:00
shahrin014	bdd90ae2ee	community: Ollama - Pass headers to post request (#15881 ) ## Feature - Set additional headers in constructor - Headers will be sent in post request This feature is useful if deploying Ollama on a cloud service such as hugging face, which requires authentication tokens to be passed in the request header. ## Tests - Test if header is passed - Test if header is not passed	2024-01-11 21:40:35 -08:00
Xin Liu	5efec068c9	feat: Implement `stream` interface (#15875 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> Major changes: - Rename `wasm_chat.py` to `llama_edge.py` - Rename the `WasmChatService` class to `ChatService` - Implement the `stream` interface for `ChatService` - Add `test_chat_wasm_service_streaming` in the integration test - Update `llama_edge.ipynb` --------- Signed-off-by: Xin Liu <sam@secondstate.io>	2024-01-11 21:32:48 -08:00
Massimiliano Pronesti	ec4dab0449	feat(community): make Amadeus toolkit LLM-agnostic (#15879 ) - Description: `AmadeusToolkit` and `AmadeusClosestAirport` contained a hardcoded call to `ChatOpenAI`. This PR makes it LLM-independent, while guaranteeing backward compatibility. - Issue: #15847 - Dependencies: None @baskaryan <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-01-11 21:32:03 -08:00
Yacine	782dd44be9	<langchain_community.vectorstores>:<Fix pinecone.py __init__ docsrting instruction> (#15922 ) - Description: The pinecone docstring instructs to pass the embedding query text causing the warning below. It should be the embeddings object. warning message: UserWarning: Passing in `embedding` as a Callable is deprecated. Please pass in an Embeddings object instead. - Issue: NA - Dependencies: None @baskaryan	2024-01-11 21:26:33 -08:00
Erick Friis	623f87c888	community[patch]: pinecone bug (#15905 )	2024-01-11 11:44:07 -08:00
axiangcoding	d5aa277b94	community: add collection_properties parameter to Milvus (#15788 ) - Description: add collection_properties parameter to Milvus. See [pymilvus set_properties() description](https://milvus.io/api-reference/pymilvus/v2.3.x/Collection/set_properties().md) - Issue: None - Dependencies: None - Twitter handle: None	2024-01-10 20:29:01 -08:00
mogith-pn	9e1ed17bfb	Community : Modified doc strings and example notebook for Clarifai (#15816 ) Community : Modified doc strings and example notebook for Clarifai Description: 1. Modified doc strings inside clarifai vectorstore class and embeddings. 2. Modified notebook examples. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-01-10 19:33:10 -08:00
Erick Friis	38523d7c57	together[minor]: add llm (#15853 )	2024-01-10 17:55:34 -08:00
Erick Friis	ee708739c3	community[patch]: pinecone v3 support (#15849 ) Info in slack --------- Co-authored-by: Roie Schwaber-Cohen <roie.cohen@gmail.com>	2024-01-10 14:54:50 -08:00
Erick Friis	85a4594ed7	community[patch]: more deprecations (#15782 )	2024-01-09 20:36:16 -08:00
NuODaniel	70b6315b23	community[patch]: fix qianfan chat stream calling caused exception (#13800 ) - Description: `QianfanChatEndpoint` extends `BaseChatModel` as a super class, which has a default stream implement might concat the MessageChunk with `__add__`. When call stream(), a ValueError for duplicated key will be raise. - Issues: * #13546 * #13548 * merge two single test file related to qianfan. - Dependencies: no - Tag maintainer: --------- Co-authored-by: root <liujun45@baidu.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-09 15:29:25 -08:00
Bagatur	ee5bd986de	community[patch]: update oai deprecation message (#15681 ) addresses #15674	2024-01-09 14:36:58 -05:00
Ian	32ec56194b	community: fix myscale delete function bug (#15675 ) Now the SQL used to delete vector doc from myscale is as follow: ```sql DELETE FROM collection WHERE id = '1' AND id = '2' AND id = '3' ``` But the expected one should be ```sql DELETE FROM collection WHERE id IN ('1', '2', '3') ```	2024-01-08 12:26:29 -08:00
Christophe Bornet	a466f79ac9	Fix AstraDB logical operator filtering (#15699 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> This change fixes the AstraDB logical operator filtering (`$and,` `$or`). The `metadata` prefix must not be added if the key is `$and` or `$or`.	2024-01-08 12:23:46 -08:00
Christophe Bornet	1f5f6381ec	Add doc for AstraDB document loader (#15703 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> See preview : https://langchain-git-fork-cbornet-astra-loader-doc-langchain.vercel.app/docs/integrations/document_loaders/astradb	2024-01-08 12:21:46 -08:00
Erick Friis	94911ae503	community[patch]: Support different Pinecone initializations depending on the version (#15717 ) Co-authored-by: DosticJelena <jelenadostic2@gmail.com>	2024-01-08 11:33:36 -08:00
Nuno Campos	7ce4cd0709	Do not issue beta or deprecation warnings on internal calls (#15641 )	2024-01-07 20:54:45 -08:00
Earlee	98c6c9603e	community: fix: should flush after inserting data on milvus (#15568 ) The inserted data cannot take effect immediately. We should flush after inserting data on milvus.	2024-01-07 09:33:47 -08:00
chyroc	a17a3638b5	Docs: fix excel document loader typo (#15470 )	2024-01-07 09:33:35 -08:00
chyroc	9ae901c5e6	Feat: add CHM file loader (#15519 ) fix https://github.com/langchain-ai/langchain/issues/15469	2024-01-07 09:28:52 -08:00
Nan LI	0b393315ce	community: Correct Input API Key Name in Notebook and Enhance Readability of Comments for ZhipuAI Chat Model (#15529 ) - Description: This update rectifies an error in the notebook by changing the input variable from `zhipu_api_key` to `api_key`. It also includes revisions to comments to improve program readability. - Issue: The input variable in the notebook example should be `api_key` instead of `zhipu_api_key`. - Dependencies: No additional dependencies are required for this change. To ensure quality and standards, we have performed extensive linting and testing. Commands such as make format, make lint, and make test have been run from the root of the modified package to ensure compliance with LangChain's coding standards.	2024-01-07 09:27:47 -08:00
kursathalat	9ea28ee464	fix: Fix DEFAULT_API_KEY for ArgillaCallbackHandler (#15534 ) - ArgillaCallbackHandler does not properly set the default values while initializing. This PR corrects the line. - Issue: #15531 - Dependencies: Argilla - Also corrected some dead links.	2024-01-07 09:26:51 -08:00
Chad Norvell	d1bfb70bc4	community: Allow deleting by ID and collection in `pgvector` (#15627 ) - Description: The `delete_collection` method deletes an entire collection regardless of custom ID. The `delete` method deletes everything with the provided custom IDs regardless of collection. It can be useful to restrict deletion to both the collection and a set of custom IDs. This change adds support for that by allowing you to optionally specify that `delete` should be restricted to the collection defined on the `PGVector` instance.	2024-01-07 08:33:21 -08:00
Chad Norvell	f6226d464e	community: Include PDF ID in MathPix metadata (#15629 ) - Description: Includes the PDF ID in the MathPix document metadata. This is useful in case you need to re-request a processed PDF from the MathPix API later.	2024-01-07 08:31:53 -08:00
Chad Norvell	d2a686b165	community: Provide more actionable errors in the MathPix PDF loader (#15630 ) - Description: The `error_info['id']` can be cross-referenced with the MathPix API documentation to get very specific information about why an error occurred.	2024-01-07 08:31:09 -08:00
Kai	5d05df4bce	community: Fixed bug of "system message check" in chat_models/tongyi. (#15631 ) - Description: This PR is to fix a bug of "system message check" in langchain_community/ chat_models/tongyi.py - Issue: In term of current logic, if there's no system message in the chat messages, an error of "System message can only be the first message." will be wrongly raised. - Dependencies: No. - Twitter handle: I don't have a Twitter account.	2024-01-07 08:30:18 -08:00
Raunak	64f5968a81	community: Replaced hardcoded "metadata" with FIELDS_METADATA variable in semantic_hybrid_search_with_score_and_rerank (#15642 ) - Description: This PR is to fix a bug in semantic_hybrid_search_with_score_and_rerank() function in langchain_community/vectorstores/azuresearch.py. The hardcoded "metadata" name is replaced with FIELDS_METADATA variable with an if block to check if the metadata column exists or not. - Issue: Fixed #15581 - Dependencies: No - Twitter handle: None Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>	2024-01-06 17:04:59 -08:00
Erick Friis	d136925c49	community[patch]: fix deprecation warnings on openai subclasses (#15621 )	2024-01-05 18:02:17 -08:00
Bagatur	c5226d7a18	docs: update cohere chat integration (#15562 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-05 16:33:29 -08:00
Erick Friis	ebc75c5ca7	openai[minor]: implement langchain-openai package (#15503 ) Todo - [x] copy over integration tests - [x] update docs with new instructions in #15513 - [x] add linear ticket to bump core -> community, community->langchain, and core->openai deps - [ ] (optional): add `pip install langchain-openai` command to each notebook using it - [x] Update docstrings to not need `openai` install - [x] Add serialization - [x] deprecate old models Contributor steps: - [x] Add secret names to manual integrations workflow in .github/workflows/_integration_test.yml - [x] Add secrets to release workflow (for pre-release testing) in .github/workflows/_release.yml Maintainer steps (Contributors should not do these): - [x] set up pypi and test pypi projects - [x] add credential secrets to Github Actions - [ ] add package to conda-forge Functional changes to existing classes: - now relies on openai client v1 (1.6.1) via concrete dep in langchain-openai package Codebase organization - some function calling stuff moved to `langchain_core.utils.function_calling` in order to be used in both community and langchain-openai	2024-01-05 15:03:28 -08:00
Bagatur	a7d023aaf0	core[patch], community[patch]: mark runnable context, lc load as beta (#15603 )	2024-01-05 17:54:26 -05:00
chyroc	f12b5c1222	Feat: support Milvus more params (#15447 ) fix https://github.com/langchain-ai/langchain/issues/15442	2024-01-04 20:07:23 -08:00
Bagatur	b2f15738dd	core[patch], langchain[patch], community[patch]: Revert #15326 (#15546 )	2024-01-04 10:39:37 -05:00
Harutaka Kawamura	73da8f863c	Remove unused `Params` (#14385 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> Removes unused `Params` in `libs/langchain/langchain/llms/mlflow.py`. Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-02 22:45:18 -08:00
chyroc	b65e57971e	Patch: improve type hint (#15451 )	2024-01-02 22:39:27 -08:00
Harutaka Kawamura	8ebf55ebbf	Fix `llms.Mlflow` example (#14386 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> The example code for `llms.Mlflow` is outdated. Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-02 22:35:13 -08:00
Xin Liu	0a7d360ba4	feat: new integration `wasm_chat` (#14787 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> Adds `WasmChat` integration. `WasmChat` runs GGUF models locally or via chat service in lightweight and secure WebAssembly containers. In this PR, `WasmChatService` is introduced as the first step of the integration. `WasmChatService` is driven by [llama-api-server](https://github.com/second-state/llama-utils) and [WasmEdge Runtime](https://wasmedge.org/). --------- Signed-off-by: Xin Liu <sam@secondstate.io>	2024-01-02 22:33:14 -08:00
Anush	58cc7878e9	refactor: Qdrant async improvements (#14492 ) Follow up on https://github.com/langchain-ai/langchain/pull/13048. This PR intends to simplify the Qdrant async implementation by replacing the internal GRPC methods with the `QdrantAsyncClient` methods. This is a backward compatible change with no additional steps required after merge. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-02 20:07:48 -08:00
JuR-0	4dab37741a	Fix Bedrock broad error catching (#14398 ) Fixes #14347 <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: Added the traceback of the previous error to keep the initial error type, - Issue: #14347 , - Dependencies: None, - Tag maintainer: Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Julien Raffy <julien.raffy@emeria.eu> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-02 17:25:48 -08:00
Bob Lin	e93be14c11	Improvement: Allow passing parameters to the underlying es_client. Closes: #14403 (#14435 ) ### Description In https://github.com/langchain-ai/langchain/issues/14403, the user mentioned that he hopes not to verify ssl and needs to pass more parameters I found that the `Elasticsearch` class [has very many parameters](`98f2af2134/elasticsearch/_sync/client/__init__.py (L131-L191)` ): <img width="1097" alt="Screenshot 2023-12-08 at 4 24 39 PM" src="https://github.com/langchain-ai/langchain/assets/10000925/f2201554-b41a-4388-a8e8-c14a2d0466d4"> In order to adapt to more situations, I want to add the kwargs parameter so that users can enter more `Elasticsearch` parameters. Like [redis](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/redis/base.py#L253), [tair](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/tair.py#L32), [myscale](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/myscale.py#L112) and so on.	2024-01-02 16:48:17 -08:00
codehound42	8aa921d3a4	Support `score_threshold` in SupabaseVectorStore similarity search (#14439 ) Description: Add support for setting the `score_threshold` for similarity search in SupabaseVectoreStore. This pull request addresses issue #14438 Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-02 16:47:05 -08:00
YISH	eecfa81918	Add the collection_description parameter to Milvus (#14524 ) Because Milvus' collection_name doesn't support UFT8 characters in other languages, I want the `collection_descriotion`. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-01-02 16:28:01 -08:00
Evgenii Molov	b4ec340fb3	Fix failing serpapi response processing for Google Maps API (#14817 ) Description: Fix for processing for serpapi response for Google Maps API Issue: Due to the fact corresponding [api](https://serpapi.com/google-maps-api) returns 'local_results' as list, and old version requested `res["local_results"].keys()` of the list. As the result we got exception: ```AttributeError: 'list' object has no attribute 'keys'```. Way to reproduce wrong behaviour: ``` params = { "engine": "google_maps", "type": "search", "google_domain": "google.de", "ll": "@51.1917,10.525,14z", "hl": "de", "gl": "de", } search = SerpAPIWrapper(params=params) results = search.run("cafe") ``` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Ran <rccalman@gmail.com>	2024-01-02 16:17:21 -08:00
YISH	da0f750a0b	Milvus allows to store metadata as json field (#14636 ) Because Milvus doesn't support nullable fields, but document metadata is very rich, so it makes more sense to store it as json. https://github.com/milvus-io/pymilvus/issues/1705#issuecomment-1731112372 <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-02 16:12:00 -08:00
Ashley Xu	0ce7858529	feat: add Google BigQueryVectorSearch in vectorstore (#14829 ) BigQuery vector search lets you use GoogleSQL to do semantic search, using vector indexes for fast but approximate results, or using brute force for exact results. This PR integrates LangChain vectorstore with BigQuery Vector Search. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Vlad Kolesnikov <vladkol@google.com>	2024-01-02 15:57:14 -08:00
JaguarDB	02f59c2035	Use args option in jaguar so it takes more options in similarity search (#15080 ) - Description: replace score_threshold with args - Issue: needs a way to pass more options to similarity search - Dependencies: None - Twitter handle: @workbot --------- Co-authored-by: JY <jyjy@jaguardb>	2024-01-02 15:53:06 -08:00
chyroc	37ad6ec248	Refactor: use SecretStr for tongyi chat-model (#15102 )	2024-01-02 15:45:23 -08:00
Shaurya Rohatgi	e1c2cd7a28	community: Semanticscholar tool to search 200M+ scientific articles (#15151 ) - Description: Tool now supports querying over 200 million scientific articles, vastly expanding its reach beyond the 2 million articles accessible through Arxiv. This update significantly broadens access to the entire scope of scientific literature. - Dependencies: semantischolar https://github.com/danielnsilva/semanticscholar - Twitter handle: @shauryr --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-02 15:36:03 -08:00
dudub12	7e6b0056b8	SQLDatabase drop the column names in the result. (#15361 ) Fix for the following bug: https://github.com/langchain-ai/langchain/issues/15360 --------- Co-authored-by: dudu butbul <100126964+dudu-upstream@users.noreply.github.com>	2024-01-02 15:29:25 -08:00
chyroc	07d294b5ec	Fix: fix Bing Search empty result exception, fix #15384 (#15387 ) fix https://github.com/langchain-ai/langchain/issues/15384	2024-01-02 15:25:00 -08:00
Bagatur	1678d6ca17	langchain[patch], experimental[patch], docs: update tools imports (#15433 )	2024-01-02 18:23:34 -05:00
Leonid Ganeline	b8c6ebf647	refactor `utils` (#15432 ) The `langchain` [still holds several artifacts](https://api.python.langchain.com/en/latest/langchain_api_reference.html#module-langchain.utils) that belongs to `community`. If they moved then `langchain.utils` namespace would be removed completely. - moved `ernie_functions` artifacts to `community`	2024-01-02 14:56:38 -08:00
Bagatur	fa5d49f2c1	docs, experimental[patch], langchain[patch], community[patch]: update storage imports (#15429 ) ran ```bash g grep -l "langchain.vectorstores" \| xargs -L 1 sed -i '' "s/langchain\.vectorstores/langchain_community.vectorstores/g" g grep -l "langchain.document_loaders" \| xargs -L 1 sed -i '' "s/langchain\.document_loaders/langchain_community.document_loaders/g" g grep -l "langchain.chat_loaders" \| xargs -L 1 sed -i '' "s/langchain\.chat_loaders/langchain_community.chat_loaders/g" g grep -l "langchain.document_transformers" \| xargs -L 1 sed -i '' "s/langchain\.document_transformers/langchain_community.document_transformers/g" g grep -l "langchain\.graphs" \| xargs -L 1 sed -i '' "s/langchain\.graphs/langchain_community.graphs/g" g grep -l "langchain\.memory\.chat_message_histories" \| xargs -L 1 sed -i '' "s/langchain\.memory\.chat_message_histories/langchain_community.chat_message_histories/g" gco master libs/langchain/tests/unit_tests//test_imports.py gco master libs/langchain/tests/unit_tests/*/test_public_api.py ```	2024-01-02 16:47:11 -05:00
Bagatur	480626dc99	docs, community[patch], experimental[patch], langchain[patch], cli[pa… (#15412 ) …tch]: import models from community ran ```bash git grep -l 'from langchain\.chat_models' \| xargs -L 1 sed -i '' "s/from\ langchain\.chat_models/from\ langchain_community.chat_models/g" git grep -l 'from langchain\.llms' \| xargs -L 1 sed -i '' "s/from\ langchain\.llms/from\ langchain_community.llms/g" git grep -l 'from langchain\.embeddings' \| xargs -L 1 sed -i '' "s/from\ langchain\.embeddings/from\ langchain_community.embeddings/g" git checkout master libs/langchain/tests/unit_tests/llms git checkout master libs/langchain/tests/unit_tests/chat_models git checkout master libs/langchain/tests/unit_tests/embeddings/test_imports.py make format cd libs/langchain; make format cd ../experimental; make format cd ../core; make format ```	2024-01-02 15:32:16 -05:00
Mohammad Mohtashim	b6c57d38fa	Langchain_community: Small Fix when loading facebook messages (#15358 ) - Description: SingleFileFacebookMessengerChatLoader did not handle the case for when messages had stickers and/or photos so fixed that. - Issue: #15356 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-01 18:52:23 -08:00
Mateusz Szewczyk	cbfaccc424	WatsonxLLM updates/enhancements (#14598 ) - Description: updates/enhancements to IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM provider (prompt tuned models and prompt templates deployments support) - Dependencies: [ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/), - Tag maintainer: : @hwchase17 , @eyurtsev , @baskaryan - Twitter handle: details in comment below. Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. ✅ --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-01 18:50:05 -08:00
Manjunath Janardhan	7a0feba9f7	GITLAB_URL should take default https://gitlab.com instead of error (#14638 ) The fix #14221 has broken default gitlab url which is forcing the users to specify GITLAB_URL for default one. With this fix if GITLAB_URL is not set, the default gitlab url will be taken. - Description: Add the GITHUB URL instead of None - Issue: the issue #14221 has broken the default github URL - Dependencies: None - Tag maintainer: @hwchase17 - Twitter handle: manjunath_shiva	2024-01-01 16:55:52 -08:00
David	dcf047c48f	add api_base to _client_params (community version of #14393 ) (#14644 ) - Description: This PR adds `api_base` to `_client_params` in the `chat_model` of LiteLLM to ensure it's included in API calls. Previously, `api_base` was set on the client but was not included in the parameters passed to the completion function. This change ensures that `api_base` is correctly passed to all API calls. - Issue: #14338 - Tag maintainer: @hwchase17 @agola11 - Twitter handle: @LMS_David_RS	2024-01-01 16:53:16 -08:00
xuxiang	dd1d818a82	Fixing the Issue with DashScopeEmbeddings Handling More than 25 Rows of Data (#14662 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> This change addresses the issue where DashScopeEmbeddingAPI limits requests to 25 lines of data, and DashScopeEmbeddings did not handle cases with more than 25 lines, leading to errors. I have implemented a fix to manage data exceeding this limit efficiently. --------- Co-authored-by: xuxiang <xuxiang@aliyun.com>	2024-01-01 16:50:13 -08:00
Christophe Bornet	e2a8962ba6	Add AstraDB document loader (#14747 ) - Description: this adds the AstraDB document loader and an integration test - Twitter handle: cbornet_	2024-01-01 16:13:28 -08:00
Igor Dvorkin	76923e5743	Restore self message sent before OSX 12 Monterey (#14818 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-01-01 16:04:14 -08:00
savoiepe	d006be60ec	Added more filtering options to pgvector vectorstore (#14852 ) - Description: Using PGVector vector store, it was only possible to filter for values equals, in or not in metadata. Extended this feature to work with the following keywords : IN, NIN, BETWEEN, GT, LT, NE, EQ, LIKE, CONTAINS, OR, AND --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-01-01 16:01:22 -08:00
chyroc	32e96a471c	Refactor: use SecretStr for llm_rails embeddings (#15090 )	2024-01-01 15:24:50 -08:00
chyroc	b440f92d81	Refactor: use SecretStr for embaas embeddings (#15091 )	2024-01-01 15:24:00 -08:00
chyroc	ea6cf0f1b1	Refactor: use SecretStr for edenai embeddings (#15092 )	2024-01-01 15:22:51 -08:00
chyroc	32e6e9de13	Refactor: use SecretStr for palm chat-model (#15100 )	2024-01-01 15:21:41 -08:00
chyroc	b6952d41e5	Refactor: use SecretStr for GPTRouter chat-model (#15101 )	2024-01-01 15:20:26 -08:00
Nan LI	f506b4cfd2	community: Integration of New Chat Model Based on ChatGLM3 via ZhipuAI API (#15105 ) - Description: - This PR introduces a significant enhancement to the LangChain project by integrating a new chat model powered by the third-generation base large model, ChatGLM3, via the zhipuai API. - This advanced model supports functionalities like function calls, code interpretation, and intelligent Agent capabilities. - The additions include the chat model itself, comprehensive documentation in the form of Python notebook docs, and thorough testing with both unit and integrated tests. - Dependencies: This update relies on the ZhipuAI package as a key dependency. - Twitter handle: If this PR receives spotlight attention, we would be honored to receive a mention for our integration of the advanced ChatGLM3 model via the ZhipuAI API. Kindly tag us at @kaiwu. To ensure quality and standards, we have performed extensive linting and testing. Commands such as make format, make lint, and make test have been run from the root of the modified package to ensure compliance with LangChain's coding standards. TO DO: Continue refining and enhancing both the unit tests and integrated tests. --------- Co-authored-by: jing <jingguo92@gmail.com> Co-authored-by: hyy1987 <779003812@qq.com> Co-authored-by: jianchuanqi <qijianchuan@hotmail.com> Co-authored-by: lirq <whuclarence@gmail.com> Co-authored-by: whucalrence <81530213+whucalrence@users.noreply.github.com> Co-authored-by: Jing Guo <48378126+JaneCrystall@users.noreply.github.com>	2024-01-01 15:17:03 -08:00
Hin	2cf1e73d12	Feat add volcano embedding (#14693 ) Description: Volcano Ark is an enterprise-grade large-model service platform for developers, providing a full range of functions and services such as model training, inference, evaluation, fine-tuning. You can visit its homepage at https://www.volcengine.com/docs/82379/1099455 for details. This change could help developers use the platform for embedding. Issue: None Dependencies: volcengine Tag maintainer: @baskaryan Twitter handle: @hinnnnnnnnnnnns --------- Co-authored-by: lujingxuansc <lujingxuansc@bytedance.com>	2024-01-01 14:37:35 -08:00
David Křístek	a010f29013	fix: call correct stream method in ollama (#15104 ) Co-authored-by: David Kristek <david@David--MacBook-Pro.local>	2024-01-01 14:03:53 -08:00
Christian Janiake	be578f32be	community:Lazy load wikipedia dump file (#15111 ) Description: the MWDumpLoader implementation currently does not support the lazy_load method, and the files are usually very large. We are proposing refactoring the load function, extracting two private functions with the functionality of loading the dump file and parsing a single page, to reuse the code in the lazy_load implementation.	2024-01-01 14:02:56 -08:00
chyroc	a4ae4bc361	feat: mask api_key for konko (#14010 ) for https://github.com/langchain-ai/langchain/issues/12165	2024-01-01 13:42:49 -08:00
joel-teratis	62d32bd214	fix(minor): added missing kwargs parameter to chroma query function (#14919 ) Description: This PR adds the `kwargs` parameter to six calls in the `chroma.py` package. All functions already were able to receive `kwargs` but they were discarded before. Issue: When passing `kwargs` to functions in the `chroma.py` package they are being ignored. For example: ``` chroma_instance.similarity_search_with_score( query, k=100, include=["metadatas", "documents", "distances", "embeddings"], # this parameter gets ignored ) ``` The `include` parameter does not get passed on to the next function and does not have any effect. Dependencies: None	2024-01-01 13:40:29 -08:00
NuODaniel	7773943a51	community:qianfan endpoint support init params & remove useless params definietion (#15381 ) - Description: - support custom kwargs in object initialization. For instantance, QPS differs from multiple object(chat/completion/embedding with diverse models), for which global env is not a good choice for configuration. - Issue: no - Dependencies: no - Twitter handle: no @baskaryan PTAL	2024-01-01 13:12:31 -08:00
Nuno Campos	99000c612e	Propagate context vars in all classes/methods (#15329 ) - Any direct usage of ThreadPoolExecutor or asyncio.run_in_executor needs manual handling of context vars <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-12-29 15:59:00 -08:00
Ankush Gola	7eec8f2487	Delete V1 tracer and refactor tracer tests to core (#15326 )	2023-12-29 15:55:56 -08:00
chyroc	7ce338201c	Patch: improve check openai version (#15301 )	2023-12-29 13:44:19 -08:00
Nuno Campos	eb5e250188	Propagate context vars in all classes/methods - Any direct usage of ThreadPoolExecutor or asyncio.run_in_executor needs manual handling of context vars	2023-12-29 12:34:03 -08:00
Shuai Liu	4b53440e70	Upgrades the Tongyi LLM and ChatTongyi Model (#14793 ) - Description: fixes and upgrades for the Tongyi LLM and ChatTongyi Model - Fixed typos; it should be `Tongyi`, not `OpenAI`. - Fixed a bug in `stream_generate_with_retry`; it's a real stream generator now. - Fixed a bug in `validate_environment`; the `dashscope_api_key` should be properly handled when set by environment variables or initialization parameters. - Changed the `dashscope` response to incremental output by setting the parameter `incremental_output`, which eliminates the need for the prefix-removal trick. - Removed some unused parameters, like `n`, `prefix_messages`. - Added `_stream` method. - Added async methods support, such as `_astream`, `_agenerate`, `_abatch`. - Dependencies: No new dependencies. - Tag maintainer: @hwchase17 > PS: Some may be confused about the terms `dashscope`, `tongyi`, and `Qwen`: > - `dashscope`: A platform to deploy LLMs and provide APIs to invoke the LLM. > - `tongyi`: A brand name or overall term about Alibaba Cloud's LLM/AI. > - `Qwen`: An LLM that is open-sourced and deployed in `dashscope`. > > We use the `dashscope` SDK to interact with the `tongyi`-`Qwen` LLM. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-29 12:06:12 -08:00
Diego Rani Mazine	ec72225265	refactor: enable connection pool usage in PGVector (#11514 ) - Description: `PGVector` refactored to use connection pool. - Issue: #11433, - Tag maintainer: @hwchase17 @eyurtsev, --------- Co-authored-by: Diego Rani Mazine <diego.mazine@mercadolivre.com> Co-authored-by: Nuno Campos <nuno@langchain.dev>	2023-12-28 15:07:16 -08:00
joshy-deshaw	bf5385592e	core, community: propagate context between threads (#15171 ) While using `chain.batch`, the default implementation uses a `ThreadPoolExecutor` and run the chains in separate threads. An issue with this approach is that that [the token counting callback](https://python.langchain.com/docs/modules/callbacks/token_counting) fails to work as a consequence of the context not being propagated between threads. This PR adds context propagation to the new threads and adds some thread synchronization in the OpenAI callback. With this change, the token counting callback works as intended. Having the context propagation change would be highly beneficial for those implementing custom callbacks for similar functionalities as well. --------- Co-authored-by: Nuno Campos <nuno@langchain.dev>	2023-12-28 14:51:22 -08:00
shroominic	694bbb14cd	community: fix typo in async ollama chat (#15276 ) Made a stupid typo in the last PR which got already merged😅	2023-12-28 09:56:55 -08:00
triThirty	fea4888e72	community: Enhance Github error prompt (#15248 ) - Description: The Github error prompt is confused because of JWT enctrypt to somebody not familiar with Github connection method. This PR is to add some useful error prompt to help users troubleshooting. - Issue: https://github.com/langchain-ai/langchain/issues/14550#issuecomment-1867445049 - Dependencies: None, - Twitter handle: None	2023-12-28 08:25:19 -08:00
Bob Lin	a464eb4394	community: Make doctran synchronous (#15264 ) ### Description I found that the methods in [the doctran library](https://github.com/psychic-api/doctran) have been restructured into [synchronized versions](`14944a59f7`), And [the example ipynb](https://github.com/psychic-api/doctran/blob/main/examples.ipynb) also shows that the code is synchronized, but the README has not been updated yet. so we need to modify the code and update the documentation. ### Issue https://github.com/langchain-ai/langchain/issues/14645	2023-12-28 08:05:24 -08:00
chyroc	6fb3cc6f27	Fix: Use `Union` instead of `\|` to improve compatibility, fix #15244 (#15245 )	2023-12-27 22:06:42 -08:00
chyroc	1abcf441ae	Refactor: use SecretStr for Predibase llms (#15119 )	2023-12-26 13:01:42 -08:00
chyroc	0a9a73a9c9	Refactor: use SecretStr for PipelineAI llms (#15120 )	2023-12-26 13:00:58 -08:00
chyroc	d63ceb65b3	Refactor: use SecretStr for StochasticAI llms (#15118 )	2023-12-26 12:59:51 -08:00
chyroc	674fde87d2	Refactor: use SecretStr for VolcEngineMaas llms (#15117 )	2023-12-26 12:59:08 -08:00
chyroc	3cc1da2b38	Refactor: use SecretStr for Petals llms (#15121 )	2023-12-26 12:57:37 -08:00
shroominic	e6f0cee896	community: Async Ollama + ChatOllama (#15169 ) Description: Adding async methods to booth OllamaLLM and ChatOllama to enable async streaming and async .on_llm_new_token callbacks. Issue: ChatOllama is not working in combination with an AsyncCallbackManager because the .on_llm_new_token method is not awaited.	2023-12-26 12:08:04 -08:00
Phill Zarfos	35896faab7	community: correct spelling mistakes of "Suffle" and "reporoducibility" (#15172 ) - Description: Correct spelling mistakes of "Suffle" and "reporoducibility" in `DirectoryLoader` class - Issue: N/A - Dependencies: N/A - Twitter handle: N/A	2023-12-26 11:22:59 -08:00
chyroc	3a3f880e5a	Patch: improve ollama 404 api error message, fix #15147 (#15156 ) Make this issue more clearly exposed to developers	2023-12-26 11:07:39 -08:00
Ivan	59d4b80a92	[community]: Elasticsearch chat history encoding (#15055 ) - Added ensure_ascii property to ElasticsearchChatMessageHistory <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Ivan Chetverikov <ivan.chetverikov@raftds.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-22 13:21:34 -08:00
Corey Brown	9e492620d4	Don't reassign chunk_type (#14923 ) Description: The parameter chunk_type was being hard coded to "extractive_answers", so that when "snippet" was being passed, it was being ignored. This change simply doesn't do that.	2023-12-22 13:20:53 -08:00
Takuya Igei	6da2246215	Add support Vertex AI Gemini uses a public image URL (#14949 ) ## What Since `langchain_google_genai.ChatGoogleGenerativeAI` supported A public image URL, we add to support it in `langchain.chat_models.ChatVertexAI` as well. ### Example ```py from langchain.chat_models.vertexai import ChatVertexAI from langchain_core.messages import HumanMessage llm = ChatVertexAI(model_name="gemini-pro-vision") image_message = { "type": "image_url", "image_url": { "url": "https://python.langchain.com/assets/images/cell-18-output-1-0c7fb8b94ff032d51bfe1880d8370104.png", }, } text_message = { "type": "text", "text": "What is shown in this image?", } message = HumanMessage(content=[text_message, image_message]) output = llm([message]) print(output.content) ``` ## Refs - https://python.langchain.com/docs/integrations/llms/google_vertex_ai_palm - https://python.langchain.com/docs/integrations/chat/google_generative_ai	2023-12-22 13:19:09 -08:00
Archan Ghosh	affa3e755a	Update arxiv.py with get_summaries_as_docs inside of Arxivloader (#14953 ) Added the call function get_summaries_as_docs inside of Arxivloader - Description: Added a function that returns the documents from get_summaries_as_docs, as the call signature is present in the parent file but never used from Arxivloader, this can be used from Arxivloader itself just like .load() as both the signatures are same. - Issue: Reduces time to load papers as no pdf is processed only metadata is pulled from Arxiv allowing users for faster load times on bulk loads. Users can then choose one or more paper and use ID directly with .load() to load pdf thereby loading all the contents of the paper.	2023-12-22 13:14:22 -08:00
ccurme	f2782f4c86	community: add args_schema to GmailSendMessage (#14973 ) - Description: `tools.gmail.send_message` implements a `SendMessageSchema` that is not used anywhere. `GmailSendMessage` also does not have an `args_schema` attribute (this led to issues when invoking the tool with an OpenAI functions agent, at least for me). Here we add the missing attribute and a minimal test for the tool. - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Co-authored-by: Chester Curme <chestercurme@microsoft.com>	2023-12-22 13:07:44 -08:00
Philip Kiely - Baseten	6342da333a	community: refactor Baseten integration with new API endpoints & docs (#15017 ) - Description: In response to user feedback, this PR refactors the Baseten integration with updated model endpoints, as well as updates relevant documentation. This PR has been tested by end users in production and works as expected. - Issue: N/A - Dependencies: This PR actually removes the dependency on the `baseten` package! - Twitter handle: https://twitter.com/basetenco	2023-12-22 12:46:24 -08:00
Blane Honeycutt	3fc1b3553b	Community: Adds ability to pass a Config to the boto3 client used by Bedrock (#15029 ) # Description This PR adds the ability to pass a `botocore.config.Config` instance to the boto3 client instantiated by the Bedrock LLM. Currently, the Bedrock LLM doesn't support a way to pass a Config, which means that some settings (e.g., timeouts and retry configuration) require instantiating a new boto3 client with a Config and then replacing the LLM's client: ```python llm = Bedrock( region_name='us-west-2', model_id="anthropic.claude-v2", model_kwargs={'max_tokens_to_sample': 4096, 'temperature': 0}, ) llm.client = boto_client('bedrock-runtime', region_name='us-west-2', config=Config({'read_timeout': 300})) ``` # Issue N/A # Dependencies N/A	2023-12-22 12:42:56 -08:00
Grzegorz Sajko	dc71fcfabf	corrected outdated link (#15053 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-12-22 12:39:38 -08:00
Ran	c3f8733aef	fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647 ) fix spellings seperate -> separate: found more occurrences, see https://github.com/langchain-ai/langchain/pull/14602 initialise -> intialize: the latter is more common in the repo pre-defined > predefined: adding a comma after a prefix is a delicate matter, but this is a generally accepted word also, another word that appears in the repo is "fs" (stands for filesystem), e.g., in `libs/core/langchain_core/prompts/loading.py` ` """Unified method for loading a prompt from LangChainHub or local fs."""` Isn't "filesystem" better?	2023-12-22 11:49:35 -08:00
Harrison Chase	2e159931ac	add defaults for tavily (#15075 )	2023-12-22 11:48:26 -08:00
chyroc	4440ec5ab3	Refactor: use SecretStr for minimax embeddings (#15067 )	2023-12-22 11:43:23 -08:00
chyroc	aa19ca9723	Refactor: use SecretStr for jina embeddings (#15068 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-12-22 11:42:29 -08:00
Michael Goin	501cc8311d	community[patch]: Fix generation_config not setting properly for DeepSparse (#15036 ) - Description: Tiny but important bugfix to use a more stable interface for specifying generation_config parameters for DeepSparse LLM	2023-12-22 01:39:22 -05:00
QIAN Zifei	2460f977c5	community[minor]: Azure DocumentIntelligenceLoader/Parser support update with latest SDK (#14389 ) - Description: Add DocumentIntelligenceLoader & DocumentIntelligenceParser implementation using the latest Azure Document Intelligence SDK with markdown support. The core logic resides in DocumentIntelligenceParser and DocumentIntelligenceLoader is a mere wrapper of the parser. The parser will takes api_endpoint and api_key and creates DocumentIntelligenceClient for the user. 4 parsing modes are supported: 1. Markdown (default) 2. Single 3. Page 4. Object UT and notebook are also updated accordingly. - Dependencies: Azure Document Intelligence SDK: azure-ai-documentintelligence [azure-sdk-for-python/sdk/documentintelligence/azure-ai-documentintelligence at 7c42462ac662522a6fd21b17d2a20f4cd40d0356 · Azure/azure-sdk-for-python (github.com)](https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAzure%2Fazure-sdk-for-python%2Ftree%2F7c42462ac662522a6fd21b17d2a20f4cd40d0356%2Fsdk%2Fdocumentintelligence%2Fazure-ai-documentintelligence&data=05%7C01%7CZifei.Qian%40microsoft.com%7C298225aa3e31468a863108dbf07374ff%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638368150928704292%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=oE0Sl4HERnMKdbkV9KgBV46Z2xytcQAShdTWf7ZNl%2Bs%3D&reserved=0). --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-12-21 16:40:27 -08:00
Jacob Lee	1b01ee0e3c	community[minor]: add hf chat wrapper (#14736 ) Builds on #14040 with community refactor merged and notebook updated. Note that with this refactor, models will be imported from `langchain_community.chat_models.huggingface` rather than the main `langchain` repo. --------- Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Signed-off-by: ugm2 <unaigaraymaestre@gmail.com> Signed-off-by: Yuchen Liang <yuchenl3@andrew.cmu.edu> Co-authored-by: Andrew Reed <andrew.reed.r@gmail.com> Co-authored-by: Andrew Reed <areed1242@gmail.com> Co-authored-by: A-Roucher <aymeric.roucher@gmail.com> Co-authored-by: Aymeric Roucher <69208727+A-Roucher@users.noreply.github.com>	2023-12-21 12:28:30 -05:00
Leonid Kuligin	b99274c9d8	community[patch]: changed default for VertexAIEmbeddings (#14614 ) Replace this entire comment with: - Description: @kurtisvg has raised a point that it's a good idea to have a fixed version for embeddings (since otherwise a user might run a query with one version vs a vectorstore where another version was used). In order to avoid breaking changes, I'd suggest to give users a warning, and make a `model_name` a required argument in 1.5 months.	2023-12-21 12:15:19 -05:00
Karim Lalani	228ddabc3b	community: fix for surrealdb client 0.3.2 update + store and retrieve metadata (#14997 ) Surrealdb client changes from 0.3.1 to 0.3.2 broke the surrealdb vectore integration. This PR updates the code to work with the updated client. The change is backwards compatible with previous versions of surrealdb client. Also expanded the vector store implementation to store and retrieve metadata that's included with the document object.	2023-12-21 12:04:57 -05:00
JaguarDB	ca0a75e1fc	community[patch]: JaguarHttpClient conditional import (#14985 ) - Description: Fixed jaguar.py to import JaguarHttpClient with try and catch - Issue: the issue # Unable to use the JaguarHttpClient at run time - Dependencies: It requires "pip install -U jaguardb-http-client" - Twitter handle: workbot --------- Co-authored-by: JY <jyjy@jaguardb> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-20 19:11:57 -08:00
Michael Landis	1c934fff0e	community[patch]: support momento vector index filter expressions (#14978 ) Description For the Momento Vector Index (MVI) vector store implementation, pass through `filter_expression` kwarg to the MVI client, if specified. This change will enable the MVI self query implementation in a future PR. Also fixes some integration tests.	2023-12-20 19:11:43 -08:00
Yacine	300c1cbf92	community[patch]: Fix typo in class Docstring (#14982 ) - Description: Fix typo in class Docstring to replace AZURE_OPENAI_API_ENDPOINT by AZURE_OPENAI_ENDPOINT - Issue: the issue #14901 - Dependencies: NA - Twitter handle: Co-authored-by: Yacine Bouakkaz <Yacine.Bouakkaz@evokegroup.com>	2023-12-20 19:03:45 -08:00
MING KANG	ed5e0cfe57	community: add OCI Endpoint (#14250 ) - Description: - [OCI Data Science](https://docs.oracle.com/en-us/iaas/data-science/using/home.htm) is a fully managed and serverless platform for data science teams to build, train, and manage machine learning models in the Oracle Cloud Infrastructure. This PR add integration for using LangChain with an LLM hosted on a [OCI Data Science Model Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm). To authenticate, [oracle-ads](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/cli/authentication.html) has been used to automatically load credentials for invoking endpoint. - Issue: None - Dependencies: `oracle-ads` - Tag maintainer: @baskaryan - Twitter handle: None --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-12-20 11:52:20 -08:00
Erick Friis	75ba22793f	community: Vectara summarization (#14970 ) Description: Adding Summarization to Vectara, to reflect it provides not only vector-store type functionality but also can return a summary. Also added: MMR capability (in the Vectara platform side) Updated templates Updated documentation and IPYNB examples Tag maintainer: @baskaryan Twitter handle: @ofermend --------- Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>	2023-12-20 11:51:33 -08:00
Liang Zhang	6479aab74f	community[patch]: Add param "task" to Databricks LLM to work around serialization of transform_output_fn (#14933 ) What is the reproduce code? ```python from langchain.chains import LLMChain, load_chain from langchain.llms import Databricks from langchain.prompts import PromptTemplate def transform_output(response): # Extract the answer from the responses. return str(response["candidates"][0]["text"]) def transform_input(request): full_prompt = f"""{request["prompt"]} Be Concise. """ request["prompt"] = full_prompt return request chat_model = Databricks( endpoint_name="llama2-13B-chat-Brambles", transform_input_fn=transform_input, transform_output_fn=transform_output, verbose=True, ) print(f"Test chat model: {chat_model('What is Apache Spark')}") # This works llm_chain = LLMChain(llm=chat_model, prompt=PromptTemplate.from_template("{chat_input}")) llm_chain("colorful socks") # this works llm_chain.save("databricks_llm_chain.yaml") # transform_input_fn and transform_output_fn are not serialized into the model yaml file loaded_chain = load_chain("databricks_llm_chain.yaml") # The Databricks LLM is recreated with transform_input_fn=None, transform_output_fn=None. loaded_chain("colorful socks") # Thus this errors. The transform_output_fn is needed to produce the correct output ``` Error: ``` File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-6c34afab-3473-421d-877f-1ef18930ef4d/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__ raise validation_error pydantic.v1.error_wrappers.ValidationError: 1 validation error for Generation text str type expected (type=type_error.str) request payload: {'query': 'What is a databricks notebook?'}'} ``` What does the error mean? When the LLM generates an answer, represented by a Generation data object. The Generation data object takes a str field called text, e.g. Generation(text=”blah”). However, the Databricks LLM tried to put a non-str to text, e.g. Generation(text={“candidates”:[{“text”: “blah”}]}) Thus, pydantic errors. Why the output format becomes incorrect after saving and loading the Databricks LLM? Databrick LLM does not support serializing transform_input_fn and transform_output_fn, so they are not serialized into the model yaml file. When the Databricks LLM is loaded, it is recreated with transform_input_fn=None, transform_output_fn=None. Without transform_output_fn, the output text is not unwrapped, thus errors. Missing transform_output_fn causes this error. Missing transform_input_fn causes the additional prompt “Be Concise.” to be lost after saving and loading. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle:** we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-20 12:50:23 -05:00
Anush	60c70effe9	community[minor]: Qdrant sparse vector retriever (#14814 ) ## Description This PR intends to add support for Qdrant's new [sparse vector retrieval](https://qdrant.tech/articles/sparse-vectors/) by introducing a new retriever class, `QdrantSparseVectorRetriever`. Necessary usage docs and integration tests have been added for the retriever. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-20 02:22:19 -05:00
mogith-pn	c53fab63a3	community[patch]: Fixed duplicate input id issue in clarifai vectorstore (#14914 ) - Description: This PR fixes the issue faces with duplicate input id in Clarifai vectorstore class when ingesting documents into the vectorstore more than the batch size. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-20 02:21:36 -05:00
Sypherd	5642132c0c	community[patch]: Add safe lookup to OpenAI response adapter (#14765 ) ## Description Similar to https://github.com/langchain-ai/langchain/issues/5861, I've experienced `KeyError`s resulting from unsafe lookups in the `convert_dict_to_message` function in [this file](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/adapters/openai.py). While that issue focused on `KeyError 'content'`, I've opened another issue (#14764) about how the problem still exists in the same function but with `KeyError 'role'`. The fix for #5861 only added a safe lookup to the specific line that was giving them trouble.. This PR fixes the unsafe lookup in the rest of the function but the problem still exists across the repo. ## Issues * #14764 * #5861 ## Dependencies * None ## Checklist [x] make format [x] make lint [ ] make test - Results in `make: *** No rule to make target 'test'. Stop.` ## Maintainers * @hinthornw --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-20 01:17:23 -05:00
AlpinDale	b0588774f1	community[minor]: Add Aphrodite Engine support (#14759 ) This PR adds support for PygmalionAI's [Aphrodite Engine](https://github.com/PygmalionAI/aphrodite-engine), based on vLLM's attention mechanism. At the moment, this PR does not include support for the API servers, but they will be added in a later PR. The only dependency as of now is `aphrodite-engine==0.4.2`. We pin the version to prevent breakage due to changes in the aphrodite-engine library. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-20 01:16:57 -05:00
Dmitry Tyumentsev	d21f44b484	community[minor]: Add YandexGPT embeddings (#14767 ) - Description: Introducing an ability to work with the [YandexGPT](https://cloud.yandex.com/en/services/yandexgpt) embeddings models. --------- Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>	2023-12-20 01:11:07 -05:00
Nicolas Suzor	529144649e	community[patch]: add png support for vertexai._parse_chat_history_gemini() (#14788 ) - Description: Modify community chat model vertexai to handle png and other image types encoded in base64 - Dependencies: added `import re` but no new dependencies. This addresses a problem where the vertexai method _parse_chat_history_gemini() was only recognizing image uris in jpeg format. I made a simple change to cover other extension types.	2023-12-20 00:58:39 -05:00
Liu Jun	b0c48dc983	community[patch]: make ak and sk optional in qianfan endpoint (#14835 ) - Description: The Qianfan SDK offers multiple authentication methods, but in the `QianfanEndpoint` of Langchain, it currently only supports authentication through AK and SK. In order to accommodate users who wish to use alternative authentication methods, this pull request makes AK and SK optional. This change should not impact existing users, while allowing users to configure other authentication methods as per the Qianfan SDK documentation. - Issue: / - Dependencies: No - Tag maintainer: No - Twitter handle:	2023-12-20 00:49:33 -05:00
Archan Ghosh	65678b3816	community[patch]: Update arxiv.py with Entry ID as a return value (#14915 ) Added Entry ID as a return value inside get_summaries_as_docs - Description: Added the Entry ID as a return, so it's easier to track the IDs of the papers that are being returned. With the addition return of the entry ID in functions like ArxivRetriever, it will be easier to reference the ID of the paper itself.	2023-12-20 00:30:24 -05:00
Bagatur	345acb26ac	community[patch]: Matching engine, return doc id (#14930 )	2023-12-20 00:03:11 -05:00
Michael Feil	7b96de3d5d	community[patch]: update Gradient embeddings (#14846 ) - Description: Going forward, we have a own API `pip install gradientai`. Therefore gradually removing the self-build packages in llamaindex, haystack and langchain. - Issue: None. - Dependencies: `pip install gradientai` - Tag maintainer: @michaelfeil	2023-12-19 11:46:33 -05:00
Igor Dvorkin	6cc3c2452c	community[patch]: Enhance iMessage chat loader with timestamp parsing and message ownership (#14804 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-19 11:09:01 -05:00
Mohammad Mohtashim	e3abe12243	community[patch]: helpful error message for GitHubAPIWrapper (#14803 ) Very simple change in relation to the issue https://github.com/langchain-ai/langchain/issues/14550 @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-19 11:08:06 -05:00
Dmitry Tyumentsev	50381abc42	community[patch]: Add retry logic to Yandex GPT API Calls (#14907 ) Description: Added logic for re-calling the YandexGPT API in case of an error --------- Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>	2023-12-19 10:51:42 -05:00
Sirjanpreet Singh Banga	425e5e1791	community[minor]: rename ChatGPTRouter to GPTRouter (#14913 ) Description:: Rename integration to GPTRouter Tag maintainer: @Gupta-Anubhav12 @samanyougarg @sirjan-ws-ext Twitter handle: [@SamanyouGarg](https://twitter.com/SamanyouGarg)	2023-12-19 10:48:52 -05:00
JaguarDB	992b04e475	community[minor]: added jaguar vector store (#14838 ) Description: A new vector store Jaguar is being added. Class, test scripts, and documentation is added. Issue: None -- This is the first PR contributing to LangChain Dependencies: This depends on "pip install -U jaguardb-http-client" client http package Tag maintainer: @baskaryan, @eyurtsev, @hwchase1 Twitter handle: @workbot --------- Co-authored-by: JY <jyjy@jaguardb> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-19 10:40:18 -05:00
Sirjanpreet Singh Banga	44cb899a93	community[minor]: Integrating GPTRouter (#14900 ) Description: Adding a langchain integration for [GPTRouter](https://gpt-router.writesonic.com/) 🚀 , Tag maintainer: @Gupta-Anubhav12 @samanyougarg @sirjan-ws-ext Twitter handle: [@SamanyouGarg](https://twitter.com/SamanyouGarg) Integration Tests Passing: <img width="1137" alt="Screenshot 2023-12-19 at 5 45 31 PM" src="https://github.com/Writesonic/langchain/assets/151817113/4a59df9a-ee30-47aa-9df9-b8c4eeb9dc76">	2023-12-19 10:08:36 -05:00
Leonid Ganeline	b2fd41331e	docs: docstrings `langchain_community` update (#14889 ) Addded missed docstrings. Fixed inconsistency in docstrings. Note CC @efriis There were PR errors on `langchain_experimental/prompt_injection_identifier/hugging_face_identifier.py` But, I didn't touch this file in this PR! Can it be some cache problems? I fixed this error.	2023-12-19 08:58:24 -05:00
abhjaw	6fbd068b3f	Update kendra.py to avoid Kendra query ValidationException (#14866 ) Fixing issue - https://github.com/langchain-ai/langchain/issues/14494 to avoid Kendra query ValidationException <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: Update kendra.py to avoid Kendra query ValidationException, - Issue: the issue #https://github.com/langchain-ai/langchain/issues/14494, - Dependencies: None, - Tag maintainer: , - Twitter handle: If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-18 17:46:18 -08:00
Erick Friis	5f839beab9	community: replace deprecated davinci models (#14860 ) This is technically a breaking change because it'll switch out default models from `text-davinci-003` to `gpt-3.5-turbo-instruct`, but OpenAI is shutting off those endpoints on 1/4 anyways. Feels less disruptive to switch out the default instead.	2023-12-18 13:49:46 -08:00
Bob Lin	5de1dc72b9	community[patch]: Update Tongyi default model_name (#14844 ) <img width="1305" alt="Screenshot 2023-12-18 at 9 54 01 PM" src="https://github.com/langchain-ai/langchain/assets/10000925/c943fd81-cd48-46eb-8dff-4680424d9ba9"> The current model is no longer available.	2023-12-18 11:35:53 -05:00
Vlad Kolesnikov	11fda490ca	community[minor]: New model parameters and dynamic batching for VertexAIEmbeddings (#13999 ) - Description: VertexAIEmbeddings performance improvements - Twitter handle: @vladkol ## Improvements - Dynamic batch size, starting from 250, lowering down to 5. Batch size varies across regions. Some regions support larger batches, and it significantly improves performance. When running large batches of texts in `us-central1`, performance gain can be up to 3.5x. The dynamic batching also makes sure every batch is below 20K token limit. - New model parameter `embeddings_type` that translates to `task_type` parameter of the API. Newer model versions support [different embeddings task types](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings#api_changes_to_models_released_on_or_after_august_2023).	2023-12-17 22:24:22 -05:00
William FH	2d91d2b978	community: Add logprobs in gen output (#14826 ) Now that it's supported again for OAI chat models . Shame this wouldn't include it in the `.invoke()` output though (it's not included in the message itself). Would need to do a follow-up for that to be the case	2023-12-17 20:59:27 -05:00
Dmitry Tyumentsev	78ae276df7	community[patch]: fix agenerate return value (#14815 ) Fixed: - `_agenerate` return value in the YandexGPT Chat Model - duplicate line in the documentation Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>	2023-12-17 16:40:59 -05:00
sujeet	f1d3f29bc4	community[patch]: support for Sybase SQL anywhere added. (#14821 ) - Description: support for Sybase SQL anywhere added in sql_database.py file at path langchain\libs\community\langchain_community\utilities - Issue: It will resolve default schema setting for Sybase SQL anywhere - Dependencies: No, - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17, - Twitter handle: NA --------- Co-authored-by: learn360sujeet <121271779+learn360sujeet@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-17 16:39:44 -05:00
Noah Stapp	34e6f3ff72	community[patch]: Implement similarity_score_threshold for MongoDB Vector Store (#14740 ) Adds the option for `similarity_score_threshold` when using `MongoDBAtlasVectorSearch` as a vector store retriever. Example use: ``` vector_search = MongoDBAtlasVectorSearch.from_documents(...) qa_retriever = vector_search.as_retriever( search_type="similarity_score_threshold", search_kwargs={ "score_threshold": 0.5, } ) qa = RetrievalQA.from_chain_type( llm=OpenAI(), chain_type="stuff", retriever=qa_retriever, ) docs = qa({"query": "..."}) ``` I've tested this feature locally, using a MongoDB Atlas Cluster with a vector search index.	2023-12-15 16:49:21 -08:00
Dmitry Tyumentsev	dcead816df	community[patch]: Update YandexGPT API (#14773 ) Update LLMand Chat model to use new api version --------- Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>	2023-12-15 16:25:09 -08:00
Lance Martin	42421860bc	Add image support for Ollama (#14713 ) Support [LLaVA](https://ollama.ai/library/llava): * Upgrade Ollama * `ollama pull llava` Ensure compatibility with [image prompt template](https://github.com/langchain-ai/langchain/pull/14263) --------- Co-authored-by: jacoblee93 <jacoblee93@gmail.com>	2023-12-15 16:00:55 -08:00
Karim Lalani	a0064330b1	community[minor]: Add SurrealDB vectorstore (#13331 ) Description: Vectorstore implementation around [SurrealDB](https://www.surrealdb.com) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-15 13:34:51 -08:00
William FH	4855964332	Fix OAI Tool Message (#14746 ) See format here: https://platform.openai.com/docs/guides/function-calling/parallel-function-calling It expects a "name" argument, which we aren't providing by default. ![image](https://github.com/langchain-ai/langchain/assets/13333726/7cd82978-337c-40a1-b099-3bb25cd57eb4) Alternative is to add the 'name' field directly to the message if people prefer.	2023-12-15 06:45:09 -08:00
Leonid Kuligin	7f42811e14	google-genai[patch], community[patch]: Added support for new Google GenerativeAI models (#14530 ) Replace this entire comment with: - Description: added support for new Google GenerativeAI models - Twitter handle: lkuligin --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-12-14 20:56:46 -08:00
Erick Friis	9fb26a2a71	community[patch]: fix pgvector sqlalchemy (#14726 ) Fixes #14699	2023-12-14 13:27:30 -08:00
Funkeke	ea99612caa	community[patch]: fix dashvector endpoint params error (#14484 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> Co-authored-by: fangkeke <3339698829@qq.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-13 14:38:27 -08:00
Bob Lin	dce3c74905	community[patch]: Correct type annotation for azure_ad_token_provider Closed: #14402 (#14432 ) Description Fix https://github.com/langchain-ai/langchain/issues/14402, Similar changes: https://github.com/langchain-ai/langchain/pull/14166 Twitter handle [lin_bob57617](https://twitter.com/lin_bob57617)	2023-12-13 14:37:39 -08:00
Fran Cirka	8a4162d15e	community[patch]: Fixed issue with importing Row from sqlalchemy (#14488 ) - Description: Fixed import of Row in cache.py, - Issue: the issue # #13464 https://creditone.us.to/langchain-ai/langchain/issues/13464, - Dependencies: None, - Twitter handle: @frankybridman Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-13 14:36:08 -08:00
William FH	75b8891399	Update Vertex AI to include Gemini (#14670 ) h/t to @lkuligin - Description: added new models on VertexAI - Twitter handle: @lkuligin --------- Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-13 10:45:02 -08:00
Tomaz Bratanic	ea2616ae23	Fix RRF and lucene escape characters for neo4j vector store (#14646 ) * Remove Lucene special characters (fixes https://github.com/langchain-ai/langchain/issues/14232) * Fixes RRF normalization for hybrid search	2023-12-13 09:09:50 -08:00
Chengzu Ou	df95abb7e7	docs: Add Databricks Vector Search example notebook (#14158 ) This PR adds an example notebook for the Databricks Vector Search vector store. It also adds an introduction to the Databricks Vector Search product on the Databricks's provider page. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-12 17:40:29 -08:00
葛尧	e780433f6b	Fix token_usage None issue in ChatOpenAI with local Chatglm2-6B (#14493 ) When using local Chatglm2-6B by changing OPENAI_BASE_URL to localhost, the token_usage in ChatOpenAI becomes None. This leads to an AttributeError when trying to access token_usage.items(). This commit adds a check to ensure token_usage is not None before accessing its items. This change prevents the AttributeError and allows ChatOpenAI to work seamlessly with a local Chatglm2-6B model, aligning with the way it operates with the OpenAI API. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-12 17:30:37 -08:00
Massimiliano Pronesti	6080c98108	fix(embeddings): huggingface hub embeddings and TEI (#14489 ) Description: This PR fixes `HuggingFaceHubEmbeddings` by making the API token optional (as in the client beneath). Most models don't require one. I also updated the notebook for TEI (text-embeddings-inference) accordingly as requested here #14288. In addition, I fixed a mistake in the POST call parameters. Tag maintainers: @baskaryan	2023-12-12 17:21:52 -08:00
dandanwei	e5bd88383f	fix a bug in RedisNum filter againt value 0 (#14587 ) - Description: There is a bug in RedisNum filter that filter towards value 0 will be parsed as "". This is a fix to it. - Issue:* NA - Dependencies: NA - Tag maintainer: NA - Twitter handle: NA	2023-12-12 15:34:45 -08:00
Bagatur	d388863a3b	community[patch]: Release 0.0.2 (#14610 )	2023-12-12 09:58:04 -08:00
Erick Friis	5418d8bfd6	infra: import CI fix (#14562 ) TIL `**` globstar doesn't work in make Makefile changes fix that. `__getattr__` changes allow import of all files, but raise error when accessing anything from the module. file deletions were corresponding libs change from #14559	2023-12-11 14:59:10 -08:00
Bagatur	a844b495c4	community[patch]: Fix agenttoolkits imports (#14559 )	2023-12-11 14:19:25 -08:00
Bagatur	ed58eeb9c5	community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463 ) Moved the following modules to new package langchain-community in a backwards compatible fashion: ``` mv langchain/langchain/adapters community/langchain_community mv langchain/langchain/callbacks community/langchain_community/callbacks mv langchain/langchain/chat_loaders community/langchain_community mv langchain/langchain/chat_models community/langchain_community mv langchain/langchain/document_loaders community/langchain_community mv langchain/langchain/docstore community/langchain_community mv langchain/langchain/document_transformers community/langchain_community mv langchain/langchain/embeddings community/langchain_community mv langchain/langchain/graphs community/langchain_community mv langchain/langchain/llms community/langchain_community mv langchain/langchain/memory/chat_message_histories community/langchain_community mv langchain/langchain/retrievers community/langchain_community mv langchain/langchain/storage community/langchain_community mv langchain/langchain/tools community/langchain_community mv langchain/langchain/utilities community/langchain_community mv langchain/langchain/vectorstores community/langchain_community mv langchain/langchain/agents/agent_toolkits community/langchain_community mv langchain/langchain/cache.py community/langchain_community mv langchain/langchain/adapters community/langchain_community mv langchain/langchain/callbacks community/langchain_community/callbacks mv langchain/langchain/chat_loaders community/langchain_community mv langchain/langchain/chat_models community/langchain_community mv langchain/langchain/document_loaders community/langchain_community mv langchain/langchain/docstore community/langchain_community mv langchain/langchain/document_transformers community/langchain_community mv langchain/langchain/embeddings community/langchain_community mv langchain/langchain/graphs community/langchain_community mv langchain/langchain/llms community/langchain_community mv langchain/langchain/memory/chat_message_histories community/langchain_community mv langchain/langchain/retrievers community/langchain_community mv langchain/langchain/storage community/langchain_community mv langchain/langchain/tools community/langchain_community mv langchain/langchain/utilities community/langchain_community mv langchain/langchain/vectorstores community/langchain_community mv langchain/langchain/agents/agent_toolkits community/langchain_community mv langchain/langchain/cache.py community/langchain_community ``` Moved the following to core ``` mv langchain/langchain/utils/json_schema.py core/langchain_core/utils mv langchain/langchain/utils/html.py core/langchain_core/utils mv langchain/langchain/utils/strings.py core/langchain_core/utils cat langchain/langchain/utils/env.py >> core/langchain_core/utils/env.py rm langchain/langchain/utils/env.py ``` See .scripts/community_split/script_integrations.sh for all changes	2023-12-11 13:53:30 -08:00

... 10 11 12 13 14 ...

726 Commits