langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-02 09:40:22 +00:00

Author	SHA1	Message	Date
Massimiliano Pronesti	8113d612bb	community[patch]: support modin document loader (#18866 ) Langchain community document loaders support `pyspark`, `polars`, and `pandas` dataframes but not `modin`'s. This PR addresses this point.	2024-03-10 18:40:04 -07:00
Pol Ruiz Farre	a7f63d8cb4	community[patch]: Fix BasePDFLoader suffix for s3 presigned urls (#18844 ) BasePDFLoader doesn't parse the suffix of the file correctly when parsing S3 presigned urls. This fix enables the proper detection and parsing of S3 presigned URLs to prevent errors such as `OSError: [Errno 36] File name too long`. No additional dependencies required.	2024-03-11 00:58:51 +00:00
Joshua Carroll	ddaf9de169	community: Fix bug with StreamlitChatMessageHistory (#18834 ) - Description: Fix Streamlit bug which was introduced by https://github.com/langchain-ai/langchain/pull/18250, update integration test - Issue: https://github.com/langchain-ai/langchain/issues/18684 - Dependencies: None	2024-03-09 13:42:22 -08:00
Tomaz Bratanic	a28be31a96	Switch to md5 for deduplication in neo4j integrations (#18846 ) Deduplicate documents using MD5 of the page_content. Also allows for custom deduplication with graph ingestion method by providing metadata id attribute --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-03-09 13:28:55 -08:00
Tomaz Bratanic	246724faab	LLM graph transformer prompt engineering (#18843 ) A bit of prompt engineering to improve results	2024-03-09 11:27:16 -08:00
Erick Friis	b48865bf94	langchain[patch]: attach hub metadata (#18830 )	2024-03-08 18:40:49 -08:00
Ammar	34b31a8cc7	core: add in-code docs for RunnableAssign class (#18826 ) Description: Improves the docstring for `RunnableAssign` by providing a concise description and a self-contained code example. Issue: #18803	2024-03-09 02:04:52 +00:00
Leonid Ganeline	476d6dc596	community[patch]: Use getattr for `toolkits` imports (#18825 ) This will preserve the namespace, without actually loading the underlying packages on init.	2024-03-08 20:54:28 -05:00
Erick Friis	bbb609ac9d	core[patch]: fix arbitrary config keys (#18827 )	2024-03-08 17:35:13 -08:00
Luis Antonio Vieira Junior	67c880af74	community[patch]: adding linearization config to AmazonTextractPDFLoader (#17489 ) - Description: Adding an optional parameter `linearization_config` to the `AmazonTextractPDFLoader` so the caller can define how the output will be linearized, instead of forcing a predefined set of linearization configs. It will still have a default configuration as this will be an optional parameter. - Issue: #17457 - Dependencies: The same ones that already exist for `AmazonTextractPDFLoader` - Twitter handle: [@lvieirajr19](https://twitter.com/lvieirajr19) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-08 17:25:22 -08:00
Anis ZAKARI	37e89ba5b1	community[patch]: Bedrock add support for mistral models (#18756 ) Description*: My previous [PR](https://github.com/langchain-ai/langchain/pull/18521) was mistakenly closed, so I am reopening this one. Context: AWS released two Mistral models on Bedrock last Friday (March 1, 2024). This PR includes some code adjustments to ensure their compatibility with the Bedrock class. --------- Co-authored-by: Anis ZAKARI <anis.zakari@hymaia.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-09 01:20:38 +00:00
Alexander Dicke	66576948e0	experimental[minor]: adds mixtral wrapper (#17423 ) Description: Adds a chat wrapper for Mixtral models using the [prompt template](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1#instruction-format). --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-08 17:14:23 -08:00
Keith Chan	914af69b44	community[patch]: Update azuresearch vectorstore from_texts() method to include fields argument (#17661 ) - Description: Update azuresearch vectorstore from_texts() method to include fields argument, necessary for creating an Azure AI Search index with custom fields. - Issue: Currently index fields are fixed to default fields if Azure Search index is created using from_texts() method - Dependencies: None - Twitter handle: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-08 17:05:35 -08:00
al1p	46f0cea2b9	community[patch][: improved the suffix prompt to avoid loop (#17791 ) Small improvement to the openapi prompt. The agent was not finding the server base URL (looping through all nodes). This small change narrows the search and enables finding the url faster. No dependency Twitter : @al1pra	2024-03-08 16:53:09 -08:00
Dmitry Kankalovich	f5117e907d	openai[patch]: Proper example for AzureOpenAI usage in error message (#17798 ) # Proper example for AzureOpenAI usage in error message The original error message is wrong in part of a usage example it gives. Corrected to the right one. Co-authored-by: Dzmitry Kankalovich <dzmitry_kankalovich@epam.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-08 16:52:55 -08:00
Théo LEBRUN	cf94091cd0	community[patch]: Skip nested directories when using S3DirectoryLoader (#17829 ) - Description: `S3DirectoryLoader` is failing if prefix is a folder (ex: `my_folder/`) because `S3FileLoader` will try to load that folder and will fail. This PR skip nested directories so prefix can be set to folder instead of `my_folder/files_prefix`. - Issue: - #11917 - #6535 - #4326 - Dependencies: none - Twitter handle: @Falydoor - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-03-08 16:50:58 -08:00
Venkatesan	7a18b63dbf	community[patch]: Mongo index creation (#17748 ) - [ ] Title: Mongodb: MongoDB connection performance improvement. - [ ] Message: - Description: I made collection index_creation as optional. Index Creation is one time process. - Issue: MongoDBChatMessageHistory class object is attempting to create an index during connection, causing each request to take longer than usual. This should be optional with a parameter. - Dependencies: N/A - Branch to be checked: origin/mongo_index_creation --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-08 16:43:17 -08:00
wt3639	5b5b37a999	community[patch]: Add embedding instruction to HuggingFaceBgeEmbeddings (#18017 ) - Description: Add embedding instruction to HuggingFaceBgeEmbeddings, so that it can be compatible with nomic and other models that need embedding instruction. --------- Co-authored-by: Tao Wu <tao.wu@rwth-aachen.de> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-08 16:39:29 -08:00
Erick Friis	a8de6d1533	anthropic[patch]: integration test update (#18823 )	2024-03-08 13:47:31 -08:00
wewebber-merlin	d1f5bc4906	anthropic[patch]: add kwargs to format_output base (#18715 ) _generate() and _agenerate() both accept kwargs, then pass them on to _format_output; but _format_output doesn't accept kwargs. Attempting to pass, e.g., timeout=50 to _generate (or invoke()) results in a TypeError. Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-08 21:47:21 +00:00
Erick Friis	aa7bce6b13	anthropic[patch]: release 0.1.4 (#18822 )	2024-03-08 21:34:47 +00:00
Erick Friis	a5bcddc738	anthropic[patch]: streaming param (#18819 )	2024-03-08 13:32:57 -08:00
Erick Friis	8c0b215c02	anthropic[patch]: fix format output args (#18816 )	2024-03-08 12:34:11 -08:00
Ishani Vyas	2b0cbd65ba	community[patch]: Add Passio Nutrition AI Food Search Tool to Community Package (#18278 ) ## Add Passio Nutrition AI Food Search Tool to Community Package ### Description We propose adding a new tool to the `community` package, enabling integration with Passio Nutrition AI for food search functionality. This tool will provide a simple interface for retrieving nutrition facts through the Passio Nutrition AI API, simplifying user access to nutrition data based on food search queries. ### Implementation Details - Class Structure: Implement `NutritionAI`, extending `BaseTool`. It includes an `_run` method that accepts a query string and, optionally, a `CallbackManagerForToolRun`. - API Integration: Use `NutritionAIAPI` for the API wrapper, encapsulating all interactions with the Passio Nutrition AI and providing a clean API interface. - Error Handling: Implement comprehensive error handling for API request failures. ### Expected Outcome - User Benefits: Enable easy querying of nutrition facts from Passio Nutrition AI, enhancing the utility of the `langchain_community` package for nutrition-related projects. - Functionality: Provide a straightforward method for integrating nutrition information retrieval into users' applications. ### Dependencies - `langchain_core` for base tooling support - `pydantic` for data validation and settings management - Consider `requests` or another HTTP client library if not covered by `NutritionAIAPI`. ### Tests and Documentation - Unit Tests: Include tests that mock network interactions to ensure tool reliability without external API dependency. - Documentation: Create an example notebook in `docs/docs/integrations/tools/passio_nutrition_ai.ipynb` showing usage, setup, and example queries. ### Contribution Guidelines Compliance - Adhere to the project's linting and formatting standards (`make format`, `make lint`, `make test`). - Ensure compliance with LangChain's contribution guidelines, particularly around dependency management and package modifications. ### Additional Notes - Aim for the tool to be a lightweight, focused addition, not introducing significant new dependencies or complexity. - Potential future enhancements could include caching for common queries to improve performance. ### Twitter Handle - Here is our Passio AI [twitter handle](https://twitter.com/@passio_ai) where we announce our products. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-03-08 20:33:22 +00:00
Kushagra	b1f22bf76c	community[minor]: added a feature to filter documents in Mongoloader (#18253 ) "community: added a feature to filter documents in Mongoloader" - Description: added a feature to filter documents in Mongoloader - Feature: the feature #18251 - Dependencies: No - Twitter handle: https://twitter.com/im_Kushagra	2024-03-08 12:06:35 -08:00
Eugene Yurtsev	1f50274df7	community[patch]: Add pgvector to docker compose and update settings used in integration test (#18815 )	2024-03-08 14:39:28 -05:00
Erick Friis	ad29806255	nvidia-trt, nvidia-ai-endpoints: move to repo (#18814 ) NVIDIA maintained in https://github.com/langchain-ai/langchain-nvidia	2024-03-08 19:30:50 +00:00
Christophe Bornet	e54a49b697	community[minor]: Add lazy_table_reflection param to SqlDatabase (#18742 ) For some DBs with lots of tables, reflection of all the tables can take very long. So this change will make the tables be reflected lazily when get_table_info() is called and `lazy_table_reflection` is True.	2024-03-08 14:10:23 -05:00
Christophe Bornet	ead2a74806	community: Implement lazy_load() for JSONLoader (#18643 ) Covered by `tests/unit_tests/document_loaders/test_json_loader.py`	2024-03-08 13:58:17 -05:00
Erick Friis	a88f62ec3c	langchain[patch]: getattr import from langchain.chains (#18160 )	2024-03-08 10:36:14 -08:00
Eugene Yurtsev	cdfb5b4ca1	core[minor]: Chat Models to fallback astream to fallback on sync stream if available (#18748 ) Allows all chat models that implement _stream, but not _astream to still have async streaming to work. Amongst other things this should resolve issues with streaming community model implementations through langserve since langserve is exclusively async.	2024-03-08 13:27:29 -05:00
aditya thomas	e00c1ff2b0	infra: ChatOpenAI unit tests for invoke() and ainvoke() (#18792 ) Description: Replacing the deprecated predict() and apredict() methods in the unit tests Issue: Not applicable Dependencies: None Lint and test: `make format`, `make lint` and `make test` have been run	2024-03-08 09:48:38 -08:00
Bagatur	3e29c04213	core[minor]: add BaseMessage.response_metadata (#18699 )	2024-03-08 09:35:56 -08:00
Bagatur	bc6249c889	langchain[patch]: runnable agent streaming param (#18761 ) Usage: ```python agent = RunnableAgent(runnable=runnable, .., stream_runnable=False) ``` or for convenience ```python agent_executor = AgentExecutor(agent=agent, ..., stream_runnable=False) ```	2024-03-07 20:53:53 -08:00
Tomaz Bratanic	c8c592d3f1	experimental[minor]: Add LLM graph transformer (#18733 ) Add a class that constructs knowledge graphs based on text using an LLM.	2024-03-07 20:52:53 -08:00
Phat Vo	3ecb903d49	community[patch] : Tidy up and update Clarifai SDK functions (#18314 ) Description : * Tidy up, add missing docstring and fix unused params * Enable using session token	2024-03-07 19:47:44 -08:00
Max Jakob	61a2eba081	elasticsearch[patch]: add top-level import, remove obsolete dependency (#18644 ) Make `ElasticsearchRetriever` available as top-level import. The `langchain` package depends on `langchain-community` so we do not need to depend on it explicitly.	2024-03-07 19:38:31 -08:00
Tomaz Bratanic	010a234f1e	docs: Fix diffbot graph transformer description (#18736 ) The previous docstring was invalid	2024-03-07 19:25:41 -08:00
Jan Nissen	b8922480ed	core[patch]: improve PydanticOutputParser typing (#18740 ) This PR adds generic typing to `PydanticOutputParser` so we get a typed output from `.parse` instead of `Any`. It should provide a better DX by way of Intellisense and for anyone strictly typing. Pre-change: ![Screenshot 2024-03-07 at 10 22 31 AM](https://github.com/langchain-ai/langchain/assets/22690160/fd22dde0-9fdc-4283-b283-4c98f0bc46e5) Post-change: ![Screenshot 2024-03-07 at 10 26 31 AM](https://github.com/langchain-ai/langchain/assets/22690160/7e23d2b7-8f8c-494f-80b3-187530a173ee) I haven't dug too deep, but I think a similar change could probably be added to `JsonOutputParser` so we don't have to pull up `.parse`. Co-authored-by: Jan Nissen <jan23@gmail.com>	2024-03-07 19:25:24 -08:00
Massimiliano Pronesti	3b975c6ebe	experimental[minor]: add support for modin in pandas agent (#18749 ) Added support for Intel's [modin](https://github.com/modin-project/modin) in `create_pandas_dataframe_agent`.	2024-03-07 19:23:07 -08:00
Tomaz Bratanic	4bfe888717	comunity[patch]: Fix neo4j sanitizing values (#18750 ) Fixing sanitization for when deeply nested lists appear	2024-03-07 19:21:52 -08:00
Eugene Yurtsev	6caceb5473	core[patch]: Automatic upgrade to AddableDict in transform and atransform (#18743 ) Automatic upgrade to transform and atransform Closes: https://github.com/langchain-ai/langchain/issues/18741 https://github.com/langchain-ai/langgraph/issues/136 https://github.com/langchain-ai/langserve/issues/504	2024-03-07 21:23:12 -05:00
Yunmo Koo	fee6f983ef	community[minor]: Integration for `Friendli` LLM and `ChatFriendli` ChatModel. (#17913 ) ## Description - Add [Friendli](https://friendli.ai/) integration for `Friendli` LLM and `ChatFriendli` chat model. - Unit tests and integration tests corresponding to this change are added. - Documentations corresponding to this change are added. ## Dependencies - Optional dependency [`friendli-client`](https://pypi.org/project/friendli-client/) package is added only for those who use `Frienldi` or `ChatFriendli` model. ## Twitter handle - https://twitter.com/friendliai	2024-03-08 02:20:47 +00:00
Smit Parmar	aed46cd6f2	community[patch]: Added support for filter out AWS Kendra search by score confidence (#12920 ) Description: It will add support for filter out kendra search by score confidence which will make result more accurate. For example ``` retriever = AmazonKendraRetriever( index_id=kendra_index_id, top_k=5, region_name=region, score_confidence="HIGH" ) ``` Result will not include the records which has score confidence "LOW" or "MEDIUM". Relevant docs https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/query.html https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/retrieve.html Issue: the issue # it resolve #11801 twitter: [@SmitCode](https://twitter.com/SmitCode)	2024-03-07 17:28:09 -08:00
Ian	390ef6abe3	community[minor]: Add Initial Support for TiDB Vector Store (#15796 ) This pull request introduces initial support for the TiDB vector store. The current version is basic, laying the foundation for the vector store integration. While this implementation provides the essential features, we plan to expand and improve the TiDB vector store support with additional enhancements in future updates. Upcoming Enhancements: * Support for Vector Index Creation: To enhance the efficiency and performance of the vector store. * Support for max marginal relevance search. * Customized Table Structure Support: Recognizing the need for flexibility, we plan for more tailored and efficient data store solutions. Simple use case exmaple ```python from typing import List, Tuple from langchain.docstore.document import Document from langchain_community.vectorstores import TiDBVectorStore from langchain_openai import OpenAIEmbeddings db = TiDBVectorStore.from_texts( embedding=embeddings, texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'], table_name="tidb_vector_langchain", connection_string=tidb_connection_url, distance_strategy="cosine", ) query = "Can you tell me about Alexandra?" docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query) for doc, score in docs_with_score: print("-" * 80) print("Score: ", score) print(doc.page_content) print("-" * 80) ```	2024-03-07 17:18:20 -08:00
Bagatur	3b1eb1f828	community[patch]: chat hf typing fix (#18693 )	2024-03-07 17:06:38 -08:00
Jib	d60e93b6ae	langchain-mongodb: Standardize mongodb collection/index names in tests (#18755 ) ## Description: MongoDB integration tests link to a provided Atlas Cluster. We have very stringent permissions set against the cluster provided. In order to make it easier to track and isolate the collections each test gets run against, we've updated the collection names to map the test file name. i.e. `langchain_{filename}` => `langchain_test_vectorstores` Fixes integration test results ![image](https://github.com/langchain-ai/langchain/assets/2887713/41f911b9-55f7-4fe4-9134-5514b82009f9) ## Dependencies: Provided MONGODB_ATLAS_URI - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ cc: @shaneharvey, @blink1073 , @NoahStapp , @caseyclements	2024-03-07 17:16:04 -05:00
Eugene Yurtsev	ca299a8e08	Docs: Add custom parsing documentation and extending langchain (#18331 ) * Added extending langchain.mdx -- we'll need to add links as we add more custom documentation * Added partial documentation about parsers	2024-03-07 16:30:57 -05:00
Eugene Yurtsev	8c71f92cb2	core: upgrade mypy to recent mypy (#18753 ) Testing this works per package on CI	2024-03-07 15:25:19 -05:00
Eugene Yurtsev	e188d4ecb0	Add dangerous parameter to requests tool (#18697 ) The tools are already documented as dangerous. Not clear whether adding an opt-in parameter is necessary or not	2024-03-07 15:10:56 -05:00
Erick Friis	1beb84b061	community[patch]: move pdf text tests to integration (#18746 )	2024-03-07 10:34:22 -08:00
Christophe Bornet	4a7d73b39d	community: If load() has been overridden, use it in default lazy_load() (#18690 )	2024-03-07 11:52:19 -05:00
Christophe Bornet	6cd7607816	community[patch]: Implement lazy_load() for MHTMLLoader (#18648 ) Covered by `tests/unit_tests/document_loaders/test_mhtml.py`	2024-03-07 11:50:18 -05:00
axiangcoding	9745b5894d	community[patch]: Chroma use uuid4 instead of uuid1 to generate random ids (#18723 ) - Description: Chroma use uuid4 instead of uuid1 as random ids. Use uuid1 may leak mac address, changing to uuid4 will not cause other effects. - Issue: None - Dependencies: None - Twitter handle: None	2024-03-07 11:48:25 -05:00
Guangdong Liu	ced5e7bae7	community[patch]: Fix sparkllm authentication problem. (#18651 ) - Description: fix sparkllm authentication problem.The current timestamp is in RFC1123 format. The time deviation must be controlled within 300s. I changed to re-obtain the url every time I ask a question. https://www.xfyun.cn/doc/spark/general_url_authentication.html#_1-2-%E9%89%B4%E6%9D%83%E5%8F%82%E6%95%B0	2024-03-06 18:43:16 -08:00
Erick Friis	89d32ffbbd	community[patch]: release 0.0.27 (#18708 )	2024-03-07 01:08:43 +00:00
Erick Friis	c09b520ce4	core[patch]: release 0.1.30 (#18706 )	2024-03-06 16:12:18 -08:00
Piyush Jain	2b234a4d96	Support for claude v3 models. (#18630 ) Fixes #18513. ## Description This PR attempts to fix the support for Anthropic Claude v3 models in BedrockChat LLM. The changes here has updated the payload to use the `messages` format instead of the formatted text prompt for all models; `messages` API is backwards compatible with all models in Anthropic, so this should not break the experience for any models. ## Notes The PR in the current form does not support the v3 models for the non-chat Bedrock LLM. This means, that with these changes, users won't be able to able to use the v3 models with the Bedrock LLM. I can open a separate PR to tackle this use-case, the intent here was to get this out quickly, so users can start using and test the chat LLM. The Bedrock LLM classes have also grown complex with a lot of conditions to support various providers and models, and is ripe for a refactor to make future changes more palatable. This refactor is likely to take longer, and requires more thorough testing from the community. Credit to PRs [18579](https://github.com/langchain-ai/langchain/pull/18579) and [18548](https://github.com/langchain-ai/langchain/pull/18548) for some of the code here. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-06 15:46:18 -08:00
Sam Khano	1b4dcf22f3	community[minor]: Add DocumentDBVectorSearch VectorStore (#17757 ) Description: - Added Amazon DocumentDB Vector Search integration (HNSW index) - Added integration tests - Updated AWS documentation with DocumentDB Vector Search instructions - Added notebook for DocumentDB integration with example usage --------- Co-authored-by: EC2 Default User <ec2-user@ip-172-31-95-226.ec2.internal>	2024-03-06 15:11:34 -08:00
Vittorio Rigamonti	51f3902bc4	community[minor]: Adding support for Infinispan as VectorStore (#17861 ) Description: This integrates Infinispan as a vectorstore. Infinispan is an open-source key-value data grid, it can work as single node as well as distributed. Vector search is supported since release 15.x For more: [Infinispan Home](https://infinispan.org) Integration tests are provided as well as a demo notebook	2024-03-06 15:11:02 -08:00
Max Jakob	cca0167917	elasticsearch[patch], community[patch]: update references, deprecate community classes (#18506 ) Follow up on https://github.com/langchain-ai/langchain/pull/17467. - Update all references to the Elasticsearch classes to use the partners package. - Deprecate community classes. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-06 15:09:12 -08:00
Djordje	12b4a4d860	community[patch]: Opensearch delete method added - indexing supported (#18522 ) - Description: Added delete method for OpenSearchVectorSearch, therefore indexing supported - Issue: No - Dependencies: No - Twitter handle: stkbmf	2024-03-06 15:08:47 -08:00
Erick Friis	687d27567d	openai[patch]: unit test azure init (#18703 )	2024-03-06 14:17:09 -08:00
Christophe Bornet	db8db6faae	community: Implement lazy_load() for PlaywrightURLLoader (#18676 ) Integration tests: `tests/integration_tests/document_loaders/test_url_playwright.py`	2024-03-06 16:52:13 -05:00
Aaron Yi	c092db862e	community[patch]: make metadata and text optional as expected in DocArray (#18678 ) ValidationError: 2 validation errors for DocArrayDoc text Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict] For further information visit https://errors.pydantic.dev/2.5/v/missing metadata Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict] For further information visit https://errors.pydantic.dev/2.5/v/missing ``` In the `_get_doc_cls` method, the `DocArrayDoc` class is defined as follows: ```python class DocArrayDoc(BaseDoc): text: Optional[str] embedding: Optional[NdArray] = Field(**embeddings_params) metadata: Optional[dict] ```	2024-03-06 16:51:41 -05:00
Eugene Yurtsev	4c25b49229	community[major]: breaking change in some APIs to force users to opt-in for pickling (#18696 ) This is a PR that adds a dangerous load parameter to force users to opt in to use pickle. This is a PR that's meant to raise user awareness that the pickling module is involved.	2024-03-06 16:43:01 -05:00
Eugene Yurtsev	0e52961562	community[patch]: Patch tdidf retriever (CVE-2024-2057) (#18695 ) This is a patch for `CVE-2024-2057`: https://www.cve.org/CVERecord?id=CVE-2024-2057 This affects users that: * Use the `TFIDFRetriever` * Attempt to de-serialize it from an untrusted source that contains a malicious payload	2024-03-06 15:49:04 -05:00
Erick Friis	2619420df1	mongodb[patch]: release 0.1.1 (#18692 )	2024-03-06 19:44:14 +00:00
Christophe Bornet	ea141511d8	core: Move document loader interfaces to core (#17723 ) This is needed to be able to move document loaders to partner packages. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-03-06 13:59:00 -05:00
Christophe Bornet	5985454269	Merge pull request #18539 * Implement lazy_load() for GitLoader	2024-03-06 13:25:14 -05:00
Christophe Bornet	9a6f7e213b	Merge pull request #18423 * Implement lazy_load() for BSHTMLLoader	2024-03-06 13:25:01 -05:00
Christophe Bornet	b3a0c44838	Merge pull request #18673 * Implement lazy_load() for PDFMinerPDFasHTMLLoader and PyMuPDFLoader	2024-03-06 13:24:36 -05:00
Christophe Bornet	68fc0cf909	Merge pull request #18674 * Implement lazy_load() for TextLoader	2024-03-06 13:23:42 -05:00
Christophe Bornet	5b92f962f1	Merge pull request #18671 * Implement lazy_load() for MastodonTootsLoader	2024-03-06 13:23:14 -05:00
Christophe Bornet	15b1770326	Merge pull request #18421 * Implement lazy_load() for AssemblyAIAudioTranscriptLoader	2024-03-06 13:16:05 -05:00
Christophe Bornet	bb284eebe4	Merge pull request #18436 * Implement lazy_load() for ConfluenceLoader	2024-03-06 13:15:24 -05:00
Christophe Bornet	691480f491	Merge pull request #18647 * Implement lazy_load() for UnstructuredBaseLoader	2024-03-06 13:13:10 -05:00
Christophe Bornet	52ac67c5d8	Merge pull request #18654 * Implement lazy_load() for ObsidianLoader	2024-03-06 13:06:55 -05:00
Christophe Bornet	b9c0cf9025	Merge pull request #18656 * Implement lazy_load() for PsychicLoader	2024-03-06 13:05:04 -05:00
Christophe Bornet	aa7ac57b67	community: Implement lazy_load() for TrelloLoader (#18658 ) Covered by `tests/unit_tests/document_loaders/test_trello.py`	2024-03-06 13:04:36 -05:00
Christophe Bornet	302985fea1	community: Implement lazy_load() for SlackDirectoryLoader (#18675 ) Integration tests: `tests/integration_tests/document_loaders/test_slack.py`	2024-03-06 13:04:13 -05:00
Christophe Bornet	ed36f9f604	community: Implement lazy_load() for WhatsAppChatLoader (#18677 ) Integration test: `tests/integration_tests/document_loaders/test_whatsapp_chat.py`	2024-03-06 13:03:46 -05:00
Christophe Bornet	f414f5cdb9	community[minor]: Implement lazy_load() for WikipediaLoader (#18680 ) Integration test: `tests/integration_tests/document_loaders/test_wikipedia.py`	2024-03-06 13:03:21 -05:00
Bagatur	4cbfeeb1c2	community[patch]: Release 0.0.26 (#18683 )	2024-03-06 09:41:18 -08:00
Christophe Bornet	1100f8de7a	community[minor]: Implement lazy_load() for ArxivLoader (#18664 ) Integration tests: `tests/integration_tests/utilities/test_arxiv.py` and `tests/integration_tests/document_loaders/test_arxiv.py`	2024-03-06 09:16:49 -05:00
Christophe Bornet	2d96803ddd	community[minor]: Implement lazy_load() for OutlookMessageLoader (#18668 ) Integration test: `tests/integration_tests/document_loaders/test_email.py`	2024-03-06 09:15:57 -05:00
Christophe Bornet	ae167fb5b2	community[minor]: Implement lazy_load() for SitemapLoader (#18667 ) Integration tests: `test_sitemap.py` and `test_docusaurus.py`	2024-03-06 09:15:35 -05:00
Christophe Bornet	623dfcc55c	community[minor]: Implement lazy_load() for FacebookChatLoader (#18669 ) Integration test: `tests/integration_tests/document_loaders/test_facebook_chat.py`	2024-03-06 09:15:00 -05:00
Christophe Bornet	20794bb889	community[minor]: Implement lazy_load() for GitbookLoader (#18670 ) Integration test: `tests/integration_tests/document_loaders/test_gitbook.py`	2024-03-06 09:14:36 -05:00
Liang Zhang	81985b31e6	community[patch]: Databricks SerDe uses cloudpickle instead of pickle (#18607 ) - Description: Databricks SerDe uses cloudpickle instead of pickle when serializing a user-defined function transform_input_fn since pickle does not support functions defined in `__main__`, and cloudpickle supports this. - Dependencies: cloudpickle>=2.0.0 Added a unit test.	2024-03-05 18:04:45 -08:00
Christophe Bornet	7d6de96186	community[patch]: Implement lazy_load() for CubeSemanticLoader (#18535 ) Covered by `test_cube_semantic.py`	2024-03-05 17:32:31 -08:00
Christophe Bornet	a6b5d45e31	community[patch]: Implement lazy_load() for EverNoteLoader (#18538 ) Covered by `test_evernote_loader.py`	2024-03-05 17:29:52 -08:00
Max Jakob	ee7a7954b9	elasticsearch: add `ElasticsearchRetriever` (#18587 ) Implement [Retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/) interface for Elasticsearch. I opted to only expose the `body`, which gives you full flexibility, and none the other 68 arguments of the [search method](https://elasticsearch-py.readthedocs.io/en/v8.12.1/api/elasticsearch.html#elasticsearch.Elasticsearch.search). Added a user agent header for usage tracking in Elastic Cloud. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-06 00:42:50 +00:00
Jib	8bc347c5fc	mongodb[patch]: include LLM caches in toplevel library import (#18601 )	2024-03-05 16:35:13 -08:00
Sunchao Wang	dc81dba6cf	community[patch]: Improve amadeus tool and doc (#18509 ) Description: This pull request addresses two key improvements to the langchain repository: Fix for Crash in Flight Search Interface: Previously, the code would crash when encountering a failure scenario in the flight ticket search interface. This PR resolves this issue by implementing a fix to handle such scenarios gracefully. Now, the code handles failures in the flight search interface without crashing, ensuring smoother operation. Documentation Update for Amadeus Toolkit: Prior to this update, examples provided in the documentation for the Amadeus Toolkit were unable to run correctly due to outdated information. This PR includes an update to the documentation, ensuring that all examples can now be executed successfully. With this update, users can effectively utilize the Amadeus Toolkit with accurate and functioning examples. These changes aim to enhance the reliability and usability of the langchain repository by addressing issues related to error handling and ensuring that documentation remains up-to-date and actionable. Issue: https://github.com/langchain-ai/langchain/issues/17375 Twitter Handle: SingletonYxx	2024-03-05 16:17:22 -08:00
Christophe Bornet	f77f7dc3ec	community[patch]: Fix VectorStoreQATool (#18529 ) Fix #18460	2024-03-05 15:56:58 -08:00
Dounx	ad48f55357	community[minor]: add Yuque document loader (#17924 ) This pull request support loading documents from Yuque with Langchain. Yuque is a professional cloud-based knowledge base for team collaboration in documentation. Website: https://www.yuque.com OpenAPI: https://www.yuque.com/yuque/developer/openapi	2024-03-05 15:54:07 -08:00
Kazuki Maeda	60c5d964a8	community[minor]: use jq schema for content_key in json_loader (#18003 ) ### Description Changed the value specified for `content_key` in JSONLoader from a single key to a value based on jq schema. I created [similar PR](https://github.com/langchain-ai/langchain/pull/11255) before, but it has several conflicts because of the architectural change associated stable version release, so I re-create this PR to fit new architecture. ### Why For json data like the following, specify `.data[].attributes.message` for page_content and `.data[].attributes.id` or `.data[].attributes.attributes. tags`, etc., the `content_key` must also parse the json structure. <details> <summary>sample json data</summary> ```json { "data": [ { "attributes": { "message": "message1", "tags": [ "tag1" ] }, "id": "1" }, { "attributes": { "message": "message2", "tags": [ "tag2" ] }, "id": "2" } ] } ``` </details> <details> <summary>sample code</summary> ```python def metadata_func(record: dict, metadata: dict) -> dict: metadata["source"] = None metadata["id"] = record.get("id") metadata["tags"] = record["attributes"].get("tags") return metadata sample_file = "sample1.json" loader = JSONLoader( file_path=sample_file, jq_schema=".data[]", content_key=".attributes.message", ## content_key is parsable into jq schema is_content_key_jq_parsable=True, ## this is added parameter metadata_func=metadata_func ) data = loader.load() data ``` </details> ### Dependencies none ### Twitter handle [kzk_maeda](https://twitter.com/kzk_maeda)	2024-03-05 15:51:24 -08:00
Max Jakob	81e9ab6e3a	docs: Update elasticsearch README (#18497 ) Update Elasticsearch README with information on how to start a deployment. Also make some cosmetic changes to the [Elasticsearch docs](https://python.langchain.com/docs/integrations/vectorstores/elasticsearch). Follow up on https://github.com/langchain-ai/langchain/pull/17467	2024-03-05 15:49:16 -08:00
Hech	6a08134661	community[patch], langchain[minor]: Add retriever self_query and score_threshold in DingoDB (#18106 )	2024-03-05 15:47:29 -08:00
Mikhail Khludnev	d039dcb6ba	nvidia-trt[patch]: add TritonTensorRTLLM(verbose_client=False) (#16848 ) - Description: adding verbose flag to TritonTensorRTLLM, - Issue: nope, - Dependencies: not any, - Twitter handle:	2024-03-05 15:44:13 -08:00
Asaf Joseph Gardin	27441555d0	ai21[patch]: AI21 Labs Contextual Answers support (#18270 ) Description: Added support for AI21 Labs model - Contextual Answers Dependencies: ai21, ai21-tokenizer Twitter handle: https://github.com/AI21Labs --------- Co-authored-by: Asaf Gardin <asafg@ai21.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-05 22:42:04 +00:00
Erick Friis	e169ee8863	anthropic[patch]: handle lists in function calling (#18609 )	2024-03-05 14:19:40 -08:00
Erick Friis	1831733c2e	anthropic[patch]: fix argument integration test (#18605 )	2024-03-05 13:05:25 -08:00
Yudhajit Sinha	4570b477b9	community[patch]: Invoke callback prior to yielding token (titan_takeoff) (#18560 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message - Description: Invoke callback prior to yielding token in _stream_ method in llms/titan_takeoff. - Issue: #16913 - Dependencies: None	2024-03-05 12:54:26 -08:00
Tomaz Bratanic	ea51cdaede	Remove neo4j bloom labels from graph schema (#18564 ) Neo4j tools use particular node labels and relationship types to store metadata, but are irrelevant for text2cypher or graph generation, so we want to ignore them in the schema representation.	2024-03-05 12:54:05 -08:00
Erick Friis	e1924b3e93	core[patch]: deprecate hwchase17/langchain-hub, address path traversal (#18600 ) Deprecates the old langchain-hub repository. Does not deprecate the new https://smith.langchain.com/hub @PinkDraconian has correctly raised that in the event someone is loading unsanitized user input into the `try_load_from_hub` function, they have the ability to load files from other locations in github than the hwchase17/langchain-hub repository. This PR adds some more path checking to that function and deprecates the functionality in favor of the hub built into LangSmith.	2024-03-05 12:49:38 -08:00
Jib	fc35262356	langchain-mongodb: add unit tests for MongoDBChatMessageHistory (#18599 ) ## Description Adding in Unit Test variation for `MongoDBChatMessageHistory` package Follow-up to #18590 - [x] Add tests and docs: Unit test is what's being added - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-03-05 11:44:31 -08:00
Erick Friis	48e303ea10	airbyte[patch]: release 0.1.1, python 3.9 compat (#18597 )	2024-03-05 19:22:08 +00:00
Jib	9da1e0cf34	mongodb[patch]: Migrate MongoDBChatMessageHistory (#18590 ) ## Description Migrate the `MongoDBChatMessageHistory` to the managed `langchain-mongodb` partner-package ## Dependencies None ## Twitter handle @mongodb ## tests and docs - [x] Migrate existing integration test - [x ]~ Convert existing integration test to a unit test~ Creation is out of scope for this ticket - [x ] ~Considering delaying work until #17470 merges to leverage the `MockCollection` object. ~ - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-05 18:53:02 +00:00
Jib	f92f7d2e03	mongodb[minor]: Add MongoDB LLM Cache (#17470 ) # Description - Description: Adding MongoDB LLM Caching Layer abstraction - Issue: N/A - Dependencies: None - Twitter handle: @mongodb Checklist: - [x] PR title: Please title your PR "package: description", where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR Message (above) - [x] Pass lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified to check that you're passing lint and testing. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @efriis, @eyurtsev, @hwchase17. --------- Co-authored-by: Jib <jib@byblack.us>	2024-03-05 10:38:39 -08:00
Tomaz Bratanic	353248838d	Add precedence for input params over env variables in neo4j integration (#18581 ) input parameters take precedence over env variables	2024-03-05 09:36:56 -08:00
Christophe Bornet	c8a171a154	community: Implement lazy_load() for GithubFileLoader (#18584 )	2024-03-05 09:35:50 -08:00
Leonid Kuligin	04d134df17	marked MatchingEngine as deprecated (#18585 ) Thank you for contributing to LangChain! - [ ] PR title: "community: deprecate vectorstores.MatchingEngine" - [ ] PR message: - Description: announced a deprecation since this integration has been moved to langchain_google_vertexai	2024-03-05 09:34:53 -08:00
Erick Friis	4ac2cb4adc	anthropic[minor]: add tool calling (#18554 )	2024-03-05 08:30:16 -08:00
Bagatur	5fc67ca2c7	langchain[patch]: Release 0.1.11 (#18558 )	2024-03-04 23:58:34 -08:00
Erick Friis	68c1878380	anthropic[patch]: model type string (#18510 )	2024-03-04 19:25:19 -08:00
Erick Friis	25c7d52140	anthropic[patch]: multimodal (#18517 ) - anthropic[minor]: claude 3 - x - x --------- Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>	2024-03-04 17:50:13 -08:00
Erick Friis	343438e872	community[patch]: deprecate community fireworks (#18544 )	2024-03-05 01:04:26 +00:00
William FH	ca1d42785d	Evals wording (#18542 )	2024-03-04 16:32:33 -08:00
Bagatur	dd07eddf24	core[patch]: Release 0.1.29 (#18530 )	2024-03-04 14:37:08 -08:00
William FH	30ccc009e6	[Evals] Support list examples by dataset version tag (#18534 ) previously only supported by timestamp	2024-03-04 14:23:32 -08:00
aditya thomas	5c387a173f	docs: update to docstrings of ChatAnthropic class (#18493 ) Description: Update docstrings of ChatAnthropic class Issue: Change to ChatAnthropic from ChatAnthropicMessages Dependencies: None Lint and test: `make format`, `make lint` and `make test` passed	2024-03-04 10:44:54 -08:00
Erick Friis	24f9c700f2	anthropic[minor]: claude 3 (#18508 )	2024-03-04 15:03:51 +00:00
William FH	1eec67e8fe	Evaluate on Version (#18471 )	2024-03-03 17:47:35 -08:00
Harrison Chase	73d653324f	[Evals] Session-level feedback (#18463 ) Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>	2024-03-03 17:18:29 -08:00
Scott Nath	b051bba1a9	community: Add you.com tool, add async to retriever, add async testing, add You tool doc (#18032 ) - Description: finishes adding the you.com functionality including: - add async functions to utility and retriever - add the You.com Tool - add async testing for utility, retriever, and tool - add a tool integration notebook page - Dependencies: any dependencies required for this change - Twitter handle: @scottnath	2024-03-03 14:30:05 -08:00
mackong	b89d9fc177	langchain[patch]: add tools renderer for various non-openai agents (#18307 ) - Description: add tools_renderer for various non-openai agents, make tools can be render in different ways for your LLM. - Issue: N/A - Dependencies: N/A --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-03-03 14:25:12 -08:00
William De Vena	a63cee04ac	nvidia-trt[patch]: Invoke callback prior to yielding token (#18446 ) ## PR title nvidia-trt[patch]: Invoke callback prior to yielding ## PR message - Description: Invoke on_llm_new_token callback prior to yielding token in _stream method. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None	2024-03-03 14:15:11 -08:00
William De Vena	275877980e	community[patch]: Invoke callback prior to yielding token (#18447 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message Description: Invoke callback prior to yielding token in _stream method in llms/vertexai. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None	2024-03-03 14:14:40 -08:00
William De Vena	67375e96e0	community[patch]: Invoke callback prior to yielding token (#18448 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message - Description: Invoke callback prior to yielding token in _stream method in llms/tongyi. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None	2024-03-03 14:14:22 -08:00
William De Vena	2087cbae64	community[patch]: Invoke callback prior to yielding token (#18449 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message - Description: Invoke callback prior to yielding token in _stream method in chat_models/perplexity. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None	2024-03-03 14:14:00 -08:00
William De Vena	eb04d0d3e2	community[patch]: Invoke callback prior to yielding token (#18452 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message - Description: Invoke callback prior to yielding token in _stream and _astream methods in llms/anthropic. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None	2024-03-03 14:13:41 -08:00
William De Vena	371bec79bc	community[patch]: Invoke callback prior to yielding token (#18454 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message - Description: Invoke callback prior to yielding token in _stream and _astream methods in llms/baidu_qianfan_endpoint. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None	2024-03-03 14:13:22 -08:00
Aayush Kataria	7c2f3f6f95	community[minor]: Adding Azure Cosmos Mongo vCore Vector DB Cache (#16856 ) Description: This pull request introduces several enhancements for Azure Cosmos Vector DB, primarily focused on improving caching and search capabilities using Azure Cosmos MongoDB vCore Vector DB. Here's a summary of the changes: - AzureCosmosDBSemanticCache: Added a new cache implementation called AzureCosmosDBSemanticCache, which utilizes Azure Cosmos MongoDB vCore Vector DB for efficient caching of semantic data. Added comprehensive test cases for AzureCosmosDBSemanticCache to ensure its correctness and robustness. These tests cover various scenarios and edge cases to validate the cache's behavior. - HNSW Vector Search: Added HNSW vector search functionality in the CosmosDB Vector Search module. This enhancement enables more efficient and accurate vector searches by utilizing the HNSW (Hierarchical Navigable Small World) algorithm. Added corresponding test cases to validate the HNSW vector search functionality in both AzureCosmosDBSemanticCache and AzureCosmosDBVectorSearch. These tests ensure the correctness and performance of the HNSW search algorithm. - LLM Caching Notebook - The notebook now includes a comprehensive example showcasing the usage of the AzureCosmosDBSemanticCache. This example highlights how the cache can be employed to efficiently store and retrieve semantic data. Additionally, the example provides default values for all parameters used within the AzureCosmosDBSemanticCache, ensuring clarity and ease of understanding for users who are new to the cache implementation. @hwchase17,@baskaryan, @eyurtsev,	2024-03-03 14:04:15 -08:00
Erick Friis	f96dd57501	langchain[patch]: release 0.1.10 (#18410 )	2024-03-02 01:48:57 +00:00
Erick Friis	1fd1ac8e95	community[patch]: release 0.0.25 (#18408 )	2024-03-02 00:56:04 +00:00
Sourav Pradhan	50abeb7ed9	community[patch]: fix Chroma add_images (#17964 ) ### Description Fixed a small bug in chroma.py add_images(), previously whenever we are not passing metadata the documents is containing the base64 of the uris passed, but when we are passing the metadata the documents is containing normal string uris which should not be the case. ### Issue In add_images() method when we are calling upsert() we have to use "b64_texts" instead of normal string "uris". ### Twitter handle https://twitter.com/whitepegasus01	2024-03-01 21:55:58 +00:00
Kate Silverstein	b7c71e2e07	community[minor]: llamafile embeddings support (#17976 ) * Description: adds `LlamafileEmbeddings` class implementation for generating embeddings using [llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models. Includes related unit tests and notebook showing example usage. * Issue: N/A * Dependencies: N/A	2024-03-01 13:49:18 -08:00
Mateusz Szewczyk	9298a0b941	langchain_ibm[patch] update docstring, dependencies, tests (#18386 ) - Description: Update docstring, dependencies, tests, README - Dependencies: [ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/), - Tag maintainer: : Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally -> ✅ Please make sure integration_tests passing locally -> ✅ --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-01 21:01:53 +00:00
Jib	c2b1abe91b	mongodb[patch]: Set delete_many only if count_documents is not 0 (#18402 ) - [x] PR message: *Delete this entire checklist* and replace with - Description: Remove the assert statement on the `count_documents` in setup_class. It should just delete if there are documents present - Issue: the issue # Crashes on class setup - Dependencies: None - Twitter handle: @mongodb - [x] Add tests and docs: If you're adding a new integration, please include 1. N/A - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. Co-authored-by: Jib <jib@byblack.us>	2024-03-01 13:01:28 -08:00
Tomaz Bratanic	f6bfb969ba	community[patch]: Add an option for indexed generic label when import neo4j graph documents (#18122 ) Current implementation doesn't have an indexed property that would optimize the import. I have added a `baseEntityLabel` parameter that allows you to add a secondary node label, which has an indexed id `property`. By default, the behaviour is identical to previous version. Since multi-labeled nodes are terrible for text2cypher, I removed the secondary label from schema representation object and string, which is used in text2cypher.	2024-03-01 12:33:52 -08:00
Arun Sathiya	4adac20d7b	community[patch]: Make cohere_api_key a SecretStr (#12188 ) This PR makes `cohere_api_key` in `llms/cohere` a SecretStr, so that the API Key is not leaked when `Cohere.cohere_api_key` is represented as a string. --------- Signed-off-by: Arun <arun@arun.blog> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-03-01 20:27:53 +00:00
Petteri Johansson	6c1989d292	community[minor], langchain[minor], docs: Gremlin Graph Store and QA Chain (#17683 ) - Description: New feature: Gremlin graph-store and QA chain (including docs). Compatible with Azure CosmosDB. - Dependencies: no changes	2024-03-01 12:21:14 -08:00
Ather Fawaz	a5ccf5d33c	community[minor]: Add support for Perplexity chat model(#17024 ) - Description: This PR adds support for [Perplexity AI APIs](https://blog.perplexity.ai/blog/introducing-pplx-api). - Issues: None - Dependencies: None - Twitter handle: [@atherfawaz](https://twitter.com/AtherFawaz) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-01 12:19:23 -08:00
Rodrigo Nogueira	3438d2cbcc	community[minor]: add maritalk chat (#17675 ) Description: Adds the MariTalk chat that is based on a LLM specially trained for Portuguese. Twitter handle: @MaritacaAI	2024-03-01 12:18:23 -08:00
sarahberenji	08fa38d56d	community[patch]: the syntax error for Redis generated query (#17717 ) To fix the reported error: https://github.com/langchain-ai/langchain/discussions/17397	2024-03-01 12:18:10 -08:00
certified-dodo	43e3244573	community[patch]: Fix MongoDBAtlasVectorSearch max_marginal_relevance_search (#17971 ) Description: * `self._embedding_key` is accessed after deletion, breaking `max_marginal_relevance_search` search * Introduced in: `e135e5257c` * Updated but still persists in: `ce22e10c4b` Issue: https://github.com/langchain-ai/langchain/issues/17963 Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-01 12:17:42 -08:00
Nikita Titov	9f2ab37162	community[patch]: don't try to parse json in case of errored response (#18317 ) Related issue: #13896. In case Ollama is behind a proxy, proxy error responses cannot be viewed. You aren't even able to check response code. For example, if your Ollama has basic access authentication and it's not passed, `JSONDecodeError` will overwrite the truth response error. <details> <summary><b>Log now:</b></summary> ``` { "name": "JSONDecodeError", "message": "Expecting value: line 1 column 1 (char 0)", "stack": "--------------------------------------------------------------------------- JSONDecodeError Traceback (most recent call last) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/requests/models.py:971, in Response.json(self, kwargs) 970 try: --> 971 return complexjson.loads(self.text, kwargs) 972 except JSONDecodeError as e: 973 # Catch JSON-related errors and raise as requests.JSONDecodeError 974 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError File /opt/miniforge3/envs/.gpt/lib/python3.10/json/__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, kw) 343 if (cls is None and object_hook is None and 344 parse_int is None and parse_float is None and 345 parse_constant is None and object_pairs_hook is None and not kw): --> 346 return _default_decoder.decode(s) 347 if cls is None: File /opt/miniforge3/envs/.gpt/lib/python3.10/json/decoder.py:337, in JSONDecoder.decode(self, s, _w) 333 \"\"\"Return the Python representation of ``s`` (a ``str`` instance 334 containing a JSON document). 335 336 \"\"\" --> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 338 end = _w(s, end).end() File /opt/miniforge3/envs/.gpt/lib/python3.10/json/decoder.py:355, in JSONDecoder.raw_decode(self, s, idx) 354 except StopIteration as err: --> 355 raise JSONDecodeError(\"Expecting value\", s, err.value) from None 356 return obj, end JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: JSONDecodeError Traceback (most recent call last) Cell In[3], line 1 ----> 1 print(translate_func().invoke('text')) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/runnables/base.py:2053, in RunnableSequence.invoke(self, input, config) 2051 try: 2052 for i, step in enumerate(self.steps): -> 2053 input = step.invoke( 2054 input, 2055 # mark each step as a child run 2056 patch_config( 2057 config, callbacks=run_manager.get_child(f\"seq:step:{i+1}\") 2058 ), 2059 ) 2060 # finish the root run 2061 except BaseException as e: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:165, in BaseChatModel.invoke(self, input, config, stop, kwargs) 154 def invoke( 155 self, 156 input: LanguageModelInput, (...) 160 kwargs: Any, 161 ) -> BaseMessage: 162 config = ensure_config(config) 163 return cast( 164 ChatGeneration, --> 165 self.generate_prompt( 166 [self._convert_input(input)], 167 stop=stop, 168 callbacks=config.get(\"callbacks\"), 169 tags=config.get(\"tags\"), 170 metadata=config.get(\"metadata\"), 171 run_name=config.get(\"run_name\"), 172 kwargs, 173 ).generations[0][0], 174 ).message File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:543, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, kwargs) 535 def generate_prompt( 536 self, 537 prompts: List[PromptValue], (...) 540 kwargs: Any, 541 ) -> LLMResult: 542 prompt_messages = [p.to_messages() for p in prompts] --> 543 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:407, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 405 if run_managers: 406 run_managers[i].on_llm_error(e, response=LLMResult(generations=[])) --> 407 raise e 408 flattened_outputs = [ 409 LLMResult(generations=[res.generations], llm_output=res.llm_output) 410 for res in results 411 ] 412 llm_output = self._combine_llm_outputs([res.llm_output for res in results]) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:397, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 394 for i, m in enumerate(messages): 395 try: 396 results.append( --> 397 self._generate_with_cache( 398 m, 399 stop=stop, 400 run_manager=run_managers[i] if run_managers else None, 401 kwargs, 402 ) 403 ) 404 except BaseException as e: 405 if run_managers: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:576, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, kwargs) 572 raise ValueError( 573 \"Asked to cache, but no cache found at `langchain.cache`.\" 574 ) 575 if new_arg_supported: --> 576 return self._generate( 577 messages, stop=stop, run_manager=run_manager, kwargs 578 ) 579 else: 580 return self._generate(messages, stop=stop, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:250, in ChatOllama._generate(self, messages, stop, run_manager, kwargs) 226 def _generate( 227 self, 228 messages: List[BaseMessage], (...) 231 kwargs: Any, 232 ) -> ChatResult: 233 \"\"\"Call out to Ollama's generate endpoint. 234 235 Args: (...) 247 ]) 248 \"\"\" --> 250 final_chunk = self._chat_stream_with_aggregation( 251 messages, 252 stop=stop, 253 run_manager=run_manager, 254 verbose=self.verbose, 255 kwargs, 256 ) 257 chat_generation = ChatGeneration( 258 message=AIMessage(content=final_chunk.text), 259 generation_info=final_chunk.generation_info, 260 ) 261 return ChatResult(generations=[chat_generation]) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:183, in ChatOllama._chat_stream_with_aggregation(self, messages, stop, run_manager, verbose, kwargs) 174 def _chat_stream_with_aggregation( 175 self, 176 messages: List[BaseMessage], (...) 180 kwargs: Any, 181 ) -> ChatGenerationChunk: 182 final_chunk: Optional[ChatGenerationChunk] = None --> 183 for stream_resp in self._create_chat_stream(messages, stop, kwargs): 184 if stream_resp: 185 chunk = _chat_stream_response_to_chat_generation_chunk(stream_resp) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:156, in ChatOllama._create_chat_stream(self, messages, stop, kwargs) 147 def _create_chat_stream( 148 self, 149 messages: List[BaseMessage], 150 stop: Optional[List[str]] = None, 151 kwargs: Any, 152 ) -> Iterator[str]: 153 payload = { 154 \"messages\": self._convert_messages_to_ollama_messages(messages), 155 } --> 156 yield from self._create_stream( 157 payload=payload, stop=stop, api_url=f\"{self.base_url}/api/chat/\", kwargs 158 ) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/llms/ollama.py:234, in _OllamaCommon._create_stream(self, api_url, payload, stop, kwargs) 228 raise OllamaEndpointNotFoundError( 229 \"Ollama call failed with status code 404. \" 230 \"Maybe your model is not found \" 231 f\"and you should pull the model with `ollama pull {self.model}`.\" 232 ) 233 else: --> 234 optional_detail = response.json().get(\"error\") 235 raise ValueError( 236 f\"Ollama call failed with status code {response.status_code}.\" 237 f\" Details: {optional_detail}\" 238 ) 239 return response.iter_lines(decode_unicode=True) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/requests/models.py:975, in Response.json(self, kwargs) 971 return complexjson.loads(self.text, kwargs) 972 except JSONDecodeError as e: 973 # Catch JSON-related errors and raise as requests.JSONDecodeError 974 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError --> 975 raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) JSONDecodeError: Expecting value: line 1 column 1 (char 0)" } ``` </details> <details> <summary><b>Log after a fix:</b></summary> ``` { "name": "ValueError", "message": "Ollama call failed with status code 401. Details: <html>\r <head><title>401 Authorization Required</title></head>\r <body>\r <center><h1>401 Authorization Required</h1></center>\r <hr><center>nginx/1.18.0 (Ubuntu)</center>\r </body>\r </html>\r ", "stack": "--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[2], line 1 ----> 1 print(translate_func().invoke('text')) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/runnables/base.py:2053, in RunnableSequence.invoke(self, input, config) 2051 try: 2052 for i, step in enumerate(self.steps): -> 2053 input = step.invoke( 2054 input, 2055 # mark each step as a child run 2056 patch_config( 2057 config, callbacks=run_manager.get_child(f\"seq:step:{i+1}\") 2058 ), 2059 ) 2060 # finish the root run 2061 except BaseException as e: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:165, in BaseChatModel.invoke(self, input, config, stop, kwargs) 154 def invoke( 155 self, 156 input: LanguageModelInput, (...) 160 kwargs: Any, 161 ) -> BaseMessage: 162 config = ensure_config(config) 163 return cast( 164 ChatGeneration, --> 165 self.generate_prompt( 166 [self._convert_input(input)], 167 stop=stop, 168 callbacks=config.get(\"callbacks\"), 169 tags=config.get(\"tags\"), 170 metadata=config.get(\"metadata\"), 171 run_name=config.get(\"run_name\"), 172 kwargs, 173 ).generations[0][0], 174 ).message File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:543, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, kwargs) 535 def generate_prompt( 536 self, 537 prompts: List[PromptValue], (...) 540 kwargs: Any, 541 ) -> LLMResult: 542 prompt_messages = [p.to_messages() for p in prompts] --> 543 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:407, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 405 if run_managers: 406 run_managers[i].on_llm_error(e, response=LLMResult(generations=[])) --> 407 raise e 408 flattened_outputs = [ 409 LLMResult(generations=[res.generations], llm_output=res.llm_output) 410 for res in results 411 ] 412 llm_output = self._combine_llm_outputs([res.llm_output for res in results]) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:397, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 394 for i, m in enumerate(messages): 395 try: 396 results.append( --> 397 self._generate_with_cache( 398 m, 399 stop=stop, 400 run_manager=run_managers[i] if run_managers else None, 401 kwargs, 402 ) 403 ) 404 except BaseException as e: 405 if run_managers: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:576, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, kwargs) 572 raise ValueError( 573 \"Asked to cache, but no cache found at `langchain.cache`.\" 574 ) 575 if new_arg_supported: --> 576 return self._generate( 577 messages, stop=stop, run_manager=run_manager, kwargs 578 ) 579 else: 580 return self._generate(messages, stop=stop, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:250, in ChatOllama._generate(self, messages, stop, run_manager, kwargs) 226 def _generate( 227 self, 228 messages: List[BaseMessage], (...) 231 kwargs: Any, 232 ) -> ChatResult: 233 \"\"\"Call out to Ollama's generate endpoint. 234 235 Args: (...) 247 ]) 248 \"\"\" --> 250 final_chunk = self._chat_stream_with_aggregation( 251 messages, 252 stop=stop, 253 run_manager=run_manager, 254 verbose=self.verbose, 255 kwargs, 256 ) 257 chat_generation = ChatGeneration( 258 message=AIMessage(content=final_chunk.text), 259 generation_info=final_chunk.generation_info, 260 ) 261 return ChatResult(generations=[chat_generation]) File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:328, in ChatOllamaCustom._chat_stream_with_aggregation(self, messages, stop, run_manager, verbose, kwargs) 319 def _chat_stream_with_aggregation( 320 self, 321 messages: List[BaseMessage], (...) 325 kwargs: Any, 326 ) -> ChatGenerationChunk: 327 final_chunk: Optional[ChatGenerationChunk] = None --> 328 for stream_resp in self._create_chat_stream(messages, stop, kwargs): 329 if stream_resp: 330 chunk = _chat_stream_response_to_chat_generation_chunk(stream_resp) File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:301, in ChatOllamaCustom._create_chat_stream(self, messages, stop, kwargs) 292 def _create_chat_stream( 293 self, 294 messages: List[BaseMessage], 295 stop: Optional[List[str]] = None, 296 kwargs: Any, 297 ) -> Iterator[str]: 298 payload = { 299 \"messages\": self._convert_messages_to_ollama_messages(messages), 300 } --> 301 yield from self._create_stream( 302 payload=payload, stop=stop, api_url=f\"{self.base_url}/api/chat\", kwargs 303 ) File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:134, in _OllamaCommonCustom._create_stream(self, api_url, payload, stop, **kwargs) 132 else: 133 optional_detail = response.text --> 134 raise ValueError( 135 f\"Ollama call failed with status code {response.status_code}.\" 136 f\" Details: {optional_detail}\" 137 ) 138 return response.iter_lines(decode_unicode=True) ValueError: Ollama call failed with status code 401. Details: <html>\r <head><title>401 Authorization Required</title></head>\r <body>\r <center><h1>401 Authorization Required</h1></center>\r <hr><center>nginx/1.18.0 (Ubuntu)</center>\r </body>\r </html>\r " } ``` </details> The same is true for timeout errors or when you simply mistyped in `base_url` arg and get response from some other service, for instance. Real Ollama errors are still clearly readable: ``` ValueError: Ollama call failed with status code 400. Details: {"error":"invalid options: unknown_option"} ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-01 12:17:29 -08:00
Yudhajit Sinha	e2b901c35b	community[patch]: chat message histrory mypy fix (#18250 ) Description: Fixed type: ignore's for mypy for chat_message_histories(streamlit) Adresses #17048 Planning to add more based on reviews	2024-03-01 12:17:18 -08:00
Gabriel Altay	b9416dc96a	docs: update pinecone README to use PineconeVectorStore (#18170 )	2024-03-01 12:12:52 -08:00
Hemslo Wang	58a2abf089	community[patch]: fix RecursiveUrlLoader metadata_extractor return type (#18193 ) Description: Fix `metadata_extractor` type for `RecursiveUrlLoader`, the default `_metadata_extractor` returns `dict` instead of `str`. Issue: N/A Dependencies: N/A Twitter handle: N/A Signed-off-by: Hemslo Wang <hemslo.wang@gmail.com>	2024-03-01 12:08:20 -08:00
Maxime Perrin	98380cff9b	community[patch]: removing "response_mode" parameter in llama_index retriever (#18180 ) - Description: Removing this line ```python response = index.query(query, response_mode="no_text", self.query_kwargs) ``` to ```python response = index.query(query, self.query_kwargs) ``` Since llama index query does not support response_mode anymore : ``` \| TypeError: BaseQueryEngine.query() got an unexpected keyword argument 'response_mode'```` - Twitter handle: @maximeperrin_ --------- Co-authored-by: Maxime Perrin <mperrin@doing.fr>	2024-03-01 12:05:09 -08:00
Christophe Bornet	177f51c7bd	community: Use default load() implementation in doc loaders (#18385 ) Following https://github.com/langchain-ai/langchain/pull/18289	2024-03-01 14:46:52 -05:00
William De Vena	42341bc787	infra: fake model invoke callback prior to yielding token (#18286 ) ## PR title core[patch]: Invoke callback prior to yielding ## PR message Description: Invoke on_llm_new_token callback prior to yielding token in _stream and _astream methods. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-03-01 11:46:18 -08:00
mwmajewsk	e192f6b6eb	community[patch]: fix, better error message in deeplake vectoriser (#18397 ) If the document loader recieves Pathlib path instead of str, it reads the file correctly, but the problem begins when the document is added to Deeplake. This problem arises from casting the path to str in the metadata. ```python deeplake = True fname = Path('./lorem_ipsum.txt') loader = TextLoader(fname, encoding="utf-8") docs = loader.load_and_split() text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=100) chunks= text_splitter.split_documents(docs) if deeplake: db = DeepLake(dataset_path=ds_path, embedding=embeddings, token=activeloop_token) db.add_documents(chunks) else: db = Chroma.from_documents(docs, embeddings) ``` So using this snippet of code the error message for deeplake looks like this: ``` [part of error message omitted] Traceback (most recent call last): File "/home/mwm/repositories/sources/fixing_langchain/main.py", line 53, in <module> db.add_documents(chunks) File "/home/mwm/repositories/sources/langchain/libs/core/langchain_core/vectorstores.py", line 139, in add_documents return self.add_texts(texts, metadatas, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/deeplake.py", line 258, in add_texts return self.vectorstore.add( ^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/deeplake_vectorstore.py", line 226, in add return self.dataset_handler.add( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/dataset_handlers/client_side_dataset_handler.py", line 139, in add dataset_utils.extend_or_ingest_dataset( File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/vector_search/dataset/dataset.py", line 544, in extend_or_ingest_dataset extend( File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/vector_search/dataset/dataset.py", line 505, in extend dataset.extend(batched_processed_tensors, progressbar=False) File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/dataset/dataset.py", line 3247, in extend raise SampleExtendError(str(e)) from e.__cause__ deeplake.util.exceptions.SampleExtendError: Failed to append a sample to the tensor 'metadata'. See more details in the traceback. If you wish to skip the samples that cause errors, please specify `ignore_errors=True`. ``` Which is does not explain the error well enough. The same error for chroma looks like this ``` During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/mwm/repositories/sources/fixing_langchain/main.py", line 56, in <module> db = Chroma.from_documents(docs, embeddings) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 778, in from_documents return cls.from_texts( ^^^^^^^^^^^^^^^ File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 736, in from_texts chroma_collection.add_texts( File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 309, in add_texts raise ValueError(e.args[0] + "\n\n" + msg) ValueError: Expected metadata value to be a str, int, float or bool, got lorem_ipsum.txt which is a <class 'pathlib.PosixPath'> Try filtering complex metadata from the document using langchain_community.vectorstores.utils.filter_complex_metadata. ``` Which is way more user friendly, so I just added information about possible mismatch of the type in the error message, the same way it is covered in chroma https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/chroma.py#L224	2024-03-01 11:21:21 -08:00
Daniel Chico	7d962278f6	community[patch]: type ignore fixes (#18395 ) Related to #17048	2024-03-01 11:21:02 -08:00
Christophe Bornet	69be82c86d	community[patch]: Implement lazy_load() for CSVLoader (#18391 ) Covered by `test_csv_loader.py`	2024-03-01 11:17:08 -08:00
Bagatur	c54d6eb5da	fireworks[patch]: support "any" tool_choice (#18343 ) per https://readme.fireworks.ai/docs/function-calling	2024-03-01 11:12:28 -08:00
Erick Friis	6afb135baa	astradb: move to langchain-datastax repo (#18354 )	2024-03-01 19:04:43 +00:00
Guangdong Liu	760a16ff32	community[patch]: Fix ChatModel for sparkllm Bug. (#18375 ) PR message: *Delete this entire checklist* and replace with - Description: fix sparkllm paramer error - Issue: close #18370 - Dependencies: change `IFLYTEK_SPARK_APP_URL` to `IFLYTEK_SPARK_API_URL` - Twitter handle: No	2024-03-01 10:49:30 -08:00
Yujie Qian	cbb65741a7	community[patch]: Voyage AI updates default model and batch size (#17655 ) - Description: update the default model and batch size in VoyageEmbeddings - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Co-authored-by: fodizoltan <zoltan@conway.expert>	2024-03-01 10:22:24 -08:00
Shengsheng Huang	ae471a7dcb	community[minor]: add BigDL-LLM integrations (#17953 ) - Description: [`bigdl-llm`](https://github.com/intel-analytics/BigDL) is a library for running LLM on Intel XPU (from Laptop to GPU to Cloud) using INT4/FP4/INT8/FP8 with very low latency (for any PyTorch model). This PR adds bigdl-llm integrations to langchain. - Issue: NA - Dependencies: `bigdl-llm` library - Contribution maintainer: @shane-huang Examples added: - docs/docs/integrations/llms/bigdl.ipynb	2024-03-01 10:04:53 -08:00
Ethan Yang	f61cb8d407	community[minor]: Add openvino backend support (#11591 ) - Description: add openvino backend support by HuggingFace Optimum Intel, - Dependencies: “optimum[openvino]”, --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-01 10:04:24 -08:00
Leonid Ganeline	a89f007947	docs: `runnable` module description (#17966 ) Added a module description. Added `batch` description.	2024-03-01 10:01:32 -08:00
RadhikaBansal97	8bafd2df5e	community[patch]: Change github endpoint in GithubLoader (#17622 ) Description- - Changed the GitHub endpoint as existing was not working and giving 404 not found error - Also the existing function was failing if file_filter is not passed as the tree api return all paths including directory as well, and when get_file_content was iterating over these path, the function was failing for directory as the api was returning list of files inside the directory, so added a condition to ignore the paths if it a directory - Fixes this issue - https://github.com/langchain-ai/langchain/issues/17453 Co-authored-by: Radhika Bansal <Radhika.Bansal@veritas.com>	2024-03-01 09:36:31 -08:00
Yufei (Benny) Chen	2b93206f02	fireworks[patch]: Fix fireworks async stream (#18372 ) - Description: Fix the async stream issue with Fireworks - Dependencies: fireworks >= 0.13.0 ``` tests/integration_tests/test_chat_models.py .......... [ 45%] tests/integration_tests/test_compile.py . [ 50%] tests/integration_tests/test_embeddings.py .. [ 59%] tests/integration_tests/test_llms.py ......... [100%] ``` ``` tests/unit_tests/test_embeddings.py . [ 16%] tests/unit_tests/test_imports.py . [ 33%] tests/unit_tests/test_llms.py .... [100%] ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-01 09:20:26 -08:00
William FH	1deb8cadd5	Add dataset version info (#18299 )	2024-02-29 22:00:44 -08:00
Anush	9d663f31fa	community[patch]: FastEmbed to latest (#18040 ) ## Description Updates the `langchain_community.embeddings.fastembed` provider as per the recent updates to [`FastEmbed`](https://github.com/qdrant/fastembed) library.	2024-02-29 21:15:51 -08:00
Erick Friis	3c8a115e21	fireworks[patch]: remove custom async and stream implementations (#18363 )	2024-03-01 03:20:02 +00:00
Bagatur	f220af3dce	docs: text splitters readme (#18359 )	2024-03-01 03:00:42 +00:00
Bagatur	0d7fb5f60a	langchain[patch]: langchain-text-splitters dep (#18357 )	2024-02-29 18:48:55 -08:00
Eugene Yurtsev	51b661cfe8	community[patch]: BaseLoader load method should just delegate to lazy_load (#18289 ) load() should just reference lazy_load()	2024-02-29 21:45:28 -05:00
Bagatur	5efb5c099f	text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346 )	2024-02-29 18:33:21 -08:00
Nuno Campos	7891934173	Fix missing labels (#18356 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-29 18:11:18 -08:00
William FH	fdab931fd3	[Core] Patch: rm dumpd of outputs from runnables/base (#18295 ) It obstructs evaluations when your return a pydantic object.	2024-02-29 18:04:53 -08:00
William FH	f481cbb32d	fireworks[patch]: Fix fireworks bind tools (#18352 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-01 01:18:15 +00:00
Erick Friis	eefb49680f	multiple[patch]: fix deprecation versions (#18349 )	2024-02-29 16:58:33 -08:00
Erick Friis	11cb42c2c1	core[patch]: deprecation docstring with lib (#18350 )	2024-03-01 00:44:13 +00:00
Erick Friis	7bbff98dc7	mongodb[patch]: core 0.1.5 dep (#18348 )	2024-02-29 15:39:04 -08:00
Jib	72bfc1d3db	mongodb[minor]: MongoDB Partner Package -- Porting MongoDBAtlasVectorSearch (#17652 ) This PR migrates the existing MongoDBAtlasVectorSearch abstraction from the `langchain_community` section to the partners package section of the codebase. - [x] Run the partner package script as advised in the partner-packages documentation. - [x] Add Unit Tests - [x] Migrate Integration Tests - [x] Refactor `MongoDBAtlasVectorStore` (autogenerated) to `MongoDBAtlasVectorSearch` - [x] ~Remove~ deprecate the old `langchain_community` VectorStore references. ## Additional Callouts - Implemented the `delete` method - Included any missing async function implementations - `amax_marginal_relevance_search_by_vector` - `adelete` - Added new Unit Tests that test for functionality of `MongoDBVectorSearch` methods - Removed [`del res[self._embedding_key]`](`e0c81e1cb0/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L218)`) in `_similarity_search_with_score` function as it would make the `maximal_marginal_relevance` function fail otherwise. The `Document` needs to store the embedding key in metadata to work. Checklist: - [x] PR title: Please title your PR "package: description", where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message - [x] Pass lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified to check that you're passing lint and testing. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ - [x] Add tests and docs: If you're adding a new integration, please include 1. Existing tests supplied in docs/docs do not change. Updated docstrings for new functions like `delete` 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. (This already exists) If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Steven Silvester <steven.silvester@ieee.org> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-29 23:09:48 +00:00
William De Vena	412148773c	Updated partners/fireworks README (#18267 ) ## PR title partners: changed the README file for the Fireworks integration in the libs/partners/fireworks folder ## PR message Description: Changed the README file of partners/fireworks following the docs on https://python.langchain.com/docs/integrations/llms/Fireworks The README includes: - Brief description - Installation - Setting-up instructions (API key, model id, ...) - Basic usage Issue: https://github.com/langchain-ai/langchain/issues/17545 Dependencies: None Twitter handle: None	2024-02-29 14:55:03 -08:00
Kai Kugler	df234fb171	community[patch]: Fixing embedchain document mapping (#18255 ) - Description: The current embedchain implementation seems to handle document metadata differently than done in the current implementation of langchain and a KeyError is thrown. I would love for someone else to test this... --------- Co-authored-by: KKUGLER <kai.kugler@mercedes-benz.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Deshraj Yadav <deshraj@gatech.edu>	2024-02-29 14:54:37 -08:00
Erick Friis	040271f33a	community[patch]: remove llmlingua extended tests (#18344 )	2024-02-29 13:51:29 -08:00
William De Vena	87dca8e477	Updated partners/ibm README (#18268 ) ## PR title partners: changed the README file for the IBM Watson AI integration in the libs/partners/ibm folder. ## PR message Description: Changed the README file of partners/ibm following the docs on https://python.langchain.com/docs/integrations/llms/ibm_watsonx The README includes: - Brief description - Installation - Setting-up instructions (API key, project id, ...) - Basic usage: - Loading the model - Direct inference - Chain invoking - Streaming the model output Issue: https://github.com/langchain-ai/langchain/issues/17545 Dependencies: None Twitter handle: None --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>	2024-02-29 13:29:28 -08:00
Bagatur	9e46535ebc	core[patch]: Release 0.1.28 (#18341 )	2024-02-29 13:03:13 -08:00
Tomaz Bratanic	5999c4a240	Add support for parameters in neo4j retrieval query (#18310 ) Sometimes, you want to use various parameters in the retrieval query of Neo4j Vector to personalize/customize results. Before, when there were only predefined chains, it didn't really make sense. Now that it's all about custom chains and LCEL, it is worth adding since users can inject any params they wish at query time. Isn't prone to SQL injection-type attacks since we use parameters and not concatenating strings.	2024-02-29 13:00:54 -08:00
Hasan	15d1b73a00	Add optional output_parser param in create_react_agent (#18320 ) Description: Add facility to pass the optional output parser to customize the parsing logic --------- Co-authored-by: hasan <hasan@m2sys.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-29 12:35:43 -08:00
Bagatur	a6f0506aaf	docs: query analysis use case (#17766 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-29 12:33:49 -08:00
kkdamowang	6782dac420	docs: remove duplicate quote in AzureOpenAIEmbeddings doc (#18315 ) - Description: Remove duplicate quote in AzureOpenAIEmbeddings doc, remove trailing spaces. - Issue: No - Dependencies: No	2024-02-29 11:25:50 -08:00
Virat Singh	cd926ac3dd	community: Add PolygonFinancials Tool (#18324 ) Description: In this PR, I am adding a `PolygonFinancials` tool, which can be used to get financials data for a given ticker. The financials data is the fundamental data that is found in income statements, balance sheets, and cash flow statements of public US companies. Twitter: [@virattt](https://twitter.com/virattt)	2024-02-29 10:56:05 -08:00
Bagatur	68ad3414a2	experimental[patch]: Release 0.0.53 (#18330 )	2024-02-29 09:13:21 -08:00
William FH	8af4425abd	[Evaluation] Config Fix (#18231 )	2024-02-29 00:06:46 -08:00
William De Vena	0486404a74	langchain_openai[patch]: Invoke callback prior to yielding token (#18269 ) ## PR title langchain_openai[patch]: Invoke callback prior to yielding token ## PR message Description: Invoke callback prior to yielding token in _stream and _astream methods for langchain_openai. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-02-29 00:00:08 +00:00
William De Vena	5ee76fccd5	langchain_groq[patch]: Invoke callback prior to yielding token (#18272 ) ## PR title langchain_groq[patch]: Invoke callback prior to yielding ## PR message Description:Invoke callback prior to yielding token in _stream and _astream methods for groq. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-02-28 23:43:16 +00:00
Christophe Bornet	8a81fcd5d3	community: Fix deprecation version of AstraDB VectorStore (#17991 )	2024-02-28 17:15:09 -05:00
Stefano Lottini	6d863bed51	partner[minor]: Astra DB clients identify themselves as coming through LangChain package (#18131 ) Description This PR sets the "caller identity" of the Astra DB clients used by the integration plugins (`AstraDBChatMessageHistory`, `AstraDBStore`, `AstraDBByteStore` and, pending #17767 , `AstraDBVectorStore`). In this way, the requests to the Astra DB Data API coming from within LangChain are identified as such (the purpose is anonymous usage stats to best improve the Astra DB service).	2024-02-28 17:13:22 -05:00
mackong	2c42f3a955	ollama[patch]: delete suffix slash to avoid redirect (#18260 ) - Description: see [ollama](https://github.com/ollama/ollama/blob/main/server/routes.go#L949)'s route definitions - Issue: N/A - Dependencies: N/A	2024-02-28 16:44:48 -05:00
William De Vena	6b58943917	community[patch]: Invoke callback prior to yielding token (#18288 ) ## PR title community[patch]: Invoke callback prior to yielding PR message Description: Invoke on_llm_new_token callback prior to yielding token in _stream and _astream methods. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-02-28 21:40:53 +00:00
William De Vena	23722e3653	langchain[patch]: Invoke callback prior to yielding token (#18282 ) ## PR title langchain[patch]: Invoke callback prior to yielding ## PR message Description: Invoke on_llm_new_token callback prior to yielding token in _stream and _astream methods in langchain/tests/fake_chat_model. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-02-28 16:15:02 -05:00
Eugene Yurtsev	cd52433ba0	community[minor]: Add `SQLDatabaseLoader` document loader (#18281 ) - Description: A generic document loader adapter for SQLAlchemy on top of LangChain's `SQLDatabaseLoader`. - Needed by: https://github.com/crate-workbench/langchain/pull/1 - Depends on: GH-16655 - Addressed to: @baskaryan, @cbornet, @eyurtsev Hi from CrateDB again, in the same spirit like GH-16243 and GH-16244, this patch breaks out another commit from https://github.com/crate-workbench/langchain/pull/1, in order to reduce the size of this patch before submitting it, and to separate concerns. To accompany the SQLAlchemy adapter implementation, the patch includes integration tests for both SQLite and PostgreSQL. Let me know if corresponding utility resources should be added at different spots. With kind regards, Andreas. ### Software Tests ```console docker compose --file libs/community/tests/integration_tests/document_loaders/docker-compose/postgresql.yml up ``` ```console cd libs/community pip install psycopg2-binary pytest -vvv tests/integration_tests -k sqldatabase ``` ``` 14 passed ``` ![image](https://github.com/langchain-ai/langchain/assets/453543/42be233c-eb37-4c76-a830-474276e01436) --------- Co-authored-by: Andreas Motl <andreas.motl@crate.io>	2024-02-28 21:02:28 +00:00
William De Vena	a37dc83a9e	langchain_anthropic[patch]: Invoke callback prior to yielding token (#18274 ) ## PR title langchain_anthropic[patch]: Invoke callback prior to yielding ## PR message - Description: Invoke callback prior to yielding token in _stream and _astream methods for anthropic. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None - Twitter handle: None	2024-02-28 20:19:22 +00:00
David Ruan	af35e2525a	community[minor]: add hugging_face_model document loader (#17323 ) - Description: add hugging_face_model document loader, - Issue: NA, - Dependencies: NA, --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-28 20:05:35 +00:00
Sanjaypranav V M	b9a495e56e	community[patch]: added latin-1 decoder to gmail search tool (#18116 ) some mails from flipkart , amazon are encoded with other plain text format so to handle UnicodeDecode error , added exception and latin decoder Thank you for contributing to LangChain! @hwchase17	2024-02-28 19:28:29 +00:00
Nuno Campos	6da08d0f22	Add PNG drawer for Runnable.get_graph() (#18239 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-28 11:25:19 -08:00
Nuno Campos	d9fd1194f5	Remove check preventing passing non-declared config keys (#18276 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-28 18:28:53 +00:00
William De Vena	7ac74f291e	langchain_nvidia_ai_endpoints[patch]: Invoke callback prior to yielding token (#18271 ) ## PR title langchain_nvidia_ai_endpoints[patch]: Invoke callback prior to yielding ## PR message Description: Invoke callback prior to yielding token in _stream and _astream methods for nvidia_ai_endpoints. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None	2024-02-28 18:10:57 +00:00
Ashley Xu	e3211c2b3d	community[patch]: BigQueryVectorSearch JSON type unsupported for metadatas (#18234 )	2024-02-28 08:19:53 -08:00
Mateusz Szewczyk	db643f6283	ibm[patch]: release 0.1.0 Add possibility to pass ModelInference or Model object to WatsonxLLM class (#18189 ) - Description: Add possibility to pass ModelInference or Model object to WatsonxLLM class - Dependencies: [ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/), - Tag maintainer: : Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. ✅	2024-02-28 07:03:15 -08:00
Erick Friis	d7a77054ed	airbyte[patch]: core version 0.1.5 (#18244 )	2024-02-27 19:54:43 -08:00
Erick Friis	be8d2ff5f7	airbyte[patch]: init pkg (#18236 )	2024-02-27 19:37:53 -08:00
Ayo Ayibiowu	ac1d7d9de8	community[feat]: Adds LLMLingua as a document compressor (#17711 ) Description: This PR adds support for using the [LLMLingua project ](https://github.com/microsoft/LLMLingua) especially the LongLLMLingua (Enhancing Large Language Model Inference via Prompt Compression) as a document compressor / transformer. The LLMLingua project is an interesting project that can greatly improve RAG system by compressing prompts and contexts while keeping their semantic relevance. Issue: https://github.com/microsoft/LLMLingua/issues/31 Dependencies: [llmlingua](https://pypi.org/project/llmlingua/) @baskaryan --------- Co-authored-by: Ayodeji Ayibiowu <ayodeji.ayibiowu@getinge.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-27 19:23:56 -08:00
Nuno Campos	a99eb3abf4	openai[patch]: Assign message id in ChatOpenAI (#17837 )	2024-02-27 17:32:54 -08:00
Isaac Francisco	733367b795	docs: deprecation of OpenAI functions agent, astream_events docstring (#18164 ) Co-authored-by: Hershenson, Isaac (Extern) <isaac.hershenson.extern@bayer04.de> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-27 09:14:53 -08:00
Bagatur	242af4b5a4	openai[patch], mistral[patch], fireworks[patch]: releases 0.0.8, 0.0.5, 0.0.2 (#18186 )	2024-02-27 04:22:24 -08:00
Bagatur	7e66d964c6	core[patch]: Release 0.1.27 (#18159 )	2024-02-26 17:27:38 -08:00
Harrison Chase	d7c607ca00	core[minor]: move document compressor base (#17910 )	2024-02-26 17:20:50 -08:00
Bagatur	b3f4de38ae	mistral[minor]: Function calling and with_structured_output (#18150 ) ![Screenshot 2024-02-26 at 2 07 06 PM](https://github.com/langchain-ai/langchain/assets/22008038/20cacb47-3b24-45b5-871b-dd169f1acd37)	2024-02-26 16:22:30 -08:00
Bagatur	c53aa5cd37	core[patch]: support JS message serial namespaces (#18151 )	2024-02-26 16:19:46 -08:00
Max Jakob	5ab69f907f	partners: add Elasticsearch package (#17467 ) ### Description This PR moves the Elasticsearch classes to a partners package. Note that we will not move (and later remove) `ElasticKnnSearch`. It were previously deprecated. `ElasticVectorSearch` is going to stay in the community package since it is used quite a lot still. Also note that I left the `ElasticsearchTranslator` for self query untouched because it resides in main `langchain` package. ### Dependencies There will be another PR that updates the notebooks (potentially pulling them into the partners package) and templates and removes the classes from the community package, see https://github.com/langchain-ai/langchain/pull/17468 #### Open question How to make the transition smooth for users? Do we move the import aliases and require people to install `langchain-elasticsearch`? Or do we remove the import aliases from the `langchain` package all together? What has worked well for other partner packages? --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-26 23:19:47 +00:00
matt haigh	a4896da2a0	Experimental: Add other threshold types to SemanticChunker (#16807 ) Description Adding different threshold types to the semantic chunker. I’ve had much better and predictable performance when using standard deviations instead of percentiles. ![image](https://github.com/langchain-ai/langchain/assets/44395485/066e84a8-460e-4da5-9fa1-4ff79a1941c5) For all the documents I’ve tried, the distribution of distances look similar to the above: positively skewed normal distribution. All skews I’ve seen are less than 1 so that explains why standard deviations perform well, but I’ve included IQR if anyone wants something more robust. Also, using the percentile method backwards, you can declare the number of clusters and use semantic chunking to get an ‘optimal’ splitting. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-26 13:50:48 -08:00
Jaskirat Singh	ce682f5a09	community: vectorstores.kdbai - Added support for when no docs are present (#18103 ) - Description: By default it expects a list but that's not the case in corner scenarios when there is no document ingested(use case: Bootstrap application). \ Hence added as check, if the instance is panda Dataframe instead of list then it will procced with return immediately. - Issue: NA - Dependencies: NA - Twitter handle: jaskiratsingh1 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-26 12:47:06 -08:00
am-kinetica	9b8f6455b1	Langchain vectorstore integration with Kinetica (#18102 ) - Description: New vectorstore integration with the Kinetica database - Issue: - Dependencies: the Kinetica Python API `pip install gpudb==7.2.0.1`, - Tag maintainer: @baskaryan, @hwchase17 - Twitter handle: --------- Co-authored-by: Chad Juliano <cjuliano@kinetica.com>	2024-02-26 12:46:48 -08:00
Bagatur	1e8ab83d7b	langchain[patch], core[patch], openai[patch], fireworks[minor]: ChatFireworks.with_structured_output (#18078 ) <img width="1192" alt="Screenshot 2024-02-24 at 3 39 39 PM" src="https://github.com/langchain-ai/langchain/assets/22008038/1cf74774-a23f-4b06-9b9b-85dfa2f75b63">	2024-02-26 12:46:39 -08:00
GoodBai	3589a135ef	community: make `SET allow_experimental_[engine]_index` configurabe in vectorstores.clickhouse (#18107 ) ## Description & Issue While following the official doc to use clickhouse as a vectorstore, I found only the default `annoy` index is properly supported. But I want to try another engine `usearch` for `annoy` is not properly supported on ARM platforms. Here is the settings I prefer: ``` python settings = ClickhouseSettings( table="wiki_Ethereum", index_type="usearch", # annoy by default index_param=[], ) ``` The above settings do not work for the command `set allow_experimental_annoy_index=1` is hard-coded. This PR will make sure the experimental feature follow the `index_type` which is also consistent with Clickhouse's naming conventions.	2024-02-26 12:39:17 -08:00
Dan Stambler	69344a0661	community: Add Laser Embedding Integration (#18111 ) - Description: Added Integration with Meta AI's LASER Language-Agnostic SEntence Representations embedding library, which supports multilingual embedding for any of the languages listed here: https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200, including several low resource languages - Dependencies: laser_encoders	2024-02-26 12:16:37 -08:00
Luan Fernandes	e867557936	[docs] Update doc-string for buffer_as_messages method in ConversationBufferWindowMemory (#18136 ) minor fix stated in #18080	2024-02-26 11:46:43 -08:00
Bagatur	767523f364	core[patch], langchain[patch], templates: move openai functions parsers to core (#18060 ) ![Screenshot 2024-02-23 at 7 48 03 PM](https://github.com/langchain-ai/langchain/assets/22008038/e5540c4d-0020-4ece-869f-ae19db2a1f3f)	2024-02-26 11:12:53 -08:00
Nuno Campos	cd3ab3703b	Improve runnable generator error messages (#18142 ) h/t @hinthornw Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-26 18:54:25 +00:00
Nuno Campos	62a30efb12	Fix bug with using configurable_fields after configurable_alternatives (#18139 ) Closes #17915	2024-02-26 10:27:07 -08:00
Nuno Campos	b1d9ce541d	Add BaseMessage.id (#17835 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-26 09:27:47 -08:00
Harrison Chase	935aefa8db	add run name for query constructor (#18101 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-26 08:17:05 -08:00
Mohammad Mohtashim	719a1cde75	langchain[patch]: Update doc-string for a method in ConversationBufferWindowMemory (#18090 ) A minor doc fix stated in #18080	2024-02-26 10:15:02 -05:00
Simon Schmidt	2716d58603	langchain: Import from langchain_core in langchain.smith to avoid deprecation warning (#18129 ) Avoids deprecation warning that triggered at import time, e.g. with `python -c 'import langchain.smith'` /opt/venv/lib/python3.12/site-packages/langchain/callbacks/__init__.py:37: LangChainDeprecationWarning: Importing this callback from langchain is deprecated. Importing it from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead: `from langchain_community.callbacks import base`. To install langchain-community run `pip install -U langchain-community`.	2024-02-26 10:14:10 -05:00
Erick Friis	248c5b84ee	google-genai, google-vertexai: move to langchain-google (#17899 ) These packages have moved to https://github.com/langchain-ai/langchain-google Left tombstone readmes incase anyone ends up at the "Source Code" link from old pypi releases. Can keep these around for a few months.	2024-02-25 21:58:05 -08:00
Erick Friis	3b5bdbfee8	anthropic[minor]: package move (#17974 )	2024-02-25 21:57:26 -08:00
Christophe Bornet	a2d5fa7649	community[patch]: Fix GenericRequestsWrapper _aget_resp_content must be async (#18065 ) There are existing tests in `libs/community/tests/unit_tests/tools/requests/test_tool.py`	2024-02-25 19:07:07 -08:00
Neli Hateva	a01e8473f8	community[patch]: Fix GraphSparqlQAChain so that it works with Ontotext GraphDB (#15009 ) - Description: Introduce a new parameter `graph_kwargs` to `RdfGraph` - parameters used to initialize the `rdflib.Graph` if `query_endpoint` is set. Also, do not set `rdflib.graph.DATASET_DEFAULT_GRAPH_ID` as default value for the `rdflib.Graph` `identifier` if `query_endpoint` is set. - Issue: N/A - Dependencies: N/A - Twitter handle: N/A	2024-02-25 19:05:21 -08:00
Christophe Bornet	4d6cd5b46a	astradb[patch]: Use astrapy's upsert_one method in AstraDBStore (#18063 ) As `upsert` is deprecated	2024-02-25 19:04:18 -08:00
Danny McAteer	e42110f720	docs: Additional examples for partners/exa README (#18081 ) Description: Add additional examples for other modules to partners/exa README Issue: #17545 Dependencies: None Twitter handle: @DannyMcAteer8 --------- Co-authored-by: Daniel McAteer <danielmcateer@Daniels-MBP.attlocal.net> Co-authored-by: Daniel McAteer <danielmcateer@Daniels-MacBook-Pro.local>	2024-02-25 18:53:47 -08:00
dokato	5afb242161	langchain[patch]: Make BooleanOutputParser more robust to non-binary responses (#17810 ) - Description: I encountered this error when I tried to use LLMChainFilter. Even if the message slightly differs, like `Not relevant (NO)` this results in an error. It has been reported already here: https://github.com/langchain-ai/langchain/issues/. This change hopefully makes it more robust. - Issue: #11408 - Dependencies: No - Twitter handle: dokatox	2024-02-25 18:48:33 -08:00
kYLe	17ecf6e119	community[patch]: Remove model limitation on Anyscale LLM (#17662 ) Description: Llama Guard is deprecated from Anyscale public endpoint. Issue: Change the default model. and remove the limitation of only use Llama Guard with Anyscale LLMs Anyscale LLM can also works with all other Chat model hosted on Anyscale. Also added `async_client` for Anyscale LLM	2024-02-25 18:21:19 -08:00
Barun Amalkumar Halder	cc69976860	community[minor] : adds callback handler for Fiddler AI (#17708 ) Description: Callback handler to integrate fiddler with langchain. This PR adds the following - 1. `FiddlerCallbackHandler` implementation into langchain/community 2. Example notebook `fiddler.ipynb` for usage documentation [Internal Tracker : FDL-14305] Issue: NA Dependencies: - Installation of langchain-community is unaffected. - Usage of FiddlerCallbackHandler requires installation of latest fiddler-client (2.5+) Twitter handle: @fiddlerlabs @behalder Co-authored-by: Barun Halder <barun@fiddler.ai>	2024-02-25 18:17:03 -08:00
Christophe Bornet	b8b5ce0c8c	astradb: Add AstraDBChatMessageHistory to langchain-astradb package (#17732 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-25 18:14:49 -08:00
Maxime Perrin	c06a8732aa	community[patch]: fix llama index imports and fields access (#17870 ) - Description: Fixing outdated imports after v0.10 llama index update and updating metadata and source text access - Issue: #17860 - Twitter handle: @maximeperrin_ --------- Co-authored-by: Maxime Perrin <mperrin@doing.fr>	2024-02-25 18:14:23 -08:00
2jimoo	7fc903464a	community: Add document manager and mongo document manager (#17320 ) - Description: - Add DocumentManager class, which is a nosql record manager. - In order to use index and aindex in libs/langchain/langchain/indexes/_api.py, DocumentManager inherits RecordManager. - Also I added the MongoDB implementation of Document Manager too. - Dependencies: pymongo, motor <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: Add DocumentManager class, which is a no sql record manager. To use index method and aindex method in indexes._api.py, Document Manager inherits RecordManager.Add the MongoDB implementation of Document Manager. - Dependencies: pymongo, motor Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-23 21:32:52 -05:00
Leonid Ganeline	3f6bf852ea	experimental: docstrings update (#18048 ) Added missed docstrings. Formatted docsctrings to the consistent format.	2024-02-23 21:24:16 -05:00
kYLe	56b955fc31	community[minor]: Add async_client for Anyscale Chat model (#18050 ) Add `async_client` for Anyscale Chat_model	2024-02-23 21:22:54 -05:00
Eugene Yurtsev	68527b809d	core[patch]: Runnable with message history to use add_messages (#17958 ) This PR updates RunnableWithMessageHistory to use add_messages which will save on round-trips for any chat history abstractions that implement the optimization. If the optimization isn't implemented, add_messages automatically invokes add_message serially.	2024-02-23 21:19:38 -05:00
Bagatur	1c1bb1152e	openai[patch]: refactor with_structured_output (#18052 ) - make schema Optional with default val None, since in json_mode you don't need it if not parsing to pydantic - change return_type -> include_raw - expand docstring examples	2024-02-23 17:02:11 -08:00

... 3 4 5 6 7 ...

3467 Commits