langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-10 01:10:59 +00:00

Author	SHA1	Message	Date
Piyush Jain	2b234a4d96	Support for claude v3 models. (#18630 ) Fixes #18513. ## Description This PR attempts to fix the support for Anthropic Claude v3 models in BedrockChat LLM. The changes here has updated the payload to use the `messages` format instead of the formatted text prompt for all models; `messages` API is backwards compatible with all models in Anthropic, so this should not break the experience for any models. ## Notes The PR in the current form does not support the v3 models for the non-chat Bedrock LLM. This means, that with these changes, users won't be able to able to use the v3 models with the Bedrock LLM. I can open a separate PR to tackle this use-case, the intent here was to get this out quickly, so users can start using and test the chat LLM. The Bedrock LLM classes have also grown complex with a lot of conditions to support various providers and models, and is ripe for a refactor to make future changes more palatable. This refactor is likely to take longer, and requires more thorough testing from the community. Credit to PRs [18579](https://github.com/langchain-ai/langchain/pull/18579) and [18548](https://github.com/langchain-ai/langchain/pull/18548) for some of the code here. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-06 15:46:18 -08:00
Sam Khano	1b4dcf22f3	community[minor]: Add DocumentDBVectorSearch VectorStore (#17757 ) Description: - Added Amazon DocumentDB Vector Search integration (HNSW index) - Added integration tests - Updated AWS documentation with DocumentDB Vector Search instructions - Added notebook for DocumentDB integration with example usage --------- Co-authored-by: EC2 Default User <ec2-user@ip-172-31-95-226.ec2.internal>	2024-03-06 15:11:34 -08:00
Vittorio Rigamonti	51f3902bc4	community[minor]: Adding support for Infinispan as VectorStore (#17861 ) Description: This integrates Infinispan as a vectorstore. Infinispan is an open-source key-value data grid, it can work as single node as well as distributed. Vector search is supported since release 15.x For more: [Infinispan Home](https://infinispan.org) Integration tests are provided as well as a demo notebook	2024-03-06 15:11:02 -08:00
Max Jakob	cca0167917	elasticsearch[patch], community[patch]: update references, deprecate community classes (#18506 ) Follow up on https://github.com/langchain-ai/langchain/pull/17467. - Update all references to the Elasticsearch classes to use the partners package. - Deprecate community classes. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-06 15:09:12 -08:00
Djordje	12b4a4d860	community[patch]: Opensearch delete method added - indexing supported (#18522 ) - Description: Added delete method for OpenSearchVectorSearch, therefore indexing supported - Issue: No - Dependencies: No - Twitter handle: stkbmf	2024-03-06 15:08:47 -08:00
Erick Friis	687d27567d	openai[patch]: unit test azure init (#18703 )	2024-03-06 14:17:09 -08:00
Christophe Bornet	db8db6faae	community: Implement lazy_load() for PlaywrightURLLoader (#18676 ) Integration tests: `tests/integration_tests/document_loaders/test_url_playwright.py`	2024-03-06 16:52:13 -05:00
Aaron Yi	c092db862e	community[patch]: make metadata and text optional as expected in DocArray (#18678 ) ValidationError: 2 validation errors for DocArrayDoc text Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict] For further information visit https://errors.pydantic.dev/2.5/v/missing metadata Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict] For further information visit https://errors.pydantic.dev/2.5/v/missing ``` In the `_get_doc_cls` method, the `DocArrayDoc` class is defined as follows: ```python class DocArrayDoc(BaseDoc): text: Optional[str] embedding: Optional[NdArray] = Field(**embeddings_params) metadata: Optional[dict] ```	2024-03-06 16:51:41 -05:00
Eugene Yurtsev	4c25b49229	community[major]: breaking change in some APIs to force users to opt-in for pickling (#18696 ) This is a PR that adds a dangerous load parameter to force users to opt in to use pickle. This is a PR that's meant to raise user awareness that the pickling module is involved.	2024-03-06 16:43:01 -05:00
Eugene Yurtsev	0e52961562	community[patch]: Patch tdidf retriever (CVE-2024-2057) (#18695 ) This is a patch for `CVE-2024-2057`: https://www.cve.org/CVERecord?id=CVE-2024-2057 This affects users that: * Use the `TFIDFRetriever` * Attempt to de-serialize it from an untrusted source that contains a malicious payload	2024-03-06 15:49:04 -05:00
Erick Friis	2619420df1	mongodb[patch]: release 0.1.1 (#18692 )	2024-03-06 19:44:14 +00:00
Christophe Bornet	ea141511d8	core: Move document loader interfaces to core (#17723 ) This is needed to be able to move document loaders to partner packages. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-03-06 13:59:00 -05:00
Christophe Bornet	5985454269	Merge pull request #18539 * Implement lazy_load() for GitLoader	2024-03-06 13:25:14 -05:00
Christophe Bornet	9a6f7e213b	Merge pull request #18423 * Implement lazy_load() for BSHTMLLoader	2024-03-06 13:25:01 -05:00
Christophe Bornet	b3a0c44838	Merge pull request #18673 * Implement lazy_load() for PDFMinerPDFasHTMLLoader and PyMuPDFLoader	2024-03-06 13:24:36 -05:00
Christophe Bornet	68fc0cf909	Merge pull request #18674 * Implement lazy_load() for TextLoader	2024-03-06 13:23:42 -05:00
Christophe Bornet	5b92f962f1	Merge pull request #18671 * Implement lazy_load() for MastodonTootsLoader	2024-03-06 13:23:14 -05:00
Christophe Bornet	15b1770326	Merge pull request #18421 * Implement lazy_load() for AssemblyAIAudioTranscriptLoader	2024-03-06 13:16:05 -05:00
Christophe Bornet	bb284eebe4	Merge pull request #18436 * Implement lazy_load() for ConfluenceLoader	2024-03-06 13:15:24 -05:00
Christophe Bornet	691480f491	Merge pull request #18647 * Implement lazy_load() for UnstructuredBaseLoader	2024-03-06 13:13:10 -05:00
Christophe Bornet	52ac67c5d8	Merge pull request #18654 * Implement lazy_load() for ObsidianLoader	2024-03-06 13:06:55 -05:00
Christophe Bornet	b9c0cf9025	Merge pull request #18656 * Implement lazy_load() for PsychicLoader	2024-03-06 13:05:04 -05:00
Christophe Bornet	aa7ac57b67	community: Implement lazy_load() for TrelloLoader (#18658 ) Covered by `tests/unit_tests/document_loaders/test_trello.py`	2024-03-06 13:04:36 -05:00
Christophe Bornet	302985fea1	community: Implement lazy_load() for SlackDirectoryLoader (#18675 ) Integration tests: `tests/integration_tests/document_loaders/test_slack.py`	2024-03-06 13:04:13 -05:00
Christophe Bornet	ed36f9f604	community: Implement lazy_load() for WhatsAppChatLoader (#18677 ) Integration test: `tests/integration_tests/document_loaders/test_whatsapp_chat.py`	2024-03-06 13:03:46 -05:00
Christophe Bornet	f414f5cdb9	community[minor]: Implement lazy_load() for WikipediaLoader (#18680 ) Integration test: `tests/integration_tests/document_loaders/test_wikipedia.py`	2024-03-06 13:03:21 -05:00
Bagatur	4cbfeeb1c2	community[patch]: Release 0.0.26 (#18683 )	2024-03-06 09:41:18 -08:00
Christophe Bornet	1100f8de7a	community[minor]: Implement lazy_load() for ArxivLoader (#18664 ) Integration tests: `tests/integration_tests/utilities/test_arxiv.py` and `tests/integration_tests/document_loaders/test_arxiv.py`	2024-03-06 09:16:49 -05:00
Christophe Bornet	2d96803ddd	community[minor]: Implement lazy_load() for OutlookMessageLoader (#18668 ) Integration test: `tests/integration_tests/document_loaders/test_email.py`	2024-03-06 09:15:57 -05:00
Christophe Bornet	ae167fb5b2	community[minor]: Implement lazy_load() for SitemapLoader (#18667 ) Integration tests: `test_sitemap.py` and `test_docusaurus.py`	2024-03-06 09:15:35 -05:00
Christophe Bornet	623dfcc55c	community[minor]: Implement lazy_load() for FacebookChatLoader (#18669 ) Integration test: `tests/integration_tests/document_loaders/test_facebook_chat.py`	2024-03-06 09:15:00 -05:00
Christophe Bornet	20794bb889	community[minor]: Implement lazy_load() for GitbookLoader (#18670 ) Integration test: `tests/integration_tests/document_loaders/test_gitbook.py`	2024-03-06 09:14:36 -05:00
Liang Zhang	81985b31e6	community[patch]: Databricks SerDe uses cloudpickle instead of pickle (#18607 ) - Description: Databricks SerDe uses cloudpickle instead of pickle when serializing a user-defined function transform_input_fn since pickle does not support functions defined in `__main__`, and cloudpickle supports this. - Dependencies: cloudpickle>=2.0.0 Added a unit test.	2024-03-05 18:04:45 -08:00
Christophe Bornet	7d6de96186	community[patch]: Implement lazy_load() for CubeSemanticLoader (#18535 ) Covered by `test_cube_semantic.py`	2024-03-05 17:32:31 -08:00
Christophe Bornet	a6b5d45e31	community[patch]: Implement lazy_load() for EverNoteLoader (#18538 ) Covered by `test_evernote_loader.py`	2024-03-05 17:29:52 -08:00
Max Jakob	ee7a7954b9	elasticsearch: add `ElasticsearchRetriever` (#18587 ) Implement [Retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/) interface for Elasticsearch. I opted to only expose the `body`, which gives you full flexibility, and none the other 68 arguments of the [search method](https://elasticsearch-py.readthedocs.io/en/v8.12.1/api/elasticsearch.html#elasticsearch.Elasticsearch.search). Added a user agent header for usage tracking in Elastic Cloud. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-06 00:42:50 +00:00
Jib	8bc347c5fc	mongodb[patch]: include LLM caches in toplevel library import (#18601 )	2024-03-05 16:35:13 -08:00
Sunchao Wang	dc81dba6cf	community[patch]: Improve amadeus tool and doc (#18509 ) Description: This pull request addresses two key improvements to the langchain repository: Fix for Crash in Flight Search Interface: Previously, the code would crash when encountering a failure scenario in the flight ticket search interface. This PR resolves this issue by implementing a fix to handle such scenarios gracefully. Now, the code handles failures in the flight search interface without crashing, ensuring smoother operation. Documentation Update for Amadeus Toolkit: Prior to this update, examples provided in the documentation for the Amadeus Toolkit were unable to run correctly due to outdated information. This PR includes an update to the documentation, ensuring that all examples can now be executed successfully. With this update, users can effectively utilize the Amadeus Toolkit with accurate and functioning examples. These changes aim to enhance the reliability and usability of the langchain repository by addressing issues related to error handling and ensuring that documentation remains up-to-date and actionable. Issue: https://github.com/langchain-ai/langchain/issues/17375 Twitter Handle: SingletonYxx	2024-03-05 16:17:22 -08:00
Christophe Bornet	f77f7dc3ec	community[patch]: Fix VectorStoreQATool (#18529 ) Fix #18460	2024-03-05 15:56:58 -08:00
Dounx	ad48f55357	community[minor]: add Yuque document loader (#17924 ) This pull request support loading documents from Yuque with Langchain. Yuque is a professional cloud-based knowledge base for team collaboration in documentation. Website: https://www.yuque.com OpenAPI: https://www.yuque.com/yuque/developer/openapi	2024-03-05 15:54:07 -08:00
Kazuki Maeda	60c5d964a8	community[minor]: use jq schema for content_key in json_loader (#18003 ) ### Description Changed the value specified for `content_key` in JSONLoader from a single key to a value based on jq schema. I created [similar PR](https://github.com/langchain-ai/langchain/pull/11255) before, but it has several conflicts because of the architectural change associated stable version release, so I re-create this PR to fit new architecture. ### Why For json data like the following, specify `.data[].attributes.message` for page_content and `.data[].attributes.id` or `.data[].attributes.attributes. tags`, etc., the `content_key` must also parse the json structure. <details> <summary>sample json data</summary> ```json { "data": [ { "attributes": { "message": "message1", "tags": [ "tag1" ] }, "id": "1" }, { "attributes": { "message": "message2", "tags": [ "tag2" ] }, "id": "2" } ] } ``` </details> <details> <summary>sample code</summary> ```python def metadata_func(record: dict, metadata: dict) -> dict: metadata["source"] = None metadata["id"] = record.get("id") metadata["tags"] = record["attributes"].get("tags") return metadata sample_file = "sample1.json" loader = JSONLoader( file_path=sample_file, jq_schema=".data[]", content_key=".attributes.message", ## content_key is parsable into jq schema is_content_key_jq_parsable=True, ## this is added parameter metadata_func=metadata_func ) data = loader.load() data ``` </details> ### Dependencies none ### Twitter handle [kzk_maeda](https://twitter.com/kzk_maeda)	2024-03-05 15:51:24 -08:00
Max Jakob	81e9ab6e3a	docs: Update elasticsearch README (#18497 ) Update Elasticsearch README with information on how to start a deployment. Also make some cosmetic changes to the [Elasticsearch docs](https://python.langchain.com/docs/integrations/vectorstores/elasticsearch). Follow up on https://github.com/langchain-ai/langchain/pull/17467	2024-03-05 15:49:16 -08:00
Hech	6a08134661	community[patch], langchain[minor]: Add retriever self_query and score_threshold in DingoDB (#18106 )	2024-03-05 15:47:29 -08:00
Mikhail Khludnev	d039dcb6ba	nvidia-trt[patch]: add TritonTensorRTLLM(verbose_client=False) (#16848 ) - Description: adding verbose flag to TritonTensorRTLLM, - Issue: nope, - Dependencies: not any, - Twitter handle:	2024-03-05 15:44:13 -08:00
Asaf Joseph Gardin	27441555d0	ai21[patch]: AI21 Labs Contextual Answers support (#18270 ) Description: Added support for AI21 Labs model - Contextual Answers Dependencies: ai21, ai21-tokenizer Twitter handle: https://github.com/AI21Labs --------- Co-authored-by: Asaf Gardin <asafg@ai21.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-05 22:42:04 +00:00
Erick Friis	e169ee8863	anthropic[patch]: handle lists in function calling (#18609 )	2024-03-05 14:19:40 -08:00
Erick Friis	1831733c2e	anthropic[patch]: fix argument integration test (#18605 )	2024-03-05 13:05:25 -08:00
Yudhajit Sinha	4570b477b9	community[patch]: Invoke callback prior to yielding token (titan_takeoff) (#18560 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message - Description: Invoke callback prior to yielding token in _stream_ method in llms/titan_takeoff. - Issue: #16913 - Dependencies: None	2024-03-05 12:54:26 -08:00
Tomaz Bratanic	ea51cdaede	Remove neo4j bloom labels from graph schema (#18564 ) Neo4j tools use particular node labels and relationship types to store metadata, but are irrelevant for text2cypher or graph generation, so we want to ignore them in the schema representation.	2024-03-05 12:54:05 -08:00
Erick Friis	e1924b3e93	core[patch]: deprecate hwchase17/langchain-hub, address path traversal (#18600 ) Deprecates the old langchain-hub repository. Does not deprecate the new https://smith.langchain.com/hub @PinkDraconian has correctly raised that in the event someone is loading unsanitized user input into the `try_load_from_hub` function, they have the ability to load files from other locations in github than the hwchase17/langchain-hub repository. This PR adds some more path checking to that function and deprecates the functionality in favor of the hub built into LangSmith.	2024-03-05 12:49:38 -08:00

1 2 3 4 5 ...

3210 Commits