langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-11 19:11:02 +00:00

Author	SHA1	Message	Date
Rahul Triptahi	9ef93ecd7c	community[minor]: Added classification_location parameter in PebbloSafeLoader. (#22565 ) Description: Add classifier_location feature flag. This flag enables Pebblo to decide the classifier location, local or pebblo-cloud. Unit Tests: N/A Documentation: N/A --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-06-24 17:30:38 -04:00
wenngong	af620db9c7	partners: add lint docstrings for azure-dynamic-sessions/together modules (#23303 ) Description: add lint docstrings for azure-dynamic-sessions/together modules Issue: #23188 @baskaryan test: ruff check passed. <img width="782" alt="image" src="https://github.com/langchain-ai/langchain/assets/76683249/bf11783d-65b3-4e56-a563-255eae89a3e4"> --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-06-24 16:26:54 -04:00
yuncliu	398b2b9c51	community[minor]: Add Ascend NPU optimized Embeddings (#20260 ) - Description: Add NPU support for embeddings --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-24 20:15:11 +00:00
Luis Rueda	168e9ed3a5	partners: add custom options to MongoDBChatMessageHistory (#22944 ) Description: Adds options for configuring MongoDBChatMessageHistory (no breaking changes): - session_id_key: name of the field that stores the session id - history_key: name of the field that stores the chat history - create_index: whether to create an index on the session id field - index_kwargs: additional keyword arguments to pass to the index creation Discussion: https://github.com/langchain-ai/langchain/discussions/22918 Twitter handle: @userlerueda --------- Co-authored-by: Jib <Jibzade@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-06-24 19:42:56 +00:00
Eugene Yurtsev	1e750f12f6	standard-tests[minor]: Add standard read write test suite for vectorstores (#23355 ) Add standard read write test suite for vectorstores	2024-06-24 19:40:56 +00:00
Eugene Yurtsev	3b3ed72d35	standard-tests[minor]: Add standard tests for BaseStore (#23360 ) Add standard tests to base store abstraction. These only work on [str, str] right now. We'll need to check if it's possible to add encoder/decoders to generalize	2024-06-24 19:38:50 +00:00
ccurme	e1190c8f3c	mongodb[patch]: fix CI for python 3.12 (#23369 )	2024-06-24 19:31:20 +00:00
RUO	2b87e330b0	community: fix issue with nested field extraction in MongodbLoader (#22801 ) Description: This PR addresses an issue in the `MongodbLoader` where nested fields were not being correctly extracted. The loader now correctly handles nested fields specified in the `field_names` parameter. Issue: Fixes an issue where attempting to extract nested fields from MongoDB documents resulted in `KeyError`. Dependencies: No new dependencies are required for this change. Twitter handle: (Optional, your Twitter handle if you'd like a mention when the PR is announced) ### Changes 1. Field Name Parsing: - Added logic to parse nested field names and safely extract their values from the MongoDB documents. 2. Projection Construction: - Updated the projection dictionary to include nested fields correctly. 3. Field Extraction: - Updated the `aload` method to handle nested field extraction using a recursive approach to traverse the nested dictionaries. ### Example Usage Updated usage example to demonstrate how to specify nested fields in the `field_names` parameter: ```python loader = MongodbLoader( connection_string=MONGO_URI, db_name=MONGO_DB, collection_name=MONGO_COLLECTION, filter_criteria={"data.job.company.industry_name": "IT", "data.job.detail": { "$exists": True }}, field_names=[ "data.job.detail.id", "data.job.detail.position", "data.job.detail.intro", "data.job.detail.main_tasks", "data.job.detail.requirements", "data.job.detail.preferred_points", "data.job.detail.benefits", ], ) docs = loader.load() print(len(docs)) for doc in docs: print(doc.page_content) ``` ### Testing Tested with a MongoDB collection containing nested documents to ensure that the nested fields are correctly extracted and concatenated into a single page_content string. ### Note This change ensures backward compatibility for non-nested fields and improves functionality for nested field extraction. ### Output Sample ```python print(docs[:3]) ``` ```shell # output sample: [ Document( # Here in this example, page_content is the combined text from the fields below # "position", "intro", "main_tasks", "requirements", "preferred_points", "benefits" page_content='all combined contents from the requested fields in the document', metadata={'database': 'Your Database name', 'collection': 'Your Collection name'} ), ... ] ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-24 19:29:11 +00:00
Tomaz Bratanic	aeeda370aa	Sanitize backticks from neo4j labels and types for import (#23367 )	2024-06-24 19:05:31 +00:00
Rave Harpaz	f5ff7f178b	Add OCI Generative AI new model support (#22880 ) - [x] PR title: community: Add OCI Generative AI new model support - [x] PR message: - Description: adding support for new models offered by OCI Generative AI services. This is a moderate update of our initial integration PR 16548 and includes a new integration for our chat models under /langchain_community/chat_models/oci_generative_ai.py - Issue: NA - Dependencies: No new Dependencies, just latest version of our OCI sdk - Twitter handle: NA - [x] Add tests and docs: 1. we have updated our unit tests 2. we have updated our documentation including a new ipynb for our new chat integration - [x] Lint and test: `make format`, `make lint`, and `make test` run successfully --------- Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com> Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>	2024-06-24 14:48:23 -04:00
Baur	aa358f2be4	community: Add ZenGuard tool (#22959 ) Description This is the community integration of ZenGuard AI - the fastest guardrails for GenAI applications. ZenGuard AI protects against: - Prompts Attacks - Veering of the pre-defined topics - PII, sensitive info, and keywords leakage. - Toxicity - Etc. Twitter Handle : @zenguardai - [x] Add tests and docs: If you're adding a new integration, please include 1. Added an integration test 2. Added colab - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. --------- Co-authored-by: Nuradil <nuradil.maksut@icloud.com> Co-authored-by: Nuradil <133880216+yaksh0nti@users.noreply.github.com>	2024-06-24 17:40:56 +00:00
Mathis Joffre	60103fc4a5	community: Fix OVHcloud 401 Unauthorized on embedding. (#23260 ) They are now rejecting with code 401 calls from users with expired or invalid tokens (while before they were being considered anonymous). Thus, the authorization header has to be removed when there is no token. Related to: #23178 --------- Signed-off-by: Joffref <mariusjoffre@gmail.com>	2024-06-24 12:58:32 -04:00
Eugene Yurtsev	d90379210a	standard-tests[minor]: Add standard tests for cache (#23357 ) Add standard tests for cache abstraction	2024-06-24 15:15:03 +00:00
Leonid Ganeline	987099cfcd	community: `toolkits` docstrings (#23286 ) Added missed docstrings. Formatted docstrings to the consistent form. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-22 14:37:52 +00:00
Rahul Triptahi	0cd3f93361	Enhance metadata of sharepointLoader. (#22248 ) Description: 2 feature flags added to SharePointLoader in this PR: 1. load_auth: if set to True, adds authorised identities to metadata 2. load_extended_metadata, adds source, owner and full_path to metadata Unit tests:N/A Documentation: To be done. --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-06-21 17:03:38 -07:00
Bagatur	bcac6c3aff	openai[patch]: temp fix ignore lint (#23290 )	2024-06-21 16:52:52 -07:00
William FH	efb4c12abe	[Core] Add support for inferring Annotated types (#23284 ) in bind_tools() / convert_to_openai_function	2024-06-21 15:16:30 -07:00
Vadym Barda	9ac302cb97	core[minor]: update draw_mermaid node label processing (#23285 ) This fixes processing issue for nodes with numbers in their labels (e.g. `"node_1"`, which would previously be relabeled as `"node__"`, and now are correctly processed as `"node_1"`)	2024-06-21 21:35:32 +00:00
Rajendra Kadam	7ee2822ec2	community: Fix TypeError in PebbloRetrievalQA (#23170 ) Description: Fix "`TypeError: 'NoneType' object is not iterable`" when the auth_context is absent in PebbloRetrievalQA. The auth_context is optional; hence, PebbloRetrievalQA should work without it, but it throws an error at the moment. This PR fixes that issue. Issue: NA Dependencies: None Unit tests: NA --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-21 17:04:00 -04:00
Iurii Umnov	3b7b933aa2	community[minor]: OpenAPI agent. Add support for PUT, DELETE and PATCH (#22962 ) Description: Add PUT, DELETE and PATCH tools to tool list for OpenAPI agent if dangerous requests are allowed. Issue: https://github.com/langchain-ai/langchain/issues/20469	2024-06-21 20:44:23 +00:00
Guangdong Liu	3c42bf8d97	community(patch):Fix PineconeHynridSearchRetriever not having search_kwargs (#21577 ) - close #21521	2024-06-21 16:27:52 -04:00
Rahul Triptahi	4bb3d5c488	[community][quick-fix]: changed from blob.path to blob.path.name in 0365BaseLoader. (#22287 ) Description: file_metadata_ was not getting propagated to returned documents. Changed the lookup key to the name of the blob's path. Changed blob.path key to blob.path.name for metadata_dict key lookup. Documentation: N/A Unit tests: N/A Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-21 15:51:03 -04:00
Bagatur	f824f6d925	docs: fix merge message runs docstring (#23279 )	2024-06-21 19:50:50 +00:00
wenngong	f9aea3db07	partners: add lint docstrings for chroma module (#23249 ) Description: add lint docstrings for chroma module Issue: the issue #23188 @baskaryan test: ruff check passed. ![image](https://github.com/langchain-ai/langchain/assets/76683249/5e168a0c-32d0-464f-8ddb-110233918019) --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-06-21 19:49:24 +00:00
Bagatur	9eda8f2fe8	docs: fix trim_messages code blocks (#23271 )	2024-06-21 17:15:31 +00:00
Bagatur	4c97a9ee53	docs: fix message transformer docstrings (#23264 )	2024-06-21 16:10:03 +00:00
Vwake04	0deb98ac0c	pinecone: Fix multiprocessing issue in PineconeVectorStore (#22571 ) Description: Currently, the `langchain_pinecone` library forces the `async_req` (asynchronous required) argument to Pinecone to `True`. This design choice causes problems when deploying to environments that do not support multiprocessing, such as AWS Lambda. In such environments, this restriction can prevent users from successfully using `langchain_pinecone`. This PR introduces a change that allows users to specify whether they want to use asynchronous requests by passing the `async_req` parameter through `kwargs`. By doing so, users can set `async_req=False` to utilize synchronous processing, making the library compatible with AWS Lambda and other environments that do not support multithreading. Issue: This PR does not address a specific issue number but aims to resolve compatibility issues with AWS Lambda by allowing synchronous processing. Dependencies:** None, that I'm aware of. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-21 15:46:01 +00:00
ccurme	75c7c3a1a7	openai: release 0.1.9 (#23263 )	2024-06-21 11:15:29 -04:00
Brace Sproul	abe7566d7d	core[minor]: BaseChatModel with_structured_output implementation (#22859 )	2024-06-21 08:14:03 -07:00
mackong	360a70c8a8	core[patch]: fix no current event loop for sql history in async mode (#22933 ) - Description: When use RunnableWithMessageHistory/SQLChatMessageHistory in async mode, we'll get the following error: ``` Error in RootListenersTracer.on_chain_end callback: RuntimeError("There is no current event loop in thread 'asyncio_3'.") ``` which throwed by `ddfbca38df/libs/community/langchain_community/chat_message_histories/sql.py (L259)`. and no message history will be add to database. In this patch, a new _aexit_history function which will'be called in async mode is added, and in turn aadd_messages will be called. In this patch, we use `afunc` attribute of a Runnable to check if the end listener should be run in async mode or not. - Issue: #22021, #22022 - Dependencies: N/A	2024-06-21 10:39:47 -04:00
Philippe PRADOS	1c2b9cc9ab	core[minor]: Update pgvector transalor for langchain_postgres (#23217 ) The SelfQuery PGVectorTranslator is not correct. The operator is "eq" and not "$eq". This patch use a new version of PGVectorTranslator from langchain_postgres. It's necessary to release a new version of langchain_postgres (see [here](https://github.com/langchain-ai/langchain-postgres/pull/75) before accepting this PR in langchain.	2024-06-21 10:37:09 -04:00
Mu Yang	401d469a92	langchain: fix systax warning in create_json_chat_agent (#23253 ) fix systax warning in `create_json_chat_agent` ``` .../langchain/agents/json_chat/base.py:22: SyntaxWarning: invalid escape sequence '\ ' """Create an agent that uses JSON to format its logic, build for Chat Models. ```	2024-06-21 10:05:38 -04:00
mackong	b108b4d010	core[patch]: set schema format for AsyncRootListenersTracer (#23214 ) - Description: AsyncRootListenersTracer support on_chat_model_start, it's schema_format should be "original+chat". - Issue: N/A - Dependencies:	2024-06-21 09:30:27 -04:00
Bagatur	976b456619	docs: BaseChatModel key methods table (#23238 ) If we're moving documenting inherited params think these kinds of tables become more important ![Screenshot 2024-06-20 at 3 59 12 PM](https://github.com/langchain-ai/langchain/assets/22008038/722266eb-2353-4e85-8fae-76b19bd333e0)	2024-06-20 21:00:22 -07:00
ccurme	a7b4175091	standard tests: add test for tool calling (#23234 ) Including streaming	2024-06-20 17:20:11 -04:00
Bagatur	12e0c28a6e	docs: fix chat model methods table (#23233 ) rst table not md ![Screenshot 2024-06-20 at 12 37 46 PM](https://github.com/langchain-ai/langchain/assets/22008038/7a03b869-c1f4-45d0-8d27-3e16f4c6eb19)	2024-06-20 19:51:10 +00:00
Zheng Robert Jia	a349fce880	docs[minor],community[patch]: Minor tutorial docs improvement, minor import error quick fix. (#22725 ) minor changes to module import error handling and minor issues in tutorial documents. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-06-20 15:36:49 -04:00
Eugene Yurtsev	7545b1d29b	core[patch]: Fix doc-strings for code blocks (#23232 ) Code blocks need extra space around them to be rendered properly by sphinx	2024-06-20 19:34:52 +00:00
Luis Moros	d5be160af0	community[patch]: Fix sql_databse.from_databricks issue when ran from Job (#23224 ) Desscription: When the ``sql_database.from_databricks`` is executed from a Workflow Job, the ``context`` object does not have a "browserHostName" property, resulting in an error. This change manages the error so the "DATABRICKS_HOST" env variable value is used instead of stoping the flow Co-authored-by: lmorosdb <lmorosdb>	2024-06-20 19:34:15 +00:00
Cory Waddingham	cd6812342e	pinecone[patch]: Update Poetry requirements for pinecone-client >=3.2.2 (#22094 ) This change updates the requirements in `libs/partners/pinecone/pyproject.toml` to allow all versions of `pinecone-client` greater than or equal to 3.2.2. This change resolves issue [21955](https://github.com/langchain-ai/langchain/issues/21955). --------- Co-authored-by: Erick Friis <erickfriis@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-20 18:59:36 +00:00
Eugene Yurtsev	59d7adff8f	core[patch]: Add clarification about streaming to RunnableLambda (#23227 ) Add streaming clarification to runnable lambda docstring.	2024-06-20 16:47:16 +00:00
maang-h	bc4cd9c5cc	community[patch]: Update root_validators ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI (#22853 ) This PR updates root validators for: - ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI Issues #22819 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-20 16:36:41 +00:00
ChrisDEV	cb6cf4b631	Fix return value type of dumpd (#20123 ) The return type of `json.loads` is `Any`. In fact, the return type of `dumpd` must be based on `json.loads`, so the correction here is understandable. Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-20 16:31:41 +00:00
Guangdong Liu	0bce28cd30	core(patch): Fix encoding problem of load_prompt method (#21559 ) - description: Add encoding parameters. - @baskaryan, @efriis, @eyurtsev, @hwchase17. ![54d25ac7b1d5c2e47741a56fe8ed8ba](https://github.com/langchain-ai/langchain/assets/48236177/ffea9596-2001-4e19-b245-f8a6e231b9f9)	2024-06-20 09:25:54 -07:00
Philippe PRADOS	8711c61298	core[minor]: Adds an in-memory implementation of RecordManager (#13200 ) Description: langchain offers three technologies to save data: - [vectorstore](https://python.langchain.com/docs/modules/data_connection/vectorstores/) - [docstore](https://js.langchain.com/docs/api/schema/classes/Docstore) - [record manager](https://python.langchain.com/docs/modules/data_connection/indexing) If you want to combine these technologies in a sample persistence stategy you need a common implementation for each. `DocStore` propose `InMemoryDocstore`. We propose the class `MemoryRecordManager` to complete the system. This is the prelude to another full-request, which needs a consistent combination of persistence components. Tag maintainer: @baskaryan Twitter handle: @pprados --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-20 12:19:10 -04:00
Leonid Ganeline	51e75cf59d	community: docstrings (#23202 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-20 11:08:13 -04:00
Julian Weng	6a1a0d977a	partners[minor]: Fix value error message for with_structured_output (#22877 ) Currently, calling `with_structured_output()` with an invalid method argument raises `Unrecognized method argument. Expected one of 'function_calling' or 'json_format'`, but the JSON mode option [is now referred to](https://python.langchain.com/v0.2/docs/how_to/structured_output/#the-with_structured_output-method) by `'json_mode'`. This fixes that. Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-20 15:03:21 +00:00
Leonid Ganeline	41f7620989	huggingface: docstrings (#23148 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference) Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-20 13:22:40 +00:00
ccurme	066a5a209f	huggingface[patch]: fix CI for python 3.12 (#23197 )	2024-06-20 09:17:26 -04:00
xyd	9b3a025f9c	fix https://github.com/langchain-ai/langchain/issues/23215 (#23216 ) fix bug The ZhipuAIEmbeddings class is not working. Co-authored-by: xu yandong <shaonian@acsx1.onexmail.com>	2024-06-20 13:04:50 +00:00
Bagatur	ad7f2ec67d	standard-tests[patch]: test stop not stop_sequences (#23200 )	2024-06-19 18:07:33 -07:00
David DeCaprio	a4bcb45f65	core:Add optional max_messages to MessagePlaceholder (#16098 ) - Description: Add optional max_messages to MessagePlaceholder - Issue: [16096](https://github.com/langchain-ai/langchain/issues/16096) - Dependencies: None - Twitter handle: @davedecaprio Sometimes it's better to limit the history in the prompt itself rather than the memory. This is needed if you want different prompts in the chain to have different history lengths. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-19 23:39:51 +00:00
shaunakgodbole	7193634ae6	fireworks[patch]: fix api_key alias in Fireworks LLM (#23118 ) Thank you for contributing to LangChain! Description The current code snippet for `Fireworks` had incorrect parameters. This PR fixes those parameters. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-19 21:14:42 +00:00
Eugene Yurtsev	1fcf875fe3	core[patch]: Document agent schema (#23194 ) * Document agent schema * Refer folks to langgraph for more information on how to create agents.	2024-06-19 20:16:57 +00:00
Eugene Yurtsev	c2d43544cc	core[patch]: Document messages namespace (#23154 ) - Moved doc-strings below attribtues in TypedDicts -- seems to render better on APIReference pages. * Provided more description and some simple code examples	2024-06-19 15:00:00 -04:00
Eugene Yurtsev	3c917204dc	core[patch]: Add doc-strings to outputs, fix @root_validator (#23190 ) - Document outputs namespace - Update a vanilla @root_validator that was missed	2024-06-19 14:59:06 -04:00
Bagatur	8698cb9b28	infra: add more formatter rules to openai (#23189 ) Turns on https://docs.astral.sh/ruff/settings/#format_docstring-code-format and https://docs.astral.sh/ruff/settings/#format_skip-magic-trailing-comma ```toml [tool.ruff.format] docstring-code-format = true skip-magic-trailing-comma = true ```	2024-06-19 11:39:58 -07:00
Michał Krassowski	710197e18c	community[patch]: restore compatibility with SQLAlchemy 1.x (#22546 ) - Description: Restores compatibility with SQLAlchemy 1.4.x that was broken since #18992 and adds a test run for this version on CI (only for Python 3.11) - Issue: fixes #19681 - Dependencies: None - Twitter handle: `@krassowski_m` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-19 17:58:57 +00:00
Erick Friis	48d6ea427f	upstage: move to external repo (#22506 )	2024-06-19 17:56:07 +00:00
Bagatur	0a4ee864e9	openai[patch]: image token counting (#23147 ) Resolves #23000 --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-19 10:41:47 -07:00
Jorge Piedrahita Ortiz	b3e53ffca0	community[patch]: sambanova llm integration improvement (#23137 ) - Description: sambanova sambaverse integration improvement: removed input parsing that was changing raw user input, and was making to use process prompt parameter as true mandatory	2024-06-19 10:30:14 -07:00
Jorge Piedrahita Ortiz	e162893d7f	community[patch]: update sambastudio embeddings (#23133 ) Description: update sambastudio embeddings integration, now compatible with generic endpoints and CoE endpoints	2024-06-19 10:26:56 -07:00
Philippe PRADOS	db6f46c1a6	langchain[small]: Change type to BasePromptTemplate (#23083 ) ```python Change from_llm( prompt: PromptTemplate ... ) ``` to ```python Change from_llm( prompt: BasePromptTemplate ... ) ```	2024-06-19 13:19:36 -04:00
Sergey Kozlov	94452a94b1	core[patch[: add exceptions propagation test for astream_events v2 (#23159 ) Description: `astream_events(version="v2")` didn't propagate exceptions in `langchain-core<=0.2.6`, fixed in the #22916. This PR adds a unit test to check that exceptions are propagated upwards. Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>	2024-06-19 13:00:25 -04:00
Leonid Ganeline	50484be330	prompty: docstring (#23152 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference) --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-19 12:50:58 -04:00
chenxi	505a2e8743	fix: MoonshotChat fails when setting the moonshot_api_key through the OS environment. (#23176 ) Close #23174 Co-authored-by: tianming <tianming@bytenew.com>	2024-06-19 16:28:24 +00:00
Bagatur	677408bfc9	core[patch]: fix chat history circular import (#23182 )	2024-06-19 09:08:36 -07:00
Eugene Yurtsev	883e90d06e	core[patch]: Add an example to the Document schema doc-string (#23131 ) Add an example to the document schema	2024-06-19 11:35:30 -04:00
ccurme	2b08e9e265	core[patch]: update test to catch circular imports (#23172 ) This raises ImportError due to a circular import: ```python from langchain_core import chat_history ``` This does not: ```python from langchain_core import runnables from langchain_core import chat_history ``` Here we update `test_imports` to run each import in a separate subprocess. Open to other ways of doing this!	2024-06-19 15:24:38 +00:00
Eugene Yurtsev	ae4c0ed25a	core[patch]: Add documentation to load namespace (#23143 ) Document some of the modules within the load namespace	2024-06-19 15:21:41 +00:00
Eugene Yurtsev	a34e650f8b	core[patch]: Add doc-string to document compressor (#23085 )	2024-06-19 11:03:49 -04:00
Eugene Yurtsev	1007a715a5	community[patch]: Prevent unit tests from making network requests (#23180 ) * Prevent unit tests from making network requests	2024-06-19 14:56:30 +00:00
ccurme	ca798bc6ea	community: move test to integration tests (#23178 ) Tests failing on master with > FAILED tests/unit_tests/embeddings/test_ovhcloud.py::test_ovhcloud_embed_documents - ValueError: Request failed with status code: 401, {"message":"Bad token; invalid JSON"}	2024-06-19 14:39:48 +00:00
Eugene Yurtsev	4fe8403bfb	core[patch]: Expand documentation in the indexing namespace (#23134 )	2024-06-19 10:11:44 -04:00
Eugene Yurtsev	fe4f10047b	core[patch]: Document embeddings namespace (#23132 ) Document embeddings namespace	2024-06-19 10:11:16 -04:00
Eugene Yurtsev	a3bae56a48	core[patch]: Update documentation in LLM namespace (#23138 ) Update documentation in lllm namespace.	2024-06-19 10:10:50 -04:00
Leonid Ganeline	a70b7a688e	ai21: docstrings (#23142 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-19 08:51:15 -04:00
bilk0h	3d54784e6d	text-splitters: Fix/recursive json splitter data persistence issue (#21529 ) Thank you for contributing to LangChain! Description: Noticed an issue with when I was calling `RecursiveJsonSplitter().split_json()` multiple times that I was getting weird results. I found an issue where `chunks` list in the `_json_split` method. If chunks is not provided when _json_split (which is the case when split_json calls _json_split) then the same list is used for subsequent calls to `_json_split`. You can see this in the test case i also added to this commit. Output should be: ``` [{'a': 1, 'b': 2}] [{'c': 3, 'd': 4}] ``` Instead you get: ``` [{'a': 1, 'b': 2}] [{'a': 1, 'b': 2, 'c': 3, 'd': 4}] ``` --------- Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-18 20:21:55 -07:00
鹿鹿鹿鲨	6b46b5e9ce	community: add request_kwargs and expect TimeError AsyncHtmlLoader (#23068 ) - Description: add `request_kwargs` and expect `TimeError` in `_fetch` function for AsyncHtmlLoader. This allows you to fill in the kwargs parameter when using the `load()` method of the `AsyncHtmlLoader` class. Co-authored-by: Yucolu <yucolu@tencent.com>	2024-06-18 20:02:46 -07:00
Leonid Ganeline	109a70fc64	ibm: docstrings (#23149 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-18 20:00:27 -07:00
Ryan Elston	86ee4f0daa	text-splitters: Introduce Experimental Markdown Syntax Splitter (#22257 ) #### Description This MR defines a `ExperimentalMarkdownSyntaxTextSplitter` class. The main goal is to replicate the functionality of the original `MarkdownHeaderTextSplitter` which extracts the header stack as metadata but with one critical difference: it keeps the whitespace of the original text intact. This draft reimplements the `MarkdownHeaderTextSplitter` with a very different algorithmic approach. Instead of marking up each line of the text individually and aggregating them back together into chunks, this method builds each chunk sequentially and applies the metadata to each chunk. This makes the implementation simpler. However, since it's designed to keep white space intact its not a full drop in replacement for the original. Since it is a radical implementation change to the original code and I would like to get feedback to see if this is a worthwhile replacement, should be it's own class, or is not a good idea at all. Note: I implemented the `return_each_line` parameter but I don't think it's a necessary feature. I'd prefer to remove it. This implementation also adds the following additional features: - Splits out code blocks and includes the language in the `"Code"` metadata key - Splits text on the horizontal rule `---` as well - The `headers_to_split_on` parameter is now optional - with sensible defaults that can be overridden. #### Issue Keeping the whitespace keeps the paragraphs structure and the formatting of the code blocks intact which allows the caller much more flexibility in how they want to further split the individuals sections of the resulting documents. This addresses the issues brought up by the community in the following issues: - https://github.com/langchain-ai/langchain/issues/20823 - https://github.com/langchain-ai/langchain/issues/19436 - https://github.com/langchain-ai/langchain/issues/22256 #### Dependencies N/A #### Twitter handle @RyanElston --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-18 19:44:00 -07:00
Bagatur	93d0ad97fe	anthropic[patch]: test image input (#23155 )	2024-06-19 02:32:15 +00:00
Leonid Ganeline	3dfd055411	anthropic: docstrings (#23145 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-18 22:26:45 -04:00
Bagatur	90559fde70	openai[patch], standard-tests[patch]: don't pass in falsey stop vals (#23153 ) adds an image input test to standard-tests as well	2024-06-18 18:13:13 -07:00
Bagatur	e8a8286012	core[patch]: runnablewithchathistory from core.runnables (#23136 )	2024-06-19 00:15:18 +00:00
Vadym Barda	b483bf5095	core[minor]: handle boolean data in draw_mermaid (#23135 ) This change should address graph rendering issues for edges with boolean data Example from langgraph: ```python from typing import Annotated, TypedDict from langchain_core.messages import AnyMessage from langgraph.graph import END, START, StateGraph from langgraph.graph.message import add_messages class State(TypedDict): messages: Annotated[list[AnyMessage], add_messages] def branch(state: State) -> bool: return 1 + 1 == 3 graph_builder = StateGraph(State) graph_builder.add_node("foo", lambda state: {"messages": [("ai", "foo")]}) graph_builder.add_node("bar", lambda state: {"messages": [("ai", "bar")]}) graph_builder.add_conditional_edges( START, branch, path_map={True: "foo", False: "bar"}, then=END, ) app = graph_builder.compile() print(app.get_graph().draw_mermaid()) ``` Previous behavior: ```python AttributeError: 'bool' object has no attribute 'split' ``` Current behavior: ```python %%{init: {'flowchart': {'curve': 'linear'}}}%% graph TD; __start__[__start__]:::startclass; __end__[__end__]:::endclass; foo([foo]):::otherclass; bar([bar]):::otherclass; __start__ -. ('a',) .-> foo; foo --> __end__; __start__ -. ('b',) .-> bar; bar --> __end__; classDef startclass fill:#ffdfba; classDef endclass fill:#baffc9; classDef otherclass fill:#fad7de; ```	2024-06-18 20:15:42 +00:00
Bagatur	093ae04d58	core[patch]: Pin pydantic in py3.12.4 (#23130 )	2024-06-18 12:00:02 -07:00
hmasdev	ff0c06b1e5	langchain[patch]: fix `OutputType` of OutputParsers and fix legacy API in OutputParsers (#19792 ) # Description This pull request aims to address specific issues related to the ambiguity and error-proneness of the output types of certain output parsers, as well as the absence of unit tests for some parsers. These issues could potentially lead to runtime errors or unexpected behaviors due to type mismatches when used, causing confusion for developers and users. Through clarifying output types, this PR seeks to improve the stability and reliability. Therefore, this pull request - fixes the `OutputType` of OutputParsers to be the expected type; - e.g. `OutputType` property of `EnumOutputParser` raises `TypeError`. This PR introduce a logic to extract `OutputType` from its attribute. - and fixes the legacy API in OutputParsers like `LLMChain.run` to the modern API like `LLMChain.invoke`; - Note: For `OutputFixingParser`, `RetryOutputParser` and `RetryWithErrorOutputParser`, this PR introduces `legacy` attribute with False as default value in order to keep the backward compatibility - and adds the tests for the `OutputFixingParser` and `RetryOutputParser`. The following table shows my expected output and the actual output of the `OutputType` of OutputParsers. I have used this table to fix `OutputType` of OutputParsers. \| Class Name of OutputParser \| My Expected `OutputType` (after this PR)\| Actual `OutputType` [evidence](#evidence) (before this PR)\| Fix Required \| \|---------\|--------------\|---------\|--------\| \| BooleanOutputParser \| `<class 'bool'>` \| `<class 'bool'>` \| NO \| \| CombiningOutputParser \| `typing.Dict[str, Any]` \| `TypeError` is raised \| YES \| \| DatetimeOutputParser \| `<class 'datetime.datetime'>` \| `<class 'datetime.datetime'>` \| NO \| \| EnumOutputParser(enum=MyEnum) \| `MyEnum` \| `TypeError` is raised \| YES \| \| OutputFixingParser \| The same type as `self.parser.OutputType` \| `~T` \| YES \| \| CommaSeparatedListOutputParser \| `typing.List[str]` \| `typing.List[str]` \| NO \| \| MarkdownListOutputParser \| `typing.List[str]` \| `typing.List[str]` \| NO \| \| NumberedListOutputParser \| `typing.List[str]` \| `typing.List[str]` \| NO \| \| JsonOutputKeyToolsParser \| `typing.Any` \| `typing.Any` \| NO \| \| JsonOutputToolsParser \| `typing.Any` \| `typing.Any` \| NO \| \| PydanticToolsParser \| `typing.Any` \| `typing.Any` \| NO \| \| PandasDataFrameOutputParser \| `typing.Dict[str, Any]` \| `TypeError` is raised \| YES \| \| PydanticOutputParser(pydantic_object=MyModel) \| `<class '__main__.MyModel'>` \| `<class '__main__.MyModel'>` \| NO \| \| RegexParser \| `typing.Dict[str, str]` \| `TypeError` is raised \| YES \| \| RegexDictParser \| `typing.Dict[str, str]` \| `TypeError` is raised \| YES \| \| RetryOutputParser \| The same type as `self.parser.OutputType` \| `~T` \| YES \| \| RetryWithErrorOutputParser \| The same type as `self.parser.OutputType` \| `~T` \| YES \| \| StructuredOutputParser \| `typing.Dict[str, Any]` \| `TypeError` is raised \| YES \| \| YamlOutputParser(pydantic_object=MyModel) \| `MyModel` \| `~T` \| YES \| NOTE: In "Fix Required", "YES" means that it is required to fix in this PR while "NO" means that it is not required. # Issue No issues for this PR. # Twitter handle - [hmdev3](https://twitter.com/hmdev3) # Questions: 1. Is it required to create tests for legacy APIs `LLMChain.run` in the following scripts? - libs/langchain/tests/unit_tests/output_parsers/test_fix.py; - libs/langchain/tests/unit_tests/output_parsers/test_retry.py. 2. Is there a more appropriate expected output type than I expect in the above table? - e.g. the `OutputType` of `CombiningOutputParser` should be SOMETHING... # Actual outputs (before this PR) <div id='evidence'></div> <details><summary>Actual outputs</summary> ## Requirements - Python==3.9.13 - langchain==0.1.13 ```python Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import langchain >>> langchain.__version__ '0.1.13' >>> from langchain import output_parsers ``` ### `BooleanOutputParser` ```python >>> output_parsers.BooleanOutputParser().OutputType <class 'bool'> ``` ### `CombiningOutputParser` ```python >>> output_parsers.CombiningOutputParser(parsers=[output_parsers.DatetimeOutputParser(), output_parsers.CommaSeparatedListOutputParser()]).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable CombiningOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `DatetimeOutputParser` ```python >>> output_parsers.DatetimeOutputParser().OutputType <class 'datetime.datetime'> ``` ### `EnumOutputParser` ```python >>> from enum import Enum >>> class MyEnum(Enum): ... a = 'a' ... b = 'b' ... >>> output_parsers.EnumOutputParser(enum=MyEnum).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable EnumOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `OutputFixingParser` ```python >>> output_parsers.OutputFixingParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `CommaSeparatedListOutputParser` ```python >>> output_parsers.CommaSeparatedListOutputParser().OutputType typing.List[str] ``` ### `MarkdownListOutputParser` ```python >>> output_parsers.MarkdownListOutputParser().OutputType typing.List[str] ``` ### `NumberedListOutputParser` ```python >>> output_parsers.NumberedListOutputParser().OutputType typing.List[str] ``` ### `JsonOutputKeyToolsParser` ```python >>> output_parsers.JsonOutputKeyToolsParser(key_name='tool').OutputType typing.Any ``` ### `JsonOutputToolsParser` ```python >>> output_parsers.JsonOutputToolsParser().OutputType typing.Any ``` ### `PydanticToolsParser` ```python >>> from langchain.pydantic_v1 import BaseModel >>> class MyModel(BaseModel): ... a: int ... >>> output_parsers.PydanticToolsParser(tools=[MyModel, MyModel]).OutputType typing.Any ``` ### `PandasDataFrameOutputParser` ```python >>> output_parsers.PandasDataFrameOutputParser().OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable PandasDataFrameOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `PydanticOutputParser` ```python >>> output_parsers.PydanticOutputParser(pydantic_object=MyModel).OutputType <class '__main__.MyModel'> ``` ### `RegexParser` ```python >>> output_parsers.RegexParser(regex='$', output_keys=['a']).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable RegexParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `RegexDictParser` ```python >>> output_parsers.RegexDictParser(output_key_to_format={'a':'a'}).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable RegexDictParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `RetryOutputParser` ```python >>> output_parsers.RetryOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `RetryWithErrorOutputParser` ```python >>> output_parsers.RetryWithErrorOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `StructuredOutputParser` ```python >>> from langchain.output_parsers.structured import ResponseSchema >>> response_schemas = [ResponseSchema(name="foo",description="a list of strings",type="List[string]"),ResponseSchema(name="bar",description="a string",type="string"), ] >>> output_parsers.StructuredOutputParser.from_response_schemas(response_schemas).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable StructuredOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `YamlOutputParser` ```python >>> output_parsers.YamlOutputParser(pydantic_object=MyModel).OutputType ~T ``` <div> --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-18 18:59:42 +00:00
Artem Mukhin	e271f75bee	docs: Fix URL formatting in deprecation warnings (#23075 ) Description Updated the URLs in deprecation warning messages. The URLs were previously written as raw strings and are now formatted to be clickable HTML links. Example of a broken link in the current API Reference: https://api.python.langchain.com/en/latest/chains/langchain.chains.openai_functions.extraction.create_extraction_chain_pydantic.html <img width="942" alt="Screenshot 2024-06-18 at 13 21 07" src="https://github.com/langchain-ai/langchain/assets/4854600/a1b1863c-cd03-4af2-a9bc-70375407fb00">	2024-06-18 14:49:58 -04:00
Gabriel Petracca	c6660df58e	community[minor]: Implement Doctran async execution (#22372 ) Description The DoctranTextTranslator has an async transform function that was not implemented because [the doctran library](https://github.com/psychic-api/doctran) uses a sync version of the `execute` method. - I implemented the `DoctranTextTranslator.atransform_documents()` method using `asyncio.to_thread` to run the function in a separate thread. - I updated the example in the Notebook with the new async version. - The performance improvements can be appreciated when a big document is divided into multiple chunks. Relates to: - Issue #14645: https://github.com/langchain-ai/langchain/issues/14645 - Issue #14437: https://github.com/langchain-ai/langchain/issues/14437 - https://github.com/langchain-ai/langchain/pull/15264 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-18 18:17:37 +00:00
Eugene Yurtsev	aa6415aa7d	core[minor]: Support multiple keys in get_from_dict_or_env (#23086 ) Support passing multiple keys for ge_from_dict_or_env	2024-06-18 14:13:28 -04:00
nold	226802f0c4	community: add args_schema to SearxSearch (#22954 ) This change adds args_schema (pydantic BaseModel) to SearxSearchRun for correct schema formatting on LLM function calls Issue: currently using SearxSearchRun with OpenAI function calling returns the following error "TypeError: SearxSearchRun._run() got an unexpected keyword argument '__arg1' ". This happens because the schema sent to the LLM is "input: '{"__arg1":"foobar"}'" while the method should be called with the "query" parameter. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 17:27:39 +00:00
Bagatur	01783d67fc	core[patch]: Release 0.2.9 (#23091 )	2024-06-18 17:15:04 +00:00
Finlay Macklon	616d06d7fe	community: glob multiple patterns when using DirectoryLoader (#22852 ) - Description: Updated community.langchain_community.document_loaders.directory.py to enable the use of multiple glob patterns in the `DirectoryLoader` class. Now, the glob parameter is of type `list[str] \| str` and still defaults to the same value as before. I updated the docstring of the class to reflect this, and added a unit test to community.tests.unit_tests.document_loaders.test_directory.py named `test_directory_loader_glob_multiple`. This test also shows an example of how to use the new functionality. - ~~Issue:~~Discussion Thread: https://github.com/langchain-ai/langchain/discussions/18559 - Dependencies: None - Twitter handle: N/a - [x] Add tests and docs - Added test (described above) - Updated class docstring - [x] Lint and test --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-18 09:24:50 -07:00
Eugene Yurtsev	5564d9e404	core[patch]: Document BaseStore (#23082 ) Add doc-string to BaseStore	2024-06-18 11:47:47 -04:00
Takuya Igei	9f791b6ad5	core[patch],community[patch],langchain[patch]: `tenacity` dependency to version `>=8.1.0,<8.4.0` (#22973 ) Fix https://github.com/langchain-ai/langchain/issues/22972. - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-18 10:34:28 -04:00
Raviraj	858ce264ef	SemanticChunker : Feature Addition ("Semantic Splitting with gradient") (#22895 ) ```SemanticChunker``` currently provide three methods to split the texts semantically: - percentile - standard_deviation - interquartile I propose new method ```gradient```. In this method, the gradient of distance is used to split chunks along with the percentile method (technically) . This method is useful when chunks are highly correlated with each other or specific to a domain e.g. legal or medical. The idea is to apply anomaly detection on gradient array so that the distribution become wider and easy to identify boundaries in highly semantic data. I have tested this merge on a set of 10 domain specific documents (mostly legal). Details : - Issue: Improvement - Dependencies: NA - Twitter handle: [x.com/prajapat_ravi](https://x.com/prajapat_ravi) @hwchase17 --------- Co-authored-by: Raviraj Prajapat <raviraj.prajapat@sirionlabs.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-17 21:01:08 -07:00
Raghav Dixit	55705c0f5e	LanceDB integration update (#22869 ) Added : - [x] relevance search (w/wo scores) - [x] maximal marginal search - [x] image ingestion - [x] filtering support - [x] hybrid search w reranking make test, lint_diff and format checked.	2024-06-17 20:54:26 -07:00
Chang Liu	62c8a67f56	community: add KafkaChatMessageHistory (#22216 ) Add chat history store based on Kafka. Files added: `libs/community/langchain_community/chat_message_histories/kafka.py` `docs/docs/integrations/memory/kafka_chat_message_history.ipynb` New issue to be created for future improvement: 1. Async method implementation. 2. Message retrieval based on timestamp. 3. Support for other configs when connecting to cloud hosted Kafka (e.g. add `api_key` field) 4. Improve unit testing & integration testing.	2024-06-17 20:34:01 -07:00
shimajiroxyz	3e835a1aa1	langchain: add id_key option to EnsembleRetriever for metadata-based document merging (#22950 ) Description: - What I changed - By specifying the `id_key` during the initialization of `EnsembleRetriever`, it is now possible to determine which documents to merge scores for based on the value corresponding to the `id_key` element in the metadata, instead of `page_content`. Below is an example of how to use the modified `EnsembleRetriever`: ```python retriever = EnsembleRetriever(retrievers=[ret1, ret2], id_key="id") # The Document returned by each retriever must keep the "id" key in its metadata. ``` - Additionally, I added a script to easily test the behavior of the `invoke` method of the modified `EnsembleRetriever`. - Why I changed - There are cases where you may want to calculate scores by treating Documents with different `page_content` as the same when using `EnsembleRetriever`. For example, when you want to ensemble the search results of the same document described in two different languages. - The previous `EnsembleRetriever` used `page_content` as the basis for score aggregation, making the above usage difficult. Therefore, the score is now calculated based on the specified key value in the Document's metadata. Twitter handle: @shimajiroxyz	2024-06-18 03:29:17 +00:00
mackong	39f6c4169d	langchain[patch]: add tool messages formatter for tool calling agent (#22849 ) - Description: add tool_messages_formatter for tool calling agent, make tool messages can be formatted in different ways for your LLM. - Issue: N/A - Dependencies: N/A	2024-06-17 20:29:00 -07:00
Lucas Tucker	e25a5966b5	docs: Standardize DocumentLoader docstrings (#22932 ) Standardizing DocumentLoader docstrings (of which there are many) This PR addresses issue #22866 and adds docstrings according to the issue's specified format (in the appendix) for files csv_loader.py and json_loader.py in langchain_community.document_loaders. In particular, the following sections have been added to both CSVLoader and JSONLoader: Setup, Instantiate, Load, Async load, and Lazy load. It may be worth adding a 'Metadata' section to the JSONLoader docstring to clarify how we want to extract the JSON metadata (using the `metadata_func` argument). The files I used to walkthrough the various sections were `example_2.json` from [HERE](https://support.oneskyapp.com/hc/en-us/articles/208047697-JSON-sample-files) and `hw_200.csv` from [HERE](https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html). --------- Co-authored-by: lucast2021 <lucast2021@headroyce.org> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-18 03:26:36 +00:00
Mohammad Mohtashim	60ba02f5db	[Community]: Fixed DDG DuckDuckGoSearchResults Docstring (#22968 ) - Description: A very small fix in the Docstring of `DuckDuckGoSearchResults` identified in the following issue. - Issue: #22961 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 03:16:24 +00:00
Eun Hye Kim	70761af8cf	community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community) (#22977 ) - PR title: "community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community)" - PR message: - Description: - Added an optional verify parameter to the Requests class with a default value of True. - Modified the get, post, patch, put, and delete methods to include the verify parameter. - Updated the _arequest async context manager to include the verify parameter. - Added the verify parameter to the GenericRequestsWrapper class and passed it to the Requests class. - Issue: This PR fixes issue #22975. - Dependencies: No additional dependencies are required for this change. - Twitter handle: @lunara_x You can check this change with below code. ```python from langchain_openai.chat_models import ChatOpenAI from langchain.requests import RequestsWrapper from langchain_community.agent_toolkits.openapi import planner from langchain_community.agent_toolkits.openapi.spec import reduce_openapi_spec with open("swagger.yaml") as f: data = yaml.load(f, Loader=yaml.FullLoader) swagger_api_spec = reduce_openapi_spec(data) llm = ChatOpenAI(model='gpt-4o') swagger_requests_wrapper = RequestsWrapper(verify=False) # modified point superset_agent = planner.create_openapi_agent(swagger_api_spec, swagger_requests_wrapper, llm, allow_dangerous_requests=True, handle_parsing_errors=True) superset_agent.run( "Tell me the number and types of charts and dashboards available." ) ``` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 03:12:40 +00:00
Mohammad Mohtashim	bf839676c7	[Community]: FIxed the DocumentDBVectorSearch `_similarity_search_without_score` (#22970 ) - Description: The PR #22777 introduced a bug in `_similarity_search_without_score` which was raising the `OperationFailure` error. The mistake was syntax error for MongoDB pipeline which has been corrected now. - Issue: #22770	2024-06-17 20:08:42 -07:00
Nuno Campos	f01f12ce1e	Include "no escape" and "inverted section" mustache vars in Prompt.input_variables and Prompt.input_schema (#22981 )	2024-06-17 19:24:13 -07:00
Bagatur	c2b2e3266c	core[minor]: message transformer utils (#22752 )	2024-06-17 15:30:07 -07:00
Anders Swanson	aacc6198b9	community: OCI GenAI embedding batch size (#22986 ) Thank you for contributing to LangChain! - [x] PR title: "community: OCI GenAI embedding batch size" - [x] PR message: - Issue: #22985 - [ ] Add tests and docs: N/A - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Signed-off-by: Anders Swanson <anders.swanson@oracle.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-17 22:06:45 +00:00
Bagatur	8235bae48e	core[patch]: Release 0.2.8 (#23012 )	2024-06-17 20:55:39 +00:00
Nuno Campos	bd4b68cd54	core: run_in_executor: Wrap StopIteration in RuntimeError (#22997 ) - StopIteration can't be set on an asyncio.Future it raises a TypeError and leaves the Future pending forever so we need to convert it to a RuntimeError	2024-06-17 20:40:01 +00:00
Bagatur	d96f67b06f	standard-tests[patch]: Update chat model standard tests (#22378 ) - Refactor standard test classes to make them easier to configure - Update openai to support stop_sequences init param - Update groq to support stop_sequences init param - Update fireworks to support max_retries init param - Update ChatModel.bind_tools to type tool_choice - Update groq to handle tool_choice="any". this may be controversial --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-17 13:37:41 -07:00
Oguz Vuruskaner	dd25d08c06	community[minor]: add tool calling for DeepInfraChat (#22745 ) DeepInfra now supports tool calling for supported models. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-17 15:21:49 -04:00
maang-h	c6b7db6587	community: Add Baichuan Embeddings batch size (#22942 ) - Support batch size Baichuan updates the document, indicating that up to 16 documents can be imported at a time - Standardized model init arg names - baichuan_api_key -> api_key - model_name -> model	2024-06-17 14:11:04 -04:00
ccurme	722c8f50ea	openai[patch]: add stream_usage parameter (#22854 ) Here we add `stream_usage` to ChatOpenAI as: 1. a boolean attribute 2. a kwarg to _stream and _astream. Question: should the `stream_usage` attribute be `bool`, or `bool \| None`? Currently I've kept it `bool` and defaulted to False. It was implemented on [ChatAnthropic](`e832bbb486/libs/partners/anthropic/langchain_anthropic/chat_models.py (L535)`) as a bool. However, to maintain support for users who access the behavior via OpenAI's `stream_options` param, this ends up being possible: ```python llm = ChatOpenAI(model_kwargs={"stream_options": {"include_usage": True}}) assert not llm.stream_usage ``` (and this model will stream token usage). Some options for this: - it's ok - make the `stream_usage` attribute bool or None - make an \_\_init\_\_ for ChatOpenAI, set a `._stream_usage` attribute and read `.stream_usage` from a property Open to other ideas as well.	2024-06-17 13:35:18 -04:00
Shubham Pandey	56ac94e014	community[minor]: add `ChatSnowflakeCortex` chat model (#21490 ) Description: This PR adds a chat model integration for [Snowflake Cortex](https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions), which gives an instant access to industry-leading large language models (LLMs) trained by researchers at companies like Mistral, Reka, Meta, and Google, including [Snowflake Arctic](https://www.snowflake.com/en/data-cloud/arctic/), an open enterprise-grade model developed by Snowflake. Dependencies: Snowflake's [snowpark](https://pypi.org/project/snowflake-snowpark-python/) library is required for using this integration. Twitter handle: [@gethouseware](https://twitter.com/gethouseware) - [x] Add tests and docs: 1. integration tests: `libs/community/tests/integration_tests/chat_models/test_snowflake.py` 2. unit tests: `libs/community/tests/unit_tests/chat_models/test_snowflake.py` 3. example notebook: `docs/docs/integrations/chat/snowflake.ipynb` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-06-17 09:47:05 -07:00
Bagatur	e2304ebcdb	standard-tests[patch]: Release 0.1.1 (#22984 )	2024-06-17 15:31:34 +00:00
Hakan Özdemir	c437b1aab7	[Partner]: Add metadata to stream response (#22716 ) Adds `response_metadata` to stream responses from OpenAI. This is returned with `invoke` normally, but wasn't implemented for `stream`. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-17 09:46:50 -04:00
Bagatur	9ff249a38d	standard-tests[patch]: don't require str chunk contents (#22965 )	2024-06-17 08:52:24 -04:00
Christopher Tee	ada03dd273	community(you): Better support for You.com News API (#22622 ) ## Description While `YouRetriever` supports both You.com's Search and News APIs, news is supported as an afterthought. More specifically, not all of the News API parameters are exposed for the user, only those that happen to overlap with the Search API. This PR: - improves support for both APIs, exposing the remaining News API parameters while retaining backward compatibility - refactor some REST parameter generation logic - updates the docstring of `YouSearchAPIWrapper` - add input validation and warnings to ensure parameters are properly set by user - 🚨 Breaking: Limit the news results to `k` items If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-15 20:05:19 +00:00
Tomaz Bratanic	1c661fd849	Improve llm graph transformer docstring (#22939 )	2024-06-15 15:33:26 -04:00
maang-h	7a0af56177	docs: update ZhipuAI ChatModel docstring (#22934 ) - Description: Update ZhipuAI ChatModel rich docstring - Issue: the issue #22296	2024-06-15 09:12:21 -04:00
Bitmonkey	570d45b2a1	Update ollama.py with optional raw setting. (#21486 ) Ollama has a raw option now. https://github.com/ollama/ollama/blob/main/docs/api.md Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-14 17:19:26 -07:00
caiyueliang	9944ad7f5f	community: 'Solve the issue where the _search function in ElasticsearchStore supports passing a query_vector parameter, but the parameter does not take effect. (#21532 ) Issue: When using the similarity_search_with_score function in ElasticsearchStore, I expected to pass in the query_vector that I have already obtained. I noticed that the _search function does support the query_vector parameter, but it seems to be ineffective. I am attempting to resolve this issue. Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-14 17:13:11 -07:00
Erick Friis	c374c98389	experimental: release 0.0.61 (#22924 )	2024-06-14 15:55:07 -07:00
BuxianChen	af65cac609	cli[minor]: remove redefined DEFAULT_GIT_REF (#21471 ) remove redefined DEFAULT_GIT_REF Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-14 15:49:15 -07:00
Erick Friis	79a64207f5	community: release 0.2.5 (#22923 )	2024-06-14 15:45:07 -07:00
Jiejun Tan	c8c67dde6f	text-splitters[patch]: Fix HTMLSectionSplitter (#22812 ) Update former pull request: https://github.com/langchain-ai/langchain/pull/22654. Modified `langchain_text_splitters.HTMLSectionSplitter`, where in the latest version `dict` data structure is used to store sections from a html document, in function `split_html_by_headers`. The header/section element names serve as dict keys. This can be a problem when duplicate header/section element names are present in a single html document. Latter ones can replace former ones with the same name. Therefore some contents can be miss after html text splitting is conducted. Using a list to store sections can hopefully solve the problem. A Unit test considering duplicate header names has been added. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-14 22:40:39 +00:00
Erick Friis	fbeeb6da75	langchain: release 0.2.5 (#22922 )	2024-06-14 15:37:54 -07:00
Baskar Gopinath	c4f2bc9540	docs: Fix wrongly referenced class name in confluence.py (#22879 ) Fixes #22542 Changed ConfluenceReader to ConfluenceLoader	2024-06-14 14:00:48 -07:00
Erick Friis	9ef15691d6	core: release 0.2.7 (#22917 )	2024-06-14 20:03:58 +00:00
Nuno Campos	338180f383	core: in astream_events v2 always await task even if already finished (#22916 ) - this ensures exceptions propagate to the caller	2024-06-14 19:54:20 +00:00
Istvan/Nebulinq	513e491ce9	experimental: LLMGraphTransformer - added relationship properties. (#21856 ) - Description: The generated relationships in the graph had no properties, but the Relationship class was properly defined with properties. This made it very difficult to transform conditional sentences into a graph. Adding properties to relationships can solve this issue elegantly. The changes expand on the existing LLMGraphTransformer implementation but add the possibility to define allowed relationship properties like this: LLMGraphTransformer(llm=llm, relationship_properties=["Condition", "Time"],) - Issue: no issue found - Dependencies: n/a - Twitter handle: @IstvanSpace -Quick Test ================================================================= from dotenv import load_dotenv import os from langchain_community.graphs import Neo4jGraph from langchain_experimental.graph_transformers import LLMGraphTransformer from langchain_openai import ChatOpenAI from langchain_core.prompts import ChatPromptTemplate from langchain_core.documents import Document load_dotenv() os.environ["NEO4J_URI"] = os.getenv("NEO4J_URI") os.environ["NEO4J_USERNAME"] = os.getenv("NEO4J_USERNAME") os.environ["NEO4J_PASSWORD"] = os.getenv("NEO4J_PASSWORD") graph = Neo4jGraph() llm = ChatOpenAI(temperature=0, model_name="gpt-4o") llm_transformer = LLMGraphTransformer(llm=llm) #text = "Harry potter likes pies, but only if it rains outside" text = "Jack has a dog named Max. Jack only walks Max if it is sunny outside." documents = [Document(page_content=text)] llm_transformer_props = LLMGraphTransformer( llm=llm, relationship_properties=["Condition"], ) graph_documents_props = llm_transformer_props.convert_to_graph_documents(documents) print(f"Nodes:{graph_documents_props[0].nodes}") print(f"Relationships:{graph_documents_props[0].relationships}") graph.add_graph_documents(graph_documents_props) --------- Co-authored-by: Istvan Lorincz <istvan.lorincz@pm.me> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-14 14:41:04 -04:00
kiarina	8171efd07a	core[patch]: Fix FunctionCallbackHandler._on_tool_end (#22908 ) If the global `debug` flag is enabled, the agent will get the following error in `FunctionCallbackHandler._on_tool_end` at runtime. ``` Error in ConsoleCallbackHandler.on_tool_end callback: AttributeError("'list' object has no attribute 'strip'") ``` By calling str() before strip(), the error was avoided. This error can be seen at [debugging.ipynb](https://github.com/langchain-ai/langchain/blob/master/docs/docs/how_to/debugging.ipynb). - Issue: NA - Dependencies: NA - Twitter handle: https://x.com/kiarina37	2024-06-14 17:59:29 +00:00
Philippe PRADOS	b61de9728e	community[minor]: Fix long_context_reorder.py async (#22839 ) Implement `async def atransform_documents( self, documents: Sequence[Document], **kwargs: Any ) -> Sequence[Document]` for `LongContextReorder`	2024-06-14 13:55:18 -04:00
Eugene Yurtsev	c72bcda4f2	community[major], experimental[patch]: Remove Python REPL from community (#22904 ) Remove the REPL from community, and suggest an alternative import from langchain_experimental. Fix for this issue: https://github.com/langchain-ai/langchain/issues/14345 This is not a bug in the code or an actual security risk. The python REPL itself is behaving as expected. The PR is done to appease blanket security policies that are just looking for the presence of exec in the code. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-14 17:53:29 +00:00
Eugene Yurtsev	9a877c7adb	community[patch]: SitemapLoader restrict depth of parsing sitemap (CVE-2024-2965) (#22903 ) This PR restricts the depth to which the sitemap can be parsed. Fix for: CVE-2024-2965	2024-06-14 13:04:40 -04:00
Eugene Yurtsev	4a77a3ab19	core[patch]: fix validation of @deprecated decorator (#22513 ) This PR moves the validation of the decorator to a better place to avoid creating bugs while deprecating code. Prevent issues like this from arising: https://github.com/langchain-ai/langchain/issues/22510 we should replace with a linter at some point that just does static analysis	2024-06-14 16:52:30 +00:00
Jacob Lee	181a61982f	anthropic[minor]: Adds streaming tool call support for Anthropic (#22687 ) Preserves string content chunks for non tool call requests for convenience. One thing - Anthropic events look like this: ``` RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start') RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta') RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta') ... RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start') RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta') ``` Note that `delta` has a `type` field. With this implementation, I'm dropping it because `merge_list` behavior will concatenate strings. We currently have `index` as a special field when merging lists, would it be worth adding `type` too? If so, what do we set as a context block chunk? `text` vs. `text_delta`/`tool_use` vs `input_json_delta`? CC @ccurme @efriis @baskaryan	2024-06-14 09:14:43 -07:00
ccurme	f40b2c6f9d	fireworks[patch]: add usage_metadata to (a)invoke and (a)stream (#22906 )	2024-06-14 12:07:19 -04:00
Mohammad Mohtashim	d1b7a934aa	[Community]: HuggingFaceCrossEncoder `score` accounting for <not-relevant score,relevant score> pairs. (#22578 ) - Description: Some of the Cross-Encoder models provide scores in pairs, i.e., <not-relevant score (higher means the document is less relevant to the query), relevant score (higher means the document is more relevant to the query)>. However, the `HuggingFaceCrossEncoder` `score` method does not currently take into account the pair situation. This PR addresses this issue by modifying the method to consider only the relevant score if score is being provided in pair. The reason for focusing on the relevant score is that the compressors select the top-n documents based on relevance. - Issue: #22556 - Please also refer to this [comment](https://github.com/UKPLab/sentence-transformers/issues/568#issuecomment-729153075)	2024-06-14 08:28:24 -07:00
Thanh Nguyen	b5e2ba3a47	community[minor]: add chat model llamacpp (#22589 ) - PR title: [community] add chat model llamacpp - PR message: - Description: This PR introduces a new chat model integration with llamacpp_python, designed to work similarly to the existing ChatOpenAI model. + Work well with instructed chat, chain and function/tool calling. + Work with LangGraph (persistent memory, tool calling), will update soon - Dependencies: This change requires the llamacpp_python library to be installed. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-14 14:51:43 +00:00
ccurme	73c76b9628	anthropic[patch]: always add tool_result type to ToolMessage content (#22721 ) Anthropic tool results can contain image data, which are typically represented with content blocks having `"type": "image"`. Currently, these content blocks are passed as-is as human/user messages to Anthropic, which raises BadRequestError as it expects a tool_result block to follow a tool_use. Here we update ChatAnthropic to nest the content blocks inside a tool_result content block. Example: ```python import base64 import httpx from langchain_anthropic import ChatAnthropic from langchain_core.messages import AIMessage, HumanMessage, ToolMessage from langchain_core.pydantic_v1 import BaseModel, Field # Fetch image image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8") class FetchImage(BaseModel): should_fetch: bool = Field(..., description="Whether an image is requested.") llm = ChatAnthropic(model="claude-3-sonnet-20240229").bind_tools([FetchImage]) messages = [ HumanMessage(content="Could you summon a beautiful image please?"), AIMessage( content=[ { "type": "tool_use", "id": "toolu_01Rn6Qvj5m7955x9m9Pfxbcx", "name": "FetchImage", "input": {"should_fetch": True}, }, ], tool_calls=[ { "name": "FetchImage", "args": {"should_fetch": True}, "id": "toolu_01Rn6Qvj5m7955x9m9Pfxbcx", }, ], ), ToolMessage( name="FetchImage", content=[ { "type": "image", "source": { "type": "base64", "media_type": "image/jpeg", "data": image_data, }, }, ], tool_call_id="toolu_01Rn6Qvj5m7955x9m9Pfxbcx", ), ] llm.invoke(messages) ``` Trace: https://smith.langchain.com/public/d27e4fc1-a96d-41e1-9f52-54f5004122db/r	2024-06-13 20:14:23 -07:00
Lucas Tucker	7114aed78f	docs: Standardize ChatGroq (#22751 ) Updated ChatGroq doc string as per issue https://github.com/langchain-ai/langchain/issues/22296:"langchain_groq: updated docstring for ChatGroq in langchain_groq to match that of the description (in the appendix) provided in issue https://github.com/langchain-ai/langchain/issues/22296. " Issue: This PR is in response to issue https://github.com/langchain-ai/langchain/issues/22296, and more specifically the ChatGroq model. In particular, this PR updates the docstring for langchain/libs/partners/groq/langchain_groq/chat_model.py by adding the following sections: Instantiate, Invoke, Stream, Async, Tool calling, Structured Output, and Response metadata. I used the template from the Anthropic implementation and referenced the Appendix of the original issue post. I also noted that: `usage_metadata `returns none for all ChatGroq models I tested; there is no mention of image input in the ChatGroq documentation; unlike that of ChatHuggingFace, `.stream(messages)` for ChatGroq returned blocks of output. --------- Co-authored-by: lucast2021 <lucast2021@headroyce.org> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-14 03:08:36 +00:00
Anush	e002c855bd	qdrant[patch]: Use collection_exists API instead of exceptions (#22764 ) ## Description Currently, the Qdrant integration relies on exceptions raised by [`get_collection` ](https://qdrant.tech/documentation/concepts/collections/#collection-info) to check if a collection exists. Using [`collection_exists`](https://qdrant.tech/documentation/concepts/collections/#check-collection-existence) is recommended to avoid missing any unhandled exceptions. This PR addresses this. ## Testing All integration and unit tests pass. No user-facing changes.	2024-06-13 20:01:32 -07:00
Anindyadeep	c417803908	community[minor]: Prem Templates (#22783 ) This PR adds the feature add Prem Template feature in ChatPremAI. Additionally it fixes a minor bug for API auth error when API passed through arguments.	2024-06-13 19:59:28 -07:00
maang-h	1055b9a309	community[minor]: Implement ZhipuAIEmbeddings interface (#22821 ) - Description: Implement ZhipuAIEmbeddings interface, include: - The `embed_query` method - The `embed_documents` method refer to [ZhipuAI Embedding-2](https://open.bigmodel.cn/dev/api#text_embedding) --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-06-13 19:45:11 -07:00
Leonid Ganeline	46c9784127	docs: `ReAct` reference (#22830 ) The `ReAct` is used all across LangChain but it is not referenced properly. Added references to the original paper.	2024-06-13 19:39:28 -07:00
Bagatur	8bd368d07e	cli[patch]: Release 0.0.25 (#22876 )	2024-06-14 02:31:04 +00:00
Isaac Francisco	75e966a2fa	docs, cli[patch]: document loaders doc template (#22862 ) From: https://github.com/langchain-ai/langchain/pull/22290 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-13 19:28:57 -07:00
Kagura Chen	57783c5e55	Fix: lint errors and update Field alias in models.py and AutoSelectionScorer initialization (#22846 ) This PR addresses several lint errors in the core package of LangChain. Specifically, the following issues were fixed: 1.Unexpected keyword argument "required" for "Field" [call-arg] 2.tests/integration_tests/chains/test_cpal.py:263: error: Unexpected keyword argument "narrative_input" for "QueryModel" [call-arg]	2024-06-13 18:18:00 -07:00

1 2 3 4 5 ...

4765 Commits