langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-10 01:10:59 +00:00

Author	SHA1	Message	Date
Isaac Francisco	1318d534af	[docs]: minor react change (#24509 )	2024-07-22 10:25:01 -07:00
Jorge Piedrahita Ortiz	10e3982b59	community: sambanova integration minor changes (#24503 ) - Minor changes in samabanova llm integration - default api - docstrings - minor changes in docs	2024-07-22 17:06:35 +00:00
maang-h	721f709dec	community: Improve QianfanChatEndpoint tool result to model (#24466 ) - Description: `QianfanChatEndpoint` When using tool result to answer questions, the content of the tool is required to be in Dict format. Of course, this can require users to return Dict format when calling the tool, but in order to be consistent with other Chat Models, I think such modifications are necessary.	2024-07-22 11:29:00 -04:00
Chaunte W. Lacewell	02f0a29293	Cookbook: Add Visual RAG example using VDMS (#24353 ) - Description: Adding notebook to demonstrate visual RAG which uses both video scene description generated by open source vision models (ex. video-llama, video-llava etc.) as text embeddings and frames as image embeddings to perform vector similarity search using VDMS. - Issue: N/A - Dependencies: N/A	2024-07-22 11:16:06 -04:00
ccurme	dcba7df2fe	community[patch]: deprecate langchain_community Chroma in favor of langchain_chroma (#24474 )	2024-07-22 11:00:13 -04:00
ccurme	0f7569ddbc	core[patch]: enable RunnableWithMessageHistory without config (#23775 ) Feedback that `RunnableWithMessageHistory` is unwieldy compared to ConversationChain and similar legacy abstractions is common. Legacy chains using memory typically had no explicit notion of threads or separate sessions. To use `RunnableWithMessageHistory`, users are forced to introduce this concept into their code. This possibly felt like unnecessary boilerplate. Here we enable `RunnableWithMessageHistory` to run without a config if the `get_session_history` callable has no arguments. This enables minimal implementations like the following: ```python from langchain_core.chat_history import InMemoryChatMessageHistory from langchain_core.runnables.history import RunnableWithMessageHistory from langchain_openai import ChatOpenAI llm = ChatOpenAI(model="gpt-3.5-turbo-0125") memory = InMemoryChatMessageHistory() chain = RunnableWithMessageHistory(llm, lambda: memory) chain.invoke("Hi I'm Bob") # Hello Bob! chain.invoke("What is my name?") # Your name is Bob. ```	2024-07-22 10:36:53 -04:00
Mohammad Mohtashim	5ade0187d0	[Commutiy]: Prompts Fixed for ZERO_SHOT_REACT React Agent Type in `create_sql_agent` function (#23693 ) - Description: The correct Prompts for ZERO_SHOT_REACT were not being used in the `create_sql_agent` function. They were not using the specific `SQL_PREFIX` and `SQL_SUFFIX` prompts if client does not provide any prompts. This is fixed. - Issue: #23585 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-22 14:04:20 +00:00
ZhangShenao	0f6737cbfe	[Vector Store] Fix function `add_texts` in `TencentVectorDB` (#24469 ) Regardless of whether `embedding_func` is set or not, the 'text' attribute of document should be assigned, otherwise the `page_content` in the document of the final search result will be lost	2024-07-22 09:50:22 -04:00
남광우	7ab82eb8cc	langchain: Copy libs/standard-tests folder when building devcontainer (#24470 ) ### Description * Fix `libs/langchain/dev.Dockerfile` file. copy the `libs/standard-tests` folder when building the devcontainer. * `poetry install --no-interaction --no-ansi --with dev,test,docs` command requires this folder, but it was not copied. ### Reference #### Error message when building the devcontainer from the master branch ``` ... [2024-07-20T14:27:34.779Z] ------ > [langchain langchain-dev-dependencies 7/7] RUN poetry install --no-interaction --no-ansi --with dev,test,docs: 0.409 0.409 Directory ../standard-tests does not exist ------ ... ``` #### After the fix Build success at vscode: <img width="866" alt="image" src="https://github.com/user-attachments/assets/10db1b50-6fcf-4dfe-83e1-d93c96aa2317">	2024-07-22 13:46:38 +00:00
rbrugaro	37b89fb7fc	fix RAG with quantized embeddings notebook (#24422 ) 1. Fix HuggingfacePipeline import error to newer partner package 2. Switch to IPEXModelForCausalLM for performance There are no dependency changes since optimum intel is also needed for QuantizedBiEncoderEmbeddings --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-22 13:44:03 +00:00
Thomas Meike	40c02cedaf	langchain[patch]: add async methods to ConversationSummaryBufferMemory (#20956 ) Added asynchronously callable methods according to the ConversationSummaryBufferMemory API documentation. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-22 09:21:43 -04:00
Steve Sharp	cecd875cdc	docs: Update streaming.ipynb (typo fix) (#24483 ) Description: Fixes typo `Le'ts` -> `Let's`. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-22 11:09:13 +00:00
Sheng Han Lim	0c6a3fdd6b	langchain: Update ContextualCompressionRetriever base_retriever type to RetrieverLike (#24192 ) Description: When initializing retrievers with `configurable_fields` as base retriever, `ContextualCompressionRetriever` validation fails with the following error: ``` ValidationError: 1 validation error for ContextualCompressionRetriever base_retriever Can't instantiate abstract class BaseRetriever with abstract method _get_relevant_documents (type=type_error) ``` Example code: ```python esearch_retriever = VertexAISearchRetriever( project_id=GCP_PROJECT_ID, location_id="global", data_store_id=SEARCH_ENGINE_ID, ).configurable_fields( filter=ConfigurableField(id="vertex_search_filter", name="Vertex Search Filter") ) # rerank documents with Vertex AI Rank API reranker = VertexAIRank( project_id=GCP_PROJECT_ID, location_id=GCP_REGION, ranking_config="default_ranking_config", ) retriever_with_reranker = ContextualCompressionRetriever( base_compressor=reranker, base_retriever=esearch_retriever ) ``` It seems like the issue stems from ContextualCompressionRetriever insisting that base retrievers must be strictly `BaseRetriever` inherited, and doesn't take into account cases where retrievers need to be chained and can have configurable fields defined. `0a1e475a30/libs/langchain/langchain/retrievers/contextual_compression.py (L15-L22)` This PR proposes that the base_retriever type be set to `RetrieverLike`, similar to how `EnsembleRetriever` validates its list of retrievers: `0a1e475a30/libs/langchain/langchain/retrievers/ensemble.py (L58-L75)`	2024-07-21 14:23:19 -04:00
clement.l	d98b830e4b	community: add flag to toggle progress bar (#24463 ) - Description: Add a flag to determine whether to show progress bar - Issue: n/a - Dependencies: n/a - Twitter handle: n/a --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-20 13:18:02 +00:00
chuanbei888	6b08a33fa4	community: fix QianfanChatEndpoint default model (#24464 ) the baidu_qianfan_endpoint has been changed from ERNIE-Bot-turbo to ERNIE-Lite-8K	2024-07-20 13:00:29 +00:00
Nuno Campos	947628311b	core[patch]: Accept configurable keys top-level (#23806 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-07-20 03:49:00 +00:00
Jesus Martinez	c1d1fc13c2	langchain[patch]: Remove multiagent return_direct validation (#24419 ) Description: When you use Agents with multi-input tool and some of these tools have `return_direct=True`, langchain thrown an error related to one validator. This change is implemented on [JS community](https://github.com/langchain-ai/langchainjs/pull/4643) as well Issue: This MR resolves #19843 Dependencies: None Co-authored-by: Jesus Martinez <jesusabraham.martinez@tyson.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-07-20 03:27:43 +00:00
Will Badart	74e3d796f1	core[patch]: ensure `iterator_` in scope for `_atransform_stream_with_config` except (#24454 ) Before, if an exception was raised in the outer `try` block in `Runnable._atransform_stream_with_config` before `iterator_` is assigned, the corresponding `finally` block would blow up with an `UnboundLocalError`: ```txt UnboundLocalError: cannot access local variable 'iterator_' where it is not associated with a value ``` By assigning an initial value to `iterator_` before entering the `try` block, this commit ensures that the `finally` can run, and not bury the "true" exception under a "During handling of the above exception [...]" traceback. Thanks for your consideration!	2024-07-20 03:24:04 +00:00
maang-h	7b28359719	docs: Add ChatSparkLLM docstrings (#24449 ) - Description: - Add `ChatSparkLLM` docstrings, the issue #22296 - To support `stream` method	2024-07-19 20:19:14 -07:00
Eugene Yurtsev	5e48f35fba	core[minor]: Relax constraints on type checking for tools and parsers (#24459 ) This will allow tools and parsers to accept pydantic models from any of the following namespaces: * pydantic.BaseModel with pydantic 1 * pydantic.BaseModel with pydantic 2 * pydantic.v1.BaseModel with pydantic 2	2024-07-19 21:47:34 -04:00
Isaac Francisco	838464de25	ollama: init package (#23615 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-20 00:43:29 +00:00
Erick Friis	f4ee3c8a22	infra: add min version testing to pr test flow (#24358 ) xfailing some sql tests that do not currently work on sqlalchemy v1 #22207 was very much not sqlalchemy v1 compatible. Moving forward, implementations should be compatible with both to pass CI	2024-07-19 22:03:19 +00:00
Erick Friis	50cb0a03bc	docs: advanced feature note (#24456 ) fixes #24430	2024-07-19 20:05:59 +00:00
Bagatur	842065a9cc	community[patch]: Release 0.2.9 (#24453 )	2024-07-19 12:50:22 -07:00
Bagatur	27ad6a4bb3	langchain[patch]: Release 0.2.10 (#24452 )	2024-07-19 12:50:13 -07:00
Bagatur	dda9438e87	community[patch]: gpt-4o-mini costs (#24421 )	2024-07-19 19:02:44 +00:00
Eugene Yurtsev	604dfe2d99	community[patch]: Force opt-in for WebResearchRetriever (CVE-2024-3095) (#24451 ) This PR addresses the issue raised by (CVE-2024-3095) https://huntr.com/bounties/e62d4895-2901-405b-9559-38276b6a5273 Unfortunately, we didn't do a good job writing the initial report. It's pointing at both the wrong package and the wrong code. The affected code is the Web Retriever not the AsyncHTMLLoader, and the WebRetriever lives in langchain-community The vulnerable code lives here: `0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L233-L233)` This PR adds a forced opt-in for users to make sure they are aware of the risk and can mitigate by configuring a proxy: `0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L84-L84)`	2024-07-19 18:51:35 +00:00
Bagatur	f101c759ed	docs: how to pass runtime secrets (#24450 )	2024-07-19 18:36:28 +00:00
Asi Greenholts	372c27f2e5	community[minor]: [GoogleApiYoutubeLoader] Replace API used in _get_document_for_channel from search to playlistItem (#24034 ) - Description: Search has a limit of 500 results, playlistItems doesn't. Added a class in except clause to catch another common error. - Issue: None - Dependencies: None - Twitter handle: @TupleType --------- Co-authored-by: asi-cider <88270351+asi-cider@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 14:04:34 -04:00
Rafael Pereira	6a45bf9554	community[minor]: GraphCypherQAChain to accept additional inputs as provided by the user for cypher generation (#24300 ) Description: This PR introduces a change to the `cypher_generation_chain` to dynamically concatenate inputs. This improvement aims to streamline the input handling process and make the method more flexible. The change involves updating the arguments dictionary with all elements from the `inputs` dictionary, ensuring that all necessary inputs are dynamically appended. This will ensure that any cypher generation template will not require a new `_call` method patch. Issue: This PR fixes issue #24260.	2024-07-19 14:03:14 -04:00
Philippe PRADOS	f5856680fe	community[minor]: add mongodb byte store (#23876 ) The `MongoDBStore` can manage only documents. It's not possible to use MongoDB for an `CacheBackedEmbeddings`. With this new implementation, it's possible to use: ```python CacheBackedEmbeddings.from_bytes_store( underlying_embeddings=embeddings, document_embedding_cache=MongoDBByteStore( connection_string=db_uri, db_name=db_name, collection_name=collection_name, ), ) ``` and use MongoDB to cache the embeddings !	2024-07-19 13:54:12 -04:00
yabooung	07715f815b	community[minor]: Add ability to specify file encoding and json encoding for FileChatMessageHistory (#24258 ) Description: Add UTF-8 encoding support Issue: Inability to properly handle characters from certain languages (e.g., Korean) Fix: Implement UTF-8 encoding in FileChatMessageHistory --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 13:53:21 -04:00
Dristy Srivastava	020cc1cf3e	Community[minor]: Added checksum in while send data to pebblo-cloud (#23968 ) - Description: - Updated checksum in doc metadata - Sending checksum and removing actual content, while sending data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in `/loader/doc` API - Adding `pb_id` i.e. pebblo id to doc metadata - Refactoring as needed. - Sending `content-checksum` and removing actual content, while sending data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in `prmopt` API - Issue: NA - Dependencies: NA - Tests: Updated - Docs NA --------- Co-authored-by: dristy.cd <dristy@clouddefense.io>	2024-07-19 13:52:54 -04:00
Eun Hye Kim	9aae8ef416	core[patch]: Fix utils.json_schema.dereference_refs (#24335 KeyError: 400 in JSON schema processing) (#24337 ) Description: This PR fixes a KeyError: 400 that occurs in the JSON schema processing within the reduce_openapi_spec function. The _retrieve_ref function in json_schema.py was modified to handle missing components gracefully by continuing to the next component if the current one is not found. This ensures that the OpenAPI specification is fully interpreted and the agent executes without errors. Issue: Fixes issue #24335 Dependencies: No additional dependencies are required for this change. Twitter handle: @lunara_x	2024-07-19 13:31:00 -04:00
keval dekivadiya	06f47678ae	community[minor]: Add TextEmbed Embedding Integration (#22946 ) Description: TextEmbed is a high-performance embedding inference server designed to provide a high-throughput, low-latency solution for serving embeddings. It supports various sentence-transformer models and includes the ability to deploy image and text embedding models. TextEmbed offers flexibility and scalability for diverse applications. - PyPI Package: [TextEmbed on PyPI](https://pypi.org/project/textembed/) - Docker Image: [TextEmbed on Docker Hub](https://hub.docker.com/r/kevaldekivadiya/textembed) - GitHub Repository: [TextEmbed on GitHub](https://github.com/kevaldekivadiya2415/textembed) PR Description This PR adds functionality for embedding documents and queries using the `TextEmbedEmbeddings` class. The implementation allows for both synchronous and asynchronous embedding requests to a TextEmbed API endpoint. The class handles batching and permuting of input texts to optimize the embedding process. Example Usage: ```python from langchain_community.embeddings import TextEmbedEmbeddings # Initialise the embeddings class embeddings = TextEmbedEmbeddings(model="your-model-id", api_key="your-api-key", api_url="your_api_url") # Define a list of documents documents = [ "Data science involves extracting insights from data.", "Artificial intelligence is transforming various industries.", "Cloud computing provides scalable computing resources over the internet.", "Big data analytics helps in understanding large datasets.", "India has a diverse cultural heritage." ] # Define a query query = "What is the cultural heritage of India?" # Embed all documents document_embeddings = embeddings.embed_documents(documents) # Embed the query query_embedding = embeddings.embed_query(query) # Print embeddings for each document for i, embedding in enumerate(document_embeddings): print(f"Document {i+1} Embedding:", embedding) # Print the query embedding print("Query Embedding:", query_embedding) --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-19 17:30:25 +00:00
Shikanime Deva	9c3da11910	Fix MultiQueryRetriever breaking Embeddings with empty lines (#21093 ) Fix MultiQueryRetriever breaking Embeddings with empty lines ``` [chain/end] [1:chain:ConversationalRetrievalChain > 2:retriever:Retriever > 3:retriever:Retriever > 4:chain:LLMChain] [2.03s] Exiting Chain run with output: [outputs] > /workspaces/Sfeir/sncf/metabot-backend/.venv/lib/python3.11/site-packages/langchain/retrievers/multi_query.py(116)_aget_relevant_documents() -> if self.include_original: (Pdb) queries ['## Alternative questions for "Hello, tell me about phones?":', '', '1. What are the latest trends in smartphone technology? (Focuses on recent advancements)', '2. How has the mobile phone industry evolved over the years? (Historical perspective)', '3. What are the different types of phones available in the market, and which one is best for me? (Categorization and recommendation)'] ``` Example of failure on VertexAIEmbeddings ``` grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.INVALID_ARGUMENT details = "The text content is empty." debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.184.234:443 {created_time:"2024-04-30T09:57:45.625698408+00:00", grpc_status:3, grpc_message:"The text content is empty."}" ``` Fixes: #15959 --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 17:13:12 +00:00
John Kelly	5affbada61	langchain: Add `aadd_documents` to `ParentDocumentRetriever` (#23969 ) - Description: Add an async version of `add_documents` to `ParentDocumentRetriever` - Twitter handle: @johnkdev --------- Co-authored-by: John Kelly <j.kelly@mwam.com> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 13:12:39 -04:00
Andrew Benton	f9d64d22e5	community[minor]: Add Riza Python/JS code execution tool (#23995 ) - Description: Add Riza Python/JS code execution tool - Issue: N/A - Dependencies: an optional dependency on the `rizaio` pypi package - Twitter handle: [@rizaio](https://x.com/rizaio) [Riza](https://riza.io) is a safe code execution environment for agent-generated Python and JavaScript that's easy to integrate into langchain apps. This PR adds two new tool classes to the community package.	2024-07-19 17:03:22 +00:00
Ben Chambers	3691701d58	community[minor]: Add keybert-based link extractor (#24311 ) - Description: Add a `KeybertLinkExtractor` for graph vectorstores. This allows extracting links from keywords in a Document and linking nodes that have common keywords. - Issue: None - Dependencies: None. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-19 12:25:07 -04:00
Erick Friis	ef049769f0	core[patch]: Release 0.2.22 (#24423 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-07-19 09:09:24 -07:00
Bagatur	cd19ba9a07	core[patch]: core lint fix (#24447 )	2024-07-19 09:01:22 -07:00
Ben Chambers	83f3d95ffa	community[minor]: GLiNER link extraction (#24314 ) - Description: This allows extracting links between documents with common named entities using [GLiNER](https://github.com/urchade/GLiNER). - Issue: None - Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 15:34:54 +00:00
Anas Khan	b5acb91080	Mask API keys for various LLM/ChatModel Modules (#13885 ) Description: - Added masking of the API Keys for the modules: - `langchain/chat_models/openai.py` - `langchain/llms/openai.py` - `langchain/llms/google_palm.py` - `langchain/chat_models/google_palm.py` - `langchain/llms/edenai.py` - Updated the modules to utilize `SecretStr` from pydantic to securely manage API key. - Added unit/integration tests - `langchain/chat_models/asure_openai.py` used the `open_api_key` that is derived from the `ChatOpenAI` Class and it was assuming `openai_api_key` is a str so we changed it to expect `SecretStr` instead. Issue: https://github.com/langchain-ai/langchain/issues/12165 , Dependencies: none, Tag maintainer: @eyurtsev --------- Co-authored-by: HassanA01 <anikeboss@gmail.com> Co-authored-by: Aneeq Hassan <aneeq.hassan@utoronto.ca> Co-authored-by: kristinspenc <kristinspenc2003@gmail.com> Co-authored-by: faisalt14 <faisalt14@gmail.com> Co-authored-by: Harshil-Patel28 <76663814+Harshil-Patel28@users.noreply.github.com> Co-authored-by: kristinspenc <146893228+kristinspenc@users.noreply.github.com> Co-authored-by: faisalt14 <90787271+faisalt14@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 15:23:34 +00:00
ccurme	f99369a54c	community[patch]: fix formatting (#24443 ) Somehow this got through CI: https://github.com/langchain-ai/langchain/pull/24363	2024-07-19 14:38:53 +00:00
Ben Chambers	242b085be7	Merge pull request #24315 * community: Add Hierarchy link extractor * add example * lint	2024-07-19 09:42:26 -04:00
Rhuan Barros	c3308f31bc	Merge pull request #24363 * important email fields	2024-07-19 09:41:20 -04:00
Piotr Romanowski	c50dd79512	docs: Update langchain-openai package version in chat_token_usage_tracking (#24436 ) This PR updates docs to mention correct version of the `langchain-openai` package required to use the `stream_usage` parameter. As it can be noticed in the details of this [merge commit](`722c8f50ea`), that functionality is available only in `langchain-openai >= 0.1.9` while docs state it's available in `langchain-openai >= 0.1.8`.	2024-07-19 13:07:37 +00:00
Han Sol Park	aade9bfde5	Mask API key for ChatOpenAI based chat_models (#14293 ) - Description: Mask API key for ChatOpenAi based chat_models (openai, azureopenai, anyscale, everlyai). Made changes to all chat_models that are based on ChatOpenAI since all of them assumes that openai_api_key is str rather than SecretStr. - Issue:: #12165 - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: N/A --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 02:25:38 +00:00
William FH	0ee6ed76ca	[Evaluation] Pass in seed directly (#24403 ) adding test rn	2024-07-18 19:12:28 -07:00
Nuno Campos	62b6965d2a	core: In ensure_config don't copy dunder configurable keys to metadata (#24420 )	2024-07-18 22:28:52 +00:00

1 2 3 4 5 ...

10594 Commits