langchain

Commit Graph

Author	SHA1	Message	Date
Eugene Yurtsev	3c917204dc	core[patch]: Add doc-strings to outputs, fix @root_validator (#23190 ) - Document outputs namespace - Update a vanilla @root_validator that was missed	3 months ago
Bagatur	8698cb9b28	infra: add more formatter rules to openai (#23189 ) Turns on https://docs.astral.sh/ruff/settings/#format_docstring-code-format and https://docs.astral.sh/ruff/settings/#format_skip-magic-trailing-comma ```toml [tool.ruff.format] docstring-code-format = true skip-magic-trailing-comma = true ```	3 months ago
Michał Krassowski	710197e18c	community[patch]: restore compatibility with SQLAlchemy 1.x (#22546 ) - Description: Restores compatibility with SQLAlchemy 1.4.x that was broken since #18992 and adds a test run for this version on CI (only for Python 3.11) - Issue: fixes #19681 - Dependencies: None - Twitter handle: `@krassowski_m` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	3 months ago
Erick Friis	48d6ea427f	upstage: move to external repo (#22506 )	3 months ago
Bagatur	0a4ee864e9	openai[patch]: image token counting (#23147 ) Resolves #23000 --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: ccurme <chester.curme@gmail.com>	3 months ago
Jorge Piedrahita Ortiz	b3e53ffca0	community[patch]: sambanova llm integration improvement (#23137 ) - Description: sambanova sambaverse integration improvement: removed input parsing that was changing raw user input, and was making to use process prompt parameter as true mandatory	3 months ago
Jorge Piedrahita Ortiz	e162893d7f	community[patch]: update sambastudio embeddings (#23133 ) Description: update sambastudio embeddings integration, now compatible with generic endpoints and CoE endpoints	3 months ago
Philippe PRADOS	db6f46c1a6	langchain[small]: Change type to BasePromptTemplate (#23083 ) ```python Change from_llm( prompt: PromptTemplate ... ) ``` to ```python Change from_llm( prompt: BasePromptTemplate ... ) ```	3 months ago
Sergey Kozlov	94452a94b1	core[patch[: add exceptions propagation test for astream_events v2 (#23159 ) Description: `astream_events(version="v2")` didn't propagate exceptions in `langchain-core<=0.2.6`, fixed in the #22916. This PR adds a unit test to check that exceptions are propagated upwards. Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>	3 months ago
Leonid Ganeline	50484be330	prompty: docstring (#23152 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference) --------- Co-authored-by: ccurme <chester.curme@gmail.com>	3 months ago
chenxi	505a2e8743	fix: MoonshotChat fails when setting the moonshot_api_key through the OS environment. (#23176 ) Close #23174 Co-authored-by: tianming <tianming@bytenew.com>	3 months ago
Bagatur	677408bfc9	core[patch]: fix chat history circular import (#23182 )	3 months ago
Eugene Yurtsev	883e90d06e	core[patch]: Add an example to the Document schema doc-string (#23131 ) Add an example to the document schema	3 months ago
ccurme	2b08e9e265	core[patch]: update test to catch circular imports (#23172 ) This raises ImportError due to a circular import: ```python from langchain_core import chat_history ``` This does not: ```python from langchain_core import runnables from langchain_core import chat_history ``` Here we update `test_imports` to run each import in a separate subprocess. Open to other ways of doing this!	3 months ago
Eugene Yurtsev	ae4c0ed25a	core[patch]: Add documentation to load namespace (#23143 ) Document some of the modules within the load namespace	3 months ago
Eugene Yurtsev	a34e650f8b	core[patch]: Add doc-string to document compressor (#23085 )	3 months ago
Eugene Yurtsev	1007a715a5	community[patch]: Prevent unit tests from making network requests (#23180 ) * Prevent unit tests from making network requests	3 months ago
ccurme	ca798bc6ea	community: move test to integration tests (#23178 ) Tests failing on master with > FAILED tests/unit_tests/embeddings/test_ovhcloud.py::test_ovhcloud_embed_documents - ValueError: Request failed with status code: 401, {"message":"Bad token; invalid JSON"}	3 months ago
Eugene Yurtsev	4fe8403bfb	core[patch]: Expand documentation in the indexing namespace (#23134 )	3 months ago
Eugene Yurtsev	fe4f10047b	core[patch]: Document embeddings namespace (#23132 ) Document embeddings namespace	3 months ago
Eugene Yurtsev	a3bae56a48	core[patch]: Update documentation in LLM namespace (#23138 ) Update documentation in lllm namespace.	3 months ago
Leonid Ganeline	a70b7a688e	ai21: docstrings (#23142 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	3 months ago
bilk0h	3d54784e6d	text-splitters: Fix/recursive json splitter data persistence issue (#21529 ) Thank you for contributing to LangChain! Description: Noticed an issue with when I was calling `RecursiveJsonSplitter().split_json()` multiple times that I was getting weird results. I found an issue where `chunks` list in the `_json_split` method. If chunks is not provided when _json_split (which is the case when split_json calls _json_split) then the same list is used for subsequent calls to `_json_split`. You can see this in the test case i also added to this commit. Output should be: ``` [{'a': 1, 'b': 2}] [{'c': 3, 'd': 4}] ``` Instead you get: ``` [{'a': 1, 'b': 2}] [{'a': 1, 'b': 2, 'c': 3, 'd': 4}] ``` --------- Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	3 months ago
鹿鹿鹿鲨	6b46b5e9ce	community: add request_kwargs and expect TimeError AsyncHtmlLoader (#23068 ) - Description: add `request_kwargs` and expect `TimeError` in `_fetch` function for AsyncHtmlLoader. This allows you to fill in the kwargs parameter when using the `load()` method of the `AsyncHtmlLoader` class. Co-authored-by: Yucolu <yucolu@tencent.com>	3 months ago
Leonid Ganeline	109a70fc64	ibm: docstrings (#23149 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	3 months ago
Ryan Elston	86ee4f0daa	text-splitters: Introduce Experimental Markdown Syntax Splitter (#22257 ) #### Description This MR defines a `ExperimentalMarkdownSyntaxTextSplitter` class. The main goal is to replicate the functionality of the original `MarkdownHeaderTextSplitter` which extracts the header stack as metadata but with one critical difference: it keeps the whitespace of the original text intact. This draft reimplements the `MarkdownHeaderTextSplitter` with a very different algorithmic approach. Instead of marking up each line of the text individually and aggregating them back together into chunks, this method builds each chunk sequentially and applies the metadata to each chunk. This makes the implementation simpler. However, since it's designed to keep white space intact its not a full drop in replacement for the original. Since it is a radical implementation change to the original code and I would like to get feedback to see if this is a worthwhile replacement, should be it's own class, or is not a good idea at all. Note: I implemented the `return_each_line` parameter but I don't think it's a necessary feature. I'd prefer to remove it. This implementation also adds the following additional features: - Splits out code blocks and includes the language in the `"Code"` metadata key - Splits text on the horizontal rule `---` as well - The `headers_to_split_on` parameter is now optional - with sensible defaults that can be overridden. #### Issue Keeping the whitespace keeps the paragraphs structure and the formatting of the code blocks intact which allows the caller much more flexibility in how they want to further split the individuals sections of the resulting documents. This addresses the issues brought up by the community in the following issues: - https://github.com/langchain-ai/langchain/issues/20823 - https://github.com/langchain-ai/langchain/issues/19436 - https://github.com/langchain-ai/langchain/issues/22256 #### Dependencies N/A #### Twitter handle @RyanElston --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	3 months ago
Bagatur	93d0ad97fe	anthropic[patch]: test image input (#23155 )	3 months ago
Leonid Ganeline	3dfd055411	anthropic: docstrings (#23145 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	3 months ago
Bagatur	90559fde70	openai[patch], standard-tests[patch]: don't pass in falsey stop vals (#23153 ) adds an image input test to standard-tests as well	3 months ago
Bagatur	e8a8286012	core[patch]: runnablewithchathistory from core.runnables (#23136 )	3 months ago
Vadym Barda	b483bf5095	core[minor]: handle boolean data in draw_mermaid (#23135 ) This change should address graph rendering issues for edges with boolean data Example from langgraph: ```python from typing import Annotated, TypedDict from langchain_core.messages import AnyMessage from langgraph.graph import END, START, StateGraph from langgraph.graph.message import add_messages class State(TypedDict): messages: Annotated[list[AnyMessage], add_messages] def branch(state: State) -> bool: return 1 + 1 == 3 graph_builder = StateGraph(State) graph_builder.add_node("foo", lambda state: {"messages": [("ai", "foo")]}) graph_builder.add_node("bar", lambda state: {"messages": [("ai", "bar")]}) graph_builder.add_conditional_edges( START, branch, path_map={True: "foo", False: "bar"}, then=END, ) app = graph_builder.compile() print(app.get_graph().draw_mermaid()) ``` Previous behavior: ```python AttributeError: 'bool' object has no attribute 'split' ``` Current behavior: ```python %%{init: {'flowchart': {'curve': 'linear'}}}%% graph TD; __start__[__start__]:::startclass; __end__[__end__]:::endclass; foo([foo]):::otherclass; bar([bar]):::otherclass; __start__ -. ('a',) .-> foo; foo --> __end__; __start__ -. ('b',) .-> bar; bar --> __end__; classDef startclass fill:#ffdfba; classDef endclass fill:#baffc9; classDef otherclass fill:#fad7de; ```	3 months ago
Bagatur	093ae04d58	core[patch]: Pin pydantic in py3.12.4 (#23130 )	3 months ago
hmasdev	ff0c06b1e5	langchain[patch]: fix `OutputType` of OutputParsers and fix legacy API in OutputParsers (#19792 ) # Description This pull request aims to address specific issues related to the ambiguity and error-proneness of the output types of certain output parsers, as well as the absence of unit tests for some parsers. These issues could potentially lead to runtime errors or unexpected behaviors due to type mismatches when used, causing confusion for developers and users. Through clarifying output types, this PR seeks to improve the stability and reliability. Therefore, this pull request - fixes the `OutputType` of OutputParsers to be the expected type; - e.g. `OutputType` property of `EnumOutputParser` raises `TypeError`. This PR introduce a logic to extract `OutputType` from its attribute. - and fixes the legacy API in OutputParsers like `LLMChain.run` to the modern API like `LLMChain.invoke`; - Note: For `OutputFixingParser`, `RetryOutputParser` and `RetryWithErrorOutputParser`, this PR introduces `legacy` attribute with False as default value in order to keep the backward compatibility - and adds the tests for the `OutputFixingParser` and `RetryOutputParser`. The following table shows my expected output and the actual output of the `OutputType` of OutputParsers. I have used this table to fix `OutputType` of OutputParsers. \| Class Name of OutputParser \| My Expected `OutputType` (after this PR)\| Actual `OutputType` [evidence](#evidence) (before this PR)\| Fix Required \| \|---------\|--------------\|---------\|--------\| \| BooleanOutputParser \| `<class 'bool'>` \| `<class 'bool'>` \| NO \| \| CombiningOutputParser \| `typing.Dict[str, Any]` \| `TypeError` is raised \| YES \| \| DatetimeOutputParser \| `<class 'datetime.datetime'>` \| `<class 'datetime.datetime'>` \| NO \| \| EnumOutputParser(enum=MyEnum) \| `MyEnum` \| `TypeError` is raised \| YES \| \| OutputFixingParser \| The same type as `self.parser.OutputType` \| `~T` \| YES \| \| CommaSeparatedListOutputParser \| `typing.List[str]` \| `typing.List[str]` \| NO \| \| MarkdownListOutputParser \| `typing.List[str]` \| `typing.List[str]` \| NO \| \| NumberedListOutputParser \| `typing.List[str]` \| `typing.List[str]` \| NO \| \| JsonOutputKeyToolsParser \| `typing.Any` \| `typing.Any` \| NO \| \| JsonOutputToolsParser \| `typing.Any` \| `typing.Any` \| NO \| \| PydanticToolsParser \| `typing.Any` \| `typing.Any` \| NO \| \| PandasDataFrameOutputParser \| `typing.Dict[str, Any]` \| `TypeError` is raised \| YES \| \| PydanticOutputParser(pydantic_object=MyModel) \| `<class '__main__.MyModel'>` \| `<class '__main__.MyModel'>` \| NO \| \| RegexParser \| `typing.Dict[str, str]` \| `TypeError` is raised \| YES \| \| RegexDictParser \| `typing.Dict[str, str]` \| `TypeError` is raised \| YES \| \| RetryOutputParser \| The same type as `self.parser.OutputType` \| `~T` \| YES \| \| RetryWithErrorOutputParser \| The same type as `self.parser.OutputType` \| `~T` \| YES \| \| StructuredOutputParser \| `typing.Dict[str, Any]` \| `TypeError` is raised \| YES \| \| YamlOutputParser(pydantic_object=MyModel) \| `MyModel` \| `~T` \| YES \| NOTE: In "Fix Required", "YES" means that it is required to fix in this PR while "NO" means that it is not required. # Issue No issues for this PR. # Twitter handle - [hmdev3](https://twitter.com/hmdev3) # Questions: 1. Is it required to create tests for legacy APIs `LLMChain.run` in the following scripts? - libs/langchain/tests/unit_tests/output_parsers/test_fix.py; - libs/langchain/tests/unit_tests/output_parsers/test_retry.py. 2. Is there a more appropriate expected output type than I expect in the above table? - e.g. the `OutputType` of `CombiningOutputParser` should be SOMETHING... # Actual outputs (before this PR) <div id='evidence'></div> <details><summary>Actual outputs</summary> ## Requirements - Python==3.9.13 - langchain==0.1.13 ```python Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import langchain >>> langchain.__version__ '0.1.13' >>> from langchain import output_parsers ``` ### `BooleanOutputParser` ```python >>> output_parsers.BooleanOutputParser().OutputType <class 'bool'> ``` ### `CombiningOutputParser` ```python >>> output_parsers.CombiningOutputParser(parsers=[output_parsers.DatetimeOutputParser(), output_parsers.CommaSeparatedListOutputParser()]).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable CombiningOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `DatetimeOutputParser` ```python >>> output_parsers.DatetimeOutputParser().OutputType <class 'datetime.datetime'> ``` ### `EnumOutputParser` ```python >>> from enum import Enum >>> class MyEnum(Enum): ... a = 'a' ... b = 'b' ... >>> output_parsers.EnumOutputParser(enum=MyEnum).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable EnumOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `OutputFixingParser` ```python >>> output_parsers.OutputFixingParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `CommaSeparatedListOutputParser` ```python >>> output_parsers.CommaSeparatedListOutputParser().OutputType typing.List[str] ``` ### `MarkdownListOutputParser` ```python >>> output_parsers.MarkdownListOutputParser().OutputType typing.List[str] ``` ### `NumberedListOutputParser` ```python >>> output_parsers.NumberedListOutputParser().OutputType typing.List[str] ``` ### `JsonOutputKeyToolsParser` ```python >>> output_parsers.JsonOutputKeyToolsParser(key_name='tool').OutputType typing.Any ``` ### `JsonOutputToolsParser` ```python >>> output_parsers.JsonOutputToolsParser().OutputType typing.Any ``` ### `PydanticToolsParser` ```python >>> from langchain.pydantic_v1 import BaseModel >>> class MyModel(BaseModel): ... a: int ... >>> output_parsers.PydanticToolsParser(tools=[MyModel, MyModel]).OutputType typing.Any ``` ### `PandasDataFrameOutputParser` ```python >>> output_parsers.PandasDataFrameOutputParser().OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable PandasDataFrameOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `PydanticOutputParser` ```python >>> output_parsers.PydanticOutputParser(pydantic_object=MyModel).OutputType <class '__main__.MyModel'> ``` ### `RegexParser` ```python >>> output_parsers.RegexParser(regex='$', output_keys=['a']).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable RegexParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `RegexDictParser` ```python >>> output_parsers.RegexDictParser(output_key_to_format={'a':'a'}).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable RegexDictParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `RetryOutputParser` ```python >>> output_parsers.RetryOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `RetryWithErrorOutputParser` ```python >>> output_parsers.RetryWithErrorOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType ~T ``` ### `StructuredOutputParser` ```python >>> from langchain.output_parsers.structured import ResponseSchema >>> response_schemas = [ResponseSchema(name="foo",description="a list of strings",type="List[string]"),ResponseSchema(name="bar",description="a string",type="string"), ] >>> output_parsers.StructuredOutputParser.from_response_schemas(response_schemas).OutputType Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType raise TypeError( TypeError: Runnable StructuredOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type. ``` ### `YamlOutputParser` ```python >>> output_parsers.YamlOutputParser(pydantic_object=MyModel).OutputType ~T ``` <div> --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	3 months ago
Artem Mukhin	e271f75bee	docs: Fix URL formatting in deprecation warnings (#23075 ) Description Updated the URLs in deprecation warning messages. The URLs were previously written as raw strings and are now formatted to be clickable HTML links. Example of a broken link in the current API Reference: https://api.python.langchain.com/en/latest/chains/langchain.chains.openai_functions.extraction.create_extraction_chain_pydantic.html <img width="942" alt="Screenshot 2024-06-18 at 13 21 07" src="https://github.com/langchain-ai/langchain/assets/4854600/a1b1863c-cd03-4af2-a9bc-70375407fb00">	3 months ago
Gabriel Petracca	c6660df58e	community[minor]: Implement Doctran async execution (#22372 ) Description The DoctranTextTranslator has an async transform function that was not implemented because [the doctran library](https://github.com/psychic-api/doctran) uses a sync version of the `execute` method. - I implemented the `DoctranTextTranslator.atransform_documents()` method using `asyncio.to_thread` to run the function in a separate thread. - I updated the example in the Notebook with the new async version. - The performance improvements can be appreciated when a big document is divided into multiple chunks. Relates to: - Issue #14645: https://github.com/langchain-ai/langchain/issues/14645 - Issue #14437: https://github.com/langchain-ai/langchain/issues/14437 - https://github.com/langchain-ai/langchain/pull/15264 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	3 months ago
Eugene Yurtsev	aa6415aa7d	core[minor]: Support multiple keys in get_from_dict_or_env (#23086 ) Support passing multiple keys for ge_from_dict_or_env	3 months ago
nold	226802f0c4	community: add args_schema to SearxSearch (#22954 ) This change adds args_schema (pydantic BaseModel) to SearxSearchRun for correct schema formatting on LLM function calls Issue: currently using SearxSearchRun with OpenAI function calling returns the following error "TypeError: SearxSearchRun._run() got an unexpected keyword argument '__arg1' ". This happens because the schema sent to the LLM is "input: '{"__arg1":"foobar"}'" while the method should be called with the "query" parameter. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	3 months ago
Bagatur	01783d67fc	core[patch]: Release 0.2.9 (#23091 )	3 months ago
Finlay Macklon	616d06d7fe	community: glob multiple patterns when using DirectoryLoader (#22852 ) - Description: Updated community.langchain_community.document_loaders.directory.py to enable the use of multiple glob patterns in the `DirectoryLoader` class. Now, the glob parameter is of type `list[str] \| str` and still defaults to the same value as before. I updated the docstring of the class to reflect this, and added a unit test to community.tests.unit_tests.document_loaders.test_directory.py named `test_directory_loader_glob_multiple`. This test also shows an example of how to use the new functionality. - ~~Issue:~~Discussion Thread: https://github.com/langchain-ai/langchain/discussions/18559 - Dependencies: None - Twitter handle: N/a - [x] Add tests and docs - Added test (described above) - Updated class docstring - [x] Lint and test --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	3 months ago
Eugene Yurtsev	5564d9e404	core[patch]: Document BaseStore (#23082 ) Add doc-string to BaseStore	3 months ago
Takuya Igei	9f791b6ad5	core[patch],community[patch],langchain[patch]: `tenacity` dependency to version `>=8.1.0,<8.4.0` (#22973 ) Fix https://github.com/langchain-ai/langchain/issues/22972. - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	3 months ago
Raviraj	858ce264ef	SemanticChunker : Feature Addition ("Semantic Splitting with gradient") (#22895 ) ```SemanticChunker``` currently provide three methods to split the texts semantically: - percentile - standard_deviation - interquartile I propose new method ```gradient```. In this method, the gradient of distance is used to split chunks along with the percentile method (technically) . This method is useful when chunks are highly correlated with each other or specific to a domain e.g. legal or medical. The idea is to apply anomaly detection on gradient array so that the distribution become wider and easy to identify boundaries in highly semantic data. I have tested this merge on a set of 10 domain specific documents (mostly legal). Details : - Issue: Improvement - Dependencies: NA - Twitter handle: [x.com/prajapat_ravi](https://x.com/prajapat_ravi) @hwchase17 --------- Co-authored-by: Raviraj Prajapat <raviraj.prajapat@sirionlabs.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	3 months ago
Raghav Dixit	55705c0f5e	LanceDB integration update (#22869 ) Added : - [x] relevance search (w/wo scores) - [x] maximal marginal search - [x] image ingestion - [x] filtering support - [x] hybrid search w reranking make test, lint_diff and format checked.	3 months ago
Chang Liu	62c8a67f56	community: add KafkaChatMessageHistory (#22216 ) Add chat history store based on Kafka. Files added: `libs/community/langchain_community/chat_message_histories/kafka.py` `docs/docs/integrations/memory/kafka_chat_message_history.ipynb` New issue to be created for future improvement: 1. Async method implementation. 2. Message retrieval based on timestamp. 3. Support for other configs when connecting to cloud hosted Kafka (e.g. add `api_key` field) 4. Improve unit testing & integration testing.	3 months ago
shimajiroxyz	3e835a1aa1	langchain: add id_key option to EnsembleRetriever for metadata-based document merging (#22950 ) Description: - What I changed - By specifying the `id_key` during the initialization of `EnsembleRetriever`, it is now possible to determine which documents to merge scores for based on the value corresponding to the `id_key` element in the metadata, instead of `page_content`. Below is an example of how to use the modified `EnsembleRetriever`: ```python retriever = EnsembleRetriever(retrievers=[ret1, ret2], id_key="id") # The Document returned by each retriever must keep the "id" key in its metadata. ``` - Additionally, I added a script to easily test the behavior of the `invoke` method of the modified `EnsembleRetriever`. - Why I changed - There are cases where you may want to calculate scores by treating Documents with different `page_content` as the same when using `EnsembleRetriever`. For example, when you want to ensemble the search results of the same document described in two different languages. - The previous `EnsembleRetriever` used `page_content` as the basis for score aggregation, making the above usage difficult. Therefore, the score is now calculated based on the specified key value in the Document's metadata. Twitter handle: @shimajiroxyz	3 months ago
mackong	39f6c4169d	langchain[patch]: add tool messages formatter for tool calling agent (#22849 ) - Description: add tool_messages_formatter for tool calling agent, make tool messages can be formatted in different ways for your LLM. - Issue: N/A - Dependencies: N/A	3 months ago
Lucas Tucker	e25a5966b5	docs: Standardize DocumentLoader docstrings (#22932 ) Standardizing DocumentLoader docstrings (of which there are many) This PR addresses issue #22866 and adds docstrings according to the issue's specified format (in the appendix) for files csv_loader.py and json_loader.py in langchain_community.document_loaders. In particular, the following sections have been added to both CSVLoader and JSONLoader: Setup, Instantiate, Load, Async load, and Lazy load. It may be worth adding a 'Metadata' section to the JSONLoader docstring to clarify how we want to extract the JSON metadata (using the `metadata_func` argument). The files I used to walkthrough the various sections were `example_2.json` from [HERE](https://support.oneskyapp.com/hc/en-us/articles/208047697-JSON-sample-files) and `hw_200.csv` from [HERE](https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html). --------- Co-authored-by: lucast2021 <lucast2021@headroyce.org> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	3 months ago
Mohammad Mohtashim	60ba02f5db	[Community]: Fixed DDG DuckDuckGoSearchResults Docstring (#22968 ) - Description: A very small fix in the Docstring of `DuckDuckGoSearchResults` identified in the following issue. - Issue: #22961 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	3 months ago
Eun Hye Kim	70761af8cf	community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community) (#22977 ) - PR title: "community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community)" - PR message: - Description: - Added an optional verify parameter to the Requests class with a default value of True. - Modified the get, post, patch, put, and delete methods to include the verify parameter. - Updated the _arequest async context manager to include the verify parameter. - Added the verify parameter to the GenericRequestsWrapper class and passed it to the Requests class. - Issue: This PR fixes issue #22975. - Dependencies: No additional dependencies are required for this change. - Twitter handle: @lunara_x You can check this change with below code. ```python from langchain_openai.chat_models import ChatOpenAI from langchain.requests import RequestsWrapper from langchain_community.agent_toolkits.openapi import planner from langchain_community.agent_toolkits.openapi.spec import reduce_openapi_spec with open("swagger.yaml") as f: data = yaml.load(f, Loader=yaml.FullLoader) swagger_api_spec = reduce_openapi_spec(data) llm = ChatOpenAI(model='gpt-4o') swagger_requests_wrapper = RequestsWrapper(verify=False) # modified point superset_agent = planner.create_openapi_agent(swagger_api_spec, swagger_requests_wrapper, llm, allow_dangerous_requests=True, handle_parsing_errors=True) superset_agent.run( "Tell me the number and types of charts and dashboards available." ) ``` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	3 months ago
Mohammad Mohtashim	bf839676c7	[Community]: FIxed the DocumentDBVectorSearch `_similarity_search_without_score` (#22970 ) - Description: The PR #22777 introduced a bug in `_similarity_search_without_score` which was raising the `OperationFailure` error. The mistake was syntax error for MongoDB pipeline which has been corrected now. - Issue: #22770	3 months ago
Nuno Campos	f01f12ce1e	Include "no escape" and "inverted section" mustache vars in Prompt.input_variables and Prompt.input_schema (#22981 )	3 months ago
Bagatur	c2b2e3266c	core[minor]: message transformer utils (#22752 )	3 months ago
Anders Swanson	aacc6198b9	community: OCI GenAI embedding batch size (#22986 ) Thank you for contributing to LangChain! - [x] PR title: "community: OCI GenAI embedding batch size" - [x] PR message: - Issue: #22985 - [ ] Add tests and docs: N/A - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Signed-off-by: Anders Swanson <anders.swanson@oracle.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	3 months ago
Bagatur	8235bae48e	core[patch]: Release 0.2.8 (#23012 )	3 months ago
Nuno Campos	bd4b68cd54	core: run_in_executor: Wrap StopIteration in RuntimeError (#22997 ) - StopIteration can't be set on an asyncio.Future it raises a TypeError and leaves the Future pending forever so we need to convert it to a RuntimeError	3 months ago
Bagatur	d96f67b06f	standard-tests[patch]: Update chat model standard tests (#22378 ) - Refactor standard test classes to make them easier to configure - Update openai to support stop_sequences init param - Update groq to support stop_sequences init param - Update fireworks to support max_retries init param - Update ChatModel.bind_tools to type tool_choice - Update groq to handle tool_choice="any". this may be controversial --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	3 months ago
Oguz Vuruskaner	dd25d08c06	community[minor]: add tool calling for DeepInfraChat (#22745 ) DeepInfra now supports tool calling for supported models. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
maang-h	c6b7db6587	community: Add Baichuan Embeddings batch size (#22942 ) - Support batch size Baichuan updates the document, indicating that up to 16 documents can be imported at a time - Standardized model init arg names - baichuan_api_key -> api_key - model_name -> model	3 months ago
ccurme	722c8f50ea	openai[patch]: add stream_usage parameter (#22854 ) Here we add `stream_usage` to ChatOpenAI as: 1. a boolean attribute 2. a kwarg to _stream and _astream. Question: should the `stream_usage` attribute be `bool`, or `bool \| None`? Currently I've kept it `bool` and defaulted to False. It was implemented on [ChatAnthropic](`e832bbb486/libs/partners/anthropic/langchain_anthropic/chat_models.py (L535)`) as a bool. However, to maintain support for users who access the behavior via OpenAI's `stream_options` param, this ends up being possible: ```python llm = ChatOpenAI(model_kwargs={"stream_options": {"include_usage": True}}) assert not llm.stream_usage ``` (and this model will stream token usage). Some options for this: - it's ok - make the `stream_usage` attribute bool or None - make an \_\_init\_\_ for ChatOpenAI, set a `._stream_usage` attribute and read `.stream_usage` from a property Open to other ideas as well.	3 months ago
Shubham Pandey	56ac94e014	community[minor]: add `ChatSnowflakeCortex` chat model (#21490 ) Description: This PR adds a chat model integration for [Snowflake Cortex](https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions), which gives an instant access to industry-leading large language models (LLMs) trained by researchers at companies like Mistral, Reka, Meta, and Google, including [Snowflake Arctic](https://www.snowflake.com/en/data-cloud/arctic/), an open enterprise-grade model developed by Snowflake. Dependencies: Snowflake's [snowpark](https://pypi.org/project/snowflake-snowpark-python/) library is required for using this integration. Twitter handle: [@gethouseware](https://twitter.com/gethouseware) - [x] Add tests and docs: 1. integration tests: `libs/community/tests/integration_tests/chat_models/test_snowflake.py` 2. unit tests: `libs/community/tests/unit_tests/chat_models/test_snowflake.py` 3. example notebook: `docs/docs/integrations/chat/snowflake.ipynb` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	3 months ago
Bagatur	e2304ebcdb	standard-tests[patch]: Release 0.1.1 (#22984 )	3 months ago
Hakan Özdemir	c437b1aab7	[Partner]: Add metadata to stream response (#22716 ) Adds `response_metadata` to stream responses from OpenAI. This is returned with `invoke` normally, but wasn't implemented for `stream`. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	3 months ago
Bagatur	9ff249a38d	standard-tests[patch]: don't require str chunk contents (#22965 )	3 months ago
Christopher Tee	ada03dd273	community(you): Better support for You.com News API (#22622 ) ## Description While `YouRetriever` supports both You.com's Search and News APIs, news is supported as an afterthought. More specifically, not all of the News API parameters are exposed for the user, only those that happen to overlap with the Search API. This PR: - improves support for both APIs, exposing the remaining News API parameters while retaining backward compatibility - refactor some REST parameter generation logic - updates the docstring of `YouSearchAPIWrapper` - add input validation and warnings to ensure parameters are properly set by user - 🚨 Breaking: Limit the news results to `k` items If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	3 months ago
Tomaz Bratanic	1c661fd849	Improve llm graph transformer docstring (#22939 )	3 months ago
maang-h	7a0af56177	docs: update ZhipuAI ChatModel docstring (#22934 ) - Description: Update ZhipuAI ChatModel rich docstring - Issue: the issue #22296	3 months ago
Bitmonkey	570d45b2a1	Update ollama.py with optional raw setting. (#21486 ) Ollama has a raw option now. https://github.com/ollama/ollama/blob/main/docs/api.md Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	3 months ago
caiyueliang	9944ad7f5f	community: 'Solve the issue where the _search function in ElasticsearchStore supports passing a query_vector parameter, but the parameter does not take effect. (#21532 ) Issue: When using the similarity_search_with_score function in ElasticsearchStore, I expected to pass in the query_vector that I have already obtained. I noticed that the _search function does support the query_vector parameter, but it seems to be ineffective. I am attempting to resolve this issue. Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	3 months ago
Erick Friis	c374c98389	experimental: release 0.0.61 (#22924 )	3 months ago
BuxianChen	af65cac609	cli[minor]: remove redefined DEFAULT_GIT_REF (#21471 ) remove redefined DEFAULT_GIT_REF Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	3 months ago
Erick Friis	79a64207f5	community: release 0.2.5 (#22923 )	3 months ago
Jiejun Tan	c8c67dde6f	text-splitters[patch]: Fix HTMLSectionSplitter (#22812 ) Update former pull request: https://github.com/langchain-ai/langchain/pull/22654. Modified `langchain_text_splitters.HTMLSectionSplitter`, where in the latest version `dict` data structure is used to store sections from a html document, in function `split_html_by_headers`. The header/section element names serve as dict keys. This can be a problem when duplicate header/section element names are present in a single html document. Latter ones can replace former ones with the same name. Therefore some contents can be miss after html text splitting is conducted. Using a list to store sections can hopefully solve the problem. A Unit test considering duplicate header names has been added. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
Erick Friis	fbeeb6da75	langchain: release 0.2.5 (#22922 )	3 months ago
Baskar Gopinath	c4f2bc9540	docs: Fix wrongly referenced class name in confluence.py (#22879 ) Fixes #22542 Changed ConfluenceReader to ConfluenceLoader	3 months ago
Erick Friis	9ef15691d6	core: release 0.2.7 (#22917 )	3 months ago
Nuno Campos	338180f383	core: in astream_events v2 always await task even if already finished (#22916 ) - this ensures exceptions propagate to the caller	3 months ago
Istvan/Nebulinq	513e491ce9	experimental: LLMGraphTransformer - added relationship properties. (#21856 ) - Description: The generated relationships in the graph had no properties, but the Relationship class was properly defined with properties. This made it very difficult to transform conditional sentences into a graph. Adding properties to relationships can solve this issue elegantly. The changes expand on the existing LLMGraphTransformer implementation but add the possibility to define allowed relationship properties like this: LLMGraphTransformer(llm=llm, relationship_properties=["Condition", "Time"],) - Issue: no issue found - Dependencies: n/a - Twitter handle: @IstvanSpace -Quick Test ================================================================= from dotenv import load_dotenv import os from langchain_community.graphs import Neo4jGraph from langchain_experimental.graph_transformers import LLMGraphTransformer from langchain_openai import ChatOpenAI from langchain_core.prompts import ChatPromptTemplate from langchain_core.documents import Document load_dotenv() os.environ["NEO4J_URI"] = os.getenv("NEO4J_URI") os.environ["NEO4J_USERNAME"] = os.getenv("NEO4J_USERNAME") os.environ["NEO4J_PASSWORD"] = os.getenv("NEO4J_PASSWORD") graph = Neo4jGraph() llm = ChatOpenAI(temperature=0, model_name="gpt-4o") llm_transformer = LLMGraphTransformer(llm=llm) #text = "Harry potter likes pies, but only if it rains outside" text = "Jack has a dog named Max. Jack only walks Max if it is sunny outside." documents = [Document(page_content=text)] llm_transformer_props = LLMGraphTransformer( llm=llm, relationship_properties=["Condition"], ) graph_documents_props = llm_transformer_props.convert_to_graph_documents(documents) print(f"Nodes:{graph_documents_props[0].nodes}") print(f"Relationships:{graph_documents_props[0].relationships}") graph.add_graph_documents(graph_documents_props) --------- Co-authored-by: Istvan Lorincz <istvan.lorincz@pm.me> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	3 months ago
kiarina	8171efd07a	core[patch]: Fix FunctionCallbackHandler._on_tool_end (#22908 ) If the global `debug` flag is enabled, the agent will get the following error in `FunctionCallbackHandler._on_tool_end` at runtime. ``` Error in ConsoleCallbackHandler.on_tool_end callback: AttributeError("'list' object has no attribute 'strip'") ``` By calling str() before strip(), the error was avoided. This error can be seen at [debugging.ipynb](https://github.com/langchain-ai/langchain/blob/master/docs/docs/how_to/debugging.ipynb). - Issue: NA - Dependencies: NA - Twitter handle: https://x.com/kiarina37	3 months ago
Philippe PRADOS	b61de9728e	community[minor]: Fix long_context_reorder.py async (#22839 ) Implement `async def atransform_documents( self, documents: Sequence[Document], **kwargs: Any ) -> Sequence[Document]` for `LongContextReorder`	3 months ago
Eugene Yurtsev	c72bcda4f2	community[major], experimental[patch]: Remove Python REPL from community (#22904 ) Remove the REPL from community, and suggest an alternative import from langchain_experimental. Fix for this issue: https://github.com/langchain-ai/langchain/issues/14345 This is not a bug in the code or an actual security risk. The python REPL itself is behaving as expected. The PR is done to appease blanket security policies that are just looking for the presence of exec in the code. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	3 months ago
Eugene Yurtsev	9a877c7adb	community[patch]: SitemapLoader restrict depth of parsing sitemap (CVE-2024-2965) (#22903 ) This PR restricts the depth to which the sitemap can be parsed. Fix for: CVE-2024-2965	3 months ago
Eugene Yurtsev	4a77a3ab19	core[patch]: fix validation of @deprecated decorator (#22513 ) This PR moves the validation of the decorator to a better place to avoid creating bugs while deprecating code. Prevent issues like this from arising: https://github.com/langchain-ai/langchain/issues/22510 we should replace with a linter at some point that just does static analysis	3 months ago
Jacob Lee	181a61982f	anthropic[minor]: Adds streaming tool call support for Anthropic (#22687 ) Preserves string content chunks for non tool call requests for convenience. One thing - Anthropic events look like this: ``` RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start') RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta') RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta') ... RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start') RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta') ``` Note that `delta` has a `type` field. With this implementation, I'm dropping it because `merge_list` behavior will concatenate strings. We currently have `index` as a special field when merging lists, would it be worth adding `type` too? If so, what do we set as a context block chunk? `text` vs. `text_delta`/`tool_use` vs `input_json_delta`? CC @ccurme @efriis @baskaryan	3 months ago
ccurme	f40b2c6f9d	fireworks[patch]: add usage_metadata to (a)invoke and (a)stream (#22906 )	3 months ago
Mohammad Mohtashim	d1b7a934aa	[Community]: HuggingFaceCrossEncoder `score` accounting for <not-relevant score,relevant score> pairs. (#22578 ) - Description: Some of the Cross-Encoder models provide scores in pairs, i.e., <not-relevant score (higher means the document is less relevant to the query), relevant score (higher means the document is more relevant to the query)>. However, the `HuggingFaceCrossEncoder` `score` method does not currently take into account the pair situation. This PR addresses this issue by modifying the method to consider only the relevant score if score is being provided in pair. The reason for focusing on the relevant score is that the compressors select the top-n documents based on relevance. - Issue: #22556 - Please also refer to this [comment](https://github.com/UKPLab/sentence-transformers/issues/568#issuecomment-729153075)	3 months ago
Thanh Nguyen	b5e2ba3a47	community[minor]: add chat model llamacpp (#22589 ) - PR title: [community] add chat model llamacpp - PR message: - Description: This PR introduces a new chat model integration with llamacpp_python, designed to work similarly to the existing ChatOpenAI model. + Work well with instructed chat, chain and function/tool calling. + Work with LangGraph (persistent memory, tool calling), will update soon - Dependencies: This change requires the llamacpp_python library to be installed. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	3 months ago
ccurme	73c76b9628	anthropic[patch]: always add tool_result type to ToolMessage content (#22721 ) Anthropic tool results can contain image data, which are typically represented with content blocks having `"type": "image"`. Currently, these content blocks are passed as-is as human/user messages to Anthropic, which raises BadRequestError as it expects a tool_result block to follow a tool_use. Here we update ChatAnthropic to nest the content blocks inside a tool_result content block. Example: ```python import base64 import httpx from langchain_anthropic import ChatAnthropic from langchain_core.messages import AIMessage, HumanMessage, ToolMessage from langchain_core.pydantic_v1 import BaseModel, Field # Fetch image image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8") class FetchImage(BaseModel): should_fetch: bool = Field(..., description="Whether an image is requested.") llm = ChatAnthropic(model="claude-3-sonnet-20240229").bind_tools([FetchImage]) messages = [ HumanMessage(content="Could you summon a beautiful image please?"), AIMessage( content=[ { "type": "tool_use", "id": "toolu_01Rn6Qvj5m7955x9m9Pfxbcx", "name": "FetchImage", "input": {"should_fetch": True}, }, ], tool_calls=[ { "name": "FetchImage", "args": {"should_fetch": True}, "id": "toolu_01Rn6Qvj5m7955x9m9Pfxbcx", }, ], ), ToolMessage( name="FetchImage", content=[ { "type": "image", "source": { "type": "base64", "media_type": "image/jpeg", "data": image_data, }, }, ], tool_call_id="toolu_01Rn6Qvj5m7955x9m9Pfxbcx", ), ] llm.invoke(messages) ``` Trace: https://smith.langchain.com/public/d27e4fc1-a96d-41e1-9f52-54f5004122db/r	3 months ago
Lucas Tucker	7114aed78f	docs: Standardize ChatGroq (#22751 ) Updated ChatGroq doc string as per issue https://github.com/langchain-ai/langchain/issues/22296:"langchain_groq: updated docstring for ChatGroq in langchain_groq to match that of the description (in the appendix) provided in issue https://github.com/langchain-ai/langchain/issues/22296. " Issue: This PR is in response to issue https://github.com/langchain-ai/langchain/issues/22296, and more specifically the ChatGroq model. In particular, this PR updates the docstring for langchain/libs/partners/groq/langchain_groq/chat_model.py by adding the following sections: Instantiate, Invoke, Stream, Async, Tool calling, Structured Output, and Response metadata. I used the template from the Anthropic implementation and referenced the Appendix of the original issue post. I also noted that: `usage_metadata `returns none for all ChatGroq models I tested; there is no mention of image input in the ChatGroq documentation; unlike that of ChatHuggingFace, `.stream(messages)` for ChatGroq returned blocks of output. --------- Co-authored-by: lucast2021 <lucast2021@headroyce.org> Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
Anush	e002c855bd	qdrant[patch]: Use collection_exists API instead of exceptions (#22764 ) ## Description Currently, the Qdrant integration relies on exceptions raised by [`get_collection` ](https://qdrant.tech/documentation/concepts/collections/#collection-info) to check if a collection exists. Using [`collection_exists`](https://qdrant.tech/documentation/concepts/collections/#check-collection-existence) is recommended to avoid missing any unhandled exceptions. This PR addresses this. ## Testing All integration and unit tests pass. No user-facing changes.	3 months ago
Anindyadeep	c417803908	community[minor]: Prem Templates (#22783 ) This PR adds the feature add Prem Template feature in ChatPremAI. Additionally it fixes a minor bug for API auth error when API passed through arguments.	3 months ago
maang-h	1055b9a309	community[minor]: Implement ZhipuAIEmbeddings interface (#22821 ) - Description: Implement ZhipuAIEmbeddings interface, include: - The `embed_query` method - The `embed_documents` method refer to [ZhipuAI Embedding-2](https://open.bigmodel.cn/dev/api#text_embedding) --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	3 months ago
Leonid Ganeline	46c9784127	docs: `ReAct` reference (#22830 ) The `ReAct` is used all across LangChain but it is not referenced properly. Added references to the original paper.	3 months ago
Bagatur	8bd368d07e	cli[patch]: Release 0.0.25 (#22876 )	3 months ago
Isaac Francisco	75e966a2fa	docs, cli[patch]: document loaders doc template (#22862 ) From: https://github.com/langchain-ai/langchain/pull/22290 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	3 months ago
Kagura Chen	57783c5e55	Fix: lint errors and update Field alias in models.py and AutoSelectionScorer initialization (#22846 ) This PR addresses several lint errors in the core package of LangChain. Specifically, the following issues were fixed: 1.Unexpected keyword argument "required" for "Field" [call-arg] 2.tests/integration_tests/chains/test_cpal.py:263: error: Unexpected keyword argument "narrative_input" for "QueryModel" [call-arg]	3 months ago
Erick Friis	5bc774827b	langchain: release 0.2.4 (#22872 )	3 months ago
Erick Friis	7234fd0f51	core: release 0.2.6 (#22868 )	3 months ago
Jacob Lee	bcbb43480c	core[patch]: Treat type as a special field when merging lists (#22750 ) Should we even log a warning? At least for Anthropic, it's expected to get e.g. `text_block` followed by `text_delta`. @ccurme @baskaryan @efriis	3 months ago
Nuno Campos	bae82e966a	core: In astream_events v2 propagate cancel/break to the inner astream call (#22865 ) - previous behavior was for the inner astream to continue running with no interruption - also propagate break in core runnable methods	3 months ago
Eugene Yurtsev	a766815a99	experimental[patch]/docs[patch]: Update links to security docs (#22864 ) Minor update to newest version of security docs (content should be identical).	3 months ago
Eugene Yurtsev	8f7cc73817	ci: Add script to check for pickle usage in community (#22863 ) Add script to check for pickle usage in community.	3 months ago
Eugene Yurtsev	77209f315e	community[patch]: FAISS VectorStore deserializer should be opt-in (#22861 ) FAISS deserializer uses pickle module. Users have to opt-in to de-serialize.	3 months ago
Eugene Yurtsev	ce0b0f22a1	experimental[major]: Force users to opt-in into code that relies on the python repl (#22860 ) This should make it obvious that a few of the agents in langchain experimental rely on the python REPL as a tool under the hood, and will force users to opt-in.	3 months ago
Isaac Francisco	869523ad72	[docs]: added info for TavilySearchResults (#22765 )	3 months ago
ccurme	42257b120f	partners: fix numpy dep (#22858 ) Following https://github.com/langchain-ai/langchain/pull/22813, which added python 3.12 to CI, here we update numpy accordingly in partner packages.	3 months ago
Isaac Francisco	345fd3a556	minor functionality change: adding API functionality to tavilysearch (#22761 )	3 months ago
Isaac Francisco	034257e9bf	docs: improved recursive url loader docs (#22648 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	3 months ago
ccurme	b626c3ca23	groq[patch]: add usage_metadata to (a)invoke and (a)stream (#22834 )	3 months ago
James Braza	45b394268c	core[patch]: allowing latest `packaging` versions (#22792 ) Allowing version 24 of https://github.com/pypa/packaging --------- Co-authored-by: Erick Friis <erick@langchain.dev>	3 months ago
Karim Lalani	276be6cdd4	[experimental][llms][OllamaFunctions] tool calling related fixes (#22339 ) Fixes issues with tool calling to handle tool objects correctly. Added support to handle ToolMessage correctly. Added additional checks for error conditions. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	3 months ago
Christophe Bornet	d04e899b56	ci: add testing with Python 3.12 (#22813 ) We need to use a different version of numpy for py3.8 and py3.12 in pyproject. And so do projects that use that Python version range and import langchain. - Twitter handle: _cbornet	3 months ago
HyoJin Kang	b6bf2bb234	community[patch]: fix database uri type in SQLDatabase (#22661 ) Description sqlalchemy uses "sqlalchemy.engine.URL" type for db uri argument. Added 'URL' type for compatibility. Issue: None Dependencies: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
Eugene Yurtsev	5dbbdcbf8e	core[patch]: Update remaining root_validators (#22829 ) This PR updates the remaining root_validators in core to either be explicit pre-init or post-init validators.	3 months ago
Eugene Yurtsev	265e650e64	community[patch]: Update root_validators embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub, Toolkits: Connery, ChatModels: PAI_EAS, (#22828 ) This PR updates root validators for: * Embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub * Toolkits: Connery * ChatModels: PAI_EAS Following this issue: https://github.com/langchain-ai/langchain/issues/22819	3 months ago
JonZeolla	32ba8cfab0	community[minor]: implement huggingface show_progress consistently (#22682 ) - Description: This implements `show_progress` more consistently (i.e. it is also added to the `HuggingFaceBgeEmbeddings` object). - Issue: This implements `show_progress` more consistently in the embeddings huggingface classes. Previously this could have been set via `encode_kwargs`. - Dependencies: None - Twitter handle: @jonzeolla	3 months ago
Eugene Yurtsev	74e705250f	core[patch]: update some root_validators (#22787 ) Update some of the @root_validators to be explicit pre=True or pre=False, skip_on_failure=True for pydantic 2 compatibility.	3 months ago
mrhbj	a1268d9e9a	community[patch]: fix hunyuan message include chinese signature error (#22795 ) (#22796 ) … (#22795) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	3 months ago
Mr. Lance E Sloan «UMich»	08c466c603	community[patch]: bugfix for `YoutubeLoader`'s `LINES` format (#22815 ) - Description: A change I submitted recently introduced a bug in `YoutubeLoader`'s `LINES` output format. In those conditions, curly braces ("`{}`") creates a set, not a dictionary. This bugfix explicitly specifies that a dictionary is created. - Issue: N/A - Dependencies: N/A - Twitter: lsloan_umich - Mastodon: [lsloan@mastodon.social](https://mastodon.social/@lsloan)	3 months ago
Philippe PRADOS	23c22fcbc9	langchain[minor]: Make EmbeddingsFilters async (#22737 ) Add native async implementation for EmbeddingsFilter	3 months ago
ccurme	936aedd10c	mistral[patch]: add usage_metadata to (a)invoke and (a)stream (#22781 )	3 months ago
mrhbj	9212c9fcb8	community[patch]: fix hunyuan client json analysis (#22452 ) (#22767 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
Rohan Aggarwal	86e8224cf1	community[patch]: Support for old clients (Thin and Thick) Oracle Vector Store (#22766 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" Support for old clients (Thin and Thick) Oracle Vector Store - [ ] PR message: *Delete this entire checklist* and replace with Support for old clients (Thin and Thick) Oracle Vector Store - [ ] Add tests and docs: If you're adding a new integration, please include Have our own local tests --------- Co-authored-by: rohan.aggarwal@oracle.com <rohaagga@phoenix95642.dev3sub2phx.databasede3phx.oraclevcn.com>	3 months ago
Mr. Lance E Sloan «UMich»	84dc2dd059	community[patch]: Load YouTube transcripts (captions) as fixed-duration chunks with start times (#21710 ) - Description: Add a new format, `CHUNKS`, to `langchain_community.document_loaders.youtube.YoutubeLoader` which creates multiple `Document` objects from YouTube video transcripts (captions), each of a fixed duration. The metadata of each chunk `Document` includes the start time of each one and a URL to that time in the video on the YouTube website. I had implemented this for UMich (@umich-its-ai) in a local module, but it makes sense to contribute this to LangChain community for all to benefit and to simplify maintenance. - Issue: N/A - Dependencies: N/A - Twitter: lsloan_umich - Mastodon: [lsloan@mastodon.social](https://mastodon.social/@lsloan) With regards to tests and documentation, most existing features of the `YoutubeLoader` class are not tested. Only the `YoutubeLoader.extract_video_id()` static method had a test. However, while I was waiting for this PR to be reviewed and merged, I had time to add a test for the chunking feature I've proposed in this PR. I have added an example of using chunking to the `docs/docs/integrations/document_loaders/youtube_transcript.ipynb` notebook. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
Aayush Kataria	71811e0547	community[minor]: Adds a vector store for Azure Cosmos DB for NoSQL (#21676 ) This PR add supports for Azure Cosmos DB for NoSQL vector store. Summary: Description: added vector store integration for Azure Cosmos DB for NoSQL Vector Store, Dependencies: azure-cosmos dependency, Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	3 months ago
Mohammad Mohtashim	36cad5d25c	[Community]: Added Metadata filter support for DocumentDB Vector Store (#22777 ) - Description: As pointed out in this issue #22770, DocumentDB `similarity_search` does not support filtering through metadata which this PR adds by passing in the parameter `filter`. Also this PR fixes a minor Documentation error. - Issue: #22770 --------- Co-authored-by: Erick Friis <erickfriis@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	3 months ago
Dmitry Stepanov	912751e268	Ollama vision support (#22734 ) Description: Ollama vision with messages in OpenAI-style support `{ "image_url": { "url": ... } }` Issue: #22460 Added flexible solution for ChatOllama to support chat messages with images. Works when you provide either `image_url` as a string or as a dict with "url" inside (like OpenAI does). So it makes available to use tuples with `ChatPromptTemplate.from_messages()` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
Philippe PRADOS	0908b01cb2	langchain[minor]: Add native async implementation to LLMFilter, add concurrency to both sync and async paths (#22739 ) Thank you for contributing to LangChain! - [ ] PR title: "langchain: Fix chain_filter.py to be compatible with async" - [ ] PR message: - Description: chain_filter is not compatible with async. - Twitter handle: pprados - [X ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Signed-off-by: zhangwangda <zhangwangda94@163.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Lei Zhang <zhanglei@apache.org> Co-authored-by: Gin <ictgtvt@gmail.com> Co-authored-by: wangda <38549158+daziz@users.noreply.github.com> Co-authored-by: Max Mulatz <klappradla@posteo.net>	3 months ago
Jaeyeon Kim(김재연)	ce4e29ae42	community[minor]: fix redis store docstring and streamline initialization code (#22730 ) Thank you for contributing to LangChain! ### Description Fix the example in the docstring of redis store. Change the initilization logic and remove redundant check, enhance error message. ### Issue The example in docstring of how to use redis store was wrong. ![image](https://github.com/langchain-ai/langchain/assets/37469330/78c5d9ce-ee66-45b3-8dfe-ea29f125e6e9) ### Dependencies Nothing - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	3 months ago
am-kinetica	ad101adec8	community[patch]: Kinetica Integrations handled error in querying; quotes in table names; updated gpudb API (#22724 ) - [ ] Miscellaneous updates and fixes: - Description: Handled error in querying; quotes in table names; updated gpudb API - Issue: Threw an error with an error message difficult to understand if a query failed or returned no records - Dependencies: Updated GPUDB API version to `7.2.0.9` @baskaryan @hwchase17	3 months ago
Mathis Joffre	ea43f40daf	community[minor]: Add support for OVHcloud AI Endpoints Embedding (#22667 ) Description: Add support for [OVHcloud AI Endpoints](https://endpoints.ai.cloud.ovh.net/) Embedding models. Inspired by: https://gist.github.com/gmasse/e1f99339e161f4830df6be5d0095349a Signed-off-by: Joffref <mariusjoffre@gmail.com>	4 months ago
Erick Friis	2aaf86ddae	core: fix mustache falsy cases (#22747 )	4 months ago
Eugene Yurtsev	5a7eac191a	core[patch]: Add missing type annotations (#22756 ) Add missing type annotations. The missing type annotations will raise exceptions with pydantic 2.	4 months ago
Eugene Yurtsev	05d31a2f00	community[patch]: Add missing type annotations (#22758 ) Add missing type annotations to objects in community. These missing type annotations will raise type errors in pydantic 2.	4 months ago
Naka Masato	3237909221	langchain[patch]: allow to use partial variables in create_sql_query_chain (#22688 ) - Description: allow to use partial variables to pass `top_k` and `table_info` - Issue: no - Dependencies: no - Twitter handle: @gymnstcs --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Bharat Ramanathan	2b5631a6be	community[patch]: fix `WandbTracer` to work with new "RunV2" API (#22673 ) - Description: This PR updates the `WandbTracer` to work with the new RunV2 API so that wandb Traces logging works correctly for new LangChain versions. Here's an example [run](https://wandb.ai/parambharat/langchain-tracing/runs/wpm99ftq) from the existing tests - Issue: https://github.com/wandb/wandb/issues/7762 - Twitter handle: @ParamBharat _If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17._	4 months ago
Oguz Vuruskaner	f0f4532579	community[patch]: fix deepinfra inference (#22680 ) This PR includes: 1. Update of default model to LLama3. 2. Handle some 400x errors with more user friendly error messages. 3. Handle user errors.	4 months ago
Lucas Tucker	cb79e80b0b	docs: standardize ChatHuggingFace (#22693 ) Updated ChatHuggingFace doc string as per issue #22296: "langchain_huggingface: updated docstring for ChatHuggingFace in langchain_huggingface to match that of the description (in the appendix) provided in issue #22296. " Issue: This PR is in response to issue #22296, and more specifically ChatHuggingFace model. In particular, this PR updates the docstring for langchain/libs/partners/hugging_face/langchain_huggingface/chat_models/huggingface.py by adding the following sections: Instantiate, Invoke, Stream, Async, Tool calling, and Response metadata. I used the template from the Anthropic implementation and referenced the Appendix of the original issue post. I also noted that: langchain_community hugging face llms do not work with langchain_huggingface's ChatHuggingFace model (at least for me); the .stream(messages) functionality of ChatHuggingFace only returned a block of response. --------- Co-authored-by: lucast2021 <lucast2021@headroyce.org> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Tomaz Bratanic	76a193decc	community[patch]: Add function response to graph cypher qa chain (#22690 ) LLMs struggle with Graph RAG, because it's different from vector RAG in a way that you don't provide the whole context, only the answer and the LLM has to believe. However, that doesn't really work a lot of the time. However, if you wrap the context as function response the accuracy is much better. btw... `union[LLMChain, Runnable]` is linting fun, that's why so many ignores	4 months ago
X-HAN	34edfe4a16	community[minor]: add Volcengine Rerank (#22700 ) Description: this PR adds Volcengine Rerank capability to Langchain, you can find Volcengine Rerank API from [here](https://www.volcengine.com/docs/84313/1254474) & [here](https://www.volcengine.com/docs/84313/1254605). [Volcengine](https://www.volcengine.com/) is a cloud service platform developed by ByteDance, the parent company of TikTok. You can obtain Volcengine API AK/SK from [here](https://www.volcengine.com/docs/84313/1254553). Dependencies: VolcengineRerank depends on `volcengine` python package. Twitter handle: my twitter/x account is https://x.com/LastMonopoly and I'd like a mention, thank you! Tests and docs 1. integration test: `test_volcengine_rerank.py` 2. example notebook: `volcengine_rerank.ipynb` Lint and test: I have run `make format`, `make lint` and `make test` from the root of the package I've modified.	4 months ago
Mohammad Mohtashim	c3cce98d86	community[patch]: Small Fix in OutlookMessageLoader (Close the Message once Open) (#22744 ) - Description: A very small fix where we close the message when it opened - Issue: #22729	4 months ago
ccurme	f9fdca6cc2	openai: add `parallel_tool_calls` to api ref (#22746 ) ![Screenshot 2024-06-10 at 1 41 24 PM](https://github.com/langchain-ai/langchain/assets/26529506/2626bf9c-41c6-4431-b2e1-f59de1e4e468)	4 months ago
Max Mulatz	058a64c563	Community[minor]: Add language parser for Elixir (#22742 ) Hi 👋 First off, thanks a ton for your work on this 💚 Really appreciate what you're providing here for the community. ## Description This PR adds a basic language parser for the [Elixir](https://elixir-lang.org/) programming language. The parser code is based upon the approach outlined in https://github.com/langchain-ai/langchain/pull/13318: it's using `tree-sitter` under the hood and aligns with all the other `tree-sitter` based parses added that PR. The `CHUNK_QUERY` I'm using here is probably not the most sophisticated one, but it worked for my application. It's a starting point to provide "core" parsing support for Elixir in LangChain. It enables people to use the language parser out in real world applications which may then lead to further tweaking of the queries. I consider this PR just the ground work. - Dependencies: requires `tree-sitter` and `tree-sitter-languages` from the extended dependencies - Twitter handle:`@bitcrowd` ## Checklist - [x] PR title: "package: description" - [x] Add tests and docs - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. <!-- If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->	4 months ago
Philippe PRADOS	2d4689d721	langchain[minor]: Add pgvector to list of supported vectorstores in self query retriever (#22678 ) The fact that we outsourced pgvector to another project has an unintended effect. The mapping dictionary found by `_get_builtin_translator()` cannot recognize the new version of pgvector because it comes from another package. `SelfQueryRetriever` no longer knows `PGVector`. I propose to fix this by creating a global dictionary that can be populated by various database implementations. Thus, importing `langchain_postgres` will allow the registration of the `PGvector` mapping. But for the moment I'm just adding a lazy import Furthermore, the implementation of _get_builtin_translator() reconstructs the BUILTIN_TRANSLATORS variable with each invocation, which is not very efficient. A global map would be an optimization. - Twitter handle: pprados @eyurtsev, can you review this PR? And unlock the PR [Add async mode for pgvector](https://github.com/langchain-ai/langchain-postgres/pull/32) and PR [community[minor]: Add SQL storage implementation](https://github.com/langchain-ai/langchain/pull/22207)? Are you in favour of a global dictionary-based implementation of Translator?	4 months ago
Enzo Poggio	8f019e91d7	community[patch]: Use Custom Logger Instead of Root Logger in get_user_agent Function (#22691 ) ## Description This PR addresses a logging inconsistency in the `get_user_agent` function. Previously, the function was using the root logger to log a warning message when the "USER_AGENT" environment variable was not set. This bypassed the custom logger `log` that was created at the start of the module, leading to potential inconsistencies in logging behavior. Changes: - Replaced `logging.warning` with `log.warning` in the `get_user_agent` function to ensure that the custom logger is used. This change ensures that all logging in the `get_user_agent` function respects the configurations of the custom logger, leading to more consistent and predictable logging behavior. ## Dependencies None ## Issue None ## Tests and docs ☝🏻 see description ## `make format`, `make lint` & `cd libs/community; make test` ```shell > make format poetry run ruff format docs templates cookbook 1417 files left unchanged poetry run ruff check --select I --fix docs templates cookbook All checks passed! ``` ```shell > make lint poetry run ruff check docs templates cookbook All checks passed! poetry run ruff format docs templates cookbook --diff 1417 files already formatted poetry run ruff check --select I docs templates cookbook All checks passed! git grep 'from langchain import' docs/docs templates cookbook \| grep -vE 'from langchain import (hub)' && exit 1 \|\| exit 0 ``` ~cd libs/community; make test~ too much dependencies for integration ... ```shell > poetry run pytest tests/unit_tests .... ==== 884 passed, 466 skipped, 4447 warnings in 15.93s ==== ``` I choose you randomly : @ccurme	4 months ago
Philippe PRADOS	9aabb446c5	community[minor]: Add SQL storage implementation (#22207 ) Hello @eyurtsev - package: langchain-comminity - Description: Add SQL implementation for docstore. A new implementation, in line with my other PR ([async PGVector](https://github.com/langchain-ai/langchain-postgres/pull/32), [SQLChatMessageMemory](https://github.com/langchain-ai/langchain/pull/22065)) - Twitter handler: pprados --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Piotr Mardziel <piotrm@gmail.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	4 months ago
Nithish Raghunandanan	f2f0e0e13d	couchbase: Add the initial version of Couchbase partner package (#22087 ) Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	4 months ago
Cahid Arda Öz	6c07eb0c12	community[minor]: Add UpstashRatelimitHandler (#21885 ) Adding `UpstashRatelimitHandler` callback for rate limiting based on number of chain invocations or LLM token usage. For more details, see [upstash/ratelimit-py repository](https://github.com/upstash/ratelimit-py) or the notebook guide included in this PR. Twitter handle: @cahidarda --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	4 months ago
Erick Friis	9e03864d64	core: add error message for non-structured llm to StructuredPrompt (#22684 ) previously was the blank `NotImplementedError` from `BaseLanguageModel.with_structured_output`	4 months ago
ccurme	f32d57f6f0	anthropic: refactor streaming to use events api; add streaming usage metadata (#22628 ) - Refactor streaming to use raw events; - Add `stream_usage` class attribute and kwarg to stream methods that, if True, will include separate chunks in the stream containing usage metadata. There are two ways to implement streaming with anthropic's python sdk. They have slight differences in how they surface usage metadata. 1. [Use helper functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers). This is what we are doing now. ```python count = 1 with client.messages.stream(params) as stream: for text in stream.text_stream: snapshot = stream.current_message_snapshot print(f"{count}: {snapshot.usage} -- {text}") count = count + 1 final_snapshot = stream.get_final_message() print(f"{count}: {final_snapshot.usage}") ``` ``` 1: Usage(input_tokens=8, output_tokens=1) -- Hello 2: Usage(input_tokens=8, output_tokens=1) -- ! 3: Usage(input_tokens=8, output_tokens=1) -- How 4: Usage(input_tokens=8, output_tokens=1) -- can 5: Usage(input_tokens=8, output_tokens=1) -- I 6: Usage(input_tokens=8, output_tokens=1) -- assist 7: Usage(input_tokens=8, output_tokens=1) -- you 8: Usage(input_tokens=8, output_tokens=1) -- today 9: Usage(input_tokens=8, output_tokens=1) -- ? 10: Usage(input_tokens=8, output_tokens=12) ``` To do this correctly, we need to emit a new chunk at the end of the stream containing the usage metadata. 2. [Handle raw events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses) ```python stream = client.messages.create(params, stream=True) count = 1 for event in stream: print(f"{count}: {event}") count = count + 1 ``` ``` 1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start') 2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start') 3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta') 4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta') 5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta') 6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta') 7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta') 8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta') 9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta') 10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta') 11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta') 12: RawContentBlockStopEvent(index=0, type='content_block_stop') 13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12)) 14: RawMessageStopEvent(type='message_stop') ``` Here we implement the second option, in part because it should make things easier when implementing streaming tool calls in the near future. This would add two new chunks to the stream-- one at the beginning and one at the end-- with blank content and containing usage metadata. We add kwargs to the stream methods and a class attribute allowing for this behavior to be toggled. I enabled it by default. If we merge this we can add the same kwargs / attribute to OpenAI. Usage: ```python from langchain_anthropic import ChatAnthropic model = ChatAnthropic( model="claude-3-haiku-20240307", temperature=0 ) full = None for chunk in model.stream("hi"): full = chunk if full is None else full + chunk print(chunk) print(f"\nFull: {full}") ``` ``` content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8} content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3' content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3' content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3' content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3' content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3' content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3' content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3' content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3' content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12} Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20} ```	4 months ago
Bagatur	235d91940d	community[patch]: Release 0.2.4 (#22643 )	4 months ago
William FH	be79ce9336	[Core] Unified Enable/Disable Tracing (#22576 )	4 months ago
Bagatur	fe2e5a3b74	langchain[patch]: Release 0.2.3 (#22644 )	4 months ago
Erick Friis	a24a9c6427	multiple: get rid of pyproject extras (#22581 ) They cause `poetry lock` to take a ton of time, and `uv pip install` can resolve the constraints from these toml files in trivial time (addressing problem with #19153) This allows us to properly upgrade lockfile dependencies moving forward, which revealed some issues that were either fixed or type-ignored (see file comments)	4 months ago
Bagatur	4367e89c9a	core[patch]: Release 0.2.5 (#22642 )	4 months ago
Eugene Yurtsev	28f744c1f5	core[patch]: Correctly order parent ids in astream events (from root to immediate parent), add defensive check for cycles (#22637 ) This PR makes two changes: 1. Fixes the order of parent IDs to be from root to immediate parent 2. Adds a simple defensive check for cycles	4 months ago
Eugene Yurtsev	035a9c9609	core[minor]: Add parent_ids to astream_events API (#22563 ) Include a list of parent ids for each event in astream events.	4 months ago
Nicolas Nkiere	51005e2776	core[minor]: Add an async root listener and with_alisteners method (#22151 ) - [x] Adding AsyncRootListener: "langchain_core: Adding AsyncRootListener" - Description: Adding an AsyncBaseTracer, AsyncRootListener and `with_alistener` function. This is to enable binding async root listener to runnables. This currently only supported for sync listeners. - Issue: None - Dependencies: None - [x] Add tests and docs: Added units tests and example snippet code within the function description of `with_alistener` - [x] Lint and test: Run make format_diff, make lint_diff and make test	4 months ago
seyf97	2904c50cd5	openai[patch]: correct grammar in exception message in embeddings/base.py (#22629 ) Correct the grammar error for missing transformers package ValueError	4 months ago
Anush	80560419b0	qdrant[patch]: Make path optional in from_existing_collection() (#21875 ) ## Description The `path` param is used to specify the local persistence directory, which isn't required if using Qdrant server. This is a breaking but necessary change.	4 months ago
ccurme	b57aa89f34	multiple: implement ls_params (#22621 ) implement ls_params for ai21, fireworks, groq.	4 months ago
Xiangrui Meng	f26ab93df8	community: support Databricks Unity Catalog functions as LangChain tools (#22555 ) This PR adds support for using Databricks Unity Catalog functions as LangChain tools, which runs inside a Databricks SQL warehouse. * An example notebook is provided.	4 months ago
ccurme	c1ef731503	anthropic: update attribute name and alias (#22625 ) update name to `stop_sequences` and alias to `stop` (instead of the other way around), since `stop_sequences` is the name used by anthropic.	4 months ago
lucasiscovici	05bf98b2f9	community[patch]: pgvector replace nin_ by not_in (#22619 ) - [ ] community: "pgvector: replace nin_ by not_in" - [ ] PR message: nin_ do not exist in sqlalchemy orm, it's not_in	4 months ago
ccurme	3999761201	multiple: add `stop` attribute (#22573 )	4 months ago
ccurme	e08879147b	Revert "anthropic: stream token usage" (#22624 ) Reverts langchain-ai/langchain#20180	4 months ago
Bagatur	0d495f3f63	anthropic: stream token usage (#20180 ) open to other ideas <img width="1181" alt="Screenshot 2024-04-08 at 5 34 08 PM" src="https://github.com/langchain-ai/langchain/assets/22008038/03eb11c4-5eb5-43e3-9109-a13f76098fa4"> --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	4 months ago
Satyam Kumar	17b486a37b	openai, azure: update model_name in ChatResult to use name from API response (#22569 ) The response.get("model", self.model_name) checks if the model key exists in the response dictionary. If it does, it uses that value; otherwise, it uses self.model_name. Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Christophe Bornet	12ddb4fc6f	core[patch]: Use explicit classes for InMemoryByteStore and InMemoryStore (#22608 ) The current implementation doesn't work well with type checking. Instead replace with class definition that correctly works with type checking.	4 months ago
andyjessen	cfed68e06f	docs: Fix description (#22611 ) This commit fixes the description of the hair_color field.	4 months ago
ccurme	1925bde32e	together: bump langchain-core (#22616 ) langchain-together depends on langchain-openai ^0.1.8 langchain-openai 0.1.8 has langchain-core >= 0.2.2 Here we bump langchain-core to 0.2.2, just to pass minimum dependency version tests.	4 months ago
ccurme	35f4aa927b	together[patch]: Release 0.1.3 (#22615 )	4 months ago
andyjessen	8b40428f58	docs: Fix typo (#22603 ) This commit changes minor typo in the field description.	4 months ago
Isaac Francisco	ba3e219d83	community[patch]: recursive url loader fix and unit tests (#22521 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Jeffrey Mak	5fc5ed463c	community[patch]:Support filter for AzureAISearchRetriever (#22303 ) Description: The AzureAISearchRetriever does not support the "$filter" argument offered in the AISearch API: https://learn.microsoft.com/en-us/rest/api/searchservice/documents/search-get?view=rest-searchservice-2023-11-01&tabs=HTTP The $filter allows filtering of indexes based on values in metadata. Issue: https://github.com/langchain-ai/langchain/issues/19885 Dependencies: No Twitter handle: @Jeffreym9M - [ ] Add tests and docs: Not relevant - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	4 months ago
Isaac Francisco	148088a588	docs: duckduckgosearch options listed (#22568 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
X-HAN	62f13f95e4	community[minor]: add DashScope Rerank (#22403 ) Description: this PR adds DashScope Rerank capability to Langchain, you can find DashScope Rerank API from [here](https://help.aliyun.com/document_detail/2780058.html?spm=a2c4g.2780059.0.0.6d995024FlrJ12) & [here](https://help.aliyun.com/document_detail/2780059.html?spm=a2c4g.2780058.0.0.63f75024cr11N9). [DashScope](https://dashscope.aliyun.com/) is the generative AI service from Alibaba Cloud (Aliyun). You can create DashScope API key from [here](https://bailian.console.aliyun.com/?apiKey=1#/api-key). Dependencies: DashScopeRerank depends on `dashscope` python package. Twitter handle: my twitter/x account is https://x.com/LastMonopoly and I'd like a mention, thanks you! Tests and docs 1. integration test: `test_dashscope_rerank.py` 2. example notebook: `dashscope_rerank.ipynb` Lint and test: I have run `make format`, `make lint` and `make test` from the root of the package I've modified. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
Ethan Yang	29064848f9	[Community]add option to delete the prompt from HF output (#22225 ) This will help to solve pattern mismatching issue when parsing the output in Agent. https://github.com/langchain-ai/langchain/issues/21912	4 months ago
Bagatur	584a1e30ac	community[patch]: AzureSearch async functions (#22075 )	4 months ago
Bagatur	1a911018bc	langchain[minor]: add universal init_model (#22039 ) decisions to discuss - only chat models - model_provider isn't based on any existing values like llm-type, package names, class names - implemented as function not as a wrapper ChatModel - function name (init_model) - in langchain as opposed to community or core - marked beta	4 months ago
ccurme	af129974a3	community: update how OpenAIAssistantV2Runnable creates threads with tool_resources (#22549 ) https://github.com/langchain-ai/langchain/issues/22503	4 months ago
Bagatur	51a0d4574e	community[patch]: Release 0.2.3 (#22562 )	4 months ago
Bagatur	b2daba37c7	nomic[patch]: Release 0.1.2 (#22561 )	4 months ago
Zach Nussbaum	14f3014cce	embeddings: nomic embed vision (#22482 ) Thank you for contributing to LangChain! Description: Adds Langchain support for Nomic Embed Vision Twitter handle: nomic_ai,zach_nussbaum - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
leila-messallem	3280a5b49b	community[patch]: improve test setup to accurately test filtering of labels in neo4j (#22531 ) Description: This PR addresses an issue with an existing test that was not effectively testing the intended functionality. The previous test setup did not adequately validate the filtering of the labels in neo4j, because the nodes and relationship in the test data did not have any properties set. Without properties these labels would not have been returned, regardless of the filtering. --------- Co-authored-by: Oskar Hane <oh@oskarhane.com>	4 months ago
Mohammad Mohtashim	7fcef2556c	[Experimental]: Async agenerate method ollama functions (#21682 ) - Description: : Added Async method for Generate for OllamaFunctions which was missing and was raising errors for the users. - Issue: #21422	4 months ago
Stefano Lottini	328d0c99f2	community[minor]: Add support for metadata indexing policy in Cassandra vector store (#22548 ) This PR adds a constructor `metadata_indexing` parameter to the Cassandra vector store to allow optional fine-tuning of which fields of the metadata are to be indexed. This is a feature supported by the underlying CassIO library. Indexing mode of "all", "none" or deny- and allow-list based choices are available. The rationale is, in some cases it's advisable to programmatically exclude some portions of the metadata from the index if one knows in advance they won't ever be used at search-time. this keeps the index more lightweight and performant and avoids limitations on the length of _indexed_ strings. I added a integration test of the feature. I also added the possibility of running the integration test with Cassandra on an arbitrary IP address (e.g. Dockerized), via `CASSANDRA_CONTACT_POINTS=10.1.1.5,10.1.1.6 poetry run pytest [...]` or similar. While I was at it, I added a line to the `.gitignore` since the mypy _test_ cache was not ignored yet. My X (Twitter) handle: @rsprrs.	4 months ago
Emilien Chauvet	c3d4126eb1	community[minor]: add user agent for web scraping loaders (#22480 ) Description: This PR adds a `USER_AGENT` env variable that is to be used for web scraping. It creates a util to get that user agent and uses it in the classes used for scraping in [this piece of doc](https://python.langchain.com/v0.1/docs/use_cases/web_scraping/). Identifying your scraper is considered a good politeness practice, this PR aims at easing it. Issue: `None` Dependencies: `None` Twitter handle: `None`	4 months ago
Philippe PRADOS	8250c177de	community[minor]: Add native async support to SQLChatMessageHistory (#22065 ) # package community: Fix SQLChatMessageHistory ## Description Here is a rewrite of `SQLChatMessageHistory` to properly implement the asynchronous approach. The code circumvents [issue 22021](https://github.com/langchain-ai/langchain/issues/22021) by accepting a synchronous call to `def add_messages()` in an asynchronous scenario. This bypasses the bug. For the same reasons as in [PR 22](https://github.com/langchain-ai/langchain-postgres/pull/32) of `langchain-postgres`, we use a lazy strategy for table creation. Indeed, the promise of the constructor cannot be fulfilled without this. It is not possible to invoke a synchronous call in a constructor. We compensate for this by waiting for the next asynchronous method call to create the table. The goal of the `PostgresChatMessageHistory` class (in `langchain-postgres`) is, among other things, to be able to recycle database connections. The implementation of the class is problematic, as we have demonstrated in [issue 22021](https://github.com/langchain-ai/langchain/issues/22021). Our new implementation of `SQLChatMessageHistory` achieves this by using a singleton of type (`Async`)`Engine` for the database connection. The connection pool is managed by this singleton, and the code is then reentrant. We also accept the type `str` (optionally complemented by `async_mode`. I know you don't like this much, but it's the only way to allow an asynchronous connection string). In order to unify the different classes handling database connections, we have renamed `connection_string` to `connection`, and `Session` to `session_maker`. Now, a single transaction is used to add a list of messages. Thus, a crash during this write operation will not leave the database in an unstable state with a partially added message list. This makes the code resilient. We believe that the `PostgresChatMessageHistory` class is no longer necessary and can be replaced by: ``` PostgresChatMessageHistory = SQLChatMessageHistory ``` This also fixes the bug. ## Issue - [issue 22021](https://github.com/langchain-ai/langchain/issues/22021) - Bug in _exit_history() - Bugs in PostgresChatMessageHistory and sync usage - Bugs in PostgresChatMessageHistory and async usage - [issue 36](https://github.com/langchain-ai/langchain-postgres/issues/36) ## Twitter handle: pprados ## Tests - libs/community/tests/unit_tests/chat_message_histories/test_sql.py (add async test) @baskaryan, @eyurtsev or @hwchase17 can you check this PR ? And, I've been waiting a long time for validation from other PRs. Can you take a look? - [PR 32](https://github.com/langchain-ai/langchain-postgres/pull/32) - [PR 15575](https://github.com/langchain-ai/langchain/pull/15575) - [PR 13200](https://github.com/langchain-ai/langchain/pull/13200) --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	4 months ago
Vincent Min	59bef31997	community[minor]: Improve InMemoryVectorStore with ability to persist to disk and filter on metadata. (#22186 ) - Description: The InMemoryVectorStore is a nice and simple vector store implementation for quick development and debugging. The current implementation is quite limited in its functionalities. This PR extends the functionalities by adding utility function to persist the vector store to a json file and to load it from a json file. We choose the json file format because it allows inspection of the database contents in a text editor, which is great for debugging. Furthermore, it adds a `filter` keyword that can be used to filter out documents on their `page_content` or `metadata`. - Issue: - - Dependencies: - - Twitter handle: @Vincent_Min	4 months ago
Christophe Bornet	c34ad8c163	core[patch]: Improve VectorStore API doc (#22547 )	4 months ago
maang-h	89128b7a49	community[patch]: add detailed paragraph and example for BaichuanTextEmbeddings (#22031 ) - Description: add detailed paragraph and example for BaichuanTextEmbeddings - Issue: the issue #21983	4 months ago
Anthony Bernabeu	4e676a63b8	community[minor]: Added filter search for LanceDB (#22461 ) - [ ] community: "vectorstore: added filtering support for LanceDB vector store" - [ ] This PR adds filtering capabilities to LanceDB: - Description: In LanceDB filtering can be applied when searching for data into the vectorstore. It is using the SQL language as mentioned in the LanceDB documentation. - Issue: #18235 - Dependencies: No - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	4 months ago
Erick Friis	4050d6ea2b	huggingface: remove text-generation dep (#22543 )	4 months ago
Erick Friis	a6fc74f379	ai21: fix core version (#22544 )	4 months ago
Asaf Joseph Gardin	75cba742e5	ai21: fix ai21 unittests (#22526 ) Co-authored-by: Asaf Gardin <asafg@ai21.com> Co-authored-by: Erick Friis <erick@langchain.dev>	4 months ago
Erick Friis	58192d617f	community: fix huggingface deprecations (#22522 )	4 months ago
Christophe Bornet	8ba868d3b0	core[patch]: Add similarity_score_threshold to VectorStore search types (#22477 )	4 months ago
Eugene Yurtsev	9120cf5df2	core[patch]: Deduplicate of callback handlers in merge_configs (#22478 ) This PR adds deduplication of callback handlers in merge_configs. Fix for this issue: https://github.com/langchain-ai/langchain/issues/22227 The issue appears when the code is: 1) running python >=3.11 2) invokes a runnable from within a runnable 3) binds the callbacks to the child runnable from the parent runnable using with_config In this case, the same callbacks end up appearing twice: (1) the first time from with_config, (2) the second time with langchain automatically propagating them on behalf of the user. Prior to this PR this will emit duplicate events: ```python @tool async def get_items(question: str, callbacks: Callbacks): # <--- Accept callbacks """Ask question""" template = ChatPromptTemplate.from_messages( [ ( "human", "'{question}" ) ] ) chain = template \| chat_model.with_config( { "callbacks": callbacks, # <-- Propagate callbacks } ) return await chain.ainvoke({"question": question}) ``` Prior to this PR this will work work correctly (no duplicate events): ```python @tool async def get_items(question: str, callbacks: Callbacks): # <--- Accept callbacks """Ask question""" template = ChatPromptTemplate.from_messages( [ ( "human", "'{question}" ) ] ) chain = template \| chat_model return await chain.ainvoke({"question": question}, {"callbacks": callbacks}) ``` This will also work (as long as the user is using python >= 3.11) -- as langchain will automatically propagate callbacks ```python @tool async def get_items(question: str,): """Ask question""" template = ChatPromptTemplate.from_messages( [ ( "human", "'{question}" ) ] ) chain = template \| chat_model return await chain.ainvoke({"question": question}) ```	4 months ago
Ofer Mendelevitch	ad502e8d50	community[minor]: Vectara Integration Update - Streaming, FCS, Chat, updates to documentation and example notebooks (#21334 ) Thank you for contributing to LangChain! Description: update to the Vectara / Langchain integration to integrate new Vectara capabilities: - Full RAG implemented as a Runnable with as_rag() - Vectara chat supported with as_chat() - Both support streaming response - Updated documentation and example notebook to reflect all the changes - Updated Vectara templates Twitter handle: ofermend Add tests and docs: no new tests or docs, but updated both existing tests and existing docs	4 months ago
Bagatur	cb183a9bf1	docs: update anthropic chat model (#22483 ) Related to #22296 And update anthropic to accept base_url	4 months ago
Erick Friis	d700ce8545	robocorp: typo (#22509 )	4 months ago
Erick Friis	39fd44579a	robocorp: release 0.0.9.post1 (#22507 )	4 months ago
Erick Friis	339e3b7f55	ai21: release 0.1.6 (#22508 )	4 months ago
ccurme	3c53cea760	together, upstage: bump minimum langchain-openai version (#22505 )	4 months ago
Bagatur	efcb04f84b	mongodb[patch]: Release 0.1.6 (#22501 )	4 months ago
Bagatur	222b1ba112	groq[patch]: Release 0.1.5 (#22500 )	4 months ago
Bagatur	f021be510e	milvus[patch]: Release 0.1.1 (#22499 )	4 months ago
Bagatur	64d68c17cd	upstage[patch]: Release 0.1.6 (#22498 )	4 months ago
Bagatur	48fba40fce	experimental[patch]: Release 0.0.60 (#22497 )	4 months ago
Bagatur	e60f88ccdd	community[patch]: Release 0.2.2 (#22496 )	4 months ago
Bagatur	85aa218564	langchain[patch]: Release 0.2.2 (#22495 )	4 months ago
Bagatur	8e86080def	mistralai[patch]: Release 0.1.8 (#22494 )	4 months ago
Bagatur	e850de2422	huggingface[patch]: release 0.0.2 (#22493 )	4 months ago
Bagatur	99a3cad258	text-splitters[patch]: Release 0.2.1 (#22490 )	4 months ago
Bagatur	161b02a8be	core[patch]: Release 0.2.4 (#22489 )	4 months ago
Joydeep Banik Roy	3796672c67	community, milvus, pinecone, qdrant, mongo: Broadcast operation failure while using simsimd beyond v3.7.7 (#22271 ) - [ ] Packages affected: - community: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/milvus: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/mongodb: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/pinecone: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/qdrant: fix `cosine_similarity` to support simsimd beyond 3.7.7 - [ ] Broadcast operation failure while using simsimd beyond v3.7.7: - Description: I was using simsimd 4.3.1 and the unsupported operand type issue popped up. When I checked out the repo and ran the tests, they failed as well (have attached a screenshot for that). Looks like it is a variant of https://github.com/langchain-ai/langchain/issues/18022 . Prior to 3.7.7, simd.cdist returned an ndarray but now it returns simsimd.DistancesTensor which is ineligible for a broadcast operation with numpy. With this change, it also remove the need to explicitly cast `Z` to numpy array - Issue: #19905 - Dependencies: No - Twitter handle: https://x.com/GetzJoydeep <img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM" src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56"> - [ ] Considerations: 1. I started with community but since similar changes were there in Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well. If touching multiple packages in one PR is not the norm, then I can remove them from this PR and raise separate ones 2. I have run and verified that the tests work. Since, only MongoDB had tests, I ran theirs and verified it works as well. Screenshots attached : <img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM" src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9"> <img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM" src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5"> I have added a test for simsimd. I feel it may not go well with the CI/CD setup as installing simsimd is not a dependency requirement. I have just imported simsimd to ensure simsimd cosine similarity is invoked. However, its not a good approach. Suggestions are welcome and I can make the required changes on the PR. Please provide guidance on the same as I am new to the community. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
KyrianC	03178ee74f	community[minor]: Add tools calls to `ChatEdenAI` (#22320 ) ### Description Add tools implementation to `ChatEdenAI`: - `bind_tools()` - `with_structured_output()` ### Documentation Updated `docs/docs/integrations/chat/edenai.ipynb` ### Notes We don´t support stream with tools as of yet. If stream is called with tools we directly yield the whole message from `generate` (implemented the same way as Anthropic did).	4 months ago
pranavvuppala	9d4350e69a	docs : Update docstrings for OpenAI base.py (#22221 ) - [x] PR title: Update docstrings for OpenAI base.py -Description: Updated the docstring of few OpenAI functions for a better understanding of the function. - Issue: #21983 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Anindyadeep	7a197539aa	communty[patch]: Native RAG Support in Prem AI langchain (#22238 ) This PR adds native RAG support in langchain premai package. The same has been added in the docs too.	4 months ago
Rahul Triptahi	77ad857934	community[minor]: Enable retrieval api calls in PebbloRetrievalQA (#21958 ) Description: Enable app discovery and Prompt/Response apis in PebbloSafeRetrieval Documentation: NA Unit test: N/A --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	4 months ago
liugz18	8fd231086e	experimental[patch]: Fix graph_transformers llms #21482 (#22417 ) Fix AttributeError on calling LLMGraphTransformer.convert_to_graph_documents #21482 since raw_schema is always a str @baskaryan	4 months ago
ccurme	6db25b4e31	core[patch]: bump langsmith (#22476 ) Noticing errors logged in some situations when tracing with Langsmith: ```python from langchain_core.pydantic_v1 import BaseModel from langchain_anthropic import ChatAnthropic class AnswerWithJustification(BaseModel): """An answer to the user question along with justification for the answer.""" answer: str justification: str llm = ChatAnthropic(model="claude-3-haiku-20240307") structured_llm = llm.with_structured_output(AnswerWithJustification) list(structured_llm.stream("What weighs more a pound of bricks or a pound of feathers")) ``` ``` Error in LangChainTracer.on_chain_end callback: AttributeError("'NoneType' object has no attribute 'append'") [AnswerWithJustification(answer='A pound of bricks and a pound of feathers weigh the same amount.', justification='This is because a pound is a unit of mass, not volume. By definition, a pound of any material, whether bricks or feathers, will weigh the same - one pound. The physical size or volume of the materials does not matter when measuring by mass. So a pound of bricks and a pound of feathers both weigh exactly one pound.')] ```	4 months ago
Bagatur	17c127531a	community[patch]: deprecate all HF classes (#22444 )	4 months ago
Nuno Campos	58b118544e	Use immutable sequence type for batch/batch_as_completed types (#22433 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	4 months ago
Christophe Bornet	9a8fe58ebe	community[minor]: Improve Cassandra VectorStore as_retriever (#22465 ) The Vectorstore's API `as_retriever` doesn't expose explicitly the parameters `search_type` and `search_kwargs` and so these are not well documented. This PR improves `as_retriever` for the Cassandra VectorStore by making these parameters explicit. NB: An alternative would have been to modify `as_retriever` in `Vectorstore`. But there's probably a good reason these were not exposed in the first place ? Is it because implementations may decide to not support them and have fixed values when creating the VectorStoreRetriever ?	4 months ago
Christophe Bornet	23bba18f92	core[patch]: Fix VectorStore's as_retriever mutating tags param (#22470 ) The current VectorStore `as_retriever` implementation mutates the `tags` param when it's passed in kwargs. This fix ensures that a copy is done.	4 months ago
Michal Gregor	98b2e7b195	huggingface[patch]: Support for HuggingFacePipeline in ChatHuggingFace. (#22194 ) - Description: Added support for using HuggingFacePipeline in ChatHuggingFace (previously it was only usable with API endpoints, probably by oversight). - Issue: #19997 - Dependencies: none - Twitter handle: none --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
Fahreddin Özcan	0061ded002	community[patch]: Upstash Vector Store Namespace Support (#22251 ) This PR introduces namespace support for Upstash Vector Store, which would allow users to partition their data in the vector index. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
Guangdong Liu	bc7e32f315	core(patch):fix partial_variables not working with SystemMessagePromptTemplate (#20711 ) - Issue: close #17560 - @baskaryan, @eyurtsev	4 months ago
Dristy Srivastava	ef3df45d9d	community[minor]: Updating payload for pebblo discover API (#22309 ) Description: Updating response for pebblo discover API. Also updating filed name case type Documentation: N/A Unit tests: N/A	4 months ago
Miroslav	cbd5720011	huggingface[patch]: Skip Login to HuggingFaceHub when token is not set (#22365 )	4 months ago
bhardwaj-vipul	f397a84a59	langchain[patch]: Fix MongoDBAtlasVectorSearch reference in self query retriever (#22401 ) Description: SelfQuery Retriever with MongoDBAtlasVectorSearch (from langchain_mongodb import MongoDBAtlasVectorSearch) and Chroma (from langchain_chroma import Chroma) is not supported. The imports in the [builtin translators](`8cbce684d4/libs/langchain/langchain/retrievers/self_query/base.py (L73)`) points to the [deprecated](`acaf214a45/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L36)`) vectorstore. Issue: #22272 --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
ccurme	afe89a1411	community: add standard chat model params to Ollama (#22446 )	4 months ago
Ethan Yang	52da6a160d	community[patch]: Update OpenVINO embedding and reranker to support static input shape (#22171 ) It can help to deploy embedding models on NPU device	4 months ago
Tom Clelford	c599732e1a	text-splitters[patch]: fix HTMLSectionSplitter parsing of xslt paths (#22176 ) ## Description This PR allows passing the HTMLSectionSplitter paths to xslt files. It does so by fixing two trivial bugs with how passed paths were being handled. It also changes the default value of the param `xslt_path` to `None` so the special case where the file was part of the langchain package could be handled. ## Issue #22175	4 months ago
maang-h	01352bb55f	community[minor]: Implement MiniMaxChat interface (#22391 ) - Description: Implement MiniMaxChat interface, include: - No longer inherits the LLM class (like other chat model) - Update request parameters (v1 -> v2) - update `base url` - update message role (system, user, assistant) - add `stream` function - no longer use `group id` - Implement the `_stream`, `_agenerate`, and `_astream` interfaces [minimax v2 api document](https://platform.minimaxi.com/document/guides/chat-model/V2?id=65e0736ab2845de20908e2dd)	4 months ago
Brandon Sharp	56e5aa4dd9	community[patch]: Airtable to allow for addtl params (#22092 ) - [X] PR title: "community: added optional params to Airtable table.all()" - [X] PR message: - Description: Add's kwargs to AirtableLoader to allow for kwargs: https://pyairtable.readthedocs.io/en/latest/api.html#pyairtable.Table.all - Issue: N/A - Dependencies: N/A - Twitter handle: parakoopa88 - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [X] Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
Harichandan Roy	1f751343e2	community[patch]: update embeddings/oracleai.py (#22240 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" "community/embeddings: update oracleai.py" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! Adding oracle VECTOR_ARRAY_T support. - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Tests are not impacted. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Done. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	4 months ago
maang-h	13140dc4ff	community[patch]: Update the default api_url and reqeust_body of sparkllm embedding (#22136 ) - Description: When I was running the SparkLLMTextEmbeddings, app_id, api_key and api_secret are all correct, but it cannot run normally using the current URL. ```python # example from langchain_community.embeddings import SparkLLMTextEmbeddings embedding= SparkLLMTextEmbeddings( spark_app_id="my-app-id", spark_api_key="my-api-key", spark_api_secret="my-api-secret" ) embedding= "hello" print(spark.embed_query(text1)) ``` ![sparkembedding](https://github.com/langchain-ai/langchain/assets/55082429/11daa853-4f67-45b2-aae2-c95caa14e38c) So I updated the url and request body parameters according to [Embedding_api](https://www.xfyun.cn/doc/spark/Embedding_api.html), now it is runnable.	4 months ago
Yuwen Hu	ba0dca46d7	community[minor]: Add IPEX-LLM BGE embedding support on both Intel CPU and GPU (#22226 ) Description: [IPEX-LLM](https://github.com/intel-analytics/ipex-llm) is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low latency. This PR adds ipex-llm integrations to langchain for BGE embedding support on both Intel CPU and GPU. Dependencies: `ipex-llm`, `sentence-transformers` Contribution maintainer: @Oscilloscope98 tests and docs: - langchain/docs/docs/integrations/text_embedding/ipex_llm.ipynb - langchain/docs/docs/integrations/text_embedding/ipex_llm_gpu.ipynb - langchain/libs/community/tests/integration_tests/embeddings/test_ipex_llm.py --------- Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>	4 months ago
Jacob Lee	c01467b1f4	core[patch]: RFC: Allow concatenation of messages with multi part content (#22002 ) Anthropic's streaming treats tool calls as different content parts (streamed back with a different index) from normal content in the `content`. This means that we need to update our chunk-merging logic to handle chunks with multi-part content. The alternative is coerceing Anthropic's responses into a string, but we generally like to preserve model provider responses faithfully when we can. This will also likely be useful for multimodal outputs in the future. This current PR does unfortunately make `index` a magic field within content parts, but Anthropic and OpenAI both use it at the moment to determine order anyway. To avoid cases where we have content arrays with holes and to simplify the logic, I've also restricted merging to chunks in order. TODO: tests CC @baskaryan @ccurme @efriis	4 months ago
Dan	86509161b0	community: fix AzureSearch delete documents (#22315 ) Description Fix AzureSearch delete documents method by using FIELDS_ID variable instead of the hard coded "id" value Issue: This is linked to this issue: https://github.com/langchain-ai/langchain/issues/22314 Co-authored-by: dseban <dan.seban@neoxia.com>	4 months ago
Harrison Chase	8fad2e209a	fix error message (#22437 ) Was confusing when language is in Enum but not implemented	4 months ago
Bagatur	678a19a5f7	infra: bump anthropic mypy 1 (#22373 )	4 months ago
Nuno Campos	ceb73ad06f	core: In BaseRetriever make get_relevant_docs delegate to invoke (#22434 ) - This fixes all the tracing issues with people still using get_relevant_docs, and a change we need for 0.3 anyway Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	4 months ago
Charles John	2d81a72884	community: fix missing `apify_api_token` field in ApifyWrapper (#22421 ) - Description: The `ApifyWrapper` class expects `apify_api_token` to be passed as a named parameter or set as an environment variable. But the corresponding field was missing in the class definition causing the argument to be ignored when passed as a named param. This patch fixes that.	4 months ago
Joan Fontanals	a7ae16f912	add `embed_image` API to JinaEmbedding (#22416 ) - Description: Add `embed_image` to JinaEmbedding to embed images - Twitter handle: https://x.com/JinaAI_	4 months ago
Nuno Campos	ed8e9c437a	core: In RunnableSequence pass kwargs to the first step (#22393 ) - This is a pattern that shows up occasionally in langgraph questions, people chain a graph to something else after, and want to pass the graph some kwargs (eg. stream_mode)	4 months ago
Bagatur	a8098f5ddb	anthropic[patch]: Release 0.1.15, fix sdk tools break (#22369 )	4 months ago
Erick Friis	6ffa0acf32	ai21: fix text-splitters version (#22366 )	4 months ago
Bagatur	2b9f1469d8	core[patch]: Release 0.2.3 (#22329 )	4 months ago
Harrison Chase	ee32369265	core[patch]: fix runnable history and add docs (#22283 )	4 months ago
William FH	dcec133b85	[Core] Update Tracing Interops (#22318 ) LangSmith and LangChain context var handling evolved in parallel since originally we didn't expect people to want to interweave the decorator and langchain code. Once we get a new langsmith release, this PR will let you seemlessly hand off between @traceable context and runnable config context so you can arbitrarily nest code. It's expected that this fails right now until we get another release of the SDK	4 months ago
ccurme	f34337447f	openai: update ChatOpenAI api ref (#22324 ) Update to reflect that token usage is no longer default in streaming mode. Add detail for streaming context under Token Usage section.	4 months ago
ChengZi	2443e85533	docs: fix milvus import and update template (#22306 ) docs: fix milvus import problem update milvus-rag template with milvus-lite Signed-off-by: ChengZi <chen.zhang@zilliz.com>	4 months ago
WU LIFU	86698b02a9	doc: fix wrong documentation on FAISS load_local function (#22310 ) ### Issue: #22299 ### descriptions The documentation appears to be wrong. When the user actually sets this parameter "asynchronous" to be True, it fails because the __init__ function of FAISS class doesn't allow this parameter. In fact, most of the class/instance functions of this class have both the sync/async version, so it looks like what we need is just to remove this parameter from the doc. Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Lifu Wu <lifu@nextbillion.ai>	4 months ago
maang-h	596c062cba	community[patch]: Standardize qianfan model init args name (#22322 ) - Description: - Standardize qianfan chat model intialization arguments name - qianfan_ak (qianfan api key) -> api_key - qianfan_sk (qianfan secret key) -> secret_key - Delete unuse variable - Issue: #20085	4 months ago
Dobiichi-Origami	10b12e1c08	community: adding tool_call_id for every ToolCall (#22323 ) - Description: This PR contains a bugfix which result in malfunction of multi-turn conversation in QianfanChatEndpoint and adaption for ToolCall and ToolMessage	4 months ago
ccurme	f39e1a2288	community, docs: update token usage tracking callback + how-to guides (#22145 )	4 months ago
Bagatur	2bc50fb895	docs, cli[patch]: chat model template nit (#22294 )	4 months ago
Bagatur	aa6c31df53	cli[patch]: Release 0.0.24 (#22293 )	4 months ago
Bagatur	627a337887	docs, cli[patch]: chat model doc template (#22290 ) Update ChatModel integration doc template, integration docstring, and adds langchain-cli command to easily create just doc (for updating existing integrations): ```bash langchain-cli integration create-doc --name "foo-bar" ```	4 months ago
ccurme	6e1df72a88	openai[patch]: Release 0.1.8 (#22291 )	4 months ago
ccurme	e71b0b5827	core[patch]: Release 0.2.2 (#22289 )	4 months ago
Bagatur	6dd0f095c3	docs: revamp ChatOpenAI (#22253 ) Can build API ref docs by running ```bash make api_docs_clean; make api_docs_quick_preview API_PKG=openai ``` only builds openai ref, takes ~20 sec	4 months ago
Erick Friis	00c70d98c2	robocorp: release 0.0.9 (#22282 )	4 months ago
Mikko Korpela	fc5909ad6f	langchain-robocorp: Fix parsing of Union types (such as Optional). (#22277 )	4 months ago
ccurme	af1f723ada	openai: don't override stream_options default (#22242 ) ChatOpenAI supports a kwarg `stream_options` which can take values `{"include_usage": True}` and `{"include_usage": False}`. Setting include_usage to True adds a message chunk to the end of the stream with usage_metadata populated. In this case the final chunk no longer includes `"finish_reason"` in the `response_metadata`. This is the current default and is not yet released. Because this could be disruptive to workflows, here we remove this default. The default will now be consistent with OpenAI's API (see parameter [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)). Examples: ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI() for chunk in llm.stream("hi"): print(chunk) ``` ``` content='' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' content='Hello' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' content='!' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' content='' response_metadata={'finish_reason': 'stop'} id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' ``` ```python for chunk in llm.stream("hi", stream_options={"include_usage": True}): print(chunk) ``` ``` content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='Hello' id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='!' id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='' response_metadata={'finish_reason': 'stop'} id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17} ``` ```python llm = ChatOpenAI().bind(stream_options={"include_usage": True}) for chunk in llm.stream("hi"): print(chunk) ``` ``` content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='Hello' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='!' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='' response_metadata={'finish_reason': 'stop'} id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17} ```	4 months ago
Karim Lalani	a1899439fc	[experimental][llms][ollama_functions] Update OllamaFunctions to send `tool_calls` attribute (#21625 ) Update OllamaFunctions to return `tool_calls` for AIMessages when used for tool calling.	4 months ago
Bagatur	d61bdeba25	core[patch]: allow access RunnableWithFallbacks.runnable attrs (#22139 ) RFC, candidate fix for #13095 #22134	4 months ago
SteveLiao	7496fe2b16	Update parent_document_retriever.py about kwargs (#22219 ) Add kwargs in add_documents function langchain: Add kwargs in parent_document_retriever" - Add kwargs for `add_document` in `parent_document_retriever.py` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	4 months ago
Erick Friis	93240fac68	milvus: fix core dep (#22239 )	4 months ago
ChengZi	404d92ded0	milvus: New langchain_milvus package and new milvus features (#21077 ) New features: - New langchain_milvus package in partner - Milvus collection hybrid search retriever - Zilliz cloud pipeline retriever - Milvus Local guid - Rag-milvus template --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> Co-authored-by: Jael Gu <mengjia.gu@zilliz.com> Co-authored-by: Jackson <jacksonxie612@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Erick Friis <erickfriis@gmail.com>	4 months ago
Leonid Ganeline	d6995e814b	ai21[patch]: added `license` (#22153 ) The `pyproject.toml` missed the `license` parameter. I've added it as `MIT`	4 months ago
Maddy Adams	8332a36f69	infra: update langchainhub and add integration test (#22154 ) Description: Update langchainhub integration test dependency and add an integration test for pulling private prompt Dependencies: langchainhub 0.1.16	4 months ago
Will Higgins	83d10df78d	community[patch]: Update firecrawl api key name (#22183 ) Change 'FIREWALL' to 'FIRECRAWL' as I believe this may have been in error. Other docs refer to 'FIRECRAWL_API_KEY'. Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
hmasdev	bbd7015b5d	core[patch]: Add `TypeError` handler into `get_graph` of `Runnable` (#19856 ) # Description ## Problem `Runnable.get_graph` fails when `InputType` or `OutputType` property raises `TypeError`. - `003c98e5b4/libs/core/langchain_core/runnables/base.py (L250-L274)` - `003c98e5b4/libs/core/langchain_core/runnables/base.py (L394-L396)` This problem prevents getting a graph of `Runnable` objects whose `InputType` or `OutputType` property raises `TypeError` but whose `invoke` works well, such as `langchain.output_parsers.RegexParser`, which I have already pointed out in #19792 that a `TypeError` would occur. ## Solution - Add `try-except` syntax to handle `TypeError` to the codes which get `input_node` and `output_node`. # Issue - #19801 # Twitter Handle - [hmdev3](https://twitter.com/hmdev3) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Mohammad Mohtashim	577ed68b59	mistralai[patch]: Added Json Mode for ChatMistralAI (#22213 ) - Description: Powered [ChatMistralAI.with_structured_output](`fbfed65fb1/libs/partners/mistralai/langchain_mistralai/chat_models.py (L609)`) via json mode - Issue: #22081 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Pavlo Paliychuk	342df7cf83	community[minor]: Add Zep Cloud components + docs + examples (#21671 ) Thank you for contributing to LangChain! - [x] PR title: community: Add Zep Cloud components + docs + examples - [x] PR message: We have recently released our new zep-cloud sdks that are compatible with Zep Cloud (not Zep Open Source). We have also maintained our Cloud version of langchain components (ChatMessageHistory, VectorStore) as part of our sdks. This PRs goal is to port these components to langchain community repo, and close the gap with the existing Zep Open Source components already present in community repo (added ZepCloudMemory,ZepCloudVectorStore,ZepCloudRetriever). Also added a ZepCloudChatMessageHistory components together with an expression language example ported from our repo. We have left the original open source components intact on purpose as to not introduce any breaking changes. - Issue: - - Dependencies: Added optional dependency of our new cloud sdk `zep-cloud` - Twitter handle: @paulpaliychuk51 - [x] Add tests and docs - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	4 months ago
Jan Soubusta	cccc8fbe2f	community[patch]: DuckDB VS - expose similarity, improve performance of from_texts (#20971 ) 3 fixes of DuckDB vector store: - unify defaults in constructor and from_texts (users no longer have to specify `vector_key`). - include search similarity into output metadata (fixes #20969) - significantly improve performance of `from_documents` Dependencies: added Pandas to speed up `from_documents`. I was thinking about CSV and JSON options, but I expect trouble loading JSON values this way and also CSV and JSON options require storing data to disk. Anyway, the poetry file for langchain-community already contains a dependency on Pandas. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	4 months ago
Erick Friis	42ffcb2ff1	anthropic: release 0.1.14rc2, test release note gen (#22147 )	4 months ago
Ameya Shenoy	8ba492ed6a	community[minor]: clickhouse -- ability to use secure connection (#22108 ) - Description: this PR gives clickhouse client the ability to use a secure connection to the clickhosue server - Issue: fixes #22082 - Dependencies: - - Twitter handle: `_codingcoffee_` Signed-off-by: Ameya Shenoy <shenoy.ameya@gmail.com> Co-authored-by: Shresth Rana <shresth@grapevine.in>	4 months ago
ccurme	9a010fb761	openai: read stream_options (#21548 ) OpenAI recently added a `stream_options` parameter to its chat completions API (see [release notes](https://platform.openai.com/docs/changelog/added-chat-completions-stream-usage)). When this parameter is set to `{"usage": True}`, an extra "empty" message is added to the end of a stream containing token usage. Here we propagate token usage to `AIMessage.usage_metadata`. We enable this feature by default. Streams would now include an extra chunk at the end, after the chunk with `response_metadata={'finish_reason': 'stop'}`. New behavior: ``` [AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='Hello', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='!', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde', usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17})] ``` Old behavior (accessible by passing `stream_options={"include_usage": False}` into (a)stream: ``` [AIMessageChunk(content='', id='run-1312b971-c5ea-4d92-9015-e6604535f339'), AIMessageChunk(content='Hello', id='run-1312b971-c5ea-4d92-9015-e6604535f339'), AIMessageChunk(content='!', id='run-1312b971-c5ea-4d92-9015-e6604535f339'), AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-1312b971-c5ea-4d92-9015-e6604535f339')] ``` From what I can tell this is not yet implemented in Azure, so we enable only for ChatOpenAI.	4 months ago
Rahul Triptahi	1a485f59b9	community[patch]: Put authorized identities behind a feature flag in SharepointLoader (#22125 ) Description: Put authorised identities behind a feature flag, load_auth. Documentation: N/A Unit tests: N/A --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	4 months ago
sasha	1c9ceff503	community: add metadata to chain logging; (#22122 ) Hey, I'm Sasha. The SDK engineer from [Comet](https://comet.com). This PR updates the CometTracer class. Added metadata to CometTracerr. From now on, both chains and spans will send it.	4 months ago
Jirka Lhotka	7c0459faf2	community: Update costs of openai finetuned models (#22124 ) - Description: Update costs of finetuned models and add gpt-3-turbo-0125. Source: https://openai.com/api/pricing/ - Issue: N/A - Dependencies: None	4 months ago
Eugene Yurtsev	d3db83abe3	community[major]: lint for usage of xml library (#22132 ) * Lint for usage of standard xml library * Add forced opt-in for quip client * Actual security issue is with underlying QuipClient not LangChain integration (since the client is doing the parsing), but adding enforcement at the LangChain level.	4 months ago
Bagatur	baa3c975cb	anthropic[patch]: allow tool call mutation (#22130 ) If tool_use blocks and tool_calls with overlapping IDs are present, prefer the values of the tool_calls. Allows for mutating AIMessages just via tool_calls.	4 months ago
Christophe Bornet	c838de5027	doc: Add doc for CassandraByteStore (#22126 ) Preview: https://langchain-git-fork-cbornet-doc-cassandrabytestore-langchain.vercel.app/v0.2/docs/integrations/stores/cassandra/	4 months ago
ccurme	0ea1e89b2c	groq: read tool calls from .tool_calls attribute (#22096 )	4 months ago
Eugene Yurtsev	2d693c484e	docs: fix some spelling mistakes caught by newest version of code spell (#22090 ) Going to merge this even though it doesn't pass all tests, and open a separate PR for the remaining spelling mistakes.	4 months ago
Pavel Zloi	fe26f937e4	community[minor]: ManticoreSearch engine added to vectorstore (#19117 ) Description: ManticoreSearch engine added to vectorstores Issue: no issue, just a new feature Dependencies: https://pypi.org/project/manticoresearch-dev/ Twitter handle: @EvilFreelancer - Example notebook with test integration: https://github.com/EvilFreelancer/langchain/blob/manticore-search-vectorstore/docs/docs/integrations/vectorstores/manticore_search.ipynb --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Erick Friis	95c3e5f85f	cli: model name substitution fix, release 0.0.23 (#22089 )	4 months ago
ccurme	152c8cac33	anthropic, openai: cut pre-releases (#22083 )	4 months ago
ccurme	cd07521170	core: bump to 0.2.1rc (#22080 )	4 months ago
ccurme	fbfed65fb1	core, partners: add token usage attribute to AIMessage (#21944 ) ```python class UsageMetadata(TypedDict): """Usage metadata for a message, such as token counts. Attributes: input_tokens: (int) count of input (or prompt) tokens output_tokens: (int) count of output (or completion) tokens total_tokens: (int) total token count """ input_tokens: int output_tokens: int total_tokens: int ``` ```python class AIMessage(BaseMessage): ... usage_metadata: Optional[UsageMetadata] = None """If provided, token usage information associated with the message.""" ... ```	4 months ago
Bagatur	3d26807b92	community[patch]: Release. 0.2.1 (#22073 )	4 months ago
Bagatur	2d968213d7	langchain[patch]: Release 0.2.1 (#22074 )	4 months ago
maang-h	9aba9e3e33	community[patch]: Update the default “API URL” and “MODEL” of sparkllm (#22070 ) - Description: When I was running the sparkllm, I found that the default parameters currently used could no longer run correctly. - original parameters & values: - spark_api_url: "wss://spark-api.xf-yun.com/v3.1/chat" - spark_llm_domain: "generalv3" ```python # example from langchain_community.chat_models import ChatSparkLLM spark = ChatSparkLLM(spark_app_id="my_app_id", spark_api_key="my_api_key", spark_api_secret="my_api_secret") spark.invoke("hello") ``` ![sparkllm](https://github.com/langchain-ai/langchain/assets/55082429/5369bfdf-4305-496a-bcf5-2d3f59d39414) So I updated them to 3.5 (same as sparkllm official website). After the update, they can be used normally. - new parameters & values: - spark_api_url: "wss://spark-api.xf-yun.com/v3.5/chat" - spark_llm_domain: "generalv3.5"	4 months ago
junkeon	4fda7bf4f2	upstage[patch] : fix error handling in Layout Analysis parser (#22054 ) This pull request addresses and fixes exception handling in the UpstageLayoutAnalysisParser and enhances the test coverage by adding error exception tests for the document loader. These improvements ensure robust error handling and increase the reliability of the system when dealing with external API calls and JSON responses. ### Changes Made 1. Fix Request Exception Handling: - Issue: The existing implementation of UpstageLayoutAnalysisParser did not properly handle exceptions thrown by the requests library, which could lead to unhandled exceptions and potential crashes. - Solution: Added comprehensive exception handling for requests.RequestException to catch any request-related errors. This includes logging the error details and raising a ValueError with a meaningful error message. 2. Add Error Exception Tests for Document Loader: - New Tests: Introduced new test cases to verify the robustness of the UpstageLayoutAnalysisLoader against various error scenarios. The tests ensure that the loader gracefully handles: - RequestException: Simulates network issues or invalid API requests to ensure appropriate error handling and user feedback. - JSONDecodeError: Simulates scenarios where the API response is not a valid JSON, ensuring the system does not crash and provides clear error messaging.	4 months ago
JuHyung Son	d9eff44400	partner-upstage[patch]: embeddings empty list bug (#22057 ) Fixed an error in `embed_documents` when the input was given as an empty list. And I have revised the document.	4 months ago
Martin Triska	2df8ac402a	community[minor]: Added propagation of document metadata from O365BaseLoader (#20663 ) Description: - Added propagation of document metadata from O365BaseLoader to FileSystemBlobLoader (O365BaseLoader uses FileSystemBlobLoader under the hood). - This is done by passing dictionary `metadata_dict`: key=filename and value=dictionary containing document's metadata - Modified `FileSystemBlobLoader` to accept the `metadata_dict`, use `mimetype` from it (if available) and pass metadata further into blob loader. Issue: - `O365BaseLoader` under the hood downloads documents to temp folder and then uses `FileSystemBlobLoader` on it. - However metadata about the document in question is lost in this process. In particular: - `mime_type`: `FileSystemBlobLoader` guesses `mime_type` from the file extension, but that does not work 100% of the time. - `web_url`: this is useful to keep around since in RAG LLM we might want to provide link to the source document. In order to work well with document parsers, we pass the `web_url` as `source` (`web_url` is ignored by parsers, `source` is preserved) Dependencies: None Twitter handle: @martintriska1 Please review @baskaryan --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	4 months ago
Eugene Yurtsev	e5541d1da7	community[patch]: Update doc-string in CloudBlobLoader (#22069 ) Update doc-string	4 months ago
Philippe PRADOS	6dd621d636	community[minor]: Add CloudBlobLoader that supports loading data from cloud buckets (#21957 ) Thank you for contributing to LangChain! - [ ] PR title: "Add CloudBlobLoader" - community: Add CloudBlobLoader - [ ] PR message: Add cloud blob loader - Description: Langchain provides several approaches to read different file formats: Specific loaders (`CVSLoader`) or blob-compatible loaders (`FileSystemBlobLoader`). The only implementation proposed for BlobLoader is `FileSystemBlobLoader`. Many projects retrieve files from cloud storage. We propose a new implementation of `BlobLoader` to read files from the three cloud storage systems. The interface is strictly identical to `FileSystemBlobLoader`. The only difference is the constructor, which takes a cloud "url" object such as `s3://my-bucket`, `az://my-bucket`, or `gs://my-bucket`. By streamlining the process, this novel implementation eliminates the requirement to pre-download files from cloud storage to local temporary files (which are seldom removed). The code relies on the [CloudPathLib](https://cloudpathlib.drivendata.org/stable/) library to interpret cloud URLs. This has been added as an optional dependency. ```Python loader = CloudBlobLoader("s3://mybucket/id") for blob in loader.yield_blobs(): print(blob) ``` - [X] Dependencies: CloudPathLib - [X] Twitter handle: pprados - [X] Add tests and docs: Add unit test, but it's easy to convert to integration test, with some files in a cloud storage (see `test_cloud_blob_loader.py`) - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. Hello from Paris @hwchase17. Can you review this PR? --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	4 months ago
Christophe Bornet	74947ec894	community[minor]: Add Cassandra ByteStore (#22064 )	4 months ago
Christophe Bornet	fea6b99b16	community[minor]: Add async methods to CassandraChatMessageHistory (#21975 )	4 months ago
Sky	12d65f17ff	community[patch]: surrealdb provide functions for MMR (Maximal Marginal Relevance) (#21185 ) This PR contains 4 added functions: - max_marginal_relevance_search_by_vector - amax_marginal_relevance_search_by_vector - max_marginal_relevance_search - amax_marginal_relevance_search I'm no langchain expert, but tried do inspect other vectorstore sources like chroma, to build these functions for SurrealDB. If someone has some changes for me, please let me know. Otherwise I would be happy, if these changes are added to the repository, so that I can use the orignal repo and not my local monkey patched version. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Bruno Alvisio	5eabe90494	community[patch]: Adding HEADER to the list of supported locations (#21946 ) Description: adds headers to the list of supported locations when generating the openai function schema	4 months ago
Bagatur	50186da0a1	infra: rm unused # noqa violations (#22049 ) Updating #21137	4 months ago
acho98	45ed5f3f51	community[minor]: Add Clova Embeddings for LangChain Community (#21890 ) - [ ] PR title: "Add Naver ClovaX embedding to LangChain community" - HyperClovaX is a large language model developed by [Naver](https://clova-x.naver.com/welcome). It's a powerful and purpose-trained LLM. - You can visit the embedding service provided by [ClovaX](https://www.ncloud.com/product/aiService/clovaStudio) - You may get CLOVA_EMB_API_KEY, CLOVA_EMB_APIGW_API_KEY, CLOVA_EMB_APP_ID From https://www.ncloud.com/product/aiService/clovaStudio --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
arpitkumar980	444c2a3d9f	community[patch]: sharepoint loader identity enabled (#21176 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines:https://github.com/arpitkumar980/langchain.git - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
HuiyuanYan	bf3aefce93	community[patch]: Update tongyi.py to support MultimodalConversation in dashscope. (#21249 ) Add the support of multimodal conversation in dashscope,now we can use multimodal language model "qwen-vl-v1", "qwen-vl-chat-v1", "qwen-audio-turbo" to processing picture an audio. :) - [ ] PR title: "community: add multimodal conversation support in dashscope" - [ ] PR message: *Delete this entire checklist* and replace with - Description: add multimodal conversation support in dashscope - Issue: - Dependencies: dashscope≥1.18.0 - Twitter handle: none :) - [ ] How to use it?: - ```python Tongyi_chat = ChatTongyi( top_p=0.5, dashscope_api_key=api_key, model="qwen-vl-v1" ) response= Tongyi_chat.invoke( input = [ { "role": "user", "content": [ {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"}, {"text": "这是什么?"} ] } ] ) ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
mochi	63284ffebf	experimental[patch], docs: refine notebook for MyScale `SelfQueryRetriever` (#22016 ) - Description: upgrade model to `gpt-4o`	4 months ago
MSubik	d948783a4c	community[patch]: standardize init args, update for javelin sdk release. (#21980 ) Related to [20085](https://github.com/langchain-ai/langchain/issues/20085) Updated the Javelin chat model to standardize the initialization argument. Also fixed an existing bug, where code was initialized with incorrect call to the JavelinClient defined in the javelin_sdk, resulting in an initialization error. See related [Javelin Documentation](https://docs.getjavelin.io/docs/javelin-python/quickstart).	4 months ago
Mohammad Mohtashim	16617dd239	community[patch]: AzureSearchVectorStoreRetriever Fixed to account for search_kwargs (#21572 ) - Description: Fixed `AzureSearchVectorStoreRetriever` to account for search_kwargs. More explanation is in the mentioned issue. - Issue: #21492 --------- Co-authored-by: MAC <mac@MACs-MacBook-Pro.local> Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Klaudia Lemiec	45351d1bc6	docs: Chroma docstrings update (#22001 ) Thank you for contributing to LangChain! - [X] PR title: "docs: Chroma docstrings update" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [X] PR message: - Description: Added and updated Chroma docstrings - Issue: https://github.com/langchain-ai/langchain/issues/21983 - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - only docs - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	4 months ago
Jerron Lim	28456c2c33	community[patch]: add args_schema to WikipediaQueryRun (#22019 ) Description: This change adds args_schema (pydantic BaseModel) to WikipediaQueryRun for correct schema formatting on LLM function calls Issue: currently using WikipediaQueryRun with OpenAI function calling returns the following error "TypeError: WikipediaQueryRun._run() got an unexpected keyword argument '__arg1' ". This happens because the schema sent to the LLM is "input: '{"__arg1":"Hunter x Hunter"}'" while the method should be called with the "query" parameter. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Mazen Ramadan	3c1d77dd64	community[minor]: Add Scrapfly Loader community integration (#22036 ) Added [Scrapfly](https://scrapfly.io/) Web Loader integration. Scrapfly is a web scraping API that allows extracting web page data into accessible markdown or text datasets. - __Description__: Added Scrapfly web loader for retrieving web page data as markdown or text. - Dependencies: scrapfly-sdk - Twitter: @thealchemi1st --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
ccurme	b51a1eba4d	langchain, community: move OpenAIAssistantV2Runnable to community (#22044 )	4 months ago
CaroFG	6b98140b38	community[patch]: update for compatibility with Meilisearch v1.8 (#21979 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: Updates Meilisearch vectorstore for compatibility with v1.8. Adds [”showRankingScore”: true”](https://www.meilisearch.com/docs/reference/api/search#ranking-score) in the search parameters and replaces `_semanticScore` field with ` _rankingScore` - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	4 months ago
Oleksii Pokotylo	98c0b093bb	community[patch]: Extend AzureSearch with `maximal_marginal_relevance`, `from_embeddings` (#21065 ) Description: - Extend AzureSearch with `maximal_marginal_relevance` (for vector and hybrid search) - Add construction `from_embeddings` - if the user has already embedded the texts - Add `add_embeddings` - Refactor common parts (`_simple_search`, `_results_to_documents`, `_reorder_results_with_maximal_marginal_relevance`) - Add `vector_search_dimensions` as a parameter to the constructor to avoid extra calls to `embed_query` (most of the time the user applies the same model and knows the dimension) Issue: none Dependencies: none - [x] Add tests and docs: The docstrings have been added to the new functions, and unified for the existing ones. The example notebook is great in illustrating the main usage of AzureSearch, adding the new methods would only dilute the main content. - [x] Lint and test --------- Co-authored-by: Oleksii Pokotylo <oleksii.pokotylo@pwc.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
SaschaStoll	709664a079	community[patch]: Performant filter columns option for Hanavector (#21971 ) Description: Backwards compatible extension of the initialisation interface of HanaDB to allow the user to specify specific_metadata_columns that are used for metadata storage of selected keys which yields increased filter performance. Any not-mentioned metadata remains in the general metadata column as part of a JSON string. Furthermore switched to executemany for batch inserts into HanaDB. Issue: N/A Dependencies: no new dependencies added Twitter handle: @sapopensource --------- Co-authored-by: Martin Kolb <martin.kolb@sap.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Bagatur	16b55b0704	langchain[patch]: remove dataclasses-json dep (#22042 ) vestigial dep afaict	4 months ago
Christos Boulmpasakos	c3bcfad66d	text-splitters[patch]: Extend TextSplitter:keep_separator functionality (#21130 ) Description: Added extra functionality to `CharacterTextSplitter`, `TextSplitter` classes. The user can select whether to append the separator to the previous chunk with `keep_separator='end' ` or else prepend to the next chunk. Previous functionality prepended by default to next chunk. Issue: Fixes #20908 --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
Eric Zhang	e7e41eaabe	langchain: add RankLLM Reranker (#21171 ) Integrate RankLLM reranker (https://github.com/castorini/rank_llm) into LangChain An example notebook is given in `docs/docs/integrations/retrievers/rankllm-reranker.ipynb` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	4 months ago
maang-h	fc93bed8c4	community: Fix CSVLoader columns is None (#20701 ) - Bug code: In langchain_community/document_loaders/csv_loader.py:100 - Description: currently, when 'CSVLoader' reads the column as None in the 'csv' file, it will report an error because the 'CSVLoader' does not verify whether the column is of str type and does not consider how to handle the corresponding 'row_data' when the column is' None 'in the csv. This pr provides a solution. - Issue: Fix #20699 - thinking: 1. Refer to the processing method for 'langchain_community/document_loaders/csv_loader.py:100' when 'v' equals'None', and apply the same method to 'k'. (Reference`csv.DictReader` ,'k' will only be None when ` len(columns) < len(number_row_data)` is established) 2. ‘k’ equals None only holds when it is the last column, and its corresponding 'v' type is a list. Therefore, I referred to the data format in 'Document' and used ',' to concatenated the elements in the list.(But I'm not sure if you accept this form, if you have any other ideas, communicate) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
Nithin James Padayatti	403142eaba	langchain: added revision_example prompt template (#20916 ) Description: Added revision_example prompt template to include the revision request and revision examples in the revision chain. Issue: Not Applicable Dependencies: Not Applicable Twitter handle: @nithinjp09	4 months ago
Sihan Chen	1f81277b9b	community[minor]: allow enabling proxy in aiohttp session in AsyncHTML (#19499 ) Allow enabling proxy in aiohttp session async html	4 months ago
Eugene Yurtsev	36813d2f00	community[patch]: Fix remaining __inits__ in community (#22037 ) Fixes the __init__ files in community to use __all__ which is statically defined.	4 months ago
Eugene Yurtsev	58360a1e53	community[patch]: Add unit test to verify that init is correctly defined (#22030 ) Fix some __init__ files and add a unit test	4 months ago
Erick Friis	ef53ccf54b	robocorp: release 0.0.8 (#22034 )	4 months ago
Matthew Hoffman	4f2e3bd7fd	community[patch]: fix public interface for embeddings module (#21650 ) ## Description The existing public interface for `langchain_community.emeddings` is broken. In this file, `__all__` is statically defined, but is subsequently overwritten with a dynamic expression, which type checkers like pyright do not support. pyright actually gives the following diagnostic on the line I am requesting we remove: [reportUnsupportedDunderAll](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportUnsupportedDunderAll): ``` Operation on "__all__" is not supported, so exported symbol list may be incorrect ``` Currently, I get the following errors when attempting to use publicablly exported classes in `langchain_community.emeddings`: ```python import langchain_community.embeddings langchain_community.embeddings.HuggingFaceEmbeddings(...) # error: "HuggingFaceEmbeddings" is not exported from module "langchain_community.embeddings" (reportPrivateImportUsage) ``` This is solved easily by removing the dynamic expression.	4 months ago
Eugene Yurtsev	8d82160a8a	community[patch]: Clean up logic in import checking unit test (#22026 ) Clean up unit test	4 months ago
Tomaz Bratanic	d8a1f1114d	community[patch]: Handle exceptions where node props aren't consistent in neo4j schema (#22027 )	4 months ago
WeichenXu	b0ef5e778a	community[patch]: Fix ChatDatabricsk in case that streaming response doesn't have role field in delta chunk (#21897 ) Thank you for contributing to LangChain! - [X] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" Description: Fix ChatDatabricsk in case that streaming response doesn't have role field in delta chunk - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Signed-off-by: Weichen Xu <weichen.xu@databricks.com>	4 months ago
Eugene Yurtsev	aed64daabb	community[patch]: Add unit test to catch bad __all__ definitions (#21996 ) This will catch all dynamic __all__ definitions.	4 months ago
Bagatur	3b0437c05b	core[patch]: Release 0.2.1 (#22003 )	4 months ago
Kefan You	24b5c27bb1	community[patch]: raise_for_status logic missing in async _fetch of WebBaseLoader (#21948 ) ## 'raise_for_status' parameter of WebBaseLoader works in sync load but not in async load. In webBaseLoader: Sync load is calling `_scrape` and has `raise_for_status` properly handled. ``` def _scrape( self, url: str, parser: Union[str, None] = None, bs_kwargs: Optional[dict] = None, ) -> Any: from bs4 import BeautifulSoup if parser is None: if url.endswith(".xml"): parser = "xml" else: parser = self.default_parser self._check_parser(parser) html_doc = self.session.get(url, self.requests_kwargs) if self.raise_for_status: html_doc.raise_for_status() if self.encoding is not None: html_doc.encoding = self.encoding elif self.autoset_encoding: html_doc.encoding = html_doc.apparent_encoding return BeautifulSoup(html_doc.text, parser, (bs_kwargs or {})) ``` Async load is calling `_fetch` but missing `raise_for_status` logic. ``` async def _fetch( self, url: str, retries: int = 3, cooldown: int = 2, backoff: float = 1.5 ) -> str: async with aiohttp.ClientSession() as session: for i in range(retries): try: async with session.get( url, headers=self.session.headers, ssl=None if self.session.verify else False, cookies=self.session.cookies.get_dict(), ) as response: return await response.text() ``` Co-authored-by: kefan.you <darkfss@sina.com>	4 months ago
Surya Rath	eb096675a8	OpenAI Assistants v2 api support for OpenAIAssistantRunnable (#21484 ) Title: "langchain: OpenAI Assistants v2 api support" *Descriptions* - [x] "attachments" support added along with backward compatibility of "file_ids" - [x] "tool_resources" support added while creating new assistant - [ ] "tool_choice" parameter support - [ ] Streaming support - Dependencies: OpenAI v2 API (openai>=1.23.0) - Twitter handle: @skanta_rath --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	4 months ago
Eugene Yurtsev	7a5d042bd2	langchain[patch]: Add unit test to detect changes to community imports (#21998 ) Add unit tests for community imports	4 months ago
Eugene Yurtsev	90f4d8842f	langchain[patch]: Turn on all deprecations for 0.2 (#21999 ) - Turn on all 0.2 import deprecations. - Update error messag with URL to upgrade instructions.	4 months ago
Asaf Joseph Gardin	a042e804b4	ai21: AI21 Jamba docs (#21978 ) - Updated docs to have an example to use Jamba instead of J2 --------- Co-authored-by: Asaf Gardin <asafg@ai21.com> Co-authored-by: Erick Friis <erick@langchain.dev>	4 months ago
Pengcheng Liu	4cf523949a	community[patch]: Update model client to support vision model in Tong… (#21474 ) - Description: Tongyi uses different client for chat model and vision model. This PR chooses proper client based on model name to support both chat model and vision model. Reference [tongyi document](https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api?spm=a2c4g.11186623.0.0.27404c9a7upm11) for details. ``` from langchain_core.messages import HumanMessage from langchain_community.chat_models import ChatTongyi llm = ChatTongyi(model_name='qwen-vl-max') image_message = { "image": "https://lilianweng.github.io/posts/2023-06-23-agent/agent-overview.png" } text_message = { "text": "summarize this picture", } message = HumanMessage(content=[text_message, image_message]) llm.invoke([message]) ``` - Issue: None - Dependencies: None - Twitter handle: None	4 months ago
Sevin F. Varoglu	1bc0ea5496	community[patch]: update OctoAIEmbeddings to subclass OpenAIEmbeddings (#21805 )	4 months ago
Eugene Yurtsev	ded53297e0	core[patch]: Add unit test for RunnableGenerator for eventstream v2 (#21990 ) No unit tests with runnable generator	4 months ago
Nuno Campos	fb6108c8f5	core[patch]: In astream_events(version=v2) tap output of root run (#21977 ) - if tap_output_iter/aiter is called multiple times for the same run issue events only once - if chat model run is tapped don't issue duplicate on_llm_new_token events - if first chunk arrives after run has ended do not emit it as a stream event --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	4 months ago
Bagatur	72d4a8eeed	community[patch]: AzureSearch dont overwrite default async (#21989 )	4 months ago
ccurme	a983465694	docs: set default anthropic model (#21988 ) `ChatAnthropic()` raises ValidationError.	4 months ago
ccurme	4be5537837	Revert "anthropic: set default model" (#21987 ) Reverts langchain-ai/langchain#21986	4 months ago
ccurme	35439cf3bd	anthropic: set default model (#21986 ) Various docs reference `ChatAnthropic()`, but this currently raises ValidationError.	4 months ago
ccurme	0923136851	langchain: default to Runnable in MultiQueryRetriever (#21770 ) - `llm_chain` becomes `Union[LLMChain, Runnable]` - `.from_llm` creates a runnable tested by verifying that docs/how_to/MultiQueryRetriever.ipynb runs unchanged with sync/async invoke (and that it runs if we specifically instantiate with LLMChain).	4 months ago
Yulong Wang	8e1aeb8ad5	community[patch]: Fix typo in arxiv tool's doc (#21970 ) Fix typo in arxiv tool's doc	4 months ago
Robert Caulk	54adcd9e82	community[minor]: add AskNews retriever and AskNews tool (#21581 ) We add a tool and retriever for the [AskNews](https://asknews.app) platform with example notebooks. The retriever can be invoked with: ```py from langchain_community.retrievers import AskNewsRetriever retriever = AskNewsRetriever(k=3) retriever.invoke("impact of fed policy on the tech sector") ``` To retrieve 3 documents in then news related to fed policy impacts on the tech sector. The included notebook also includes deeper details about controlling filters such as category and time, as well as including the retriever in a chain. The tool is quite interesting, as it allows the agent to decide how to obtain the news by forming a query and deciding how far back in time to look for the news: ```py from langchain_community.tools.asknews import AskNewsSearch from langchain import hub from langchain.agents import AgentExecutor, create_openai_functions_agent from langchain_openai import ChatOpenAI tool = AskNewsSearch() instructions = """You are an assistant.""" base_prompt = hub.pull("langchain-ai/openai-functions-template") prompt = base_prompt.partial(instructions=instructions) llm = ChatOpenAI(temperature=0) asknews_tool = AskNewsSearch() tools = [asknews_tool] agent = create_openai_functions_agent(llm, tools, prompt) agent_executor = AgentExecutor( agent=agent, tools=tools, verbose=True, ) agent_executor.invoke({"input": "How is the tech sector being affected by fed policy?"}) ``` --------- Co-authored-by: Emre <e@emre.pm>	4 months ago
Jesse S	fc79b372cb	community[minor]: add aerospike vectorstore integration (#21735 ) Please let me know if you see any possible areas of improvement. I would very much appreciate your constructive criticism if time allows. Description: - Added a aerospike vector store integration that utilizes [Aerospike-Vector-Search](https://aerospike.com/products/vector-database-search-llm/) add-on. - Added both unit tests and integration tests - Added a docker compose file for spinning up a test environment - Added a notebook Dependencies: any dependencies required for this change - aerospike-vector-search Twitter handle: - No twitter, you can use my GitHub handle or LinkedIn if you'd like Thanks! --------- Co-authored-by: Jesse Schumacher <jschumacher@aerospike.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Prince Canuma	3587c60396	community[patch]: Fix MLX LLM Stream (#20575 ) Closes #20561 This PR fixes MLX LLM stream `AttributeError`. Recently, `mlx-lm` changed the token decoding logic, which affected the LC+MLX integration. Additionally, I made minor fixes such as: docs example broken link and enforcing pipeline arguments (max_tokens, temp and etc) for invoke. - Issue: #20561 - Twitter handle: @Prince_Canuma	4 months ago
Rahul Triptahi	96bd0b0844	community[patch]: Remove redundant pebblo cloud api call (#21589 ) Description: removed redundant pebblo cloud api call. Changed classified `doc` key to `ai_apps_data`. Documentation: N/A Unit tests: N/A	4 months ago
Param Singh	d07885f8b7	community[patch]: standardized sparkllm init args (#21633 ) Related to #20085 @baskaryan Thank you for contributing to LangChain! community:sparkllm[patch]: standardized init args updated `spark_api_key` so that aliased to `api_key`. Added integration test for `sparkllm` to test that it continues to set the same underlying attribute. updated temperature with Pydantic Field, added to the integration test. Ran `make format`,`make test`, `make lint`, `make spell_check`	4 months ago
Dhruv Chawla	d4359d3de6	community[patch]: Update UpTrain Callback Handler to support the new UpTrain evaluation schema (#21656 ) UpTrain has a new dashboard now that makes it easier to view projects and evaluations. Using this requires specifying both project_name and evaluation_name when performing evaluations. I have updated the code to support it.	4 months ago
Alex Riina	c0e3c3a350	openai[patch], community[patch]: add pricing and max context window for GPT-4o (#21673 ) # Add pricing and max context window for GPT-4o - community: add cost per 1k tokens and max context window - partners: add max context window Description: adds static information about GPT-4o based on https://openai.com/api/pricing/ and https://platform.openai.com/docs/models/gpt-4o so that GPT-4o reporting is accurate. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
缨缨	bd39b2ccdf	community: enable SupabaseVectorStore to support extended table fields (#21762 ) Thank you for contributing to LangChain! - [x] PR title: "community: enable SupabaseVectorStore to support extended table fields" - [x] PR message: - Added extension fields to the function _add_vectors so that users can add other custom fields when insert a record into the database. eg: ![image](https://github.com/langchain-ai/langchain/assets/10885578/e1d5ca20-936e-4cab-ba69-8fdd23b8ce8f) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Leonid Ganeline	e98a4fd19a	ai21[patch]: configuration fix (#21790 ) added "repository" and "Source Code" parameters (these parameters are missed only in this partner package configuration).	4 months ago
Trayan Azarov	f54cbf8ff5	chroma[patch]: Chroma - remove reference to collection upon delete_collection (#21817 ) Description: - Reference to `Collection` object is set to `None` when deleting a collection `delete_collection()` - Added utility method `reset_collection()` to allow recreating the collection - Moved collection creation out of `__init__` into `__ensure_collection()` to be reused by object init and `reset_collection()` - `_collection` is now a property to avoid breaking changes Issues: - chroma-core/chroma#2213 Twitter: @t_azarov	4 months ago
Jens	b0b302ec6b	community[patch]: fixed aleph alpha default emedding request (#21826 ) - Description: In the aleph alpha client the paramater `normalize` is not optional. Setting this to `None` gives an error. - Dependencies: None Co-authored-by: Jens Lücke <jens.luecke@tngtech.com> Co-authored-by: Jens <jens.luecke@hu-berlin.de> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
Jorge Piedrahita Ortiz	e6207ad4f3	community[patch]: Sambanova integration api update (#21848 ) - Description:: SambaStudio generic endpoint compatibility added Improved error description, and handling streaming examples added	4 months ago
Michael Reed	7a5e1bcf99	core[patch]: Fix NPE in function_calling._get_python_function_required_args (#21863 ) Example error message: line 206, in _get_python_function_required_args if is_function_type and required[0] == "self": ~~~~~~~~^^^ IndexError: list index out of range Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Liuww	332ffed393	community[patch]: Adopting the lighter-weight xinference_client (#21900 ) While integrating the xinference_embedding, we observed that the downloaded dependency package is quite substantial in size. With a focus on resource optimization and efficiency, if the project requirements are limited to its vector processing capabilities, we recommend migrating to the xinference_client package. This package is more streamlined, significantly reducing the storage space requirements of the project and maintaining a feature focus, making it particularly suitable for scenarios that demand lightweight integration. Such an approach not only boosts deployment efficiency but also enhances the application's maintainability, rendering it an optimal choice for our current context. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Tomaz Bratanic	a43515ca65	experimental[patch]: Pass enum only to openai in llm graph transformer (#21860 ) Some models like Groq return bad request if you pass in `enum` parameter in tool definition	4 months ago
Jiří Spilka	6499897c87	community[patch]: update apify integration to attribute API activity to langchain (#21909 ) Description: Add `Origin/langchain` to Apify's client's user-agent to attribute API activity to LangChain (at Apify, we aim to monitor our integrations to evaluate whether we should invest more in the LangChain integration regarding functionality and content) Issue: None Dependencies: None Twitter handle: None	4 months ago
Jared Van Bortel	25d1c1c9bb	nomic: implement local embeddings with the inference_mode parameter (#21934 ) ## Description This PR implements local and dynamic mode in the Nomic Embed integration using the inference_mode and device parameters. They work as documented [here](https://docs.nomic.ai/reference/python-api/embeddings#local-inference). <!-- If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --> --------- Co-authored-by: Erick Friis <erickfriis@gmail.com>	4 months ago
ccurme	0e72ed39a0	infra: fix CI on text-splitters (#21935 )	4 months ago
ccurme	4470d3b4a0	partners: bump core in packages implementing ls_params (#21868 ) These packages all import `LangSmithParams` which was released in langchain-core==0.2.0. N.B. we will need to release `openai` and then bump `langchain-openai` in `together` and `upstage`.	4 months ago
ccurme	9c76739425	mistral: implement ls_params (#21867 )	4 months ago
Tomaz Bratanic	d85e46321a	community[patch]: Better error message for neo4j vector when text is null (#21861 )	4 months ago
Stefano Lottini	f2e75f9500	cli[minor]: fix import path for two Astra DB classes in the migration json data (#21926 ) This PR fixes two mistakes in the import paths from community for the json data aiding the cli migration to 0.2. It is intended as a quick follow-up to https://github.com/langchain-ai/langchain/pull/21913 . @nicoloboschi FYI	4 months ago
WilliamEspegren	30bca57aae	doc list not empty (#21208 ) Make sure the doc list is not empty, and set Metadata: true in param, to enable the user to disable metadata for slightly faster crawls.	4 months ago
David Charles	8da35fba7f	langchain[minor]: add libs/partners to dev.Dockerfile (#21902 ) Resolves #21886 by adding "COPY libs/partners ../partners/" to libs/dev.Dockerfile Twitter: @kabakongo	4 months ago
TJ	8cd6ed3e1e	community[patch]: Update documentation string in databricks chat model (#21915 ) Update typos in documentation string in databricks chat model	4 months ago
Nicolò Boschi	dd00aac7ad	cli[minor]: add astradb in the cli migration to 0.2 (#21913 ) astradb has a new partner package but the automatic migration cli tool doesn't take care of migration astradb integrations	4 months ago
Coozywana	b6c8b6f944	Fix base.py typo (#21862 ) ChatOpenaAI --> ChatOpenAI Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	4 months ago
fzowl	d3624eaba1	partners: Remove unnecessary print from voyageai embeddings (#21865 ) Thank you for contributing to LangChain! Remove unnecessary print from voyageai embeddings - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	4 months ago
Bagatur	8b3c5f93f5	docs: lcel how to and cheatsheet (#21851 )	4 months ago
Nuno Campos	b1e7b40b6a	core: Tap output of sync iterators for astream_events (#21842 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	4 months ago
Eugene Yurtsev	67b6f6c82a	core[patch]: Check if event loop is closed in memory stream (#21841 ) Check if event stream is closed in memory loop. Using try/except here to avoid race condition, but this may incur a small overhead in versions prios to 3.11	4 months ago
Erick Friis	2d3f4e1a16	experimental: release 0.0.59 (#21835 )	4 months ago
Erick Friis	169f525cfb	community: release 0.2.0 (#21834 )	4 months ago
Erick Friis	e5046cbd72	langchain: release 0.2.0, fix min deps (#21833 )	4 months ago
Erick Friis	1b555021f7	text-splitters: release 0.2.0 (#21832 )	4 months ago
Erick Friis	0ad8de5eb7	langchain: release 0.2.0 (#21831 )	4 months ago
Erick Friis	23310626b3	core: release 0.2.0 (#21829 )	4 months ago
Eugene Yurtsev	e3f30b4cde	docs: clean up link to bing search (#21825 ) Documentation should be inlined, not linking to medium article.	4 months ago
ccurme	181dfef118	core, standard tests, partner packages: add test for model params (#21677 ) 1. Adds `.get_ls_params` to BaseChatModel which returns ```python class LangSmithParams(TypedDict, total=False): ls_provider: str ls_model_name: str ls_model_type: Literal["chat"] ls_temperature: Optional[float] ls_max_tokens: Optional[int] ls_stop: Optional[List[str]] ``` by default it will only return ```python {ls_model_type="chat", ls_stop=stop} ``` 2. Add these params to inheritable metadata in `CallbackManager.configure` 3. Implement `.get_ls_params` and populate all params for Anthropic + all subclasses of BaseChatOpenAI Sample trace: https://smith.langchain.com/public/d2962673-4c83-47c7-b51e-61d07aaffb1b/r OpenAI: <img width="984" alt="Screenshot 2024-05-17 at 10 03 35 AM" src="https://github.com/langchain-ai/langchain/assets/26529506/2ef41f74-a9df-4e0e-905d-da74fa82a910"> Anthropic: <img width="978" alt="Screenshot 2024-05-17 at 10 06 07 AM" src="https://github.com/langchain-ai/langchain/assets/26529506/39701c9f-7da5-4f1a-ab14-84e9169d63e7"> Mistral (and all others for which params are not yet populated): <img width="977" alt="Screenshot 2024-05-17 at 10 08 43 AM" src="https://github.com/langchain-ai/langchain/assets/26529506/37d7d894-fec2-4300-986f-49a5f0191b03">	4 months ago
Sen Lin	eb7f07ae36	community[patch]: fix typo in ValueError message in load_local function (#21818 ) Description: Corrected an error in the `allow_dangerous_deserialization` message within the `load_local` functions	4 months ago
Jorge Piedrahita Ortiz	700b1c7212	community: sambaverse api update (#21816 ) - Description: fix sambaverse integration to make it compatible with sambaverse API update / minor changes in docs	4 months ago
maang-h	9f8d18c028	community[patch]: Fix unintended newline in print statement in exception for BaichuanTextEmbeddings (#21820 ) - Code: langchain_community/embeddings/baichuan.py:82 - Description: When I make an error using 'baichuan embeddings', the printed error message is wrapped (there is actually no need to wrap) ```python # example from langchain_community.embeddings import BaichuanTextEmbeddings # error key BAICHUAN_API_KEY = "sk-xxxxxxxxxxxxx" embeddings = BaichuanTextEmbeddings(baichuan_api_key=BAICHUAN_API_KEY) text_1 = "今天天气不错" query_result = embeddings.embed_query(text_1) ``` ![unintended newline](https://github.com/langchain-ai/langchain/assets/55082429/e1178ce8-62bb-405d-a4af-e3b28eabc158)	4 months ago
Bakar Tavadze	3b5ac44e03	langchain-robocorp[minor]: Enable passing additional headers to the action server. (#21809 ) Actions can optionally receive secrets via request headers. This PR enables this functionality.	4 months ago
Asaf Joseph Gardin	f3289b898c	partners: Revert AI21 Labs docs scan feature (#21699 ) Description: Reverted commit #21614 --------- Co-authored-by: Asaf Gardin <asafg@ai21.com> Co-authored-by: Erick Friis <erick@langchain.dev>	4 months ago
Eugene Yurtsev	8607735b80	langchain[patch],community[patch]: Move unit tests that depend on community to community (#21685 )	4 months ago
Marco Lamina	d0fae6cd54	community: Add token cost for GPT-4o model (#21771 ) Adding [token cost for the new GPT-4o model](https://openai.com/api/pricing/): * Input cost US$5.00 / 1M tokens * Output cost US$15.00 / 1M tokens	4 months ago
Massimiliano Pronesti	0c0db7c5db	feat(community): support semantic hybrid score threshold in Azure AI Search (#21527 ) Support semantic hybrid search with a score threshold -- similar to what we do for similarity search and for hybrid search (#20907).	4 months ago
Bagatur	6416d16d39	anthropic[patch]: Release 0.1.13, tool_choice support (#21773 )	4 months ago
Stefano Lottini	040597e832	community: init signature revision for Cassandra LLM cache classes + small maintenance (#17765 ) This PR improves on the `CassandraCache` and `CassandraSemanticCache` classes, mainly in the constructor signature, and also introduces several minor improvements around these classes. ### Init signature A (sigh) breaking change is tentatively introduced to the constructor. To me, the advantages outweigh the possible discomfort: the new syntax places the DB-connection objects `session` and `keyspace` later in the param list, so that they can be given a default value. This is what enables the pattern of _not_ specifying them, provided one has previously initialized the Cassandra connection through the versatile utility method `cassio.init(...)`. In this way, a much less unwieldy instantiation can be done, such as `CassandraCache()` and `CassandraSemanticCache(embedding=xyz)`, everything else falling back to defaults. A downside is that, compared to the earlier signature, this might turn out to be breaking for those doing positional instantiation. As a way to mitigate this problem, this PR typechecks its first argument trying to detect the legacy usage. (And to make this point less tricky in the future, most arguments are left to be keyword-only). If this is considered too harsh, I'd like guidance on how to further smoothen this transition. Our plan is to make the pattern of optional session/keyspace a standard across all Cassandra classes, so that a repeatable strategy would be ideal. A possibility would be to keep positional arguments for legacy reasons but issue a deprecation warning if any of them is actually used, to later remove them with 0.2 - please advise on this point. ### Other changes - class docstrings: enriched, completely moved to class level, added note on `cassio.init(...)` pattern, added tiny sample usage code. - semantic cache: revised terminology to never mention "distance" (it is in fact a similarity!). Kept the legacy constructor param with a deprecation warning if used. - `llm_caching` notebook: uniform flow with the Cassandra and Astra DB separate cases; better and Cassandra-first description; all imports made explicit and from community where appropriate. - cache integration tests moved to community (incl. the imported tools), env var bugfix for `CASSANDRA_CONTACT_POINTS`. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	4 months ago
Kyle Cassidy	eca8c4bcc6	Standardized openai init params (#21739 ) ## Patch Summary community:openai[patch]: standardize init args ## Details I made changes to the OpenAI Chat API wrapper test in the Langchain open-source repository - File: `libs/community/tests/unit_tests/chat_models/test_openai.py` - Changes: - Updated `max_retries` with Pydantic Field - Updated the corresponding unit test - Related Issues: #20085 - Updated max_retries with Pydantic Field, updated the unit test. --------- Co-authored-by: JuHyung Son <sonju0427@gmail.com>	4 months ago
Ethan Yang	e44b448ec3	community: update openvino doc with streaming support (#21519 ) Co-authored-by: Chester Curme <chester.curme@gmail.com>	4 months ago
ccurme	19e6bf814b	community: fix CI (#21766 )	4 months ago
Mish Ushakov	d77e60a7f4	community: updated Browserbase loader (#21757 ) Thank you for contributing to LangChain! - [x] PR title: "community: updated Browserbase loader" - [x] PR message: Updates the Browserbase loader with more options and improved docs. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	4 months ago
Eugene Yurtsev	6ed0aa3239	core[major]: only use function description (#21622 ) Do not prefix function signature --- * Reason for this is that information is already present with tool calling models. * This will save on tokens for those models, and makes it more obvious what the description is! * The @tool can get more parameters to allow a user to re-introduce the the signature if we want	4 months ago
William FH	8498b41cda	Finish agent migration doc (#21731 )	4 months ago
Cheese	0ead09f84d	community: Implement `bind_tools` for ChatTongyi (#20725 ) ## Description Implement `bind_tools` in ChatTongyi. Usage example: ```py from langchain_core.tools import tool from langchain_community.chat_models.tongyi import ChatTongyi @tool def multiply(first_int: int, second_int: int) -> int: """Multiply two integers together.""" return first_int * second_int llm = ChatTongyi(model="qwen-turbo") llm_with_tools = llm.bind_tools([multiply]) msg = llm_with_tools.invoke("What's 5 times forty two") print(msg) ``` Streaming is also supported. ## Dependencies No Dependency is required for this change. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	4 months ago
Bagatur	867adbf27b	docs: add aca-ds (#21746 )	4 months ago
Erick Friis	06110e20b9	pinecone: bump min core version (#21742 )	4 months ago
Erick Friis	bd3e7d50f3	fireworks: bump min core version (#21741 )	4 months ago
Erick Friis	f5c31078d7	airbyte[patch]: airbyte-cdk compatible pydantic versions (#21738 )	4 months ago
Erick Friis	3d33b89fa4	ibm[patch]: release 0.1.7 (#21737 )	4 months ago
Erick Friis	e41d801369	openai[patch]: fix embedding float precision issue (#21736 ) also clean up + comment some of the embedding batching code	4 months ago
JuHyung Son	38c297a025	upstage: Support batch input in embedding request. (#21730 ) Description: upstage embedding now supports batch input.	4 months ago
Harrison Chase	15be439719	Harrison/move flashrank rerank (#21448 ) third party integration, should be in community	4 months ago
Erick Friis	aca98fd150	multiple: releases with relaxed core dep (#21724 )	4 months ago
Bagatur	af284518bc	openai[patch]: Release 0.1.7, bump tiktoken 0.7.0 (#21723 )	4 months ago
William FH	ca768c8353	[Core] Check is async callable (#21714 ) To permit proper coercion of objects like the following: ```python class MyAsyncCallable: async def __call__(self, foo): return await ... class MyAsyncGenerator: async def __call__(self, foo): await ... yield ```	4 months ago
Eugene Yurtsev	5c2cfabec6	core[minor]: Add v2 implementation of astream events (#21638 ) This PR introduces a v2 implementation of astream events that removes intermediate abstractions and fixes some issues with v1 implementation. The v2 implementation significantly reduces relevant code that's associated with the astream events implementation together with overhead. After this PR, the astream events implementation: - Uses an async callback handler - No longer relies on BaseTracer - No longer relies on json patch As a result of this re-write, a number of issues were discovered with the existing implementation. ## Changes in V2 vs. V1 ### on_chat_model_end `output` The outputs associated with `on_chat_model_end` changed depending on whether it was within a chain or not. As a root level runnable the output was: ```python "data": {"output": AIMessageChunk(content="hello world!", id='some id')} ``` As part of a chain the output was: ``` "data": { "output": { "generations": [ [ { "generation_info": None, "message": AIMessageChunk( content="hello world!", id=AnyStr() ), "text": "hello world!", "type": "ChatGenerationChunk", } ] ], "llm_output": None, } }, ``` After this PR, we will always use the simpler representation: ```python "data": {"output": AIMessageChunk(content="hello world!", id='some id')} ``` NOTE Non chat models (i.e., regular LLMs) are still associated with the more verbose format. ### Remove some `_stream` events `on_retriever_stream` and `on_tool_stream` events were removed -- these were not real events, but created as an artifact of implementing on top of astream_log. The same information is already available in the `x_on_end` events. ### Propagating Names Names of runnables have been updated to be more consistent ```python model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields( messages=ConfigurableField( id="messages", name="Messages", description="Messages return by the LLM", ) ) ``` Before: ```python "name": "RunnableConfigurableFields", ``` After: ```python "name": "GenericFakeChatModel", ``` ### on_retriever_end on_retriever_end will always return `output` which is a list of documents (rather than a dict containing a key called "documents") ### Retry events Removed the `on_retry` callback handler. It was incorrectly showing that the failed function being retried has invoked `on_chain_end` https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394	4 months ago
Rajendra Kadam	54e003268e	langchain[minor]: Add PebbloRetrievalQA chain with Identity & Semantic Enforcement support (#20641 ) - Description: PebbloRetrievalQA chain introduces identity enforcement using vector-db metadata filtering - Dependencies: None - Issue: None - Documentation: Adding documentation for PebbloRetrievalQA chain in a separate PR(https://github.com/langchain-ai/langchain/pull/20746) - Unit tests: New unit-tests added --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	4 months ago
Bagatur	241a6e43a5	docs: update structured how to (#21679 )	4 months ago
Jib	f369495fa0	mongodb: [performance] Increase DEFAULT_INSERT_BATCH_SIZE to 100,000 and introduce sizing constraints (#19608 )	4 months ago
Eugene Yurtsev	e69a9bedf8	core[patch]: Update mypy config (#21684 ) Update mypy config to ignore checking deps from numpy and pytest (which are optional in langsmith sdk)	4 months ago
Erick Friis	9973547aef	mongodb: release 0.1.4 (#21678 )	4 months ago
Jib	a97473c846	mongodb[patch]: Make ObjectId JSON-serializable on generation (#21394 )	4 months ago
Eugene Yurtsev	5c64c004cc	core[patch]: Add unit tests with some streaming scenarios (#21668 ) Add unit tests that show differences between sync / async versions when streaming. The inner on_chain_chunk event is missing if mixing sync and async functionality. Likely due to missing tap_output_iter implementation on the sync variant of `_transform_stream_with_config`	4 months ago
Eugene Yurtsev	2ac4d2960c	core[patch]: Add unit test to catch ordering (#21669 ) Add unit test to catch ordering issues	4 months ago
Zhao Blake	972d2071c6	core[patch]: Fix typo in VectorStoreExampleSelector doc-string (#21574 )	4 months ago
Erick Friis	2a984e8e3f	docs: huggingface package (#21645 )	4 months ago
Erick Friis	c77d2f2b06	multiple: core 0.2 nonbreaking dep, check_diff community->langchain dep (#21646 ) 0.2 is not a breaking release for core (but it is for langchain and community) To keep the core+langchain+community packages in sync at 0.2, we will relax deps throughout the ecosystem to tolerate `langchain-core` 0.2	4 months ago
Anush	edd68e4ad4	qdrant: init package (#21146 ) ## Description This PR introduces the new `langchain-qdrant` partner package, intending to deprecate the community package. ## Changes - Moved the Qdrant vector store implementation `/libs/partners/qdrant` with integration tests. - The conditional imports of the client library are now regular with minor implementation improvements. - Added a deprecation warning to `langchain_community.vectorstores.qdrant.Qdrant`. - Replaced references/imports from `langchain_community` with either `langchain_core` or by moving the definitions to the `langchain_qdrant` package itself. - Updated the Qdrant vector store documentation to reflect the changes. ## Testing - `QDRANT_URL` and [`QDRANT_API_KEY`](`583e36bf6b`) env values need to be set to [run integration tests](`d608c93d1f`) in the [cloud](https://cloud.qdrant.tech). - If a Qdrant instance is running at `http://localhost:6333`, the integration tests will use it too. - By default, tests use an [`in-memory`](https://github.com/qdrant/qdrant-client?tab=readme-ov-file#local-mode) instance(Not comprehensive). --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Erick Friis <erickfriis@gmail.com>	4 months ago
Prashanth Rao	63c3a0e56c	[community][graph]: Update KuzuQAChain and docs (#21218 ) This PR makes some small updates for `KuzuQAChain` for graph QA. - Updated Cypher generation prompt (we now support `WHERE EXISTS`) and generalize it more - Support different LLMs for Cypher generation and QA - Update docs and examples	4 months ago
Tomaz Bratanic	89ff6a3d3b	Add sentiment and confidence levels to diffbotgraphtransformer (#21590 ) Co-authored-by: Erick Friis <erickfriis@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	4 months ago
Erick Friis	9b51ca08bc	huggingface: fix community dep checking (#21628 )	4 months ago
Erick Friis	91a2ea5cd6	chroma, mongodb: fix docstrings (#21629 )	4 months ago
Jofthomas	afd85b60fc	huggingface: init package (#21097 ) First Pr for the langchain_huggingface partner Package - Moved some of the hugging face related class from `community` to the new `partner package` Still needed : - Documentation - Tests - Support for the new apply_chat_template in `ChatHuggingFace` - Confirm choice of class to support for embeddings witht he sentence-transformer team. cc : @efriis --------- Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	4 months ago
Tomaz Bratanic	9fce03e7db	community[patch]: Fix neo4j enhanced schema (#21582 )	4 months ago
Christophe Bornet	66a4da8ad0	community[patch]: Improve Cassandra VectorStore docsctrings (#21620 )	4 months ago
adreo00	40aff1eacc	core[major]: AsyncCallbackManagerForToolRun no longer casts return object to string (#20374 ) - Description: Stops `AsyncCallbackManagerForToolRun` from converting the output to str - Issue: #20372 - Dependencies: None	4 months ago
Eugene Yurtsev	25fbe356b4	community[patch]: upgrade to recent version of mypy (#21616 ) This PR upgrades community to a recent version of mypy. It inserts type: ignore on all existing failures.	4 months ago
Eugene Yurtsev	b923951062	langchain[patch]: CI add lint rule for community imports (#21618 ) Add a rule to check for imports from community in global scope	4 months ago
Jorge Piedrahita Ortiz	4378fbbef0	community[patch]: Fix typos in Sambanova integration doc-strings (#21617 ) - Description: Sambanova integration docstrings updated, bad formated --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	4 months ago
Christophe Bornet	bcf53f93e1	[community]: Add missing docstring param to CassandraLoader (#21611 )	4 months ago
Christophe Bornet	e6fa4547b1	community[minor]: Add alazy_load to AsyncHtmlLoader (#21536 ) Also fixes a bug that `_scrape` was called and was doing a second HTTP request synchronously. Twitter handle: cbornet_	4 months ago
Wang Guan	b53548dcda	langchain[minor]: allow CacheBackedEmbeddings to cache queries (#20073 ) Add optional caching of queries to cache backed embeddings	4 months ago
Guangdong Liu	a156aace2b	core[patch]:Fix Incorrect listeners parameters for Runnable.with_listeners() and .map() (#20661 ) - Issue: fix #20509 - @baskaryan, @eyurtsev ![image](https://github.com/langchain-ai/langchain/assets/48236177/f799a976-b983-4d8b-b373-64392e1fd6c6)	4 months ago
junkeon	480c02bf55	upstage[minor]: add merge_and_split function for document loader (#21603 ) - Introduce the `merge_and_split` function in the `UpstageLayoutAnalysisLoader`. - The `merge_and_split` function takes a list of documents and a splitter as inputs. - This function merges all documents and then divides them using the `split_documents` method, which is a proprietary function of the splitter. - If the provided splitter is `None` (which is the default setting), the function will simply merge the documents without splitting them.	4 months ago
Leonid Ganeline	500569da48	community[patch]: `vectorstores` import update (#21169 ) Issue: we have several helper functions to import third-party libraries like lancedb.import_lancedb in [community.vectorstores](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.lancedb.import_lancedb.html#langchain_community.vectorstores.lancedb.import_lancedb). And we have core.utils.utils.guard_import that works exactly for this purpose. The import_<package> functions work inconsistently and rather be private functions. Change: replaced these functions with the guard_import function. Related to #21133	4 months ago
ccurme	3003363605	langchain, community: remove cap on sqlalchemy and bump duckdb (#21509 )	4 months ago

... 7 8 9 10 11 ...

5010 Commits (199e64d37288330a80f9aeb19e00dff23af71d4c)