langchain

Commit Graph

Author	SHA1	Message	Date
Naveen Tatikonda	8bbdb4f6a0	community[patch]: Add OpenSearch as semantic cache (#20254 ) ### Description Use OpenSearch vector store as Semantic Cache. ### Twitter Handle @OpenSearchProj --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com> Co-authored-by: Harish Tatikonda <harishtatikonda@Harishs-MacBook-Air.local> Co-authored-by: EC2 Default User <ec2-user@ip-172-31-31-155.ec2.internal> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2 months ago
Mayank Solanki	8c085fc697	community[patch]: Added a function `from_existing_collection` in `Qdrant` vector database. (#20779 ) Issue: #20514 The current implementation of `construct_instance` expects a `texts: List[str]` that will call the embedding function. This might not be needed when we already have a client with collection and `path, you don't want to add any text. This PR adds a class method that returns a qdrant instance with an existing client. Here everytime `cb6e5e56c2/libs/community/langchain_community/vectorstores/qdrant.py (L1592)` `construct_instance` is called, this line sends some text for embedding generation. --------- Co-authored-by: Anush <anushshetty90@gmail.com>	2 months ago
Leonid Kuligin	893a924b90	core[minor], community[patch], langchain[patch]: move BaseChatLoader to core (#19607 ) Thank you for contributing to LangChain! - [ ] PR title: "core: move BaseChatLoader and BaseToolkit from community" - [ ] PR message: move BaseChatLoader and BaseToolkit --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2 months ago
Lei Zhang	9281841cfe	community[patch]: fix integrated test case test_recursive_url_loader.py assertions (issue-20919) (#20920 ) Description: Fix integrated test case test_recursive_url_loader.py Local testing successful ```shell (venv) lei@LeideMacBook-Pro community % poetry run pytest tests/integration_tests/document_loaders/test_recursive_url_loader.py ================================================================================ test session starts ================================================================================ platform darwin -- Python 3.11.4, pytest-7.4.4, pluggy-1.4.0 -- /Users/zhanglei/Work/github/langchain/venv/bin/python cachedir: .pytest_cache rootdir: /Users/zhanglei/Work/github/langchain/libs/community configfile: pyproject.toml plugins: syrupy-4.6.1, asyncio-0.20.3, cov-4.1.0, vcr-1.0.2, mock-3.12.0, anyio-3.7.1, dotenv-0.5.2, requests-mock-1.11.0, socket-0.6.0 asyncio: mode=Mode.AUTO collected 6 items tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader PASSED [ 16%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic PASSED [ 33%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader FAILED [ 50%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent PASSED [ 66%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_loading_invalid_url PASSED [ 83%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties PASSED [100%] ===================================================================================== FAILURES ====================================================================================== __________________________________________________________________________ test_sync_recursive_url_loader ___________________________________________________________________________ def test_sync_recursive_url_loader() -> None: url = "https://docs.python.org/3.9/" loader = RecursiveUrlLoader( url, extractor=lambda _: "placeholder", use_async=False, max_depth=2 ) docs = loader.load() > assert len(docs) == 23 E AssertionError: assert 24 == 23 E + where 24 = len([Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/', 'content_type': 'text/html', 'title': '3.9.18 Documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/py-modindex.html', 'content_type': 'text/html', 'title': 'Python Module Index — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/download.html', 'content_type': 'text/html', 'title': 'Download — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/howto/index.html', 'content_type': 'text/html', 'title': 'Python HOWTOs — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/whatsnew/index.html', 'content_type': 'text/html', 'title': 'Whatâ\x80\x99s New in Python — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/c-api/index.html', 'content_type': 'text/html', 'title': 'Python/C API Reference Manual — Python 3.9.18 documentation', 'language': None}), ...]) tests/integration_tests/document_loaders/test_recursive_url_loader.py:38: AssertionError ================================================================================= warnings summary ================================================================================== tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties /Users/zhanglei/.pyenv/versions/3.11.4/lib/python3.11/html/parser.py:170: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor. k = self.parse_starttag(i) -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ================================================================================ slowest 5 durations ================================================================================ 56.75s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic 38.99s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader 31.20s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties 30.37s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent 15.44s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader ============================================================================== short test summary info ============================================================================== FAILED tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader - AssertionError: assert 24 == 23 ================================================================ 1 failed, 5 passed, 5 warnings in 172.97s (0:02:52) ================================================================ (venv) zhanglei@LeideMacBook-Pro community % poetry run pytest tests/integration_tests/document_loaders/test_recursive_url_loader.py ================================================================================ test session starts ================================================================================ platform darwin -- Python 3.11.4, pytest-7.4.4, pluggy-1.4.0 -- /Users/zhanglei/Work/github/langchain/venv/bin/python cachedir: .pytest_cache rootdir: /Users/zhanglei/Work/github/langchain/libs/community configfile: pyproject.toml plugins: syrupy-4.6.1, asyncio-0.20.3, cov-4.1.0, vcr-1.0.2, mock-3.12.0, anyio-3.7.1, dotenv-0.5.2, requests-mock-1.11.0, socket-0.6.0 asyncio: mode=Mode.AUTO collected 6 items tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader PASSED [ 16%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic PASSED [ 33%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader PASSED [ 50%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent PASSED [ 66%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_loading_invalid_url PASSED [ 83%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties PASSED [100%] ================================================================================= warnings summary ================================================================================== tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties /Users/zhanglei/.pyenv/versions/3.11.4/lib/python3.11/html/parser.py:170: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor. k = self.parse_starttag(i) -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ================================================================================ slowest 5 durations ================================================================================ 46.99s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic 32.43s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader 31.23s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent 30.75s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties 15.89s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader ===================================================================== 6 passed, 5 warnings in 157.42s (0:02:37) ===================================================================== (venv) lei@LeideMacBook-Pro community % ``` Issue: https://github.com/langchain-ai/langchain/issues/20919 Twitter handle: @coolbeevip	2 months ago
Matt	28df4750ef	community[patch]: Add initial tests for AzureSearch vector store (#17663 ) Description: AzureSearch vector store has no tests. This PR adds initial tests to validate the code can be imported and used. Issue: N/A Dependencies: azure-search-documents and azure-identity are added as optional dependencies for testing --------- Co-authored-by: Matt Gotteiner <[email protected]> Co-authored-by: Bagatur <baskaryan@gmail.com>	2 months ago
Dristy Srivastava	5f1d1666e3	community[patch]: Add support for pebblo server and client version (#20269 ) Description: _PebbloSafeLoader_: Add support for pebblo server and client version Documentation: NA Unit test: NA Issue: NA Dependencies: None --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2 months ago
am-kinetica	b54b19ba1c	community[minor]: Implemented Kinetica Document Loader and added notebooks (#20002 ) - [ ] Kinetica Document Loader: "community: a class to load Documents from Kinetica" - [ ] Kinetica Document Loader: - Description: implemented KineticaLoader in `kinetica_loader.py` - Dependencies: install the Kinetica API using `pip install gpudb==7.2.0.1 `	2 months ago
Shengsheng Huang	fd1061e7bf	community[patch]: add more data types support to ipex-llm llm integration (#20833 ) - Description: - add support for more data types: by default `IpexLLM` will load the model in int4 format. This PR adds more data types support such as `sym_in5`, `sym_int8`, etc. Data formats like NF3, NF4, FP4 and FP8 are only supported on GPU and will be added in future PR. - Fix a small issue in saving/loading, update api docs - Dependencies: `ipex-llm` library - Document: In `docs/docs/integrations/llms/ipex_llm.ipynb`, added instructions for saving/loading low-bit model. - Tests: added new test cases to `libs/community/tests/integration_tests/llms/test_ipex_llm.py`, added config params. - Contribution maintainer: @shane-huang	2 months ago
Rahul Triptahi	dc921f0823	community[patch]: Add semantic info to metadata, classified by pebblo-server. (#20468 ) Description: Add support for Semantic topics and entities. Classification done by pebblo-server is not used to enhance metadata of Documents loaded by document loaders. Dependencies: None Documentation: Updated. Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2 months ago
Jingpan Xiong	1202017c56	community[minor]: Add relyt vector database (#20316 ) Co-authored-by: kaka <kaka@zbyte-inc.cloud> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: jingsi <jingsi@leadincloud.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2 months ago
davidefantiniIntel	f386f71bb3	community: fix tqdm import (#20263 ) Description: Fix tqdm import in QuantizedBiEncoderEmbeddings	2 months ago
Andres Algaba	05ae8ca7d4	community[patch]: deprecate persist method in Chroma (#20855 ) Thank you for contributing to LangChain! - [x] PR title - [x] PR message: - Description: Deprecate persist method in Chroma no longer exists in Chroma 0.4.x - Issue: #20851 - Dependencies: None - Twitter handle: AndresAlgaba1 - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2 months ago
ccurme	b8db73233c	core, community: deprecate tool.__call__ (#20900 ) Does not update docs.	2 months ago
Tomaz Bratanic	520972fd0f	community[patch]: Support passing graph object to Neo4j integrations (#20876 ) For driver connection reusage, we introduce passing the graph object to neo4j integrations	2 months ago
Lei Zhang	748a6ae609	community[patch]: add HTTP response headers Content-Type to metadata of RecursiveUrlLoader document (#20875 ) Description: The RecursiveUrlLoader loader offers a link_regex parameter that can filter out URLs. However, this filtering capability is limited, and if the internal links of the website change, unexpected resources may be loaded. These resources, such as font files, can cause problems in subsequent embedding processing. > https://blog.langchain.dev/assets/fonts/source-sans-pro-v21-latin-ext_latin-regular.woff2?v=0312715cbf We can add the Content-Type in the HTTP response headers to the document metadata so developers can choose which resources to use. This allows developers to make their own choices. For example, the following may be a good choice for text knowledge. - text/plain - simple text file - text/html - HTML web page - text/xml - XML format file - text/json - JSON format data - application/pdf - PDF file - application/msword - Word document and ignore the following - text/css - CSS stylesheet - text/javascript - JavaScript script - application/octet-stream - binary data - image/jpeg - JPEG image - image/png - PNG image - image/gif - GIF image - image/svg+xml - SVG image - audio/mpeg - MPEG audio files - video/mp4 - MP4 video file - application/font-woff - WOFF font file - application/font-ttf - TTF font file - application/zip - ZIP compressed file - application/octet-stream - binary data Twitter handle: @coolbeevip --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2 months ago
Joan Fontanals	baefbfb14e	community[mionr]: add Jina Reranker in retrievers module (#19406 ) - Description: Adapt JinaEmbeddings to run with the new Jina AI Rerank API - Twitter handle: https://twitter.com/JinaAI_ - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2 months ago
Jason_Chen	53bb7dbd29	community[patch]: add BeautifulSoupTransformer remove_unwanted_classnames method (#20467 ) Add the remove_unwanted_classnames method to the BeautifulSoupTransformer class, which can filter more effectively. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2 months ago
Bagatur	5b83130855	core[minor], langchain[patch], community[patch]: mv StructuredQuery (#20849 ) mv StructuredQuery to core	2 months ago
Mish Ushakov	6ccecf2363	community[minor]: added Browserbase loader (#20478 )	2 months ago
ccurme	481d3855dc	patch: remove usage of llm, chat model __call__ (#20788 ) - `llm(prompt)` -> `llm.invoke(prompt)` - `llm(prompt=prompt` -> `llm.invoke(prompt)` (same with `messages=`) - `llm(prompt, callbacks=callbacks)` -> `llm.invoke(prompt, config={"callbacks": callbacks})` - `llm(prompt, kwargs)` -> `llm.invoke(prompt, kwargs)`	2 months ago
Raghav Dixit	9b7fb381a4	community[patch]: LanceDB integration patch update (#20686 ) Description : - added functionalities - delete, index creation, using existing connection object etc. - updated usage - Added LaceDB cloud OSS support make lint_diff , make test checks done	2 months ago
volodymyr-memsql	493afe4d8d	community[patch]: add hybrid search to singlestoredb vectorstore (#20793 ) Implemented the ability to enable full-text search within the SingleStore vector store, offering users a versatile range of search strategies. This enhancement allows users to seamlessly combine full-text search with vector search, enabling the following search strategies: * Search solely by vector similarity. * Conduct searches exclusively based on text similarity, utilizing Lucene internally. * Filter search results by text similarity score, with the option to specify a threshold, followed by a search based on vector similarity. * Filter results by vector similarity score before conducting a search based on text similarity. * Perform searches using a weighted sum of vector and text similarity scores. Additionally, integration tests have been added to comprehensively cover all scenarios. Updated notebook with examples. CC: @baskaryan, @hwchase17 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2 months ago
Tomaz Bratanic	9efab3ed66	community[patch]: Add driver config param for neo4j graph (#20772 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2 months ago
Leonid Ganeline	13751c3297	community: `tigergraph` fixes (#20034 ) - added guard on the `pyTigerGraph` import - added a missed example page in the `docs/integrations/graphs/` - formatted the `docs/integrations/providers/` page to the consistent format. Added links.	2 months ago
Martin Kolb	0186e4e633	community[patch]: Advanced filtering for HANA Cloud Vector Engine (#20821 ) - Description: This PR adds support for advanced filtering to the integration of HANA Vector Engine. The newly supported filtering operators are: $eq, $ne, $gt, $gte, $lt, $lte, $between, $in, $nin, $like, $and, $or - Issue: N/A - Dependencies: no new dependencies added Added integration tests to: `libs/community/tests/integration_tests/vectorstores/test_hanavector.py` Description of the new capabilities in notebook: `docs/docs/integrations/vectorstores/hanavector.ipynb`	2 months ago
Alex Sherstinsky	12e5ec6de3	community: Support both Predibase SDK-v1 and SDK-v2 in Predibase-LangChain integration (#20859 )	2 months ago
JeffKatzy	5ab3f9a995	community[patch]: standardize chat init args (#20844 ) Thank you for contributing to LangChain! community:perplexity[patch]: standardize init args updated pplx_api_key and request_timeout so that aliased to api_key, and timeout respectively. Added test that both continue to set the same underlying attributes. Related to [20085](https://github.com/langchain-ai/langchain/issues/20085) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2 months ago
Massimiliano Pronesti	8d1167b32f	community[patch]: add support for similarity_score_threshold search in… (#20852 ) See https://github.com/langchain-ai/langchain/issues/20600#issuecomment-2075569338 for details. @chrislrobert	2 months ago
Eugene Yurtsev	30e48c9878	core[patch],community[patch]: Move file chat history back to community (#20834 ) Marking as patch since we haven't had releases in between. This just reverting part of a PR from yesterday.	2 months ago
Nestor Qin	9111d3a636	community[patch]: Fix message formatting for Anthropic models on Amazon Bedrock (#20801 ) Description: This PR fixes an issue in message formatting function for Anthropic models on Amazon Bedrock. Currently, LangChain BedrockChat model will crash if it uses Anthropic models and the model return a message in the following type: - `AIMessageChunk` Moreover, when use BedrockChat with for building Agent, the following message types will trigger the same issue too: - `HumanMessageChunk` - `FunctionMessage` Issue: https://github.com/langchain-ai/langchain/issues/18831 Dependencies: No. Testing: Manually tested. The following code was failing before the patch and works after. ``` @tool def square_root(x: str): "Useful when you need to calculate the square root of a number" return math.sqrt(int(x)) llm = ChatBedrock( model_id="anthropic.claude-3-sonnet-20240229-v1:0", model_kwargs={ "temperature": 0.0 }, ) prompt = ChatPromptTemplate.from_messages( [ ("system", FUNCTION_CALL_PROMPT), ("human", "Question: {user_input}"), MessagesPlaceholder(variable_name="agent_scratchpad"), ] ) tools = [square_root] tools_string = format_tool_to_anthropic_function(square_root) agent = ( RunnablePassthrough.assign( user_input=lambda x: x['user_input'], agent_scratchpad=lambda x: format_to_openai_function_messages( x["intermediate_steps"] ) ) \| prompt \| llm \| AnthropicFunctionsAgentOutputParser() ) agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, return_intermediate_steps=True) output = agent_executor.invoke({ "user_input": "What is the square root of 2?", "tools_string": tools_string, }) ``` List of messages returned from Bedrock: ``` <SystemMessage> content='You are a helpful assistant.' <HumanMessage> content='Question: What is the square root of 2?' <AIMessageChunk> content="Okay, let's calculate the square root of 2.<scratchpad>\nTo calculate the square root of a number, I can use the square_root tool:\n\n<function_calls>\n <invoke>\n <tool_name>square_root</tool_name>\n <parameters>\n <__arg1>2</__arg1>\n </parameters>\n </invoke>\n</function_calls>\n</scratchpad>\n\n<function_results>\n<search_result>\nThe square root of 2 is approximately 1.414213562373095\n</search_result>\n</function_results>\n\n<answer>\nThe square root of 2 is approximately 1.414213562373095\n</answer>" id='run-92363df7-eff6-4849-bbba-fa16a1b2988c'" <FunctionMessage> content='1.4142135623730951' name='square_root' ```	2 months ago
Aliaksandr Kuzmik	5560cc448c	community[patch]: fix CometTracer bug (#20796 ) Hi! My name is Alex, I'm an SDK engineer from [Comet](https://www.comet.com/site/) This PR updates the `CometTracer` class. Fixed an issue when `CometTracer` failed while logging the data to Comet because this data is not JSON-encodable. The problem was in some of the `Run` attributes that could contain non-default types inside, now these attributes are taken not from the run instance, but from the `run.dict()` return value.	2 months ago
Eugene Yurtsev	645b1e142e	core[minor],langchain[patch],community[patch]: Move InMemory and File implementations of Chat History to core (#20752 ) This PR moves the implementations for chat history to core. So it's easier to determine which dependencies need to be broken / add deprecation warnings	2 months ago
ccurme	7a922f3e48	core, openai: support custom token encoders (#20762 )	2 months ago
Christophe Bornet	0ae5027d98	community[patch]: Remove usage of deprecated StoredBlobHistory in CassandraChatMessageHistory (#20666 )	2 months ago
Eugene Yurtsev	38adbfdf34	community[patch],core[minor]: Move BaseToolKit to core.tools (#20669 )	2 months ago
Mark Needham	ce23f8293a	Community patch clickhouse make it possible to not specify index (#20460 ) Vector indexes in ClickHouse are experimental at the moment and can sometimes break/change behaviour. So this PR makes it possible to say that you don't want to specify an index type. Any queries against the embedding column will be brute force/linear scan, but that gives reasonable performance for small-medium dataset sizes. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2 months ago
ccurme	c010ec8b71	patch: deprecate (a)get_relevant_documents (#20477 ) - `.get_relevant_documents(query)` -> `.invoke(query)` - `.get_relevant_documents(query=query)` -> `.invoke(query)` - `.get_relevant_documents(query, callbacks=callbacks)` -> `.invoke(query, config={"callbacks": callbacks})` - `.get_relevant_documents(query, kwargs)` -> `.invoke(query, kwargs)` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2 months ago
Matheus Henrique Raymundo	bb69819267	community: Fix the stop sequence key name for Mistral in Bedrock (#20709 ) Fixing the wrong stop sequence key name that causes an error on AWS Bedrock. You can check the MistralAI bedrock parameters [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-mistral.html) This change fixes this [issue](https://github.com/langchain-ai/langchain/issues/20095)	2 months ago
Bagatur	1c7b3c75a7	community[patch], experimental[patch]: support tool-calling sql and p… (#20639 ) d agents	2 months ago
shumway743	cb6e5e56c2	community[minor]: add graph store implementation for apache age (#20582 ) Description: implemented GraphStore class for Apache Age graph db Dependencies: depends on psycopg2 Unit and integration tests included. Formatting and linting have been run. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2 months ago
Christophe Bornet	c909ae0152	community[minor]: Add async methods to CassandraVectorStore (#20602 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2 months ago
Dmitry Tyumentsev	f111efeb6e	community[patch]: YandexGPT API add ability to disable request logging (#20670 ) Closes (#20622) Added the ability to [disable logging of requests to YandexGPT](https://yandex.cloud/en/docs/foundation-models/operations/yandexgpt/disable-logging).	2 months ago
Erick Friis	73809817ff	community: release 0.0.34 (#20672 )	2 months ago
Tomaz Bratanic	8c08cf4619	community: Add support for relationship indexes in neo4j vector (#20657 ) Neo4j has added relationship vector indexes. We can't populate them, but we can use existing indexes for retrieval	2 months ago
Charlie Holtz	1cbab0ebda	community: update Replicate to work with official models (#20633 ) Description: you don't need to pass a version for Replicate official models. That was broken on LangChain until now! You can now run: ``` llm = Replicate( model="meta/meta-llama-3-8b-instruct", model_kwargs={"temperature": 0.75, "max_length": 500, "top_p": 1}, ) prompt = """ User: Answer the following yes/no question by reasoning step by step. Can a dog drive a car? Assistant: """ llm(prompt) ``` I've updated the replicate.ipynb to reflect that. twitter: @charliebholtz --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2 months ago
Congyu	dd5139e304	community[patch]: truncate zhipuai `temperature` and `top_p` parameters to [0.01, 0.99] (#20261 ) ZhipuAI API only accepts `temperature` parameter between `(0, 1)` open interval, and if `0` is passed, it responds with status code `400`. However, 0 and 1 is often accepted by other APIs, for example, OpenAI allows `[0, 2]` for temperature closed range. This PR truncates temperature parameter passed to `[0.01, 0.99]` to improve the compatibility between langchain's ecosystem's and ZhipuAI (e.g., ragas `evaluate` often generates temperature 0, which results in a lot of 400 invalid responses). The PR also truncates `top_p` parameter since it has the same restriction. Reference: [glm-4 doc](https://open.bigmodel.cn/dev/api#glm-4) (which unfortunately is in Chinese though). --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2 months ago
Lance Martin	d5c22b80a5	community[patch]: Fix Ollama for LLaMA3 (#20624 ) We see verbose generations w/ LLaMA3 and Ollama - https://smith.langchain.com/public/88c4cd21-3d57-4229-96fe-53443398ca99/r --- Fix here implies that when stop was being set to an empty list, the stream had no conditions under which to stop, which could lead to excessive or unintended output. Test LLaMA2 - https://smith.langchain.com/public/57dfc64a-591b-46fa-a1cd-8783acaefea2/r Test LLaMA3 - https://smith.langchain.com/public/76ff5f47-ac89-4772-a7d2-5caa907d3fd6/r https://smith.langchain.com/public/a31d2fad-9094-4c93-949a-964b27630ccb/r Test Mistral - https://smith.langchain.com/public/a4fe7114-c308-4317-b9fd-6c86d31f1c5b/r --------- Co-authored-by: Erick Friis <erick@langchain.dev>	3 months ago
hulitaitai	7d0a008744	community[minor]: Add audio-parser "faster-whisper" in audio.py (#20012 ) faster-whisper is a reimplementation of OpenAI's Whisper model using CTranslate2, which is up to 4 times faster than enai/whisper for the same accuracy while using less memory. The efficiency can be further improved with 8-bit quantization on both CPU and GPU. It can automatically detect the following 14 languages and transcribe the text into their respective languages: en, zh, fr, de, ja, ko, ru, es, th, it, pt, vi, ar, tr. The gitbub repository for faster-whisper is : https://github.com/SYSTRAN/faster-whisper --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	3 months ago
Guangdong Liu	e3c2431c5b	comminuty[patch]:Fix Error in apache doris insert (#19989 ) - Issue: #19886	3 months ago
Tomaz Bratanic	27370b679e	community[patch]: Ignore null and invalid embedding values for neo4j metadata filtering (#20558 )	3 months ago
Massimiliano Pronesti	2542a09abc	community[patch]: AzureSearch incorrectly converted to retriever (#20601 ) Closes #20600. Please see the issue for more details.	3 months ago
Christophe Bornet	8f0b5687a3	community[minor]: Add hybrid search to Cassandra VectorStore (#20286 ) Only supported by Astra DB at the moment. Twitter handle: cbornet_	3 months ago
Christophe Bornet	d2d01370bc	community[minor]: Add async methods to CassandraLoader (#20609 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	3 months ago
balloonio	e786da7774	community[patch]: Invoke callback prior to yielding token fix [HuggingFaceTextGenInference] (#20426 ) …gFaceTextGenInference) - [x] PR title: community[patch]: Invoke callback prior to yielding token fix for [HuggingFaceTextGenInference] - [x] PR message: - Description: Invoke callback prior to yielding token in stream method in [HuggingFaceTextGenInference] - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None - Twitter handle: @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	3 months ago
Ethan Yang	2d6d796040	community: Add save_model function for openvino reranker and embedding (#19896 )	3 months ago
zR	9c1d7f2405	update zhipuai notebook (#20595 ) fix timeout issue fix zhipuai usecase notebookbook Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	3 months ago
ccurme	c897264b9b	community: (milvus) check for num_shards (#20603 ) @rgupta2508 I believe this change is necessary following https://github.com/langchain-ai/langchain/pull/20318 because of how Milvus handles defaults: `59bf5e811a/pymilvus/client/prepare.py (L82-L85)` ```python num_shards = kwargs[next(iter(same_key))] if not isinstance(num_shards, int): msg = f"invalid num_shards type, got {type(num_shards)}, expected int" raise ParamError(message=msg) req.shards_num = num_shards ``` this way lets Milvus control the default value (instead of maintaining a separate default in Langchain). Let me know if I've got this wrong or you feel it's unnecessary. Thanks.	3 months ago
Rohit Gupta	25c4c24e89	Support to create shards_num in milvus vectorstores (#20318 ) To support number of the shards for the collection to create in milvus vvectorstores. Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	3 months ago
Erick Friis	e395115807	docs: aws docs updates (#20571 )	3 months ago
Erick Friis	f09bd0b75b	upstage: init package (#20574 ) Co-authored-by: Sean Cho <sean@upstage.ai> Co-authored-by: JuHyung-Son <sonju0427@gmail.com>	3 months ago
Marco Perini	11c9ed3362	community[patch]: exposing headless flag parameter to AsyncChromiumLoader class (#20424 ) - Description: added the headless parameter as optional argument to the langchain_community.document_loaders AsyncChromiumLoader class - Dependencies: None - Twitter handle: @perinim_98 If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
Christophe Bornet	a22da4315b	community[patch]: Replace function in CassandraVectorStore with simpler lambda (#20323 )	3 months ago
Christophe Bornet	75733c5cc1	community[minor]: Improve CassandraVectorStore from_texts (#20284 )	3 months ago
Tomer Cagan	463160c3f6	community: fix `DirectoryLoader` progress bar (#19821 ) Description: currently, the `DirectoryLoader` progress-bar maximum value is based on an incorrect number of files to process In langchain_community/document_loaders/directory.py:127: ```python paths = p.rglob(self.glob) if self.recursive else p.glob(self.glob) items = [ path for path in paths if not (self.exclude and any(path.match(glob) for glob in self.exclude)) ] ``` `paths` returns both files and directories. `items` is later used to determine the maximum value of the progress-bar which gives an incorrect progress indication.	3 months ago
Pengcheng Liu	ecd19a9e58	community[patch]: Add function call support in Tongyi chat model. (#20119 ) - [ ] PR message: - Description: This pr adds function calling support in Tongyi chat model. - Issue: None - Dependencies: None - Twitter handle: None Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	3 months ago
kaijietti	80679ab906	zep[patch]: implement add_messages and aadd_messages (#20099 ) This PR implement `add_messages` and `aadd_messages` to avoid unnecessary round-trips.	3 months ago
Sevin F. Varoglu	3f156e0ece	community[minor]: add ChatOctoAI (#20059 ) This PR adds ChatOctoAI, a chat model integration for OctoAI.	3 months ago
Eun Hye Kim	b34f1086fe	community[patch]: Add streaming logic in ChatHuggingFace (#18784 ) - Add functions (_stream, _astream) - Connect to _generate and _agenerate Thank you for contributing to LangChain! - [x] PR title: "community: Add streaming logic in ChatHuggingFace" - [x] PR message: *Delete this entire checklist* and replace with - Description: Addition functions (_stream, _astream) and connection to _generate and _agenerate - Issue: #18782 - Dependencies: none - Twitter handle: @lunara_x	3 months ago
pjb157	479be3cc91	community[minor]: Unify Titan Takeoff Integrations and Adding Embedding Support (#18775 ) Community: Unify Titan Takeoff Integrations and Adding Embedding Support Description: Titan Takeoff no longer reflects this either of the integrations in the community folder. The two integrations (TitanTakeoffPro and TitanTakeoff) where causing confusion with clients, so have moved code into one place and created an alias for backwards compatibility. Added Takeoff Client python package to do the bulk of the work with the requests, this is because this package is actively updated with new versions of Takeoff. So this integration will be far more robust and will not degrade as badly over time. Issue: Fixes bugs in the old Titan integrations and unified the code with added unit test converge to avoid future problems. Dependencies: Added optional dependency takeoff-client, all imports still work without dependency including the Titan Takeoff classes but just will fail on initialisation if not pip installed takeoff-client Twitter @MeryemArik9 Thanks all :) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	3 months ago
Rahul Triptahi	2cbfc94bcb	community[patch]: Add support for authorized identities in PebbloSafeLoader. (#20055 ) Description: Add support for authorized identities in PebbloSafeLoader. Now with this change, PebbloSafeLoader will extract authorized_identities from metadata and send it to pebblo server Dependencies: None Documentation: None Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	3 months ago
Guangdong Liu	b78ede2f96	community[patch]: standardize init args (#20166 ) Related to https://github.com/langchain-ai/langchain/issues/20085 @baskaryan	3 months ago
Guangdong Liu	3729bec1a2	community[patch]: standardize init args (#20210 ) Related to https://github.com/langchain-ai/langchain/issues/20085 @baskaryan	3 months ago
sdan	a7c5e41443	community[minor]: Added VLite as VectorStore (#20245 ) Support [VLite](https://github.com/sdan/vlite) as a new VectorStore type. Description: vlite is a simple and blazing fast vector database(vdb) made with numpy. It abstracts a lot of the functionality around using a vdb in the retrieval augmented generation(RAG) pipeline such as embeddings generation, chunking, and file processing while still giving developers the functionality to change how they're made/stored. Before submitting: Added tests [here](`c09c2ebd5c/libs/community/tests/integration_tests/vectorstores/test_vlite.py`) Added ipython notebook [here](`c09c2ebd5c/docs/docs/integrations/vectorstores/vlite.ipynb`) Added simple docs on how to use [here](`c09c2ebd5c/docs/docs/integrations/providers/vlite.mdx`) Profiles Maintainers: @sdan Twitter handles: [@sdand](https://x.com/sdand) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
Hyeongchan Kim	7824291252	community[patch]: Fix not to cast to str type when `file_path` is None (#20057 ) From `langchain_community 0.0.30`, there's a bug that cannot send a file-like object via `file` parameter instead of `file path` due to casting the `file_path` to str type even if `file_path` is None. which means that when I call the `partition_via_api()`, exactly one of `filename` and `file` must be specified by the following error message. however, from `langchain_community 0.0.30`, `file_path` is casted into `str` type even `file_path` is None in `get_elements_from_api()` and got an error at `exactly_one(filename=filename, file=file)`. here's an error message ``` ---> 51 exactly_one(filename=filename, file=file) 53 if metadata_filename and file_filename: 54 raise ValueError( 55 "Only one of metadata_filename and file_filename is specified. " 56 "metadata_filename is preferred. file_filename is marked for deprecation.", 57 ) File /opt/homebrew/lib/python3.11/site-packages/unstructured/partition/common.py:441, in exactly_one(**kwargs) 439 else: 440 message = f"{names[0]} must be specified." --> 441 raise ValueError(message) ValueError: Exactly one of filename and file must be specified. ``` So, I simply made a change that casting to str type when `file_path` is not None. I use `UnstructuredAPIFileLoader` like below. ``` from langchain_community.document_loaders.unstructured import UnstructuredAPIFileLoader documents: list = UnstructuredAPIFileLoader( file_path=None, file=file, # file-like object, io.BytesIO type mode='elements', url='http://127.0.0.1:8000/general/v0/general', content_type='application/pdf', metadata_filename='asdf.pdf', ).load_and_split() ```	3 months ago
MacanPN	bce69ae43d	community[patch]: Changes to base_o365 and sharepoint document loaders (#20373 ) ## Description: The PR introduces 3 changes: 1. added `recursive` property to `O365BaseLoader`. (To keep the behavior unchanged, by default is set to `False`). When `recursive=True`, `_load_from_folder()` also recursively loads all nested folders. 2. added `folder_id` to SharePointLoader.(similar to (this PR)[https://github.com/langchain-ai/langchain/pull/10780] ) This provides an alternative to `folder_path` that doesn't seem to reliably work. 3. when none of `document_ids`, `folder_id`, `folder_path` is provided, the loader fetches documets from root folder. Combined with `recursive=True` this provides an easy way of loading all compatible documents from SharePoint. The PR contains the same logic as [this stale PR](https://github.com/langchain-ai/langchain/pull/10780) by @WaleedAlfaris. I'd like to ask his blessing for moving forward with this one. ## Issue: - As described in https://github.com/langchain-ai/langchain/issues/19938 and https://github.com/langchain-ai/langchain/pull/10780 the sharepoint loader often does not seem to work with folder_path. - Recursive loading of subfolders is a missing functionality ## Dependecies: None Twitter handle: @martintriska1 @WRhetoric This is my first PR here, please be gentle :-) Please review @baskaryan --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	3 months ago
Sevin F. Varoglu	54d388d898	community[patch]: update OctoAI endpoint to subclass BaseOpenAI (#19757 ) This PR updates OctoAIEndpoint LLM to subclass BaseOpenAI as OctoAI is an OpenAI-compatible service. The documentation and tests have also been updated.	3 months ago
Benito Geordie	57b226532d	community[minor]: Added integrations for ThirdAI's NeuralDB as a Retriever (#17334 ) Description: Adds ThirdAI NeuralDB retriever integration. NeuralDB is a CPU-friendly and fine-tunable text retrieval engine. We previously added a vector store integration but we think that it will be easier for our customers if they can also find us under under langchain-community/retrievers. --------- Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com> Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>	3 months ago
WeichenXu	e9fc87aab1	community[patch]: Make ChatDatabricks model supports streaming response (#19912 ) Description: Make ChatDatabricks model supports stream Issue: N/A Dependencies: MLflow nightly build version (we will release next MLflow version soon) Twitter handle: N/A Manually test: (Before testing, please install `pip install git+https://github.com/mlflow/mlflow.git`) ```python # Test Databricks Foundation LLM model from langchain.chat_models import ChatDatabricks chat_model = ChatDatabricks( endpoint="databricks-llama-2-70b-chat", max_tokens=500 ) from langchain_core.messages import AIMessageChunk for chunk in chat_model.stream("What is mlflow?"): print(chunk.content, end="\|") ``` - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Signed-off-by: Weichen Xu <weichen.xu@databricks.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
Dhruv Chawla	d6d559d50d	community[minor]: add UpTrainCallbackHandler (#19956 ) - Description: This PR adds a callback handler for UpTrain. It performs evaluations in the RAG pipeline to check the quality of retrieved documents, generated queries and responses. - Dependencies: - The UpTrainCallbackHandler requires the uptrain package --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	3 months ago
Ravindu Somawansa	5acc7ba622	community[minor]: Add glue catalog loader (#20220 ) Add Glue Catalog loader	3 months ago
Martín Gotelli Ferenaz	b48add4353	community[patch]: Fix pgvector deprecated filter clause usage with OR and AND conditions (#20446 ) Description: Support filter by OR and AND for deprecated PGVector version Issue: #20445 Dependencies: N/A Twitter handle: @martinferenaz	3 months ago
Eugene Yurtsev	c50099161b	community[patch]: Use uuid4 not uuid1 (#20487 ) Using UUID1 is incorrect since it's time dependent, which makes it easy to generate the exact same uuid	3 months ago
Erick Friis	86cf1d3ee1	community: release 0.0.33 (#20490 )	3 months ago
Leonid Kuligin	676c68d318	community[patch]: deprecating remaining google_community integrations (#20471 ) Deprecating remaining google community integrations	3 months ago
balloonio	b66a4f48fa	community[patch]: Invoke callback prior to yielding token fix [DeepInfra] (#20427 ) - [x] PR title: community[patch]: Invoke callback prior to yielding token fix for [DeepInfra] - [x] PR message: - Description: Invoke callback prior to yielding token in stream method in [DeepInfra] - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None - Twitter handle: @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	3 months ago
Juan Carlos José Camacho	450c458f8f	community[minor]: Add Datahareld tool (#19680 ) Description: Integrate [dataherald](https://www.dataherald.com) tool, It is a natural language-to-SQL tool. Dependencies: Install dataherald sdk to use it, ``` pip install dataherald ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com>	3 months ago
Egor Krasheninnikov	c8391d4ff1	community[patch]: Fix YandexGPT embeddings (#19720 ) Fix of YandexGPT embeddings. The current version uses a single `model_name` for queries and documents, essentially making the `embed_documents` and `embed_query` methods the same. Yandex has a different endpoint (`model_uri`) for encoding documents, see [this](https://yandex.cloud/en/docs/yandexgpt/concepts/embeddings). The bug may impact retrievers built with `YandexGPTEmbeddings` (for instance FAISS database as retriever) since they use both `embed_documents` and `embed_query`. A simple snippet to test the behaviour: ```python from langchain_community.embeddings.yandex import YandexGPTEmbeddings embeddings = YandexGPTEmbeddings() q_emb = embeddings.embed_query('hello world') doc_emb = embeddings.embed_documents(['hello world', 'hello world']) q_emb == doc_emb[0] ``` The response is `True` with the current version and `False` with the changes I made. Twitter: @egor_krash --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	3 months ago
Guangdong Liu	4be7ca7b4c	community[patch]:sparkllm standardize init args (#20194 ) Related to https://github.com/langchain-ai/langchain/issues/20085 @baskaryan	3 months ago
Yuki Oshima	0758da8940	community[patch]: Set default value for _ListSQLDatabaseToolInput tool_input (#20409 ) Description: `_ListSQLDatabaseToolInput` raise error if model returns `{}`. For example, gpt-4-turbo returns `{}` with SQL Agent initialized by `create_sql_agent`. So, I set default value `""` for `_ListSQLDatabaseToolInput` tool_input. This is actually a gpt-4-turbo issue, not a LangChain issue, but I thought it would be helpful to set a default value `""`. This problem is discussed in detail in the following Issue. Issue: https://github.com/langchain-ai/langchain/issues/20405 Dependencies: none Sorry, I did not add or change the test code, as tests for this components was not exist . However, I have tested the following code based on the [SQL Agent Document](https://python.langchain.com/docs/use_cases/sql/agents/), to make sure it works. ``` from langchain_community.agent_toolkits.sql.base import create_sql_agent from langchain_community.utilities.sql_database import SQLDatabase from langchain_openai import ChatOpenAI db = SQLDatabase.from_uri("sqlite:///Chinook.db") llm = ChatOpenAI(model="gpt-4-turbo", temperature=0) agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True) result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?") print(result["output"]) ```	3 months ago
ccurme	38faa74c23	community[patch]: update use of deprecated llm methods (#20393 ) .predict and .predict_messages for BaseLanguageModel and BaseChatModel	3 months ago
Corey Zumar	3a068b26f3	community[patch]: Databricks - fix scope of dangerous deserialization error in Databricks LLM connector (#20368 ) fix scope of dangerous deserialization error in Databricks LLM connector --------- Signed-off-by: dbczumar <corey.zumar@databricks.com>	3 months ago
balloonio	e7b1a44c5b	community[patch]: Invoke callback prior to yielding token fix for Llamafile (#20365 ) - [x] PR title: community[patch]: Invoke callback prior to yielding token fix for Llamafile - [x] PR message: - Description: Invoke callback prior to yielding token in stream method in community llamafile.py - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None - Twitter handle: @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	3 months ago
balloonio	93caa568f9	community[patch]: Invoke callback prior to yielding token fix for HuggingFaceEndpoint (#20366 ) - [x] PR title: community[patch]: Invoke callback prior to yielding token fix for HuggingFaceEndpoint - [x] PR message: - Description: Invoke callback prior to yielding token in stream method in community HuggingFaceEndpoint - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None - Twitter handle: @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	3 months ago
Nicolas	ad04585e30	community[minor]: Firecrawl.dev integration (#20364 ) Added the [FireCrawl](https://firecrawl.dev) document loader. Firecrawl crawls and convert any website into LLM-ready data. It crawls all accessible subpages and give you clean markdown for each. - Description: Adds FireCrawl data loader - Dependencies: firecrawl-py - Twitter handle: @mendableai ccing contributors: (@ericciarla @nickscamara) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	3 months ago
P. Taylor Goetz	9317df7f16	community[patch]: Add "model" attribute to the payload sent to Ollama in `ChatOllama` (#20354 ) Example Ollama API calls: Request without "model": ``` curl --location 'http://localhost:11434/api/chat' \ --header 'Content-Type: application/json' \ --data '{ "messages": [ { "role": "user", "content": "What is the capitol of PA?" } ], "stream": false }' ``` Response: ``` {"error":"model is required"} ``` Request with "model": ``` curl --location 'http://localhost:11434/api/chat' \ --header 'Content-Type: application/json' \ --data '{ "model": "openchat", "messages": [ { "role": "user", "content": "What is the capitol of PA?" } ], "stream": false }' ``` Response: ``` { "eval_duration" : 733248000, "created_at" : "2024-04-11T23:04:08.735766843Z", "model" : "openchat", "message" : { "content" : " The capital city of Pennsylvania is Harrisburg.", "role" : "assistant" }, "total_duration" : 3138731168, "prompt_eval_count" : 25, "load_duration" : 466562959, "done" : true, "prompt_eval_duration" : 1938495000, "eval_count" : 10 } ```	3 months ago
Alex Sherstinsky	fad0962643	community: for Predibase -- enable both Predibase-hosted and HuggingFace-hosted fine-tuned adapter repositories (#20370 )	3 months ago
Isak Nyberg	bac9fb9a7c	community: add gpt-4 pricing in callback (#20292 ) Added the pricing for `gpt-4-turbo` and `gpt-4-turbo-2024-04-09` in the callback method. related to issue #17173 https://openai.com/pricing#language-models	3 months ago
Leonid Ganeline	7cf2d2759d	community[patch]: docstrings update (#20301 ) Added missed docstrings. Format docstings to the consistent form.	3 months ago
Eugene Yurtsev	22fd844e8a	community[patch]: Add deprecation warnings to postgres implementation (#20222 ) Add deprecation warnings to postgres implementation that are in langchain-postgres.	3 months ago
Leonid Ganeline	4cb5f4c353	community[patch]: import flattening fix (#20110 ) This PR should make it easier for linters to do type checking and for IDEs to jump to definition of code. See #20050 as a template for this PR. - As a byproduct: Added 3 missed `test_imports`. - Added missed `SolarChat` in to __init___.py Added it into test_import ut. - Added `# type: ignore` to fix linting. It is not clear, why linting errors appear after ^ changes. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	3 months ago

1 2 3 4 5 ...

944 Commits (c9e9470c5a3a12e2fe7fd14164dd5a26aec37fb0)