langchain

Commit Graph

Author	SHA1	Message	Date
Ankush Gola	a9246333fd	fix anthropic chat model mutating input list (#6457 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes: ChatAnthropic was mutating the input message list during formatting which isn't ideal bc you could be changing the behavior for other chat models when using the same input #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested:	1 year ago
Zander Chase	00f276d23f	Run eval in eval mode (#6447 ) For the `run_on_dataset` sessions	1 year ago
David Duong	be9371ca8f	Include placeholder value for all secrets, not just kwargs (#6421 ) Mirror PR for https://github.com/hwchase17/langchainjs/pull/1696 Secrets passed via environment variables should be present in the serialised chain	1 year ago
Fei Wang	50556f3b35	support memory for functions (#6165 ) #### Before submitting Add memory support for `OpenAIFunctionsAgent` like `StructuredChatAgent`. #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
hp0404	6aa7b04f79	Fix integration tests for Faiss vector store (#6281 ) Fixes #5807 (issue) #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	1 year ago
Tomaz Bratanic	b3bccabc66	Add option to save/load graph cypher QA (#6219 ) Similar as https://github.com/hwchase17/langchain/pull/5818 Added the functionality to save/load Graph Cypher QA Chain due to a user reporting the following error > raise NotImplementedError("Saving not supported for this chain type.")\nNotImplementedError: Saving not supported for this chain type.\n'	1 year ago
Leonid Ganeline	c7ca350cd3	Fix class promotion (#6187 ) In LangChain, all module classes are enumerated in the `__init__.py` file of the correspondent module. But some classes were missed and were not included in the module `__init__.py` This PR: - added the missed classes to the module `__init__.py` files - `__init__.py:__all_` variable value (a list of the class names) was sorted - `langchain.tools.sql_database.tool.QueryCheckerTool` was renamed into the `QuerySQLCheckerTool` because it conflicted with `langchain.tools.spark_sql.tool.QueryCheckerTool` - changes to `pyproject.toml`: - added `pgvector` to `pyproject.toml:extended_testing` - added `pandas` to `pyproject.toml:[tool.poetry.group.test.dependencies]` - commented out the `streamlit` from `collbacks/__init__.py`, It is because now the `streamlit` requires Python >=3.7, !=3.9.7 - fixed duplicate names in `tools` - fixed correspondent ut-s #### Who can review? @hwchase17 @dev2049	1 year ago
Harrison Chase	9bf5b0defa	Harrison/myscale self query (#6376 ) Co-authored-by: Fangrui Liu <fangruil@moqi.ai> Co-authored-by: 刘方瑞 <fangrui.liu@outlook.com> Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>	1 year ago
Harrison Chase	42a28ac1ba	Harrison/error zero tools (#6340 ) Co-authored-by: Juhee Kim <46583939+juppytt@users.noreply.github.com>	1 year ago
Slawomir Gonet	eef62bf4e9	qdrant: search by vector (#6043 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Added support to `search_by_vector` to Qdrant Vector store. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ### Who can review VectorStores / Retrievers / Memory - @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 -->	1 year ago
Richy Wang	444ca3f669	Improve AnalyticDB Vector Store implementation without affecting user (#6086 ) Hi there: As I implement the AnalyticDB VectorStore use two table to store the document before. It seems just use one table is a better way. So this commit is try to improve AnalyticDB VectorStore implementation without affecting user behavior: 1. Streamline the `post_init `behavior by creating a single table with vector indexing. 2. Update the `add_texts` API for document insertion. 3. Optimize `similarity_search_with_score_by_vector` to retrieve results directly from the table. 4. Implement `_similarity_search_with_relevance_scores`. 5. Add `embedding_dimension` parameter to support different dimension embedding functions. Users can continue using the API as before. Test cases added before is enough to meet this commit.	1 year ago
Saba Sturua	427551eabf	DocArray as a Retriever (#6031 ) ## DocArray as a Retriever [DocArray](https://github.com/docarray/docarray) is an open-source tool for managing your multi-modal data. It offers flexibility to store and search through your data using various document index backends. This PR introduces `DocArrayRetriever` - which works with any available backend and serves as a retriever for Langchain apps. Also, I added 2 notebooks: DocArray Backends - intro to all 5 currently supported backends, how to initialize, index, and use them as a retriever DocArray Usage - showcasing what additional search parameters you can pass to create versatile retrievers Example: ```python from docarray.index import InMemoryExactNNIndex from docarray import BaseDoc, DocList from docarray.typing import NdArray from langchain.embeddings.openai import OpenAIEmbeddings from langchain.retrievers import DocArrayRetriever # define document schema class MyDoc(BaseDoc): description: str description_embedding: NdArray[1536] embeddings = OpenAIEmbeddings() # create documents descriptions = ["description 1", "description 2"] desc_embeddings = embeddings.embed_documents(texts=descriptions) docs = DocList[MyDoc]( [ MyDoc(description=desc, description_embedding=embedding) for desc, embedding in zip(descriptions, desc_embeddings) ] ) # initialize document index with data db = InMemoryExactNNIndex[MyDoc](docs) # create a retriever retriever = DocArrayRetriever( index=db, embeddings=embeddings, search_field="description_embedding", content_field="description", ) # find the relevant document doc = retriever.get_relevant_documents("action movies") print(doc) ``` #### Who can review? @dev2049 --------- Signed-off-by: jupyterjazz <saba.sturua@jina.ai>	1 year ago
Harrison Chase	af18413d97	Harrison/deeplake new features (#6263 ) Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Davis Chase	87e502c6bc	Doc refactor (#6300 ) Co-authored-by: jacoblee93 <jacoblee93@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
hp0404	b01cf0dd54	ArxivAPIWrapper - doc_content_chars_max (#6063 ) This PR refactors the ArxivAPIWrapper class making `doc_content_chars_max` parameter optional. Additionally, tests have been added to ensure the functionality of the doc_content_chars_max parameter. Fixes #6027 (issue)	1 year ago
Zander Chase	ae76e473e1	Add Tags for LLMs (#6229 ) - [x] Add tracing tags to LLMs + Chat Models (both inheritable and local) - [x] Add tags for the run_on_dataset helper function(s)	1 year ago
0xJordan	c5a46e7435	feat: Add support for the Solidity language (#6054 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ## Add Solidity programming language support for code splitter. Twitter: @0xjord4n_ <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	1 year ago
Zander Chase	b3b155d488	Return session name in runner response (#6112 ) Makes it easier to then run evals w/o thinking about specifying a session	1 year ago
Nuno Campos	11ab0be11a	Add support for tags (#5898 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	1 year ago
Harrison Chase	970b2f9d38	convert tools to openai (#6100 )	1 year ago
Lance Martin	ee3d0513ad	Add tests and update notebook for MarkdownHeaderTextSplitter (#6069 ) Add test and update notebook for `MarkdownHeaderTextSplitter`.	1 year ago
Zander Chase	0c52275bdb	Use Run object from SDK (#6067 ) Update the Run object in the tracer to extend that in the SDK to include the parameters necessary for tracking/tracing	1 year ago
Harrison Chase	ec1a2adf9c	improve tools (#6062 )	1 year ago
Julius Lipp	5b6bbf4ab2	Add embaas document extraction api endpoints (#6048 ) # Introduces embaas document extraction api endpoints In this PR, we add support for embaas document extraction endpoints to Text Embedding Models (with LLMs, in different PRs coming). We currently offer the MTEB leaderboard top performers, will continue to add top embedding models and soon add support for customers to deploy thier own models. Additional Documentation + Infomation can be found [here](https://embaas.io). While developing this integration, I closely followed the patterns established by other langchain integrations. Nonetheless, if there are any aspects that require adjustments or if there's a better way to present a new integration, let me know! :) Additionally, I fixed some docs in the embeddings integration. Related PR: #5976 #### Who can review? DataLoaders - @eyurtsev	1 year ago
Jens Madsen	2c91f0d750	chore: spedd up integration test by using smaller model (#6044 ) Adds a new parameter `relative_chunk_overlap` for the `SentenceTransformersTokenTextSplitter` constructor. The parameter sets the chunk overlap using a relative factor, e.g. for a model where the token limit is 100, a `relative_chunk_overlap=0.5` implies that `chunk_overlap=50` Tag maintainers/contributors who might be interested: @hwchase17, @dev2049	1 year ago
Harrison Chase	d1561b74eb	Harrison/cognitive search (#6011 ) Co-authored-by: Fabrizio Ruocco <ruoccofabrizio@gmail.com>	1 year ago
wenmeng zhou	bb7ac9edb5	add dashscope text embedding (#5929 ) #### What I do Adding embedding api for [DashScope](https://help.aliyun.com/product/610100.html), which is the DAMO Academy's multilingual text unified vector model based on the LLM base. It caters to multiple mainstream languages worldwide and offers high-quality vector services, helping developers quickly transform text data into high-quality vector data. Currently supported languages include Chinese, English, Spanish, French, Portuguese, Indonesian, and more. #### Who can review? Models - @hwchase17 - @agola11 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Harrison Chase	e05997c25e	Harrison/hologres (#6012 ) Co-authored-by: Changgeng Zhao <changgeng@nyu.edu> Co-authored-by: Changgeng Zhao <zhaochanggeng.zcg@alibaba-inc.com>	1 year ago
Nuno Campos	18af149e91	nc/load (#5733 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Harrison Chase	a7227ee01b	Harrison/embaas (#6010 ) Co-authored-by: Julius Lipp <43986145+juliuslipp@users.noreply.github.com>	1 year ago
Akhil Vempali	d7d629911b	feat: ✨ Added filtering option to FAISS vectorstore (#5966 ) Inspired by the filtering capability available in ChromaDB, added the same functionality to the FAISS vectorestore as well. Since FAISS does not have an inbuilt method of filtering used the approach suggested in this [thread](https://github.com/facebookresearch/faiss/issues/1079) Langchain Issue inspiration: https://github.com/hwchase17/langchain/issues/4572 - [x] Added filtering capability to semantic similarly and MMR - [x] Added test cases for filtering in `tests/integration_tests/vectorstores/test_faiss.py` #### Who can review? Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049 - @hwchase17	1 year ago
Harrison Chase	704d56e241	support kwargs (#5990 )	1 year ago
Thomas B	ac3e6e3944	Fix IndexError in RecursiveCharacterTextSplitter (#5902 ) Fixes (not reported) an error that may occur in some cases in the RecursiveCharacterTextSplitter. An empty `new_separators` array ([]) would end up in the else path of the condition below and used in a function where it is expected to be non empty. ```python if new_separators is None: ... else: # _split_text() expects this array to be non-empty! other_info = self._split_text(s, new_separators) ``` resulting in an `IndexError` ```python def _split_text(self, text: str, separators: List[str]) -> List[str]: """Split incoming text and return chunks.""" final_chunks = [] # Get appropriate separator to use > separator = separators[-1] E IndexError: list index out of range langchain/text_splitter.py:425: IndexError ``` #### Who can review? @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Ofer Mendelevitch	f8cf09a230	Update to Vectara integration (#5950 ) This PR updates the Vectara integration (@hwchase17 ): * Adds reuse of requests.session to imrpove efficiency and speed. * Utilizes Vectara's low-level API (instead of standard API) to better match user's specific chunking with LangChain * Now add_texts puts all the texts into a single Vectara document so indexing is much faster. * updated variables names from alpha to lambda_val (to be consistent with Vectara docs) and added n_context_sentence so it's available to use if needed. * Updates to documentation and tests --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
qued	e4224a396b	feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955 ) # Unstructured XML Loader Adds an `UnstructuredXMLLoader` class for .xml files. Works with unstructured>=0.6.7. A plain text representation of the text with the XML tags will be available under the `page_content` attribute in the doc. ### Testing ```python from langchain.document_loaders import UnstructuredXMLLoader loader = UnstructuredXMLLoader( "example_data/factbook.xml", ) docs = loader.load() ``` ## Who can review? @hwchase17 @eyurtsev	1 year ago
Harrison Chase	9218684759	Add a new vector store - AwaDB (#5971 ) (#5992 ) Added AwaDB vector store, which is a wrapper over the AwaDB, that can be used as a vector storage and has an efficient similarity search. Added integration tests for the vector store Added jupyter notebook with the example Delete a unneeded empty file and resolve the conflict(https://github.com/hwchase17/langchain/pull/5886) Please check, Thanks! @dev2049 @hwchase17 --------- <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: ljeagle <vincent_jieli@yeah.net> Co-authored-by: vincent <awadb.vincent@gmail.com>	1 year ago
Tomaz Bratanic	d5819a7ca7	Add additional parameters to Graph Cypher Chain (#5979 ) Based on the inspiration from the SQL chain, the following three parameters are added to Graph Cypher Chain. - top_k: Limited the number of results from the database to be used as context - return_direct: Return database results without transforming them to natural language - return_intermediate_steps: Return intermediate steps	1 year ago
Harrison Chase	ca1afa7213	add test for structured tools (#5989 )	1 year ago
German Martin	736a1819aa	LOTR: Lord of the Retrievers. A retriever that merge several retrievers together applying document_formatters to them. (#5798 ) "One Retriever to merge them all, One Retriever to expose them, One Retriever to bring them all and in and process them with Document formatters." Hi @dev2049! Here bothering people again! I'm using this simple idea to deal with merging the output of several retrievers into one. I'm aware of DocumentCompressorPipeline and ContextualCompressionRetriever but I don't think they allow us to do something like this. Also I was getting in trouble to get the pipeline working too. Please correct me if i'm wrong. This allow to do some sort of "retrieval" preprocessing and then using the retrieval with the curated results anywhere you could use a retriever. My use case is to generate diff indexes with diff embeddings and sources for a more colorful results then filtering them with one or many document formatters. I saw some people looking for something like this, here: https://github.com/hwchase17/langchain/issues/3991 and something similar here: https://github.com/hwchase17/langchain/issues/5555 This is just a proposal I know I'm missing tests , etc. If you think this is a worth it idea I can work on tests and anything you want to change. Let me know! --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
felpigeon	2791a753bf	Add start index to metadata in TextSplitter (#5912 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> #### Add start index to metadata in TextSplitter - Modified method `create_documents` to track start position of each chunk - The `start_index` is included in the metadata if the `add_start_index` parameter in the class constructor is set to `True` This enables referencing back to the original document, particularly useful when a specific chunk is retrieved. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	1 year ago
Philip Kiely - Baseten	a09a0e3511	Baseten integration (#5862 ) This PR adds a Baseten integration. I've done my best to follow the contributor's guidelines and add docs, an example notebook, and an integration test modeled after similar integrations' test. Please let me know if there is anything I can do to improve the PR. When it is merged, please tag https://twitter.com/basetenco and https://twitter.com/philip_kiely as contributors (the note on the PR template said to include Twitter accounts)	1 year ago
Harrison Chase	35cfd25db3	Harrison/nebula graph (#5865 ) Co-authored-by: Wey Gu <weyl.gu@gmail.com> Co-authored-by: chenweisomebody <chenweisomebody@gmail.com>	1 year ago
Harrison Chase	658f8bdee7	Harrison/fauna loader (#5864 ) Co-authored-by: Shadid12 <Shadid12@users.noreply.github.com>	1 year ago
Liang Zhang	5518f24ec3	Implement saving and loading of RetrievalQA chain (#5818 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #3983 Mimicing what we do for saving and loading VectorDBQA chain, I added the logic for RetrievalQA chain. Also added a unit test. I did not find how we test other chains for their saving and loading functionality, so I just added a file with one test case. Let me know if there are recommended ways to test it. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
volodymyr-memsql	a1549901ce	Added SingleStoreDB Vector Store (#5619 ) - Added `SingleStoreDB` vector store, which is a wrapper over the SingleStore DB database, that can be used as a vector storage and has an efficient similarity search. - Added integration tests for the vector store - Added jupyter notebook with the example @dev2049 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
bnassivet	9355e3f5f5	qdrant vector store - search with relevancy scores (#5781 ) Implementation of similarity_search_with_relevance_scores for quadrant vector store. As implemented the method is also compatible with other capacities such as filtering. Integration tests updated. #### Who can review? Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	1 year ago
Matt Robinson	11fec7d4d1	feat: Add `UnstructuredCSVLoader` for CSV files (#5844 ) ### Summary Adds an `UnstructuredCSVLoader` for loading CSVs. One advantage of using `UnstructuredCSVLoader` relative to the standard `CSVLoader` is that if you use `UnstructuredCSVLoader` in `"elements"` mode, an HTML representation of the table will be available in the metadata. #### Who can review? @hwchase17 @eyurtsev	1 year ago
Yessen Kanapin	c66755b661	Add DeepInfra embeddings integration with tests and examples, better exception handling for Deep Infra LLM (#5854 ) #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead - @agola11 --------- Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>	1 year ago
bnassivet	062c3c00a2	fixed faiss integ tests (#5808 ) Fixes # 5807 Realigned tests with implementation. Also reinforced folder unicity for the test_faiss_local_save_load test using date-time suffix #### Before submitting - Integration test updated - formatting and linting ok (locally) #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead VectorStores / Retrievers / Memory -@dev2049	1 year ago
Lance Martin	4092fd21dc	YoutubeAudioLoader and updates to OpenAIWhisperParser (#5772 ) This introduces the `YoutubeAudioLoader`, which will load blobs from a YouTube url and write them. Blobs are then parsed by `OpenAIWhisperParser()`, as show in this [PR](https://github.com/hwchase17/langchain/pull/5580), but we extend the parser to split audio such that each chuck meets the 25MB OpenAI size limit. As shown in the notebook, this enables a very simple UX: ``` # Transcribe the video to text loader = GenericLoader(YoutubeAudioLoader([url],save_dir),OpenAIWhisperParser()) docs = loader.load() ``` Tested on full set of Karpathy lecture videos: ``` # Karpathy lecture videos urls = ["https://youtu.be/VMj-3S1tku0" "https://youtu.be/PaCmpygFfXo", "https://youtu.be/TCH_1BHY58I", "https://youtu.be/P6sfmUTpUmc", "https://youtu.be/q8SA3rM6ckI", "https://youtu.be/t3YJ5hKiMQ0", "https://youtu.be/kCc8FmEb1nY"] # Directory to save audio files save_dir = "~/Downloads/YouTube" # Transcribe the videos to text loader = GenericLoader(YoutubeAudioLoader(urls,save_dir),OpenAIWhisperParser()) docs = loader.load() ```	1 year ago
Ankush Gola	b177a29d3f	support returning run info for llms, chat models and chains (#5666 ) returning the run id is important for accessing the run later on	1 year ago
Yoann Poupart	65111eb2b3	Attribute support for html tags (#5782 ) # What does this PR do? Change the HTML tags so that a tag with attributes can be found. ## Before submitting - [x] Tests added - [x] CI/CD validated ### Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.	1 year ago
Zander Chase	204a73c1d9	Use client from LCP-SDK (#5695 ) - Remove the client implementation (this breaks backwards compatibility for existing testers. I could keep the stub in that file if we want, but not many people are using it yet - Add SDK as dependency - Update the 'run_on_dataset' method to be a function that optionally accepts a client as an argument - Remove the langchain plus server implementation (you get it for free with the SDK now) We could make the SDK optional for now, but the plan is to use w/in the tracer so it would likely become a hard dependency at some point.	1 year ago
Ankush Gola	84a46753ab	Tracing Group (#5326 ) Add context manager to group all runs under a virtual parent --------- Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>	1 year ago
Ilya	d5b1608216	fix markdown text splitter horizontal lines (#5625 ) Fixes #5614 #### Issue The `**` combination produces an exception when used as a seperator in `re.split`. Instead `\\\` should be used for regex exprations. #### Who can review? @eyurtsev	1 year ago
M Waleed Kadous	5124c1e0d9	Add aviary support (#5661 ) Aviary is an open source toolkit for evaluating and deploying open source LLMs. You can find out more about it on [http://github.com/ray-project/aviary). You can try it out at [http://aviary.anyscale.com](aviary.anyscale.com). This code adds support for Aviary in LangChain. To minimize dependencies, it connects directly to the HTTP endpoint. The current implementation is not accelerated and uses the default implementation of `predict` and `generate`. It includes a test and a simple example. @hwchase17 and @agola11 could you have a look at this? --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Lance Martin	aea090045b	Create OpenAIWhisperParser for generating Documents from audio files (#5580 ) # OpenAIWhisperParser This PR creates a new parser, `OpenAIWhisperParser`, that uses the [OpenAI Whisper model](https://platform.openai.com/docs/guides/speech-to-text/quickstart) to perform transcription of audio files to text (`Documents`). Please see the notebook for usage.	1 year ago
Hao Chen	a4c9053d40	Integrate Clickhouse as Vector Store (#5650 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> #### Description This PR is mainly to integrate open source version of ClickHouse as Vector Store as it is easy for both local development and adoption of LangChain for enterprises who already have large scale clickhouse deployment. ClickHouse is a open source real-time OLAP database with full SQL support and a wide range of functions to assist users in writing analytical queries. Some of these functions and data structures perform distance operations between vectors, [enabling ClickHouse to be used as a vector database](https://clickhouse.com/blog/vector-search-clickhouse-p1). Recently added ClickHouse capabilities like [Approximate Nearest Neighbour (ANN) indices](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/annindexes) support faster approximate matching of vectors and provide a promising development aimed to further enhance the vector matching capabilities of ClickHouse. In LangChain, some ClickHouse based commercial variant vector stores like [Chroma](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/chroma.py) and [MyScale](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/myscale.py), etc are already integrated, but for some enterprises with large scale Clickhouse clusters deployment, it will be more straightforward to upgrade existing clickhouse infra instead of moving to another similar vector store solution, so we believe it's a valid requirement to integrate open source version of ClickHouse as vector store. As `clickhouse-connect` is already included by other integrations, this PR won't include any new dependencies. #### Before submitting <!-- If you're adding a new integration, please include: 1. Added a test for the integration: https://github.com/haoch/langchain/blob/clickhouse/tests/integration_tests/vectorstores/test_clickhouse.py 2. Added an example notebook and document showing its use: * Notebook: https://github.com/haoch/langchain/blob/clickhouse/docs/modules/indexes/vectorstores/examples/clickhouse.ipynb * Doc: https://github.com/haoch/langchain/blob/clickhouse/docs/integrations/clickhouse.md See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> 1. Added a test for the integration: https://github.com/haoch/langchain/blob/clickhouse/tests/integration_tests/vectorstores/test_clickhouse.py 2. Added an example notebook and document showing its use: * Notebook: https://github.com/haoch/langchain/blob/clickhouse/docs/modules/indexes/vectorstores/examples/clickhouse.ipynb * Doc: https://github.com/haoch/langchain/blob/clickhouse/docs/integrations/clickhouse.md #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @hwchase17 @dev2049 Could you please help review? --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Daniel Chalef	0551bc90a5	Zep Hybrid Search (#5742 ) Zep now supports persisting custom metadata with messages and hybrid search across both message embeddings and structured metadata. This PR implements custom metadata and enhancements to the `ZepChatMessageHistory` and `ZepRetriever` classes to implement this support. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049 --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	1 year ago
Tomaz Bratanic	a0ea6f6b6b	Cypher search: Check if generated Cypher is provided in backticks (#5541 ) # Check if generated Cypher code is wrapped in backticks Some LLMs like the VertexAI like to explain how they generated the Cypher statement and wrap the actual code in three backticks: ![Screenshot from 2023-06-01 08-08-23](https://github.com/hwchase17/langchain/assets/19948365/1d8eecb3-d26c-4882-8f5b-6a9bc7e93690) I have observed a similar pattern with OpenAI chat models in a conversational settings, where multiple user and assistant message are provided to the LLM to generate Cypher statements, where then the LLM wants to maybe apologize for previous steps or explain its thoughts. Interestingly, both OpenAI and VertexAI wrap the code in three backticks if they are doing any explaining or apologizing. Checking if the generated cypher is wrapped in backticks seems like a low-hanging fruit to expand the cypher search to other LLMs and conversational settings.	1 year ago
mheguy-stingray	b64c39dfe7	top_k and top_p transposed in vertexai (#5673 ) Fix transposed properties in vertexai model Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Jens Madsen	8d9e9e013c	refactor: extract token text splitter function (#5179 ) # Token text splitter for sentence transformers The current TokenTextSplitter only works with OpenAi models via the `tiktoken` package. This is not clear from the name `TokenTextSplitter`. In this (first PR) a token based text splitter for sentence transformer models is added. In the future I think we should work towards injecting a tokenizer into the TokenTextSplitter to make ti more flexible. Could perhaps be reviewed by @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Nathan Azrak	26ec845921	Raise an exception in MKRL and Chat Output Parsers if parsing text which contains both an action and a final answer (#5609 ) Raises exception if OutputParsers receive a response with both a valid action and a final answer Currently, if an OutputParser receives a response which includes both an action and a final answer, they return a FinalAnswer object. This allows the parser to accept responses which propose an action and hallucinate an answer without the action being parsed or taken by the agent. This PR changes the logic to: 1. store a variable checking whether a response contains the `FINAL_ANSWER_ACTION` (this is the easier condition to check). 2. store a variable checking whether the response contains a valid action 3. if both are present, raise a new exception stating that both are present 4. if an action is present, return an AgentAction 5. if an answer is present, return an AgentAnswer 6. if neither is present, raise the relevant exception based around the action format (these have been kept consistent with the prior exception messages) Disclaimer: * Existing mock data included strings which did include an action and an answer. This might indicate that prioritising returning AgentAnswer was always correct, and I am patching out desired behaviour? @hwchase17 to advice. Curious if there are allowed cases where this is not hallucinating, and we do want the LLM to output an action which isn't taken. * I have not passed `send_to_llm` through this new exception Fixes #5601 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 - project lead @vowelparrot	1 year ago
Harrison Chase	b9040669a0	Harrison/pipeline prompt (#5540 ) idea is to make prompts more composable	1 year ago
Paul-Emile Brotons	92f218207b	removing client+namespace in favor of collection (#5610 ) removing client+namespace in favor of collection for an easier instantiation and to be similar to the typescript library @dev2049	1 year ago
Harrison Chase	ad09367a92	Harrison/pubmed integration (#5664 ) Co-authored-by: younis basher <71520361+younis-ba@users.noreply.github.com> Co-authored-by: Younis Bashir <younis@omicmd.com>	1 year ago
Davis Chase	ae3611730a	handle single arg to and/or (#5637 ) @ryderwishart @eyurtsev thoughts on handling this in the parser itself? related to #5570	1 year ago
Matt Robinson	a97e4252e3	feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617 ) # Unstructured Excel Loader Adds an `UnstructuredExcelLoader` class for `.xlsx` and `.xls` files. Works with `unstructured>=0.6.7`. A plain text representation of the Excel file will be available under the `page_content` attribute in the doc. If you use the loader in `"elements"` mode, an HTML representation of the Excel file will be available under the `text_as_html` metadata key. Each sheet in the Excel document is its own document. ### Testing ```python from langchain.document_loaders import UnstructuredExcelLoader loader = UnstructuredExcelLoader( "example_data/stanley-cups.xlsx", mode="elements" ) docs = loader.load() ``` ## Who can review? @hwchase17 @eyurtsev	1 year ago
Zander Chase	20ec1173f4	Update Tracer Auth / Reduce Num Calls (#5517 ) Update the session creation and calls --------- Co-authored-by: Ankush Gola <ankush.gola@gmail.com>	1 year ago
Caleb Ellington	c5a7a85a4e	fix chroma update_document to embed entire documents, fixes a characer-wise embedding bug (#5584 ) # Chroma update_document full document embeddings bugfix Chroma update_document takes a single document, but treats the page_content sting of that document as a list when getting the new document embedding. This is a two-fold problem, where the resulting embedding for the updated document is incorrect (it's only an embedding of the first character in the new page_content) and it calls the embedding function for every character in the new page_content string, using many tokens in the process. Fixes #5582 Co-authored-by: Caleb Ellington <calebellington@Calebs-MBP.hsd1.ca.comcast.net>	1 year ago
Kacper Łukawski	71a7c16ee0	Fix: Qdrant ids (#5515 ) # Fix Qdrant ids creation There has been a bug in how the ids were created in the Qdrant vector store. They were previously calculated based on the texts. However, there are some scenarios in which two documents may have the same piece of text but different metadata, and that's a valid case. Deduplication should be done outside of insertion. It has been fixed and covered with the integration tests. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Waldecir Santos	db45970a66	Fix SQLAlchemy truncating text when it is too big (#5206 ) # Fixes SQLAlchemy truncating the result if you have a big/text column with many chars. SQLAlchemy truncates columns if you try to convert a Row or Sequence to a string directly For comparison: - Before: ```[('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ... (2 characters truncated) ... hat is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]``` - After: ```[('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]``` ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: I'm not sure who to tag for chains, maybe @vowelparrot ?	1 year ago
Harrison Chase	342b671d05	add brave search util (#5538 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Davis Chase	983a213bdc	add maxcompute (#5533 ) cc @pengwork (fresh branch, no creds)	1 year ago
Bharat Ramanathan	22603d19e0	feat(integrations): Add WandbTracer (#4521 ) # WandbTracer This PR adds the `WandbTracer` and deprecates the existing `WandbCallbackHandler`. Added an example notebook under the docs section alongside the `LangchainTracer` Here's an example [colab](https://colab.research.google.com/drive/1pY13ym8ENEZ8Fh7nA99ILk2GcdUQu0jR?usp=sharing) with the same notebook and the [trace](https://wandb.ai/parambharat/langchain-tracing/runs/8i45cst6) generated from the colab run Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Sheng Han Lim	3bae595182	Add texts with embeddings to PGVector wrapper (#5500 ) Similar to #1813 for faiss, this PR is to extend functionality to pass text and its vector pair to initialize and add embeddings to the PGVector wrapper. Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @dev2049	1 year ago
Zander Chase	ea09c0846f	Add Feedback Methods + Evaluation examples (#5166 ) Add CRUD methods to interact with feedback endpoints + added eval examples to the notebook	1 year ago
Harrison Chase	5ce74b5958	code splitter docs (#5480 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Kacper Łukawski	8bcaca435a	Feature: Qdrant filters supports (#5446 ) # Support Qdrant filters Qdrant has an [extensive filtering system](https://qdrant.tech/documentation/concepts/filtering/) with rich type support. This PR makes it possible to use the filters in Langchain by passing an additional param to both the `similarity_search_with_score` and `similarity_search` methods. ## Who can review? @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Ankush Gola	1671c2afb2	py tracer fixes (#5377 )	1 year ago
Blithe	e31705b5ab	convert the parameter 'text' to uppercase in the function 'parse' of the class BooleanOutputParser (#5397 ) when the LLMs output 'yes\|no'，BooleanOutputParser can parse it to 'True\|False', fix the ValueError in parse(). <!-- when use the BooleanOutputParser in the chain_filter.py, the LLMs output 'yes\|no'，the function 'parse' will throw ValueError。 --> Fixes # (issue) #5396 https://github.com/hwchase17/langchain/issues/5396 --------- Co-authored-by: gaofeng27692 <gaofeng27692@hundsun.com>	1 year ago
Harrison Chase	eab4b4ccd7	add simple test for imports (#5461 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Janos Tolgyesi	1111f18eb4	Add maximal relevance search to SKLearnVectorStore (#5430 ) # Add maximal relevance search to SKLearnVectorStore This PR implements the maximum relevance search in SKLearnVectorStore. Twitter handle: jtolgyesi (I submitted also the original implementation of SKLearnVectorStore) ## Before submitting Unit tests are included. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Kacper Łukawski	f93d256190	Feat: Add batching to Qdrant (#5443 ) # Add batching to Qdrant Several people requested a batching mechanism while uploading data to Qdrant. It is important, as there are some limits for the maximum size of the request payload, and without batching implemented in Langchain, users need to implement it on their own. This PR exposes a new optional `batch_size` parameter, so all the documents/texts are loaded in batches of the expected size (64, by default). The integration tests of Qdrant are extended to cover two cases: 1. Documents are sent in separate batches. 2. All the documents are sent in a single request.	1 year ago
Matt Wells	1d861dc37a	MRKL output parser no longer breaks well formed queries (#5432 ) # Handles the edge scenario in which the action input is a well formed SQL query which ends with a quoted column There may be a cleaner option here (or indeed other edge scenarios) but this seems to robustly determine if the action input is likely to be a well formed SQL query in which we don't want to arbitrarily trim off `"` characters Fixes #5423 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Agents / Tools / Toolkits - @vowelparrot	1 year ago
Yoann Poupart	c1807d8408	`encoding_kwargs` for InstructEmbeddings (#5450 ) # What does this PR do? Bring support of `encode_kwargs` for ` HuggingFaceInstructEmbeddings`, change the docstring example and add a test to illustrate with `normalize_embeddings`. Fixes #3605 (Similar to #3914) Use case: ```python from langchain.embeddings import HuggingFaceInstructEmbeddings model_name = "hkunlp/instructor-large" model_kwargs = {'device': 'cpu'} encode_kwargs = {'normalize_embeddings': True} hf = HuggingFaceInstructEmbeddings( model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs ) ```	1 year ago
ByronHsu	9d658aaa5a	Add more code splitters (go, rst, js, java, cpp, scala, ruby, php, swift, rust) (#5171 ) As the title says, I added more code splitters. The implementation is trivial, so i don't add separate tests for each splitter. Let me know if any concerns. Fixes # (issue) https://github.com/hwchase17/langchain/issues/5170 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev @hwchase17 --------- Signed-off-by: byhsu <byhsu@linkedin.com> Co-authored-by: byhsu <byhsu@linkedin.com>	1 year ago
Paul-Emile Brotons	a61b7f7e7c	adding MongoDBAtlasVectorSearch (#5338 ) # Add MongoDBAtlasVectorSearch for the python library Fixes #5337 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Harrison Chase	760632b292	Harrison/spark reader (#5405 ) Co-authored-by: Rithwik Ediga Lakhamsani <rithwik.ediga@databricks.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
UmerHA	8259f9b7fa	DocumentLoader for GitHub (#5408 ) # Creates GitHubLoader (#5257) GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub. Fixes #5257 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
German Martin	0b3e0dd1d2	New Trello document loader (#4767 ) # Added New Trello loader class and documentation Simple Loader on top of py-trello wrapper. With a board name you can pull cards and to do some field parameter tweaks on load operation. I included documentation and examples. Included unit test cases using patch and a fixture for py-trello client class. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Harrison Chase	72f99ff953	Harrison/text splitter (#5417 ) adds support for keeping separators around when using recursive text splitter	1 year ago
小铭	cf5803e44c	Add ToolException that a tool can throw. (#5050 ) # Add ToolException that a tool can throw This is an optional exception that tool throws when execution error occurs. When this exception is thrown, the agent will not stop working,but will handle the exception according to the handle_tool_error variable of the tool,and the processing result will be returned to the agent as observation,and printed in pink on the console.It can be used like this: ```python from langchain.schema import ToolException from langchain import LLMMathChain, SerpAPIWrapper, OpenAI from langchain.agents import AgentType, initialize_agent from langchain.chat_models import ChatOpenAI from langchain.tools import BaseTool, StructuredTool, Tool, tool from langchain.chat_models import ChatOpenAI llm = ChatOpenAI(temperature=0) llm_math_chain = LLMMathChain(llm=llm, verbose=True) class Error_tool: def run(self, s: str): raise ToolException('The current search tool is not available.') def handle_tool_error(error) -> str: return "The following errors occurred during tool execution:"+str(error) search_tool1 = Error_tool() search_tool2 = SerpAPIWrapper() tools = [ Tool.from_function( func=search_tool1.run, name="Search_tool1", description="useful for when you need to answer questions about current events.You should give priority to using it.", handle_tool_error=handle_tool_error, ), Tool.from_function( func=search_tool2.run, name="Search_tool2", description="useful for when you need to answer questions about current events", return_direct=True, ) ] agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, handle_tool_errors=handle_tool_error) agent.run("Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?") ``` ![image](https://github.com/hwchase17/langchain/assets/32786500/51930410-b26e-4f85-a1e1-e6a6fb450ada) ## Who can review? - @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Harrison Chase	2da8c48be1	Harrison/datetime parser (#4693 ) Co-authored-by: Jacob Valdez <jacobfv@msn.com> Co-authored-by: Jacob Valdez <jacob.valdez@limboid.ai> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	1 year ago
Eduard van Valkenburg	ccb6238de1	Implemented appending arbitrary messages (#5293 ) # Implemented appending arbitrary messages to the base chat message history, the in-memory and cosmos ones. <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> As discussed this is the alternative way instead of #4480, with a add_message method added that takes a BaseMessage as input, so that the user can control what is in the base message like kwargs. <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Harrison Chase	d6fb25c439	Harrison/prediction guard update (#5404 ) Co-authored-by: Daniel Whitenack <whitenack.daniel@gmail.com>	1 year ago
Justin Flick	c09f8e4ddc	Add pagination for Vertex AI embeddings (#5325 ) Fixes #5316 --------- Co-authored-by: Justin Flick <jflick@homesite.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Martin Holecek	44b48d9518	Fix update_document function, add test and documentation. (#5359 ) # Fix for `update_document` Function in Chroma ## Summary This pull request addresses an issue with the `update_document` function in the Chroma class, as described in [#5031](https://github.com/hwchase17/langchain/issues/5031#issuecomment-1562577947). The issue was identified as an `AttributeError` raised when calling `update_document` due to a missing corresponding method in the `Collection` object. This fix refactors the `update_document` method in `Chroma` to correctly interact with the `Collection` object. ## Changes 1. Fixed the `update_document` method in the `Chroma` class to correctly call methods on the `Collection` object. 2. Added the corresponding test `test_chroma_update_document` in `tests/integration_tests/vectorstores/test_chroma.py` to reflect the updated method call. 3. Added an example and explanation of how to use the `update_document` function in the Jupyter notebook tutorial for Chroma. ## Test Plan All existing tests pass after this change. In addition, the `test_chroma_update_document` test case now correctly checks the functionality of `update_document`, ensuring that the function works as expected and updates the content of documents correctly. ## Reviewers @dev2049 This fix will ensure that users are able to use the `update_document` function as expected, without encountering the previous `AttributeError`. This will enhance the usability and reliability of the Chroma class for all users. Thank you for considering this pull request. I look forward to your feedback and suggestions.	1 year ago
Gael Grosch	8b7721ebbb	fix: Blob.from_data mimetype is lost (#5395 ) # Fix lost mimetype when using Blob.from_data method The mimetype is lost due to a typo in the class attribue name Fixes # - (no issue opened but I can open one if needed) ## Changes * Fixed typo in name * Added unit-tests to validate the output Blob ## Review @eyurtsev	1 year ago
Harrison Chase	6df90ad9fd	handle json parsing errors (#5371 ) adds tests cases, consolidates a lot of PRs	1 year ago

1 2 3 4 5 ...

646 Commits (64039b9f112653d686bbcdd980fa90ad2f3eb9fc)