langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-04 06:00:26 +00:00

Author	SHA1	Message	Date
Bagatur	f31047a394	bump 250 (#8632 )	2023-08-02 07:47:36 -07:00
Comendeiro	5c516945d0	Add local support for audio models (PR #7329 ) (#7591 ) - Description: run the poetry dependencies - Issue: #7329 - Dependencies: any dependencies required for this change, - Tag maintainer: @rlancemartin --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-02 01:24:53 -07:00
Naveen Tatikonda	d2adec3818	[Opensearch] : Fix the service validation in http_auth (#8609 ) ### Description OpenSearch supports validation using both Master Credentials (Username and password) and IAM. For Master Credentials users will not pass the argument `service` in `http_auth` and the existing code will break. To fix this, I have updated the condition to check if service attribute is present in http_auth before accessing it. ### Maintainers @baskaryan @navneet1v Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	2023-08-02 01:16:38 -07:00
Harrison Chase	7c5c0557cb	cast to string when measuring token length (#8617 )	2023-08-02 00:12:59 -07:00
rjanardhan3	68113348cc	Fireworks integration (#8322 ) Description - Integrates Fireworks within Langchain LLMs to allow users to use Fireworks models with Langchain, mainly for summarization. Issue - Not applicable Dependencies - None Tag maintainer - @rlancemartin --------- Co-authored-by: Raj Janardhan <rajjanardhan@Rajs-Laptop.attlocal.net>	2023-08-01 21:17:26 -07:00
Bagatur	b574507c51	normalized openai embeddings embed_query (#8604 ) we weren't normalizing when embedding queries	2023-08-01 17:12:10 -07:00
Neil Murphy	31820a31e4	Add firestore_client param to FirestoreChatMessageHistory if caller already has one; also lets them specify GCP project, etc. (#8601 ) Existing implementation requires that you install `firebase-admin` package, and prevents you from using an existing Firestore client instance if available. This adds optional `firestore_client` param to `FirestoreChatMessageHistory`, so users can just use their existing client/settings. If not passed, existing logic executes to initialize a `firestore_client`. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-01 15:42:13 -07:00
Naveen Tatikonda	13ccf202de	[OpenSearch] : Fix AOSS Initialization (#8600 ) ### Description This PR fixes the AOSS Initialization in Opensearch. ### Maintainers @rlancemartin, @eyurtsev, @navneet1v Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	2023-08-01 15:33:51 -07:00
Joshua Carroll	6705928b9d	Add StreamlitChatMessageHistory (#8497 ) Add a StreamlitChatMessageHistory class that stores chat messages in [Streamlit's Session State](https://docs.streamlit.io/library/api-reference/session-state). Note: The integration test uses a currently-experimental Streamlit testing framework to simulate the execution of a Streamlit app. Marking this PR as draft until I confirm with the Streamlit team that we're comfortable supporting it. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-01 14:28:15 -07:00
Matt Robinson	8961c720b8	docs: update `unstructured` install instructions (#8596 ) ### Summary Updates the `unstructured` install instructions. For `unstructured>=0.9.0`, dependencies are broken out by document type and the base `unstructured` package includes fewer dependencies. `pip install "unstructured[local-inference]"` has been replace by `pip install "unstructured[all-docs]"`, though the `local-inference` extra is still supported for the time being. ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17	2023-08-01 14:17:49 -07:00
Bagatur	73072d3db8	mv (#8595 )	2023-08-01 14:17:04 -07:00
brettdbrewer	2de028834f	updated to use new llm_util query (#8591 ) - Description: added memgraph_graph.py which defines the MemgraphGraph class, subclassing off the existing Neo4jGraph class. This lets you query the Memgraph graph database using natural language. It leverages the Neo4j drivers and the bolt protocol. - Dependencies: since it is a subclass off of Neo4jGraph, it is dependent on it and the GraphCypherQA Chain implementations. It is dependent on the Neo4j drivers being present. It is dependent on having a running Memgraph instance to connect to. - Tag maintainer: @baskaryan - Twitter handle: @villageideate - example usage can be seen in this repo https://github.com/brettdbrewer/MemgraphGraph/ --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-01 14:16:15 -07:00
Tesfagabir Meharizghi	a7000ee89e	Callback handler for Amazon SageMaker Experiments (#8587 ) ## Description This PR implements a callback handler for SageMaker Experiments which is similar to that of mlflow. * When creating the callback handler, it takes the experiment's run object as an argument. All the callback outputs are then logged to the run object. * The output of each callback action (e.g., `on_llm_start`) is saved to S3 bucket as json file. * Optionally, you can also log additional information such as the LLM hyper-parameters to the same run object. * Once the callback object is no more needed, you will need to call the `flush_tracker()` method. This makes sure that any intermediate files are deleted. * A separate notebook example is provided to show how the callback is used. @3coins @agola11 --------- Co-authored-by: Tesfagabir Meharizghi <mehariz@amazon.com>	2023-08-01 13:47:08 -07:00
Harrison Chase	9c2b29a1cb	Harrison/loader bug (#8559 ) Co-authored-by: ddroghini <d.droghini@mflgroup.com> Co-authored-by: Buckler89 <Droghini.diego@gmail.com>	2023-08-01 13:31:49 -07:00
Kristelle Widjaja	f190bc3e83	Bug fix: feature/issue-7804-chroma-client_settings-bug (#8267 ) Description: Made Chroma constructor more robust when client_settings is provided. Otherwise, existing embeddings will not be loaded correctly from Chroma. Issue: #7804 Dependencies: None Tag maintainer: @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-01 13:31:35 -07:00
mpb159753	7df2dfc4c2	Add Support for Loading Documents from Huawei OBS (#8573 ) Description: This PR adds support for loading documents from Huawei OBS (Object Storage Service) in Langchain. OBS is a cloud-based object storage service provided by Huawei Cloud. With this enhancement, Langchain users can now easily access and load documents stored in Huawei OBS directly into the system. Key Changes: - Added a new document loader module specifically for Huawei OBS integration. - Implemented the necessary logic to authenticate and connect to Huawei OBS using access credentials. - Enabled the loading of individual documents from a specified bucket and object key in Huawei OBS. - Provided the option to specify custom authentication information or obtain security tokens from Huawei Cloud ECS for easy access. How to Test: 1. Ensure the required package "esdk-obs-python" is installed. 2. Configure the endpoint, access key, secret key, and bucket details for Huawei OBS in the Langchain settings. 3. Load documents from Huawei OBS using the updated document loader module. 4. Verify that documents are successfully retrieved and loaded into Langchain for further processing. Please review this PR and let us know if any further improvements are needed. Your feedback is highly appreciated! @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-01 09:30:30 -07:00
Leonid Ganeline	ed9a0f8185	Docstrings: Module descriptions (#8262 ) Added/changed the module descriptions (the firs-line docstrings in the `__init__` files). Added class hierarchy info. @baskaryan	2023-08-01 09:12:32 -07:00
shibuiwilliam	465faab935	fix apparent spelling inconsistencies (#8574 ) Use ImportErrors where appropriate	2023-08-01 09:09:09 -07:00
Nuno Campos	0ec020698f	Add new run types for Runnables (#8488 ) - allow overriding run_type in on_chain_start <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-08-01 12:56:40 +01:00
Bagatur	bd2e298468	bump 249 (#8571 )	2023-08-01 01:20:16 -07:00
Harrison Chase	66226d1d4d	add example for memory (#8552 )	2023-08-01 01:10:19 -07:00
William FH	e83250cc5f	Rm RunTypeEnum (#8553 ) We already support raw strings in the SDK but would like to deprecate client-side validation of run types. This removes its usage	2023-08-01 07:32:07 +01:00
Jacob Lee	2a26cc6d2b	Fix combining runnable sequences (#8557 ) Combining runnable sequences was dropping a step in the middle. @nfcampos @baskaryan	2023-07-31 18:17:46 -07:00
Mohamad Zamini	3fbb737bb3	Update combined.py (#7541 ) from my understanding, the `check_repeated_memory_variable` validator will raise an error if any of the variables in the `memories` list are repeated. However, the `load_memory_variables` method does not check for repeated variables. This means that it is possible for the `CombinedMemory` instance to return a dictionary of memory variables that contains duplicate values. This code will check for repeated variables in the `data` dictionary returned by the `load_memory_variables` method of each sub-memory. If a repeated variable is found, an error will be raised. <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-31 18:15:00 -07:00
Shantanu Nair	53f3793504	Fast load conversationsummarymemory from existing summary (#7533 ) - Description: Adds an optional buffer arg to the memory's from_messages() method. If provided the existing memory will be loaded instead of regenerating a summary from the loaded messages. Why? If we have past messages to load from, it is likely we also have an existing summary. This is particularly helpful in cases where the chat is ephemeral and/or is backed by serverless where the chat history is not stored but where the updated chat history is passed back and forth between a backend/frontend. Eg: Take a stateless qa backend implementation that loads messages on every request and generates a response — without this addition, each time the messages are loaded via from_messages, the summaries are recomputed even though they may have just been computed during the previous response. With this, the previously computed summary can be passed in and avoid: 1) spending extra $$$ on tokens, and 2) increased response time by avoiding regenerating previously generated summary. Tag maintainer: @hwchase17 Twitter handle: https://twitter.com/ShantanuNair --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-31 18:14:11 -07:00
DJ Atha	ec40ead980	Fixed bug7445 where a duplicate restuld_id is added to the vectorstore. (#7573 ) - Description: updated BabyAGI examples to append the iteration to the result id to fix error storing data to vectorstore. - Issue: 7445 - Dependencies: no - Tag maintainer: @eyurtsev - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! This fix worked for me locally. Happy to take some feedback and iterate on a better solution. I was considering appending a uuid instead but didnt want to over complicate the example.	2023-07-31 18:00:01 -07:00
yangdihang	ff5024634e	fix: openapi controller prompt, when bot is unable to resolve an api … (#7525 ) …call, it needs retry <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Co-authored-by: yangdihang <yangdihang@bytedance.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-31 17:56:43 -07:00
Kenny	1e8fca5518	Add ConcurrentLoader (#7512 ) Works just like the GenericLoader but concurrently for those who choose to optimize their workflow. @rlancemartin @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-31 17:56:31 -07:00
Kevin Buckley	8061994c61	AzureSearch Vector Store: Moving the usage of additional_fields into context of it's definition (bug fix from python error) (#8551 ) Description: Using Azure Cognitive Search as a VectorStore. Calling the `add_texts` method throws an error if there is no metadata property specified. The `additional_fields` field is set in an `if` statement and then is used later outside the if statement. This PR just moves the declaration of `additional_fields` below and puts the usage of it in context. Issue: https://github.com/langchain-ai/langchain/issues/8544 Tagging @rlancemartin, @eyurtsev as this is related to Vector stores. `make format`, `make lint`, `make spellcheck`, and `make test` have been run	2023-07-31 17:25:57 -07:00
Danny Davenport	8d2344db43	updates some spelling mistakes (#8537 ) Just updating some spelling / grammar issues in the documentation. No code changes. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-31 17:15:29 -07:00
Leonid Kuligin	b4a126ae71	Updated docs on Vertex AI going GA (#8531 ) #8074 Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-07-31 17:15:04 -07:00
Pranay Chandekar	7e70cd2a28	Bug Fix - #8415 (#8417 ) - Issue: #8415 Signed-off-by: Pranay Chandekar <pranayc6@gmail.com>	2023-07-31 17:08:46 -07:00
shibuiwilliam	de61ebd9e0	add tests to redis vectorstore (#8116 ) # What - Add function to get similarity with score with threshold in Redis vector store. - Add tests to Redis vector store.	2023-07-31 17:07:09 -07:00
Bharat Raghunathan	c19a0b9c10	doc(prompts): Follow up on broken Prompt Sublink pages (#8530 ) - Description: Follow up of #8478 - Issue: #8477 - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: [@BharatR123](twitter.com/BharatR123) The links were still broken after #8478 and sadly the issue was not caught with either the Vercel app build and `make docs_linkcheck`	2023-07-31 16:46:13 -07:00
Bruno Bornsztein	5a490a79f4	fix issue #8357 by making json backtick regex greedy (#8528 ) - Description: Markdown code blocks in json response should not break the parser - Issue: #8357 @baskaryan @hinthornw	2023-07-31 16:36:57 -07:00
Gordon Clark	64d0a0fcc0	Updating docstings in utilities (#8411 ) Updating docstrings on utility packages @baskaryan	2023-07-31 16:34:53 -07:00
Harrison Chase	bca0749a11	conversational retrieval chain in lcel (#8532 )	2023-07-31 16:33:07 -07:00
Jeff Huber	07d6d1ca38	fix error in chroma docker instructions (#8533 ) This makes the Chroma instructions for Docker work! https://python.langchain.com/docs/integrations/vectorstores/chroma#basic-example-using-the-docker-container	2023-07-31 16:32:53 -07:00
Mohammad Mohtashim	144b4c0c78	SQL Query Prompt update + added _execute method for SQLDatabase (#8100 ) - Description: This pull request (PR) includes two minor changes: 1. Updated the default prompt for SQL Query Checker: The current prompt does not clearly specify the final response that the LLM (Language Model) should provide when checking for the query if `use_query_checker` is enabled in SQLDatabase Chain. As a result, the LLM adds extra words like "Here is your updated query" to the response. However, this causes a syntax error when executing the SQL command in SQLDatabaseChain, as these additional words are also included in the SQL query. 2. Moved the query's execution part into a separate method for SQLDatabase: The purpose of this change is to provide users with more flexibility when obtaining the result of an SQL query in the original form returned by sqlalchemy. In the previous implementation, the run method returned the results as a string. By creating a distinct method for execution, users can now receive the results in original format, which proves helpful in various scenarios. For example, during the development of a tool, I found it advantageous to obtain results in original format rather than a string, as currently done by the run method. - Tag maintainer: @hinthornw --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-31 16:28:08 -07:00
Matthew DeGuzman	844eca98d5	Add LLaMa Formatter and AzureML Chat Endpoint (#8382 ) ## Description Microsoft and Meta recently [announced their collaboration](https://blogs.microsoft.com/blog/2023/07/18/microsoft-and-meta-expand-their-ai-partnership-with-llama-2-on-azure-and-windows/) on LLaMa2. This PR extends the current LLM wrapper and introduces a new Chat Model wrapper for AzureML to support LLaMa2. ## Dependencies No dependencies added :) ## Twitter Handles [@matthew_d13](https://twitter.com/matthew_d13) [@prakhar_in](https://twitter.com/prakhar_in) maintainers - @hwchase17, @baskaryan	2023-07-31 16:26:25 -07:00
Anthony Mahanna	1ab773c742	docs: Update ArangoDB Colab URL (#8547 ) 1-commit PR to update the Google Colab URL of the ArangoDB Graph QA Chain notebook	2023-07-31 16:11:21 -07:00
Harrison Chase	15de57b848	fix web loader (#8538 )	2023-07-31 12:47:33 -07:00
Nuno Campos	4780156955	Rely less on positional arg order in subclasses of vector store when calling async methods (#8534 )	2023-07-31 20:13:11 +01:00
Harrison Chase	5e3b968078	router runnable (#8496 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-07-31 11:07:10 -07:00
Anubhav Bindlish	913a156cff	Minor improvements to rockset vectorstore (#8416 ) This PR makes minor improvements to our python notebook, and adds support for `Rockset` workspaces in our vectorstore client. @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-31 09:54:59 -07:00
Harrison Chase	893f3014af	add xml agent notebook	2023-07-31 07:33:22 -07:00
Bagatur	a8be207ea3	bump 248 (#8518 )	2023-07-31 07:14:45 -07:00
Harrison Chase	6556a8fcfd	add initial anthropic agent (#8468 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-07-30 21:30:49 -07:00
os1ma	a795c3d860	Fix GitLoader to handle repeated load calls (#8412 ) Description: a description of the change In this pull request, GitLoader has been updated to handle multiple load calls, provided the same repository is being cloned. Previously, calling `load` multiple times would raise an error if a clone URL was provided. Additionally, a check has been added to raise a ValueError when attempting to clone a different repository into an existing path. New tests have also been introduced to verify the correct behavior of the GitLoader class when `load` is called multiple times. Lastly, the GitPython package, a dependency for the GitLoader class, has been added to the project dependencies (pyproject.toml and poetry.lock). Issue: the issue # it fixes (if applicable) None Dependencies: any dependencies required for this change GitPython Tag maintainer: for a quicker response, tag the relevant maintainer (see below) - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev	2023-07-30 21:27:20 -07:00
Muhammed Al-Dulaimi	9975ba4124	Fix ChromaDB integration -> docker container instructions (#8447 ) ## Description This PR handles modifying the Chroma DB integration's documentation. It modifies the Docker container example to fix the instructions mentioned in the documentation. In the current documentation, the below `client.reset()` line causes a runtime error: ```py ... client = chromadb.HttpClient(settings=Settings(allow_reset=True)) client.reset() # resets the database collection = client.create_collection("my_collection") ... ``` `Exception: {"error":"ValueError('Resetting is not allowed by this configuration')"}` This is due to the Chroma DB server needing to have the `allow_reset` flag set to `true` there as well. This is fixed by adding the `ALLOW_RESET=TRUE` to the `docker-compose` file environment variable to the docker container before spinning it ## Issue This fixes the runtime error that occurs when running the docker container example code ## Tag Maintainer @rlancemartin, @eyurtsev	2023-07-30 21:11:56 -07:00

1 2 3 4 5 ...

3503 Commits