langchain

Author	SHA1	Message	Date
Andreas Liebschner	44dc959584	Improve pinecone hybrid search retriever adding metadata support (#5098 ) # Improve pinecone hybrid search retriever adding metadata support I simply remove the hardwiring of metadata to the existing implementation allowing one to pass `metadatas` attribute to the constructors and in `get_relevant_documents`. I also add one missing pip install to the accompanying notebook (I am not adding dependencies, they were pre-existing). First contribution, just hoping to help, feel free to critique :) my twitter username is `@andreliebschner` While looking at hybrid search I noticed #3043 and #1743. I think the former can be closed as following the example right now (even prior to my improvements) works just fine, the latter I think can be also closed safely, maybe pointing out the relevant classes and example. Should I reply those issues mentioning someone? @dev2049, @hwchase17 --------- Co-authored-by: Andreas Liebschner <a.liebschner@shopfully.com>	2023-05-22 11:42:54 -07:00
Deepak S V	5cd12102be	Improving Resilience of MRKL Agent (#5014 ) This is a highly optimized update to the pull request https://github.com/hwchase17/langchain/pull/3269 Summary: 1) Added ability to MRKL agent to self solve the ValueError(f"Could not parse LLM output: `{llm_output}`") error, whenever llm (especially gpt-3.5-turbo) does not follow the format of MRKL Agent, while returning "Action:" & "Action Input:". 2) The way I am solving this error is by responding back to the llm with the messages "Invalid Format: Missing 'Action:' after 'Thought:'" & "Invalid Format: Missing 'Action Input:' after 'Action:'" whenever Action: and Action Input: are not present in the llm output respectively. For a detailed explanation, look at the previous pull request. New Updates: 1) Since @hwchase17 , requested in the previous PR to communicate the self correction (error) message, using the OutputParserException, I have added new ability to the OutputParserException class to store the observation & previous llm_output in order to communicate it to the next Agent's prompt. This is done, without breaking/modifying any of the functionality OutputParserException previously performs (i.e. OutputParserException can be used in the same way as before, without passing any observation & previous llm_output too). --------- Co-authored-by: Deepak S V <svdeepak99@users.noreply.github.com>	2023-05-22 11:08:08 -07:00
Michael Landis	6eacd88ae7	fix: revert docarray explicit transitive dependencies and use extras instead (#5015 ) tldr: The docarray [integration PR](https://github.com/hwchase17/langchain/pull/4483) introduced a pinned dependency to protobuf. This is a docarray dependency, not a langchain dependency. Since this is handled by the docarray dependencies, it is unnecessary here. Further, as a pinned dependency, this quickly leads to incompatibilities with application code that consumes the library. Much less with a heavily used library like protobuf. Detail: as we see in the [docarray integration](https://github.com/hwchase17/langchain/pull/4483/files#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711R81-R83), the transitive dependencies of docarray were also listed as langchain dependencies. This is unnecessary as the docarray project has an appropriate [extras](`a01a05542d/pyproject.toml (L70)`). The docarray project also does not require this _pinned_ version of protobuf, rather [a minimum version](`a01a05542d/pyproject.toml (L41)`). So this pinned version was likely in error. To fix this, this PR reverts the explicit hnswlib and protobuf dependencies and adds the hnswlib extras install for docarray (which installs hnswlib and protobuf, as originally intended). Because version `0.32.0` of the docarray hnswlib extras added protobuf, we bump the docarray dependency from `^0.31.0` to `^0.32.0`. # revert docarray explicit transitive dependencies and use extras instead ## Who can review? @dev2049 -- reviewed the original PR @eyurtsev -- bumped the pinned protobuf dependency a few days ago --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 12:48:09 -04:00
Davis Chase	fcd88bccb3	Bump 177 (#5095 )	2023-05-22 08:19:06 -07:00
Harrison Chase	10ba201d05	Harrison/neo4j (#5078 ) Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-22 07:31:48 -07:00
Deepak S V	49ca02711e	Improved query, print & exception handling in REPL Tool (#4997 ) Update to pull request https://github.com/hwchase17/langchain/pull/3215 Summary: 1) Improved the sanitization of query (using regex), by removing python command (since gpt-3.5-turbo sometimes assumes python console as a terminal, and runs python command first which causes error). Also sometimes 1 line python codes contain single backticks. 2) Added 7 new test cases. For more details, view the previous pull request. --------- Co-authored-by: Deepak S V <svdeepak99@users.noreply.github.com>	2023-05-22 13:43:44 +00:00
Zander Chase	785502edb3	Add 'get_token_ids' method (#4784 ) Let user inspect the token ids in addition to getting th enumber of tokens --------- Co-authored-by: Zach Schillaci <40636930+zachschillaci27@users.noreply.github.com>	2023-05-22 13:17:26 +00:00
Zander Chase	ef7d015be5	Separate Runner Functions from Client (#5079 ) Extract the methods specific to running an LLM or Chain on a dataset to separate utility functions. This simplifies the client a bit and lets us separate concerns of LCP details from running examples (e.g., for evals)	2023-05-22 05:28:47 +00:00
Leonid Ganeline	443ebe22f4	docs: `Deployments` page moved into `Ecosystem/` (#4949 ) # docs: `deployments` page moved into `ecosystem/` The `Deployments` page moved into the `Ecosystem/` group Small fixes: - `index` page: fixed order of items in the `Modules` list, in the `Use Cases` list - item `References/Installation` was lost in the `index` page (not on the Navbar!). Restored it. - added `\|` marker in several places. NOTE: I also thought about moving the `Additional Resources/Gallery` page into the `Ecosystem` group but decided to leave it unchanged. Please, advise on this. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049	2023-05-21 21:18:22 -07:00
Hans van Dam	a395ff7c90	preserve language in conversation retrieval (#4969 ) Without the addition of 'in its original language', the condensing response, more often than not, outputs the rephrased question in English, even when the conversation is in another language. This question in English then transfers to the question in the retrieval prompt and the chatbot is stuck in English. I'm sometimes surprised that this does not happen more often, but apparently the GPT models are smart enough to understand that when the template contains Question: .... Answer: then the answer should be in in the language of the question.	2023-05-21 21:16:03 -07:00
Matt Robinson	bf3f554357	feat: batch multiple files in a single Unstructured API request (#4525 ) ### Submit Multiple Files to the Unstructured API Enables batching multiple files into a single Unstructured API requests. Support for requests with multiple files was added to both `UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. Note that if you submit multiple files in "single" mode, the result will be concatenated into a single document. We recommend using this feature in "elements" mode. ### Testing The following should load both documents, using two of the example docs from the integration tests folder. ```python from langchain.document_loaders import UnstructuredAPIFileLoader file_paths = ["examples/layout-parser-paper.pdf", "examples/whatsapp_chat.txt"] loader = UnstructuredAPIFileLoader( file_paths=file_paths, api_key="FAKE_API_KEY", strategy="fast", mode="elements", ) docs = loader.load() ```	2023-05-21 20:48:20 -07:00
Harrison Chase	0c3de0a0b3	Merge branch 'master' of github.com:hwchase17/langchain	2023-05-21 09:22:43 -07:00
Harrison Chase	224f73e978	move docs	2023-05-21 09:22:35 -07:00
Harrison Chase	6c25f860fd	bump to 176 (#5064 )	2023-05-21 09:19:25 -07:00
Harrison Chase	b0431c672b	Harrison/psychic (#5063 ) Co-authored-by: Ayan Bandyopadhyay <ayanb9440@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-21 09:13:20 -07:00
Harrison Chase	8c661baefb	change to type checking (#5062 )	2023-05-21 09:09:49 -07:00
Jeffrey Zheng	424a573266	DOC: Misspelling in agents.rst documentation (#5038 ) # Corrected Misspelling in agents.rst Documentation <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get --> In the [documentation](https://python.langchain.com/en/latest/modules/agents.html) it says "in fact, it is often best to have an Action Agent be in change of the execution for the Plan and Execute agent." Suggested Change: I propose correcting change to charge. Fix for issue: #5039	2023-05-20 22:24:08 -07:00
Gengliang Wang	f9f08c4b69	Add documentation for Databricks integration (#5013 ) # Add documentation for Databricks integration This is a follow-up of https://github.com/hwchase17/langchain/pull/4702 It documents the details of how to integrate Databricks using langchain. It also provides examples in a notebook. ## Who can review? @dev2049 @hwchase17 since you are aware of the context. We will promote the integration after this doc is ready. Thanks in advance!	2023-05-20 22:06:24 -07:00
tornikeo	a6ef20d7fe	Fix annoying typo in docs (#5029 ) # Fixes an annoying typo in docs <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes Annoying typo in docs - "Therefor" -> "Therefore". It's so annoying to read that I just had to make this PR.	2023-05-20 22:02:21 -07:00
Davis Chase	9d1280d451	bump v175 (#5041 )	2023-05-20 09:24:17 -07:00
UmerHA	7388248b3e	Streaming only final output of agent (#2483 ) (#4630 ) # Streaming only final output of agent (#2483) As requested in issue #2483, this Callback allows to stream only the final output of an agent (ie not the intermediate steps). Fixes #2483 Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-20 09:20:17 -07:00
Davis Chase	3bc0bf0079	fix prompt saving (#4987 ) will add unit tests	2023-05-20 08:21:52 -07:00
Zander Chase	27e63b977a	Add logs command (#5007 ) to the plus server	2023-05-20 00:06:17 +00:00
Marcus Winter	2aa3754024	Check for single prompt in __call__ method of the BaseLLM class (#4892 ) # Ensuring that users pass a single prompt when calling a LLM - This PR adds a check to the `__call__` method of the `BaseLLM` class to ensure that it is called with a single prompt - Raises a `ValueError` if users try to call a LLM with a list of prompt and instructs them to use the `generate` method instead ## Why this could be useful I stumbled across this by accident. I accidentally called the OpenAI LLM with a list of prompts instead of a single string and still got a result: ``` >>> from langchain.llms import OpenAI >>> llm = OpenAI() >>> llm(["Tell a joke"]2) "\n\nQ: Why don't scientists trust atoms?\nA: Because they make up everything!" ``` It might be better to catch such a scenario preventing unnecessary costs and irritation for the user. ## Proposed behaviour ``` >>> from langchain.llms import OpenAI >>> llm = OpenAI() >>> llm(["Tell a joke"]2) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/marcus/Projects/langchain/langchain/llms/base.py", line 291, in __call__ raise ValueError( ValueError: Argument `prompt` is expected to be a single string, not a list. If you want to run the LLM on multiple prompts, use `generate` instead. ```	2023-05-19 16:54:26 -07:00
domchan	6c60251f52	Add self query translator for weaviate vectorstore (#4804 ) # Add self query translator for weaviate vectorstore Adds support for the EQ comparator and the AND/OR operators. Co-authored-by: Dominic Chan <dchan@cppib.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-19 16:41:12 -07:00
Davis Chase	9928fb2193	Revert "API update: Engines -> Models (#4915 )" (#5008 ) This reverts commit `8c28ad6dac`. Seems to be causing #5001	2023-05-19 16:38:08 -07:00
SimFG	f07b9fde74	Update the GPTCache example (#4985 ) # Update the GPTCache example Fixes #4757	2023-05-19 16:35:36 -07:00
Leonid Ganeline	ddc2d4c21e	added instruction about pip install google-gerativeai (#5004 ) # added instruction about pip install google-gerativeai added instruction about pip install google-gerativeai	2023-05-19 15:32:24 -07:00
Nicolas	02632d52b3	docs: Big Mendable Improvements (#4964 ) - Higher accuracy on the responses - New redesigned UI - Pretty Sources: display the sources by title / sub-section instead of long URL. - Fixed Reset Button bugs and some other UI issues - Other tweaks	2023-05-19 15:31:48 -07:00
Leonid Ganeline	2ab0e1d526	changed ValueError to ImportError (#5006 ) # changed ValueError to ImportError in except Several places with this bug. ValueError does not catch ImportError.	2023-05-19 15:28:08 -07:00
Davis Chase	080eb1b3fc	Fix graphql tool (#4984 ) Fix construction and add unit test.	2023-05-19 15:27:50 -07:00
Mike McGarry	ddd595fe81	feature/4493 Improve Evernote Document Loader (#4577 ) # Improve Evernote Document Loader When exporting from Evernote you may export more than one note. Currently the Evernote loader concatenates the content of all notes in the export into a single document and only attaches the name of the export file as metadata on the document. This change ensures that each note is loaded as an independent document and all available metadata on the note e.g. author, title, created, updated are added as metadata on each document. It also uses an existing optional dependency of `html2text` instead of `pypandoc` to remove the need to download the pandoc application via `download_pandoc()` to be able to use the `pypandoc` python bindings. Fixes #4493 Co-authored-by: Mike McGarry <mike.mcgarry@finbourne.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-19 14:28:17 -07:00
Juanma Tristancho	729e935ea4	PGVector logger message level (#4920 ) # Change the logger message level The library is logging at `error` level a situation that is not an error. We noticed this error in our logs, but from our point of view it's an expected behavior and the log level should be `warning`.	2023-05-19 14:01:26 -07:00
Peng Wang	62d0a01a0f	Update python.py (#4971 ) # Delete a useless "print"	2023-05-19 13:57:16 -07:00
Eugene Yurtsev	0ff59569dc	Adds 'IN' metadata filter for pgvector for checking set presence (#4982 ) # Adds "IN" metadata filter for pgvector to all checking for set presence PGVector currently supports metadata filters of the form: ``` {"filter": {"key": "value"}} ``` which will return documents where the "key" metadata field is equal to "value". This PR adds support for metadata filters of the form: ``` {"filter": {"key": { "IN" : ["list", "of", "values"]}}} ``` Other vector stores support this via an "$in" syntax. I chose to use "IN" to match postgres' syntax, though happy to switch. Tested locally with PGVector and ChatVectorDBChain. @dev2049 --------- Co-authored-by: jade@spanninglabs.com <jade@spanninglabs.com>	2023-05-19 13:53:23 -07:00
Davis Chase	56cb77a828	Make test gha workflow manually runnable (#4998 ) if https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#workflow_dispatch is to be believed this should make it possible to manually kick of test workflow, but i don't know much about these things	2023-05-19 13:46:33 -07:00
Jiaping(JP) Zhang	22d844dc07	Add async search with relevance score (#4558 ) Add the async version for the search with relevance score Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-19 13:05:24 -07:00
Adheeban Manoharan	616e9a93e0	Bug fixes and error handling in Redis - Vectorstore (#4932 ) # Bug fixes in Redis - Vectorstore (Added the version of redis to the error message and removed the cls argument from a classmethod) Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>	2023-05-19 13:02:03 -07:00
Gengliang Wang	a87a2524c7	Remove autoreload in examples (#4994 ) # Remove autoreload in examples Remove the `autoreload` in examples since it is not necessary for most users: ``` %load_ext autoreload, %autoreload 2 ```	2023-05-19 17:35:58 +00:00
Davis Chase	2abf6b9f17	bump v0.0.174 (#4988 )	2023-05-19 09:34:28 -07:00
Eugene Yurtsev	06e524416c	power bi api wrapper integration tests & bug fix (#4983 ) # Powerbi API wrapper bug fix + integration tests - Bug fix by removing `TYPE_CHECKING` in in utilities/powerbi.py - Added integration test for power bi api in utilities/test_powerbi_api.py - Added integration test for power bi agent in agent/test_powerbi_agent.py - Edited .env.examples to help set up power bi related environment variables - Updated demo notebook with working code in docs../examples/powerbi.ipynb - AzureOpenAI -> ChatOpenAI Notes: Chat models (gpt3.5, gpt4) are much more capable than davinci at writing DAX queries, so that is important to getting the agent to work properly. Interestingly, gpt3.5-turbo needed the examples=DEFAULT_FEWSHOT_EXAMPLES to write consistent DAX queries, so gpt4 seems necessary as the smart llm. Fixes #4325 ## Before submitting Azure-core and Azure-identity are necessary dependencies check integration tests with the following: `pytest tests/integration_tests/utilities/test_powerbi_api.py` `pytest tests/integration_tests/agent/test_powerbi_agent.py` You will need a power bi account with a dataset id + table name in order to test. See .env.examples for details. ## Who can review? @hwchase17 @vowelparrot --------- Co-authored-by: aditya-pethe <adityapethe1@gmail.com>	2023-05-19 11:25:52 -04:00
Viswanadh Rayavarapu	e68dfa7062	Update planner_prompt.py (#4967 ) Typos in the OpenAPI agent Prompt.	2023-05-19 11:17:10 -04:00
Edrick Da Corte Henriquez	e80585bab0	Update tutorials.md (#4960 ) # Added a YouTube Tutorial Added a LangChain tutorial playlist aimed at onboarding newcomers to LangChain and its use cases. I've shared the video in the #tutorials channel and it seemed to be well received. I think this could be useful to the greater community. ## Who can review? @dev2049	2023-05-19 10:40:14 -04:00
Rahul Rao	13c376345e	Fixed assumptions misspelling (#4961 ) Fixed assumptions misspelling in the link mentioned below:- https://python.langchain.com/en/latest/modules/chains/examples/llm_summarization_checker.html ![image](https://github.com/hwchase17/langchain/assets/16189966/94cf2be0-b3d0-495b-98ad-e1f44331727e) Fix for Issue:- #4959 @hwchase17	2023-05-19 10:40:04 -04:00
Gengliang Wang	bf5a3c6dec	Support Databricks in SQLDatabase (#4702 ) This PR adds support for Databricks runtime and Databricks SQL by using [Databricks SQL Connector for Python](https://docs.databricks.com/dev-tools/python-sql-connector.html). As a cloud data platform, accessing Databricks requires a URL as follows `databricks://token:{api_token}@{hostname}?http_path={http_path}&catalog={catalog}&schema={schema}`. The URL is complicated and it may take users a while to figure it out. Since the fields `api_token`/`hostname`/`http_path` fields are known in the Databricks notebook, I am proposing a new method `from_databricks` to simplify the connection to Databricks. ## In Databricks Notebook After changes, Databricks users only need to specify the `catalog` and `schema` field when using langchain. <img width="881" alt="image" src="https://github.com/hwchase17/langchain/assets/1097932/984b4c57-4c2d-489d-b060-5f4918ef2f37"> ## In Jupyter Notebook The method can be used on the local setup as well: <img width="678" alt="image" src="https://github.com/hwchase17/langchain/assets/1097932/142e8805-a6ef-4919-b28e-9796ca31ef19">	2023-05-19 00:42:06 -07:00
Harrison Chase	88a3a56c1a	Add Spark SQL support (#4602 ) (#4956 ) # Add Spark SQL support * Add Spark SQL support. It can connect to Spark via building a local/remote SparkSession. * Include a notebook example I tried some complicated queries (window function, table joins), and the tool works well. Compared to the [Spark Dataframe agent](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/spark.html), this tool is able to generate queries across multiple tables. --------- # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Gengliang Wang <gengliang@apache.org> Co-authored-by: Mike W <62768671+skcoirz@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: UmerHA <40663591+UmerHA@users.noreply.github.com> Co-authored-by: 张城铭 <z@hyperf.io> Co-authored-by: assert <zhangchengming@kkguan.com> Co-authored-by: blob42 <spike@w530> Co-authored-by: Yuekai Zhang <zhangyuekai@foxmail.com> Co-authored-by: Richard He <he.yucheng@outlook.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com> Co-authored-by: Alexey Nominas <60900649+Chae4ek@users.noreply.github.com> Co-authored-by: elBarkey <elbarkey@gmail.com> Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com> Co-authored-by: Jeffrey D <1289344+verygoodsoftwarenotvirus@users.noreply.github.com> Co-authored-by: so2liu <yangliu35@outlook.com> Co-authored-by: Viswanadh Rayavarapu <44315599+vishwa-rn@users.noreply.github.com> Co-authored-by: Chakib Ben Ziane <contact@blob42.xyz> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com> Co-authored-by: Daniel Chalef <daniel.chalef@private.org> Co-authored-by: Jari Bakken <jari.bakken@gmail.com> Co-authored-by: escafati <scafatieugenio@gmail.com>	2023-05-18 20:53:08 -07:00
Harrison Chase	5feb60f426	Harrison/spell executor (#4914 ) Co-authored-by: Jan Minar <rdancer@rdancer.org>	2023-05-18 20:43:33 -07:00
Aidan Boland	c06973261a	Fix for syntax when setting search_path for Snowflake database (#4747 ) # Fixes syntax for setting Snowflake database search_path An error occurs when using a Snowflake database and providing a schema argument. I have updated the syntax to run a Snowflake specific query when the database dialect is 'snowflake'.	2023-05-18 20:30:38 -07:00
Mike Wang	db6f7ed0ba	[nit] Simplify Spark Creation Validation Check A Little Bit (#4761 ) - simplify the validation check a little bit. - re-tested in jupyter notebook. Reviewer: @hwchase17	2023-05-18 18:57:54 -07:00
escafati	e027a38f33	NIT: Instead of hardcoding k in each definition, define it as a param above. (#2675 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com>	2023-05-18 17:35:31 -07:00

1 2 3 4 5 ...

2107 Commits