langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-31 15:20:26 +00:00

Author	SHA1	Message	Date
Harrison Chase	2f6833d433	hotfix (#1742 )	2023-03-17 09:05:08 -07:00
Harrison Chase	dd90fd02d5	Harrison/move docs (#1741 )	2023-03-17 08:49:10 -07:00
Harrison Chase	07766a69f3	move docs (#1740 )	2023-03-17 08:42:28 -07:00
Harrison Chase	aa854988bf	bump version to 114 (#1739 )	2023-03-17 08:26:06 -07:00
Harrison Chase	96ebe98dc2	Harrison/latex splitter (#1738 ) Co-authored-by: Aidan Holland <thehappydinoa@gmail.com> Co-authored-by: Jan de Boer <44832123+Janldeboer@users.noreply.github.com>	2023-03-17 08:10:27 -07:00
Harrison Chase	45f05fc939	Harrison/blackboard loader (#1737 ) Co-authored-by: Aidan Holland <thehappydinoa@gmail.com>	2023-03-17 08:02:44 -07:00
Vincent Liao	cf9c3f54f7	docs: add docs link to agent toolkits (#1735 ) New to Langchain, was a bit confused where I should find the toolkits section when I'm at `agent/key_concepts` docs. I added a short link that points to the how to section.	2023-03-17 07:59:49 -07:00
Merbin J Anselm	fbc0c85b90	fix: agent json parser fails with text in suffix (#1734 ) While testing out `VectorDBQA` as a `Tool` for one of the conversation, I happened to get a response from LLM (OpenAI) like this <code> Could not parse LLM output: Here's a response using the Product Search tool: ```json { "action": "Product Search", "action_input": "pots for plants" } ``` This will allow you to search for pots for your plants and find a variety of options that are available for purchase. You can use this information to choose the pots that best fit your needs and preferences. </code> i.e. The response had a text before & after the expected JSON, leading to `JSONDecodeError`. It's fixed now, by removing text after '```' to remove unwanted text. The error I encountered in this Jupyter Notebook - [link](https://github.com/anselm94/chatbot-llm-ecommerce/blob/main/chatcommerce.ipynb) <details> <summary>Error encountered</summary> <code> --------------------------------------------------------------------------- JSONDecodeError Traceback (most recent call last) File ~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/conversational_chat/base.py:104, in ConversationalChatAgent._extract_tool_and_input(self, llm_output) 103 try: --> 104 response = self.output_parser.parse(llm_output) 105 return response["action"], response["action_input"] File ~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/conversational_chat/base.py:49, in AgentOutputParser.parse(self, text) 48 cleaned_output = cleaned_output.strip() ---> 49 response = json.loads(cleaned_output) 50 return {"action": response["action"], "action_input": response["action_input"]} File /opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, *kw) 343 if (cls is None and object_hook is None and 344 parse_int is None and parse_float is None and 345 parse_constant is None and object_pairs_hook is None and not kw): --> 346 return _default_decoder.decode(s) 347 if cls is None: File /opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py:340, in JSONDecoder.decode(self, s, _w) 339 if end != len(s): --> 340 raise JSONDecodeError("Extra data", s, end) 341 return obj JSONDecodeError: Extra data: line 5 column 1 (char 74) During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) Cell In[22], line 1 ----> 1 ask_ai.run("Yes. I need pots for my plants") File ~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/chains/base.py:213, in Chain.run(self, args, kwargs) 211 if len(args) != 1: 212 raise ValueError("`run` supports only one positional argument.") --> 213 return self(args[0])[self.output_keys[0]] 215 if kwargs and not args: 216 return self(kwargs)[self.output_keys[0]] File ~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/chains/base.py:116, in Chain.__call__(self, inputs, return_only_outputs) 114 except (KeyboardInterrupt, Exception) as e: 115 self.callback_manager.on_chain_error(e, verbose=self.verbose) --> 116 raise e 117 self.callback_manager.on_chain_end(outputs, verbose=self.verbose) 118 return self.prep_outputs(inputs, outputs, return_only_outputs) File ~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/chains/base.py:113, in Chain.__call__(self, inputs, return_only_outputs) 107 self.callback_manager.on_chain_start( 108 {"name": self.__class__.__name__}, 109 inputs, 110 verbose=self.verbose, 111 ) 112 try: --> 113 outputs = self._call(inputs) 114 except (KeyboardInterrupt, Exception) as e: 115 self.callback_manager.on_chain_error(e, verbose=self.verbose) File ~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/agent.py:499, in AgentExecutor._call(self, inputs) 497 # We now enter the agent loop (until it returns something). 498 while self._should_continue(iterations): --> 499 next_step_output = self._take_next_step( 500 name_to_tool_map, color_mapping, inputs, intermediate_steps 501 ) 502 if isinstance(next_step_output, AgentFinish): 503 return self._return(next_step_output, intermediate_steps) File ~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/agent.py:409, in AgentExecutor._take_next_step(self, name_to_tool_map, color_mapping, inputs, intermediate_steps) 404 """Take a single step in the thought-action-observation loop. 405 406 Override this to take control of how the agent makes and acts on choices. 407 """ 408 # Call the LLM to see what to do. --> 409 output = self.agent.plan(intermediate_steps, inputs) 410 # If the tool chosen is the finishing tool, then we end and return. 411 if isinstance(output, AgentFinish): File ~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/agent.py:105, in Agent.plan(self, intermediate_steps, kwargs) 94 """Given input, decided what to do. 95 96 Args: (...) 102 Action specifying what tool to use. 103 """ 104 full_inputs = self.get_full_inputs(intermediate_steps, kwargs) --> 105 action = self._get_next_action(full_inputs) 106 if action.tool == self.finish_tool_name: 107 return AgentFinish({"output": action.tool_input}, action.log) File ~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/agent.py:67, in Agent._get_next_action(self, full_inputs) 65 def _get_next_action(self, full_inputs: Dict[str, str]) -> AgentAction: 66 full_output = self.llm_chain.predict(**full_inputs) ---> 67 parsed_output = self._extract_tool_and_input(full_output) 68 while parsed_output is None: 69 full_output = self._fix_text(full_output) File ~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/conversational_chat/base.py:107, in ConversationalChatAgent._extract_tool_and_input(self, llm_output) 105 return response["action"], response["action_input"] 106 except Exception: --> 107 raise ValueError(f"Could not parse LLM output: {llm_output}") ValueError: Could not parse LLM output: Here's a response using the Product Search tool: ```json { "action": "Product Search", "action_input": "pots for plants" } ``` This will allow you to search for pots for your plants and find a variety of options that are available for purchase. You can use this information to choose the pots that best fit your needs and preferences. </details>	2023-03-17 07:59:39 -07:00
Harrison Chase	276940fd9b	Harrison/official method (#1728 ) Co-authored-by: Aratako <127325395+Aratako@users.noreply.github.com>	2023-03-16 23:20:08 -07:00
Piyush Jain	cdff6c8181	Sagemaker Endpoint LLM (#1686 ) Updates #965 --------- Co-authored-by: Nimisha Mehta <116048415+nimimeht@users.noreply.github.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	2023-03-16 21:58:06 -07:00
alekhyablue	cd45adbea2	adding new agent types in comments (#1711 )	2023-03-16 21:56:08 -07:00
Mario Kostelac	aff44d0a98	(OpenAI) Add model_name to LLMResult.llm_output (#1713 ) Given that different models have very different latencies and pricings, it's benefitial to pass the information about the model that generated the response. Such information allows implementing custom callback managers and track usage and price per model. Addresses https://github.com/hwchase17/langchain/issues/1557.	2023-03-16 21:55:55 -07:00
libra	8a95fdaee1	Fix all the bug in init Tool in docs (#1725 ) Fix all the example in the docs when init `Tool` Test by render with jupyter	2023-03-16 21:55:44 -07:00
Alexandros Mavrogiannis	5d8dc83ede	Bump duckdb-engine to 0.7.0 (#1726 ) Resolves https://github.com/hwchase17/langchain/issues/1272 Resolves https://github.com/hwchase17/langchain/issues/1578	2023-03-16 21:55:35 -07:00
Daniel Chalef	b157e0c1c3	Add HTML document_loader that includes page title metadata (#1720 ) This `BSHTMLLoader` document_loader loads an HTML document, extracts text and adds the page title to the returned Document's metadata. The loader uses the already installed bs4 package to extract both text content and the page title. Included in this PR is an example HTML file and an integration test that tests against this file. --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	2023-03-16 21:47:17 -07:00
Harrison Chase	40e9488055	fix async in agent (#1723 )	2023-03-16 21:43:22 -07:00
jerwelborn	55efbb8a7e	pydantic/json parsing (#1722 ) ``` class Joke(BaseModel): setup: str = Field(description="question to set up a joke") punchline: str = Field(description="answer to resolve the joke") joke_query = "Tell me a joke." # Or, an example with compound type fields. #class FloatArray(BaseModel): # values: List[float] = Field(description="list of floats") # #float_array_query = "Write out a few terms of fiboacci." model = OpenAI(model_name='text-davinci-003', temperature=0.0) parser = PydanticOutputParser(pydantic_object=Joke) prompt = PromptTemplate( template="Answer the user query.\n{format_instructions}\n{query}\n", input_variables=["query"], partial_variables={"format_instructions": parser.get_format_instructions()} ) _input = prompt.format_prompt(query=joke_query) print("Prompt:\n", _input.to_string()) output = model(_input.to_string()) print("Completion:\n", output) parsed_output = parser.parse(output) print("Parsed completion:\n", parsed_output) ``` ``` Prompt: Answer the user query. The output should be formatted as a JSON instance that conforms to the JSON schema below. For example, the object {"foo": ["bar", "baz"]} conforms to the schema {"foo": {"description": "a list of strings field", "type": "string"}}. Here is the output schema: --- {"setup": {"description": "question to set up a joke", "type": "string"}, "punchline": {"description": "answer to resolve the joke", "type": "string"}} --- Tell me a joke. Completion: {"setup": "Why don't scientists trust atoms?", "punchline": "Because they make up everything!"} Parsed completion: setup="Why don't scientists trust atoms?" punchline='Because they make up everything!' ``` Ofc, works only with LMs of sufficient capacity. DaVinci is reliable but not always. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-03-16 21:43:11 -07:00
Alex Strick van Linschoten	d6bbf395af	Loosen PyYAML dependency (#1698 ) Hitting some dependency issues relating to this strict pinning. Unsure of the knock-on effects, but wanted to propose this loosening down a couple of versions.	2023-03-16 17:05:36 -07:00
Jonathan Pedoeem	606605925d	Adding ability to `return_pl_id` to all PromptLayer Models in LangChain (#1699 ) PromptLayer now has support for [several different tracking features.](https://magniv.notion.site/Track-4deee1b1f7a34c1680d085f82567dab9) In order to use any of these features you need to have a request id associated with the request. In this PR we add a boolean argument called `return_pl_id` which will add `pl_request_id` to the `generation_info` dictionary associated with a generation. We also updated the relevant documentation.	2023-03-16 17:05:23 -07:00
Jeff Huber	f93c011456	fallback to {} for None metadata from Chroma (#1714 ) The basic vector store example started breaking because `Document` required `not None` for metadata, but Chroma stores metadata as `None` if none is provided. This creates a fallback which fixes the basic tutorial https://langchain.readthedocs.io/en/latest/modules/indexes/examples/vectorstores.html Here is the error that was generated ``` Running Chroma using direct local API. Using DuckDB in-memory for database. Data will be transient. Traceback (most recent call last): File "/Users/jeff/src/temp/langchainchroma/test.py", line 17, in <module> docs = docsearch.similarity_search(query) File "/Users/jeff/src/langchain/langchain/vectorstores/chroma.py", line 133, in similarity_search docs_and_scores = self.similarity_search_with_score(query, k) File "/Users/jeff/src/langchain/langchain/vectorstores/chroma.py", line 182, in similarity_search_with_score return _results_to_docs_and_scores(results) File "/Users/jeff/src/langchain/langchain/vectorstores/chroma.py", line 24, in _results_to_docs_and_scores return [ File "/Users/jeff/src/langchain/langchain/vectorstores/chroma.py", line 27, in <listcomp> (Document(page_content=result[0], metadata=result[1]), result[2]) File "pydantic/main.py", line 331, in pydantic.main.BaseModel.__init__ pydantic.error_wrappers.ValidationError: 1 validation error for Document metadata none is not an allowed value (type=type_error.none.not_allowed) Exiting: Cleaning up .chroma directory ```	2023-03-16 12:06:47 -07:00
Harrison Chase	3c24684522	harrison/bump-version-00113 (#1701 )	2023-03-15 14:49:47 -07:00
Harrison Chase	b84d190fd0	Harrison/gr int (#1700 ) Co-authored-by: Shreya Rajpal <ShreyaR@users.noreply.github.com>	2023-03-15 13:22:20 -07:00
Harrison Chase	aad4bff098	Harrison/headers (#1696 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	2023-03-15 13:13:21 -07:00
Harrison Chase	3ea6d9c4d2	add docs for save/load messages (#1697 )	2023-03-15 13:13:08 -07:00
Pandazki	ced412e1c1	fix: correct a small mistake in SimpleChatModel. (#1685 )	2023-03-15 08:00:26 -07:00
Piyush Jain	1279c8de39	Fixed typo, clarified language (#1682 )	2023-03-15 08:00:11 -07:00
at-b612	c7779c800a	Added Mynd URL to gallery (#1684 )	2023-03-15 07:59:59 -07:00
Jithin James	6f4f771897	docs: add path to state_of_the_union.txt in indexes/getting_started page (#1691 ) add the state_of_the_union.txt file so that its easier to follow through with the example. --------- Co-authored-by: Jithin James <jjmachan@pop-os.localdomain>	2023-03-15 07:59:47 -07:00
Kacper Łukawski	4a327dd1d6	Implement basic metadata filtering in Qdrant (#1689 ) This PR implements a basic metadata filtering mechanism similar to the ones in Chroma and Pinecone. It still cannot express complex conditions, as there are no operators, but some users requested to have that feature available.	2023-03-15 07:31:39 -07:00
Ankush Gola	d4edd3c312	Zapier Integration (#1654 ) * Zapier Wrapper and Tools (implemented by Zapier Team) * Zapier Toolkit, examples with mrkl agent --------- Co-authored-by: Mike Knoop <mikeknoop@gmail.com> Co-authored-by: Robert Lewis <robert.lewis@zapier.com>	2023-03-14 23:06:17 -07:00
Harrison Chase	e72074f78a	Harrison/ifixit (#1680 ) Co-authored-by: David Rans <david@ifixit.com>	2023-03-14 21:17:50 -07:00
Harrison Chase	0b29e68c17	Harrison/pgvector (#1679 ) Co-authored-by: Aman Kumar <krsingh.aman@gmail.com>	2023-03-14 21:13:58 -07:00
Harrison Chase	4d7fdb8957	Harrison/gml save (#1676 ) Co-authored-by: Satoru Sakamoto <51464932+satoru814@users.noreply.github.com>	2023-03-14 20:00:22 -07:00
Harrison Chase	656efe6ef3	Harrison/fix nb (#1678 )	2023-03-14 19:34:23 -07:00
Harrison Chase	362586fe8b	save messages (#1653 ) @yakigac this is my alternative to https://github.com/hwchase17/langchain/pull/1648 - thoughts?	2023-03-14 18:15:55 -07:00
Matt Robinson	63aa28e2a6	feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667 ) ### Summary Allows users to pass in `**unstructured_kwargs` to Unstructured document loaders. Implemented with the `strategy` kwargs in mind, but will pass in other kwargs like `include_page_breaks` as well. The two currently supported strategies are `"hi_res"`, which is more accurate but takes longer, and `"fast"`, which processes faster but with lower accuracy. The `"hi_res"` strategy is the default. For PDFs, if `detectron2` is not available and the user selects `"hi_res"`, the loader will fallback to using the `"fast"` strategy. ### Testing #### Make sure the `strategy` kwarg works Run the following in iPython to verify that the `"fast"` strategy is indeed faster. ```python from langchain.document_loaders import UnstructuredFileLoader loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", strategy="fast", mode="elements") %timeit loader.load() loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", mode="elements") %timeit loader.load() ``` On my system I get: ```python In [3]: from langchain.document_loaders import UnstructuredFileLoader In [4]: loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", strategy="fast", mode="elements") In [5]: %timeit loader.load() 247 ms ± 369 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) In [6]: loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", mode="elements") In [7]: %timeit loader.load() 2.45 s ± 31 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` #### Make sure older versions of `unstructured` still work Run `pip install unstructured==0.5.3` and then verify the following runs without error: ```python from langchain.document_loaders import UnstructuredFileLoader loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", mode="elements") loader.load() ```	2023-03-14 18:15:28 -07:00
Matthias Kern	c3dfbdf0da	Remove outdated code from Chat VectorDB QA example (#1670 )	2023-03-14 18:13:51 -07:00
Bilel MEDIMEGH	a2280f321f	Docs: Fix typo in memory/key_concepts.md (#1671 ) dialouge -> dialogue	2023-03-14 18:12:01 -07:00
Xin Qiu	4e13cef05a	feat: add redisearch vectorstore (#1307 ) # Description Add `RediSearch` vectorstore for LangChain RediSearch: [RediSearch quick start](https://redis.io/docs/stack/search/quick_start/) # How to use ``` from langchain.vectorstores.redisearch import RediSearch rds = RediSearch.from_documents(docs, embeddings,redisearch_url="redis://localhost:6379") ```	2023-03-14 18:06:03 -07:00
Harrison Chase	e5c1659864	bump ver (#1668 )	2023-03-14 13:05:17 -07:00
Harrison Chase	2d098e8869	Harrison/agent eval (#1620 ) Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>	2023-03-14 12:37:48 -07:00
Harrison Chase	8965a2f0af	bump and hotfix (#1665 )	2023-03-14 11:12:53 -07:00
Harrison Chase	e222ea4ee8	update rtd config (#1664 )	2023-03-14 10:40:06 -07:00
Harrison Chase	e326939759	bump version 110 (#1662 )	2023-03-14 10:21:35 -07:00
Harrison Chase	7cf46b3fee	Harrison/convo agent (#1642 )	2023-03-14 09:42:24 -07:00
Abhinav Upadhyay	84cd825a0e	Add a batch_size param to the add_texts API of pinecone wrapper (#1658 ) A safe default value of batch_size is required by the pinecone python client otherwise if the user of add_texts passes too many documents in a single call, they would get a 400 error from pinecone.	2023-03-14 09:40:22 -07:00
Jon Luo	0a1b1806e9	sql: do not hard code the LIMIT clause in the table_info section (#1563 ) Seeing a lot of issues in Discord in which the LLM is not using the correct LIMIT clause for different SQL dialects. ie, it's using `LIMIT` for mssql instead of `TOP`, or instead of `ROWNUM` for Oracle, etc. I think this could be due to us specifying the LIMIT statement in the example rows portion of `table_info`. So the LLM is seeing the `LIMIT` statement used in the prompt. Since we can't specify each dialect's method here, I think it's fine to just replace the `SELECT... LIMIT 3;` statement with `3 rows from table_name table:`, and wrap everything in a block comment directly following the `CREATE` statement. The Rajkumar et al paper wrapped the example rows and `SELECT` statement in a block comment as well anyway. Thoughts @fpingham?	2023-03-13 23:08:27 -07:00
Brian Thorne	9ee2713272	Bugfix - allow custom input variables in chat zero shot agent's prompt (#1624 ) I was trying out the `chat-zero-shot-react-description` agent for [qabot](`dbbd31bb27/qabot/agents/data_query_chain.py (L35-L52)`) but langchain 0.0.108 doesn't correctly use custom 'input_variables` in the prompt template.	2023-03-13 23:07:35 -07:00
Tim Asp	b3234bf3b0	cleanup: unify 3 different pdf loaders, rename PagedPDFSplitter (#1615 ) `OnlinePDFLoader` and `PagedPDFSplitter` lived separate from the rest of the pdf loaders. Because they're all similar, I propose moving all to `pdy.py` and the same docs/examples page. Additionally, `PagedPDFSplitter` naming doesn't match the pattern the rest of the loaders follow, so I renamed to `PyPDFLoader` and had it inherit from `BasePDFLoader` so it can now load from remote file sources.	2023-03-13 23:06:50 -07:00
Luis	562d9891ea	Add regex dict: (#1616 ) This class enables us to send a dictionary containing an output key and the expected format, which in turn allows us to retrieve the result of the matching formats and extract specific information from it. To exclude irrelevant information from our return dictionary, we can prompt the LLM to use a specific command that notifies us when it doesn't know the answer. We refer to this variable as the "no_update_value". Regarding the updated regular expression pattern (r"{}:\s?([^.'\n']).?"), it enables us to retrieve a format as 'Output Key':'value'. We have improved the regex by adding an optional space between ':' and 'value' with "s?", and by excluding points and line jumps from the matches using "[^.'\n']".	2023-03-13 23:05:39 -07:00

... 28 29 30 31 32 ...

2305 Commits