langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Xin Qiu	4e13cef05a	feat: add redisearch vectorstore (#1307 ) # Description Add `RediSearch` vectorstore for LangChain RediSearch: [RediSearch quick start](https://redis.io/docs/stack/search/quick_start/) # How to use ``` from langchain.vectorstores.redisearch import RediSearch rds = RediSearch.from_documents(docs, embeddings,redisearch_url="redis://localhost:6379") ```	2023-03-14 18:06:03 -07:00
Harrison Chase	e5c1659864	bump ver (#1668 )	2023-03-14 13:05:17 -07:00
Harrison Chase	2d098e8869	Harrison/agent eval (#1620 ) Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>	2023-03-14 12:37:48 -07:00
Harrison Chase	8965a2f0af	bump and hotfix (#1665 )	2023-03-14 11:12:53 -07:00
Harrison Chase	e222ea4ee8	update rtd config (#1664 )	2023-03-14 10:40:06 -07:00
Harrison Chase	e326939759	bump version 110 (#1662 )	2023-03-14 10:21:35 -07:00
Harrison Chase	7cf46b3fee	Harrison/convo agent (#1642 )	2023-03-14 09:42:24 -07:00
Abhinav Upadhyay	84cd825a0e	Add a batch_size param to the add_texts API of pinecone wrapper (#1658 ) A safe default value of batch_size is required by the pinecone python client otherwise if the user of add_texts passes too many documents in a single call, they would get a 400 error from pinecone.	2023-03-14 09:40:22 -07:00
Jon Luo	0a1b1806e9	sql: do not hard code the LIMIT clause in the table_info section (#1563 ) Seeing a lot of issues in Discord in which the LLM is not using the correct LIMIT clause for different SQL dialects. ie, it's using `LIMIT` for mssql instead of `TOP`, or instead of `ROWNUM` for Oracle, etc. I think this could be due to us specifying the LIMIT statement in the example rows portion of `table_info`. So the LLM is seeing the `LIMIT` statement used in the prompt. Since we can't specify each dialect's method here, I think it's fine to just replace the `SELECT... LIMIT 3;` statement with `3 rows from table_name table:`, and wrap everything in a block comment directly following the `CREATE` statement. The Rajkumar et al paper wrapped the example rows and `SELECT` statement in a block comment as well anyway. Thoughts @fpingham?	2023-03-13 23:08:27 -07:00
Brian Thorne	9ee2713272	Bugfix - allow custom input variables in chat zero shot agent's prompt (#1624 ) I was trying out the `chat-zero-shot-react-description` agent for [qabot](`dbbd31bb27/qabot/agents/data_query_chain.py (L35-L52)`) but langchain 0.0.108 doesn't correctly use custom 'input_variables` in the prompt template.	2023-03-13 23:07:35 -07:00
Tim Asp	b3234bf3b0	cleanup: unify 3 different pdf loaders, rename PagedPDFSplitter (#1615 ) `OnlinePDFLoader` and `PagedPDFSplitter` lived separate from the rest of the pdf loaders. Because they're all similar, I propose moving all to `pdy.py` and the same docs/examples page. Additionally, `PagedPDFSplitter` naming doesn't match the pattern the rest of the loaders follow, so I renamed to `PyPDFLoader` and had it inherit from `BasePDFLoader` so it can now load from remote file sources.	2023-03-13 23:06:50 -07:00
Luis	562d9891ea	Add regex dict: (#1616 ) This class enables us to send a dictionary containing an output key and the expected format, which in turn allows us to retrieve the result of the matching formats and extract specific information from it. To exclude irrelevant information from our return dictionary, we can prompt the LLM to use a specific command that notifies us when it doesn't know the answer. We refer to this variable as the "no_update_value". Regarding the updated regular expression pattern (r"{}:\s?([^.'\n']).?"), it enables us to retrieve a format as 'Output Key':'value'. We have improved the regex by adding an optional space between ':' and 'value' with "s?", and by excluding points and line jumps from the matches using "[^.'\n']".	2023-03-13 23:05:39 -07:00
Harrison Chase	56aff797c0	docs req (#1647 )	2023-03-13 16:03:32 -07:00
Harrison Chase	d53ff270e0	bump version to 109 (#1646 )	2023-03-13 15:52:35 -07:00
Harrison Chase	df6c33d4b3	Harrison/new output parser (#1617 )	2023-03-13 15:08:39 -07:00
Dennis Aumiller	039d05c808	Update types in cohere.py (#1635 ) Adjust argument type and clarification on parameter limits for attributes `frequency_penalty` and `presence_penalty`.	2023-03-13 09:08:32 -07:00
Harrison Chase	aed9f9febe	Harrison/return intermediate (#1633 ) Co-authored-by: Mario Kostelac <mario@intercom.io>	2023-03-13 07:54:29 -07:00
Harrison Chase	72b461e257	improve chat error (#1632 )	2023-03-13 07:43:44 -07:00
Peng Qu	cb646082ba	remove an extra whitespace (#1625 )	2023-03-13 07:27:21 -07:00
Eugene Yurtsev	bd4a2a670b	Add copy button to sphinx notebooks (#1622 ) This adds a copy button at the top right corner of all notebook cells in sphinx notebooks.	2023-03-12 21:15:07 -07:00
Ikko Eltociear Ashimine	6e98ab01e1	Fix typo in vectorstore.ipynb (#1614 ) Initalize -> Initialize	2023-03-12 14:12:47 -07:00
Harrison Chase	c0ad5d13b8	bump to version 108 (#1613 )	2023-03-12 09:50:45 -07:00
yakigac	acd86d33bc	Add read only shared memory (#1491 ) Provide shared memory capability for the Agent. Inspired by #1293 . ## Problem If both Agent and Tools (i.e., LLMChain) use the same memory, both of them will save the context. It can be annoying in some cases. ## Solution Create a memory wrapper that ignores the save and clear, thereby preventing updates from Agent or Tools.	2023-03-12 09:34:36 -07:00
Abhinav Upadhyay	9707eda83c	Fix docstring of FAISS constructor (#1611 )	2023-03-12 09:31:40 -07:00
Kayvane Shakerifar	7e550df6d4	feat: add lookup index to csv loader to make retrieving the original … (#1612 ) feat: add lookup index to csv loader to make retrieving the original csv information easier using theDocument properties	2023-03-12 09:29:27 -07:00
Harrison Chase	c9b5a30b37	move output parsing (#1605 )	2023-03-11 16:41:03 -08:00
Harrison Chase	cb04ba0136	Add support for intermediate steps to SQLDatabaseSequentialChain (#1583 ) (#1601 ) for https://github.com/hwchase17/langchain/issues/1582 I simply added the `return_intermediate_steps` and changed the `output_keys` function. I added 2 simple tests, 1 for SQLDatabaseSequentialChain without the intermediate steps and 1 with Co-authored-by: brad-nemetski <115185478+brad-nemetski@users.noreply.github.com>	2023-03-11 15:44:41 -08:00
Harrison Chase	5903a93f3d	add convinence method to call chat model as an llm (#1604 )	2023-03-11 15:04:57 -08:00
Harrison Chase	15de3e8137	Harrison/docs footer (#1600 ) Co-authored-by: Albert Avetisian <albert.avetisian@gmail.com>	2023-03-11 09:18:35 -08:00
Harrison Chase	f95d551f7a	Harrison/shallow metadata (#1599 ) Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com>	2023-03-11 09:18:25 -08:00
Harrison Chase	c6bfa00178	bump version to 107 (#1590 )	2023-03-10 15:39:30 -08:00
Tim Asp	01a57198b8	[bugfix] Fix persisted chromadb vectorstore (#1444 ) If a `persist_directory` param was set, chromadb would throw a warning that ""No embedding_function provided, using default embedding function: SentenceTransformerEmbeddingFunction". and would error with a `Illegal instruction: 4` error. This is on a MBP M1 13.2.1, python 3.9. I'm not entirely sure why that error happened, but when using `get_or_create_collection` instead of `list_collection` on our end, the error and warning goes away and chroma works as expected. Added bonus this is cleaner and likely more efficient. `list_collections` builds a new `Collection` instance for each collect, then `Chroma` would just use the `name` field to tell if the collection existed.	2023-03-10 15:14:35 -08:00
Harrison Chase	8dba30f31e	Harrison/kwargs loaders (#1588 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	2023-03-10 15:05:06 -08:00
Harrison Chase	9f78717b3c	Harrison/callbacks (#1587 )	2023-03-10 12:53:09 -08:00
Harrison Chase	90846dcc28	fix chat agent (#1586 )	2023-03-10 12:40:37 -08:00
Claus Thomasen	6ed16e13b1	Readded similarity_search_by_vector (#1568 ) I am redoing this PR, as I made a mistake by merging the latest changes into my fork's branch, sorry. This added a bunch of commits to my previous PR. This fixes #1451.	2023-03-10 12:40:14 -08:00
Harrison Chase	c1dc784a3d	buffer memory old version (#1581 ) bring back an older version of memory since people seem to be using it more widely	2023-03-10 11:27:15 -08:00
fabi.s	5b0e747f9a	Fix description of UnstructuredURLLoader & UnstructuredHTMLLoader (#1570 )	2023-03-10 07:08:58 -08:00
Zach Schillaci	624c72c266	Add wikipedia tool doc (#1579 )	2023-03-10 07:07:27 -08:00
Ryan Dao	a950287206	Strip trailing whitespaces in agent's stop sequences (#1566 ) Fixes #1489	2023-03-09 16:36:15 -08:00
Tim Asp	30383abb12	Add CSVLoader document loader (#1573 ) Simple CSV document loader which wraps `csv` reader, and preps the file with a single `Document` per row. The column header is prepended to each value for context which is useful for context with embedding and semantic search	2023-03-09 16:35:18 -08:00
Zach Schillaci	cdb97f3dfb	Add Wikipedia search utility and tool (#1561 ) The Python `wikipedia` package gives easy access for searching and fetching pages from Wikipedia, see https://pypi.org/project/wikipedia/. It can serve as an additional search and retrieval tool, like the existing Google and SerpAPI helpers, for both chains and agents.	2023-03-09 16:34:39 -08:00
Felix Altenberger	b44c8bd969	Add optional `base_url` arg to `GitbookLoader` (#1552 ) First of all, big kudos on what you guys are doing, langchain is enabling some really amazing usecases and I'm having lot's of fun playing around with it. It's really cool how many data sources it supports out of the box. However, I noticed some limitations of the current `GitbookLoader` which this PR adresses: The main change is that I added an optional `base_url` arg to `GitbookLoader`. This enables use cases where one wants to crawl docs from a start page other than the index page, e.g., the following call would scrape all pages that are reachable via nav bar links from "https://docs.zenml.io/v/0.35.0": ```python GitbookLoader( web_page="https://docs.zenml.io/v/0.35.0", load_all_paths=True, base_url="https://docs.zenml.io", ) ``` Previously, this would fail because relative links would be of the form `/v/0.35.0/...` and the full link URLs would become `docs.zenml.io/v/0.35.0/v/0.35.0/...`. I also fixed another issue of the `GitbookLoader` where the link URLs were constructed incorrectly as `website//relative_url` if the provided `web_page` had a trailing slash.	2023-03-09 16:32:40 -08:00
Andriy Mulyar	c9189d354a	AtlasDB vector store documentation updates. (#1572 ) - Updated errors in the AtlasDB vector store documentation - Removed extraneous output logs in example notebook.	2023-03-09 16:31:14 -08:00
blob42	622578a022	docs: fix typo in searx tool (#1569 ) Co-authored-by: blob42 <spike@w530>	2023-03-09 15:58:33 -08:00
Matt Robinson	7018806a92	feat: document loader for markdown files (#1558 ) ### Summary Adds a document loader for handling markdown files. This document loader requires `unstructured>=0.4.16`. ### Testing ```python from langchain.document_loaders import UnstructuredMarkdownLoader loader = UnstructuredMarkdownLoader("README.md") loader.load() ```	2023-03-09 10:55:07 -08:00
Harrison Chase	bd335ffd64	bump version to 106 (#1562 )	2023-03-09 10:20:54 -08:00
Harrison Chase	a094c49153	add chat agent (#1509 )	2023-03-09 09:12:08 -08:00
Brenton Wheeler	99fe023496	docs: fix typo in modules/indexes/chain_examples/question_answering (#1551 ) docs: fix typo in modules/indexes/chain_examples/question_answering ![image](https://user-images.githubusercontent.com/11394076/224007874-3a52adf6-ff7a-4f22-9dbf-18c83d08167f.png)	2023-03-09 09:11:43 -08:00
Harrison Chase	3ee32a01ea	Harrison/prompt layer (#1547 ) Co-authored-by: Jonathan Pedoeem <jonathanped@gmail.com> Co-authored-by: AbuBakar <abubakarsohail123@gmail.com>	2023-03-08 21:24:27 -08:00

... 2 3 4 5 6 ...

967 Commits