langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Tim Asp	b3234bf3b0	cleanup: unify 3 different pdf loaders, rename PagedPDFSplitter (#1615 ) `OnlinePDFLoader` and `PagedPDFSplitter` lived separate from the rest of the pdf loaders. Because they're all similar, I propose moving all to `pdy.py` and the same docs/examples page. Additionally, `PagedPDFSplitter` naming doesn't match the pattern the rest of the loaders follow, so I renamed to `PyPDFLoader` and had it inherit from `BasePDFLoader` so it can now load from remote file sources.	2023-03-13 23:06:50 -07:00
Luis	562d9891ea	Add regex dict: (#1616 ) This class enables us to send a dictionary containing an output key and the expected format, which in turn allows us to retrieve the result of the matching formats and extract specific information from it. To exclude irrelevant information from our return dictionary, we can prompt the LLM to use a specific command that notifies us when it doesn't know the answer. We refer to this variable as the "no_update_value". Regarding the updated regular expression pattern (r"{}:\s?([^.'\n']).?"), it enables us to retrieve a format as 'Output Key':'value'. We have improved the regex by adding an optional space between ':' and 'value' with "s?", and by excluding points and line jumps from the matches using "[^.'\n']".	2023-03-13 23:05:39 -07:00
Harrison Chase	56aff797c0	docs req (#1647 )	2023-03-13 16:03:32 -07:00
Harrison Chase	d53ff270e0	bump version to 109 (#1646 )	2023-03-13 15:52:35 -07:00
Harrison Chase	df6c33d4b3	Harrison/new output parser (#1617 )	2023-03-13 15:08:39 -07:00
Dennis Aumiller	039d05c808	Update types in cohere.py (#1635 ) Adjust argument type and clarification on parameter limits for attributes `frequency_penalty` and `presence_penalty`.	2023-03-13 09:08:32 -07:00
Harrison Chase	aed9f9febe	Harrison/return intermediate (#1633 ) Co-authored-by: Mario Kostelac <mario@intercom.io>	2023-03-13 07:54:29 -07:00
Harrison Chase	72b461e257	improve chat error (#1632 )	2023-03-13 07:43:44 -07:00
Peng Qu	cb646082ba	remove an extra whitespace (#1625 )	2023-03-13 07:27:21 -07:00
Eugene Yurtsev	bd4a2a670b	Add copy button to sphinx notebooks (#1622 ) This adds a copy button at the top right corner of all notebook cells in sphinx notebooks.	2023-03-12 21:15:07 -07:00
Ikko Eltociear Ashimine	6e98ab01e1	Fix typo in vectorstore.ipynb (#1614 ) Initalize -> Initialize	2023-03-12 14:12:47 -07:00
Harrison Chase	c0ad5d13b8	bump to version 108 (#1613 )	2023-03-12 09:50:45 -07:00
yakigac	acd86d33bc	Add read only shared memory (#1491 ) Provide shared memory capability for the Agent. Inspired by #1293 . ## Problem If both Agent and Tools (i.e., LLMChain) use the same memory, both of them will save the context. It can be annoying in some cases. ## Solution Create a memory wrapper that ignores the save and clear, thereby preventing updates from Agent or Tools.	2023-03-12 09:34:36 -07:00
Abhinav Upadhyay	9707eda83c	Fix docstring of FAISS constructor (#1611 )	2023-03-12 09:31:40 -07:00
Kayvane Shakerifar	7e550df6d4	feat: add lookup index to csv loader to make retrieving the original … (#1612 ) feat: add lookup index to csv loader to make retrieving the original csv information easier using theDocument properties	2023-03-12 09:29:27 -07:00
Harrison Chase	c9b5a30b37	move output parsing (#1605 )	2023-03-11 16:41:03 -08:00
Harrison Chase	cb04ba0136	Add support for intermediate steps to SQLDatabaseSequentialChain (#1583 ) (#1601 ) for https://github.com/hwchase17/langchain/issues/1582 I simply added the `return_intermediate_steps` and changed the `output_keys` function. I added 2 simple tests, 1 for SQLDatabaseSequentialChain without the intermediate steps and 1 with Co-authored-by: brad-nemetski <115185478+brad-nemetski@users.noreply.github.com>	2023-03-11 15:44:41 -08:00
Harrison Chase	5903a93f3d	add convinence method to call chat model as an llm (#1604 )	2023-03-11 15:04:57 -08:00
Harrison Chase	15de3e8137	Harrison/docs footer (#1600 ) Co-authored-by: Albert Avetisian <albert.avetisian@gmail.com>	2023-03-11 09:18:35 -08:00
Harrison Chase	f95d551f7a	Harrison/shallow metadata (#1599 ) Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com>	2023-03-11 09:18:25 -08:00
Harrison Chase	c6bfa00178	bump version to 107 (#1590 )	2023-03-10 15:39:30 -08:00
Tim Asp	01a57198b8	[bugfix] Fix persisted chromadb vectorstore (#1444 ) If a `persist_directory` param was set, chromadb would throw a warning that ""No embedding_function provided, using default embedding function: SentenceTransformerEmbeddingFunction". and would error with a `Illegal instruction: 4` error. This is on a MBP M1 13.2.1, python 3.9. I'm not entirely sure why that error happened, but when using `get_or_create_collection` instead of `list_collection` on our end, the error and warning goes away and chroma works as expected. Added bonus this is cleaner and likely more efficient. `list_collections` builds a new `Collection` instance for each collect, then `Chroma` would just use the `name` field to tell if the collection existed.	2023-03-10 15:14:35 -08:00
Harrison Chase	8dba30f31e	Harrison/kwargs loaders (#1588 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	2023-03-10 15:05:06 -08:00
Harrison Chase	9f78717b3c	Harrison/callbacks (#1587 )	2023-03-10 12:53:09 -08:00
Harrison Chase	90846dcc28	fix chat agent (#1586 )	2023-03-10 12:40:37 -08:00
Claus Thomasen	6ed16e13b1	Readded similarity_search_by_vector (#1568 ) I am redoing this PR, as I made a mistake by merging the latest changes into my fork's branch, sorry. This added a bunch of commits to my previous PR. This fixes #1451.	2023-03-10 12:40:14 -08:00
Harrison Chase	c1dc784a3d	buffer memory old version (#1581 ) bring back an older version of memory since people seem to be using it more widely	2023-03-10 11:27:15 -08:00
fabi.s	5b0e747f9a	Fix description of UnstructuredURLLoader & UnstructuredHTMLLoader (#1570 )	2023-03-10 07:08:58 -08:00
Zach Schillaci	624c72c266	Add wikipedia tool doc (#1579 )	2023-03-10 07:07:27 -08:00
Ryan Dao	a950287206	Strip trailing whitespaces in agent's stop sequences (#1566 ) Fixes #1489	2023-03-09 16:36:15 -08:00
Tim Asp	30383abb12	Add CSVLoader document loader (#1573 ) Simple CSV document loader which wraps `csv` reader, and preps the file with a single `Document` per row. The column header is prepended to each value for context which is useful for context with embedding and semantic search	2023-03-09 16:35:18 -08:00
Zach Schillaci	cdb97f3dfb	Add Wikipedia search utility and tool (#1561 ) The Python `wikipedia` package gives easy access for searching and fetching pages from Wikipedia, see https://pypi.org/project/wikipedia/. It can serve as an additional search and retrieval tool, like the existing Google and SerpAPI helpers, for both chains and agents.	2023-03-09 16:34:39 -08:00
Felix Altenberger	b44c8bd969	Add optional `base_url` arg to `GitbookLoader` (#1552 ) First of all, big kudos on what you guys are doing, langchain is enabling some really amazing usecases and I'm having lot's of fun playing around with it. It's really cool how many data sources it supports out of the box. However, I noticed some limitations of the current `GitbookLoader` which this PR adresses: The main change is that I added an optional `base_url` arg to `GitbookLoader`. This enables use cases where one wants to crawl docs from a start page other than the index page, e.g., the following call would scrape all pages that are reachable via nav bar links from "https://docs.zenml.io/v/0.35.0": ```python GitbookLoader( web_page="https://docs.zenml.io/v/0.35.0", load_all_paths=True, base_url="https://docs.zenml.io", ) ``` Previously, this would fail because relative links would be of the form `/v/0.35.0/...` and the full link URLs would become `docs.zenml.io/v/0.35.0/v/0.35.0/...`. I also fixed another issue of the `GitbookLoader` where the link URLs were constructed incorrectly as `website//relative_url` if the provided `web_page` had a trailing slash.	2023-03-09 16:32:40 -08:00
Andriy Mulyar	c9189d354a	AtlasDB vector store documentation updates. (#1572 ) - Updated errors in the AtlasDB vector store documentation - Removed extraneous output logs in example notebook.	2023-03-09 16:31:14 -08:00
blob42	622578a022	docs: fix typo in searx tool (#1569 ) Co-authored-by: blob42 <spike@w530>	2023-03-09 15:58:33 -08:00
Matt Robinson	7018806a92	feat: document loader for markdown files (#1558 ) ### Summary Adds a document loader for handling markdown files. This document loader requires `unstructured>=0.4.16`. ### Testing ```python from langchain.document_loaders import UnstructuredMarkdownLoader loader = UnstructuredMarkdownLoader("README.md") loader.load() ```	2023-03-09 10:55:07 -08:00
Harrison Chase	bd335ffd64	bump version to 106 (#1562 )	2023-03-09 10:20:54 -08:00
Harrison Chase	a094c49153	add chat agent (#1509 )	2023-03-09 09:12:08 -08:00
Brenton Wheeler	99fe023496	docs: fix typo in modules/indexes/chain_examples/question_answering (#1551 ) docs: fix typo in modules/indexes/chain_examples/question_answering ![image](https://user-images.githubusercontent.com/11394076/224007874-3a52adf6-ff7a-4f22-9dbf-18c83d08167f.png)	2023-03-09 09:11:43 -08:00
Harrison Chase	3ee32a01ea	Harrison/prompt layer (#1547 ) Co-authored-by: Jonathan Pedoeem <jonathanped@gmail.com> Co-authored-by: AbuBakar <abubakarsohail123@gmail.com>	2023-03-08 21:24:27 -08:00
Harrison Chase	c844d1fd46	Harrison/chunk size (#1549 ) Co-authored-by: Florian Leuerer <31259070+floleuerer@users.noreply.github.com>	2023-03-08 21:24:18 -08:00
Harrison Chase	9405af6919	Harrison/hf inf error (#1543 ) Co-authored-by: Konstantin Hebenstreit <57603012+KonstantinHebenstreit@users.noreply.github.com>	2023-03-08 20:53:46 -08:00
Harrison Chase	357d808484	Harrison/remote paths pdf (#1544 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	2023-03-08 20:53:37 -08:00
Harrison Chase	cc423f40f1	Harrison/youtube loader (#1545 ) Co-authored-by: Julian Wustl <57504258+Julianwustl@users.noreply.github.com>	2023-03-08 20:53:27 -08:00
Harrison Chase	b053f831cd	Harrison/contributing (#1542 ) Co-authored-by: Saurav Maheshkar <sauravvmaheshkar@gmail.com>	2023-03-08 20:53:16 -08:00
Harrison Chase	523ad8d2e2	Harrison/chat history formatter1 (#1538 ) Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>	2023-03-08 20:46:37 -08:00
Graham Neubig	31303d0b11	Added other evaluation metrics for data-augmented QA (#1521 ) This PR adds additional evaluation metrics for data-augmented QA, resulting in a report like this at the end of the notebook: ![Screen Shot 2023-03-08 at 8 53 23 AM](https://user-images.githubusercontent.com/398875/223731199-8eb8e77f-5ff3-40a2-a23e-f3bede623344.png) The score calculation is based on the [Critique](https://docs.inspiredco.ai/critique/) toolkit, an API-based toolkit (like OpenAI) that has minimal dependencies, so it should be easy for people to run if they choose. The code could further be simplified by actually adding a chain that calls Critique directly, but that probably should be saved for another PR if necessary. Any comments or change requests are welcome!	2023-03-08 20:41:03 -08:00
gidler	494c9d341a	[DOCS] Assorted wording, punctuation, and consistency revisions (#1443 ) Contributing some small fixes I noticed while reading through the documentation. Thank you for a creating and maintaining this project!	2023-03-08 20:16:09 -08:00
Harrison Chase	519f0187b6	Harrison/gdrive pdf (#1433 ) Co-authored-by: LM <93918064+LuisMalhadas@users.noreply.github.com> Co-authored-by: Luis Malhadas <luis@sia.so>	2023-03-08 20:15:36 -08:00
Florian Leuerer	64c6435545	Added client_settings support for chromadb vecstore (#1528 ) # Problem The ChromaDB vecstore only supported local connection. There was no way to use a chromadb server. # Fix Added `client_settings` as Chroma attribute. # Usage ``` from chromadb.config import Settings from langchain.vectorstores import Chroma chroma_settings = Settings(chroma_api_impl="rest", chroma_server_host="localhost", chroma_server_http_port="80") docsearch = Chroma.from_documents(chunks, embeddings, metadatas=metadatas, client_settings=chroma_settings, collection_name=COLLECTION_NAME) ```	2023-03-08 17:42:09 -08:00

... 7 8 9 10 11 ...

1207 Commits