langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Deepankar Mahapatro	5bea731fb4	docs(deployment): add langchain-serve (#2006 ) Adds documentation to deploy Langchain Chains & Agents using Jina. Repo: https://github.com/jina-ai/langchain-serve	2023-03-27 23:32:04 -07:00
Harrison Chase	0e3b0c827e	Harrison/ai plugin (#2084 ) Co-authored-by: Xupeng (Tony) Tong <tongxupeng.cpu@gmail.com>	2023-03-27 23:31:53 -07:00
Ace Eldeib	4be2f9d75a	fix: numerous broken documentation links (#2070 ) seems linkchecker isn't catching them because it runs on generated html. at that point the links are already missing. the generation process seems to strip invalid references when they can't be re-written from md to html. I used https://github.com/tcort/markdown-link-check to check the doc source directly. There are a few false positives on localhost for development.	2023-03-27 23:07:03 -07:00
Harrison Chase	f74a1bebf5	Harrison/duckdb (#2064 ) Co-authored-by: Trent Hauck <trent@trenthauck.com>	2023-03-27 19:51:34 -07:00
Harrison Chase	76ecca4d53	redis retriever (#2060 )	2023-03-27 19:51:23 -07:00
Ankush Gola	b7ebb8fe30	enable streaming in anthropic llm wrapper (#2065 )	2023-03-27 20:25:00 -04:00
Harrison Chase	30e3b31b04	Harrison/document cleanup (#2062 ) Co-authored-by: Delip Rao <delip@users.noreply.github.com>	2023-03-27 16:32:55 -07:00
Harrison Chase	a0cd6672aa	Harrison/site map (#2061 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	2023-03-27 16:28:08 -07:00
Krulknul	5e91928607	Added `.as_retriever()` to `from_llm()` calls (#2051 )	2023-03-27 15:04:03 -07:00
Jason Holtkamp	3d3e523520	Update getting_started with better example (#1910 ) I noticed that the "getting started" guide section on agents included an example test where the agent was getting the question wrong 😅 I guess Olivia Wilde's dating life is too tough to keep track of for this simple agent example. Let's change it to something a little easier, so users who are running their agent for the first time are less likely to be confused by a result that doesn't match that which is on the docs.	2023-03-27 08:19:13 -07:00
Eduard van Valkenburg	c1a9d83b34	Added Azure Blob Storage File and Container Loader (#1890 ) Added support for document loaders for Azure Blob Storage using a connection string. Fixes #1805 --------- Co-authored-by: Mick Vleeshouwer <mick@imick.nl>	2023-03-27 08:17:14 -07:00
Harrison Chase	b26fa1935d	fix headers (#2039 )	2023-03-27 07:55:57 -07:00
Harrison Chase	bc2ed93b77	fix doc tags (#2019 )	2023-03-26 21:43:51 -07:00
Ankush Gola	c71f2a7b26	small nit on index page (#2018 )	2023-03-27 00:15:24 -04:00
Harrison Chase	51681f653f	fix docs (#2017 )	2023-03-26 20:50:36 -07:00
Harrison Chase	705431aecc	big docs refactor (#1978 ) Co-authored-by: Ankush Gola <ankush.gola@gmail.com>	2023-03-26 19:49:46 -07:00
Harrison Chase	b83e826510	plugin tool (#1974 )	2023-03-24 12:30:08 -07:00
Harrison Chase	6ec5780547	add docs for openai retriever ingest (#1969 )	2023-03-24 08:24:33 -07:00
Harrison Chase	47d37db2d2	WIP: Harrison/base retriever (#1765 )	2023-03-24 07:46:49 -07:00
Enwei Jiao	4f364db9a9	Add milvus for ecosystem (#1951 )	2023-03-23 22:01:28 -07:00
Tim Asp	030ce9f506	fix import error of bs4 (#1952 ) Ran into a broken build if bs4 wasn't installed in the project. Minor tweak to follow the other doc loaders optional package-loading conventions. Also updated html docs to include reference to this new html loader. side note: Should there be 2 different html-to-text document loaders? This new one only handles local files, while the existing unstructured html loader handles HTML from local and remote. So it seems like the improvement was adding the title to the metadata, which is useful but could also be added to `html.py`	2023-03-23 21:56:13 -07:00
Harrison Chase	8990122d5d	retrievers interface (#1948 )	2023-03-23 19:00:38 -07:00
Harrison Chase	52d6bf04d0	tracing improvements to docs (#1947 )	2023-03-23 19:00:18 -07:00
Harrison Chase	b5667bed9e	human input default (#1911 )	2023-03-22 20:30:45 -07:00
Eric Zhu	b3be83c750	Add human as a tool (#1879 ) Human can help AI. #1871	2023-03-22 20:14:52 -07:00
Harrison Chase	50626a10ee	Hx23840 feat/add redisearch vectorstore (#1909 ) Co-authored-by: Peter <peter.shi@alephf.com> Co-authored-by: Peter Shi <42536066+hx23840@users.noreply.github.com>	2023-03-22 19:57:56 -07:00
Harrison Chase	6e1b5b8f7e	Harrison/figma doc loader (#1908 ) Co-authored-by: Ismail Pelaseyed <homanp@gmail.com>	2023-03-22 19:57:46 -07:00
Klein Tahiraj	d3d4503ce2	Remove redundant .docx loader (closes #1716 ) + update how_to_guides.rst (#1891 ) In https://github.com/hwchase17/langchain/issues/1716 , it was identified that there were two .py files performing similar tasks. As a resolution, one of the files has been removed, as its purpose had already been fulfilled by the other file. Additionally, the init has been updated accordingly. Furthermore, the how_to_guides.rst file has been updated to include links to documentation that was previously missing. This was deemed necessary as the existing list on https://langchain.readthedocs.io/en/latest/modules/document_loaders/how_to_guides.html was incomplete, causing confusion for users who rely on the full list of documentation on the left sidebar of the website.	2023-03-22 15:19:42 -07:00
Harrison Chase	1f93c5cf69	extraction docs (#1898 )	2023-03-22 15:00:44 -07:00
Sean Zheng	15b5a08f4b	Update how_to_guides.rst (#1893 ) Adding OpenSearch examples	2023-03-22 14:30:43 -07:00
Harrison Chase	ce5d97bcb3	Harrison/guarded output parser (#1804 ) Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>	2023-03-21 22:07:23 -07:00
DeadBranch	8fa1764c60	docs: update gpt index references to LlamaIndex (#1856 ) The GPT Index project is transitioning to the new project name, LlamaIndex. I've updated a few files referencing the old project name and repository URL to the current ones. From the [LlamaIndex repo](https://github.com/jerryjliu/llama_index): > NOTE: We are rebranding GPT Index as LlamaIndex! We will carry out this transition gradually. > > 2/25/2023: By default, our docs/notebooks/instructions now reference "LlamaIndex" instead of "GPT Index". > > 2/19/2023: By default, our docs/notebooks/instructions now use the llama-index package. However the gpt-index package still exists as a duplicate! > > 2/16/2023: We have a duplicate llama-index pip package. Simply replace all imports of gpt_index with llama_index if you choose to pip install llama-index. I'm not associated with LlamaIndex in any way. I just noticed the discrepancy when studying the lanchain documentation.	2023-03-21 22:01:05 -07:00
Harrison Chase	f299bd1416	clean up sagemaker nb (#1875 )	2023-03-21 22:00:08 -07:00
Philipp Schmid	064be93edf	[Embeddings] Add SageMaker Endpoint Embedding class (#1859 ) # What does this PR do? This PR adds similar to `llms` a SageMaker-powered `embeddings` class. This is helpful if you want to leverage Hugging Face models on SageMaker for creating your indexes. I added a example into the [docs/modules/indexes/examples/embeddings.ipynb](https://github.com/hwchase17/langchain/compare/master...philschmid:add-sm-embeddings?expand=1#diff-e82629e2894974ec87856aedd769d4bdfe400314b03734f32bee5990bc7e8062) document. The example currently includes some `_### TEMPORARY: Showing how to deploy a SageMaker Endpoint from a Hugging Face model ###_ ` code showing how you can deploy a sentence-transformers to SageMaker and then run the methods of the embeddings class. @hwchase17 please let me know if/when i should remove the `_### TEMPORARY: Showing how to deploy a SageMaker Endpoint from a Hugging Face model ###_` in the description i linked to a detail blog on how to deploy a Sentence Transformers so i think we don't need to include those steps here. I also reused the `ContentHandlerBase` from `langchain.llms.sagemaker_endpoint` and changed the output type to `any` since it is depending on the implementation.	2023-03-21 21:51:48 -07:00
anupam-tiwari	86822d1cc2	Fixes the import typo in the vector db text generator notebook (#1874 ) Fixes the import typo in the vector db text generator notebook for the chroma library Co-authored-by: Anupam <anupam@10-16-252-145.dynapool.wireless.nyu.edu>	2023-03-21 21:48:26 -07:00
Harrison Chase	a581bce379	remove key (#1863 )	2023-03-21 12:43:41 -07:00
Harrison Chase	2ffc643086	add listen api docs (#1855 )	2023-03-21 09:29:34 -07:00
Tomoko Uchida	b706966ebc	Add setup instruction in Getting Started for Indexing (#1847 ) `VectorstoreIndexCreator` [uses Chroma as the vectorstore by default](`1c22657256/langchain/indexes/vectorstore.py (L49)`). It may be helpful to add a short note for the setup. You can see how the notebook looks here. https://github.com/mocobeta/langchain/blob/feat/add-setup-instruction-to-index-getting-started/docs/modules/indexes/getting_started.ipynb	2023-03-21 09:06:35 -07:00
Harrison Chase	1c22657256	Harrison/faiss merge (#1843 ) Co-authored-by: Ting Su <ting.su.1995@outlook.com>	2023-03-20 22:54:08 -07:00
Simon Zhou	3674074eb0	Add Qdrant to ecosystem page (#1830 ) Add [Qdrant](https://qdrant.tech/) to [LangChain ecosystem](https://langchain.readthedocs.io/en/latest/ecosystem.html) page.	2023-03-20 22:06:40 -07:00
Wenbin Fang	a7e09d46c5	Add podcast api tool to use NLP to search all podcasts or episodes. (#1833 ) Use the following code to test: ```python import os from langchain.llms import OpenAI from langchain.chains.api import podcast_docs from langchain.chains import APIChain # Get api key here: https://openai.com/pricing os.environ["OPENAI_API_KEY"] = "sk-xxxxx" # Get api key here: https://www.listennotes.com/api/pricing/ listen_api_key = 'xxx' llm = OpenAI(temperature=0) headers = {"X-ListenAPI-Key": listen_api_key} chain = APIChain.from_llm_and_api_docs(llm, podcast_docs.PODCAST_DOCS, headers=headers, verbose=True) chain.run("Search for 'silicon valley bank' podcast episodes, audio length is more than 30 minutes, return only 1 results") ``` Known issues: the api response data might be too big, and we'll get such error: `openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 6733 tokens (6477 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.`	2023-03-20 22:04:17 -07:00
Ikko Eltociear Ashimine	9555bbd5bb	Fix typo in sqlite.ipynb (#1828 ) overriden -> overridden	2023-03-20 16:47:19 -07:00
Harrison Chase	d5b4393bb2	Harrison/llm math (#1808 ) Co-authored-by: Vadym Barda <vadim.barda@gmail.com>	2023-03-20 07:53:26 -07:00
Bryan Helmig	7b6ff7fe00	Follow up to #1803 to remove dynamic docs route. (#1818 ) The base docs are going to be more stable and familiar for folks. Dynamic route is currently in flux.	2023-03-20 07:52:41 -07:00
Harrison Chase	76c7b1f677	Harrison/wandb (#1764 ) Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com>	2023-03-20 07:52:27 -07:00
Harrison Chase	d5d50c39e6	Harrison/azure embeddings (#1787 ) Co-authored-by: Hemant <4627288+ghaccount@users.noreply.github.com>	2023-03-19 10:42:33 -07:00
Harrison Chase	1f18698b2a	Harrison/token buffer memory (#1786 ) Co-authored-by: Aratako <127325395+Aratako@users.noreply.github.com>	2023-03-19 10:42:24 -07:00
Harrison Chase	ef4945af6b	Harrison/chat token usage (#1785 )	2023-03-19 10:32:31 -07:00
Harrison Chase	7de2ada3ea	Harrison/add source column (#1784 ) Co-authored-by: Brian Graham <46691715+briangrahamww@users.noreply.github.com> Co-authored-by: briangrahamww <brian.graham@ww.com>	2023-03-19 10:32:13 -07:00
hitoshi44	3cf493b089	Fix Document & Expose StringPromptTemplate as a custom-prompt-template. (#1753 ) Regarding [this issue](https://github.com/hwchase17/langchain/issues/1754), the code in the document [Creating a custom prompt template](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/custom_prompt_template.html) is no longer functional and outdated. To address this, I have made the following changes: 1. Updated the guide in the document to use `StringPromptTemplate` instead of `BasePromptTemplate`. 2. Exposed `StringPromptTemplate` in `prompts/__init__.py` for easier importing.	2023-03-19 09:47:56 -07:00
hung_ng__	3d6fcb85dc	Add load json prompt example (#1776 ) Hi, I just want to add a PR on the prompt serialization examples of loading from JSON so that it can contain the same as loading from YAML.	2023-03-19 09:28:56 -07:00
Piyush Jain	1a8790d808	Corrects copyright year (#1762 ) Corrected copyright year.	2023-03-18 19:55:05 -07:00
Harrison Chase	8685d53adc	querying tabular data (#1758 )	2023-03-18 11:12:18 -07:00
Harrison Chase	dd90fd02d5	Harrison/move docs (#1741 )	2023-03-17 08:49:10 -07:00
Harrison Chase	07766a69f3	move docs (#1740 )	2023-03-17 08:42:28 -07:00
Harrison Chase	96ebe98dc2	Harrison/latex splitter (#1738 ) Co-authored-by: Aidan Holland <thehappydinoa@gmail.com> Co-authored-by: Jan de Boer <44832123+Janldeboer@users.noreply.github.com>	2023-03-17 08:10:27 -07:00
Harrison Chase	45f05fc939	Harrison/blackboard loader (#1737 ) Co-authored-by: Aidan Holland <thehappydinoa@gmail.com>	2023-03-17 08:02:44 -07:00
Vincent Liao	cf9c3f54f7	docs: add docs link to agent toolkits (#1735 ) New to Langchain, was a bit confused where I should find the toolkits section when I'm at `agent/key_concepts` docs. I added a short link that points to the how to section.	2023-03-17 07:59:49 -07:00
Piyush Jain	cdff6c8181	Sagemaker Endpoint LLM (#1686 ) Updates #965 --------- Co-authored-by: Nimisha Mehta <116048415+nimimeht@users.noreply.github.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	2023-03-16 21:58:06 -07:00
libra	8a95fdaee1	Fix all the bug in init Tool in docs (#1725 ) Fix all the example in the docs when init `Tool` Test by render with jupyter	2023-03-16 21:55:44 -07:00
jerwelborn	55efbb8a7e	pydantic/json parsing (#1722 ) ``` class Joke(BaseModel): setup: str = Field(description="question to set up a joke") punchline: str = Field(description="answer to resolve the joke") joke_query = "Tell me a joke." # Or, an example with compound type fields. #class FloatArray(BaseModel): # values: List[float] = Field(description="list of floats") # #float_array_query = "Write out a few terms of fiboacci." model = OpenAI(model_name='text-davinci-003', temperature=0.0) parser = PydanticOutputParser(pydantic_object=Joke) prompt = PromptTemplate( template="Answer the user query.\n{format_instructions}\n{query}\n", input_variables=["query"], partial_variables={"format_instructions": parser.get_format_instructions()} ) _input = prompt.format_prompt(query=joke_query) print("Prompt:\n", _input.to_string()) output = model(_input.to_string()) print("Completion:\n", output) parsed_output = parser.parse(output) print("Parsed completion:\n", parsed_output) ``` ``` Prompt: Answer the user query. The output should be formatted as a JSON instance that conforms to the JSON schema below. For example, the object {"foo": ["bar", "baz"]} conforms to the schema {"foo": {"description": "a list of strings field", "type": "string"}}. Here is the output schema: --- {"setup": {"description": "question to set up a joke", "type": "string"}, "punchline": {"description": "answer to resolve the joke", "type": "string"}} --- Tell me a joke. Completion: {"setup": "Why don't scientists trust atoms?", "punchline": "Because they make up everything!"} Parsed completion: setup="Why don't scientists trust atoms?" punchline='Because they make up everything!' ``` Ofc, works only with LMs of sufficient capacity. DaVinci is reliable but not always. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-03-16 21:43:11 -07:00
Jonathan Pedoeem	606605925d	Adding ability to `return_pl_id` to all PromptLayer Models in LangChain (#1699 ) PromptLayer now has support for [several different tracking features.](https://magniv.notion.site/Track-4deee1b1f7a34c1680d085f82567dab9) In order to use any of these features you need to have a request id associated with the request. In this PR we add a boolean argument called `return_pl_id` which will add `pl_request_id` to the `generation_info` dictionary associated with a generation. We also updated the relevant documentation.	2023-03-16 17:05:23 -07:00
Harrison Chase	3ea6d9c4d2	add docs for save/load messages (#1697 )	2023-03-15 13:13:08 -07:00
Piyush Jain	1279c8de39	Fixed typo, clarified language (#1682 )	2023-03-15 08:00:11 -07:00
at-b612	c7779c800a	Added Mynd URL to gallery (#1684 )	2023-03-15 07:59:59 -07:00
Jithin James	6f4f771897	docs: add path to state_of_the_union.txt in indexes/getting_started page (#1691 ) add the state_of_the_union.txt file so that its easier to follow through with the example. --------- Co-authored-by: Jithin James <jjmachan@pop-os.localdomain>	2023-03-15 07:59:47 -07:00
Ankush Gola	d4edd3c312	Zapier Integration (#1654 ) * Zapier Wrapper and Tools (implemented by Zapier Team) * Zapier Toolkit, examples with mrkl agent --------- Co-authored-by: Mike Knoop <mikeknoop@gmail.com> Co-authored-by: Robert Lewis <robert.lewis@zapier.com>	2023-03-14 23:06:17 -07:00
Harrison Chase	0b29e68c17	Harrison/pgvector (#1679 ) Co-authored-by: Aman Kumar <krsingh.aman@gmail.com>	2023-03-14 21:13:58 -07:00
Harrison Chase	4d7fdb8957	Harrison/gml save (#1676 ) Co-authored-by: Satoru Sakamoto <51464932+satoru814@users.noreply.github.com>	2023-03-14 20:00:22 -07:00
Harrison Chase	656efe6ef3	Harrison/fix nb (#1678 )	2023-03-14 19:34:23 -07:00
Matt Robinson	63aa28e2a6	feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667 ) ### Summary Allows users to pass in `**unstructured_kwargs` to Unstructured document loaders. Implemented with the `strategy` kwargs in mind, but will pass in other kwargs like `include_page_breaks` as well. The two currently supported strategies are `"hi_res"`, which is more accurate but takes longer, and `"fast"`, which processes faster but with lower accuracy. The `"hi_res"` strategy is the default. For PDFs, if `detectron2` is not available and the user selects `"hi_res"`, the loader will fallback to using the `"fast"` strategy. ### Testing #### Make sure the `strategy` kwarg works Run the following in iPython to verify that the `"fast"` strategy is indeed faster. ```python from langchain.document_loaders import UnstructuredFileLoader loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", strategy="fast", mode="elements") %timeit loader.load() loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", mode="elements") %timeit loader.load() ``` On my system I get: ```python In [3]: from langchain.document_loaders import UnstructuredFileLoader In [4]: loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", strategy="fast", mode="elements") In [5]: %timeit loader.load() 247 ms ± 369 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) In [6]: loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", mode="elements") In [7]: %timeit loader.load() 2.45 s ± 31 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` #### Make sure older versions of `unstructured` still work Run `pip install unstructured==0.5.3` and then verify the following runs without error: ```python from langchain.document_loaders import UnstructuredFileLoader loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", mode="elements") loader.load() ```	2023-03-14 18:15:28 -07:00
Matthias Kern	c3dfbdf0da	Remove outdated code from Chat VectorDB QA example (#1670 )	2023-03-14 18:13:51 -07:00
Bilel MEDIMEGH	a2280f321f	Docs: Fix typo in memory/key_concepts.md (#1671 ) dialouge -> dialogue	2023-03-14 18:12:01 -07:00
Xin Qiu	4e13cef05a	feat: add redisearch vectorstore (#1307 ) # Description Add `RediSearch` vectorstore for LangChain RediSearch: [RediSearch quick start](https://redis.io/docs/stack/search/quick_start/) # How to use ``` from langchain.vectorstores.redisearch import RediSearch rds = RediSearch.from_documents(docs, embeddings,redisearch_url="redis://localhost:6379") ```	2023-03-14 18:06:03 -07:00
Harrison Chase	2d098e8869	Harrison/agent eval (#1620 ) Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>	2023-03-14 12:37:48 -07:00
Harrison Chase	7cf46b3fee	Harrison/convo agent (#1642 )	2023-03-14 09:42:24 -07:00
Jon Luo	0a1b1806e9	sql: do not hard code the LIMIT clause in the table_info section (#1563 ) Seeing a lot of issues in Discord in which the LLM is not using the correct LIMIT clause for different SQL dialects. ie, it's using `LIMIT` for mssql instead of `TOP`, or instead of `ROWNUM` for Oracle, etc. I think this could be due to us specifying the LIMIT statement in the example rows portion of `table_info`. So the LLM is seeing the `LIMIT` statement used in the prompt. Since we can't specify each dialect's method here, I think it's fine to just replace the `SELECT... LIMIT 3;` statement with `3 rows from table_name table:`, and wrap everything in a block comment directly following the `CREATE` statement. The Rajkumar et al paper wrapped the example rows and `SELECT` statement in a block comment as well anyway. Thoughts @fpingham?	2023-03-13 23:08:27 -07:00
Tim Asp	b3234bf3b0	cleanup: unify 3 different pdf loaders, rename PagedPDFSplitter (#1615 ) `OnlinePDFLoader` and `PagedPDFSplitter` lived separate from the rest of the pdf loaders. Because they're all similar, I propose moving all to `pdy.py` and the same docs/examples page. Additionally, `PagedPDFSplitter` naming doesn't match the pattern the rest of the loaders follow, so I renamed to `PyPDFLoader` and had it inherit from `BasePDFLoader` so it can now load from remote file sources.	2023-03-13 23:06:50 -07:00
Harrison Chase	56aff797c0	docs req (#1647 )	2023-03-13 16:03:32 -07:00
Harrison Chase	d53ff270e0	bump version to 109 (#1646 )	2023-03-13 15:52:35 -07:00
Harrison Chase	df6c33d4b3	Harrison/new output parser (#1617 )	2023-03-13 15:08:39 -07:00
Eugene Yurtsev	bd4a2a670b	Add copy button to sphinx notebooks (#1622 ) This adds a copy button at the top right corner of all notebook cells in sphinx notebooks.	2023-03-12 21:15:07 -07:00
Ikko Eltociear Ashimine	6e98ab01e1	Fix typo in vectorstore.ipynb (#1614 ) Initalize -> Initialize	2023-03-12 14:12:47 -07:00
yakigac	acd86d33bc	Add read only shared memory (#1491 ) Provide shared memory capability for the Agent. Inspired by #1293 . ## Problem If both Agent and Tools (i.e., LLMChain) use the same memory, both of them will save the context. It can be annoying in some cases. ## Solution Create a memory wrapper that ignores the save and clear, thereby preventing updates from Agent or Tools.	2023-03-12 09:34:36 -07:00
Harrison Chase	c9b5a30b37	move output parsing (#1605 )	2023-03-11 16:41:03 -08:00
Harrison Chase	15de3e8137	Harrison/docs footer (#1600 ) Co-authored-by: Albert Avetisian <albert.avetisian@gmail.com>	2023-03-11 09:18:35 -08:00
Harrison Chase	9f78717b3c	Harrison/callbacks (#1587 )	2023-03-10 12:53:09 -08:00
Harrison Chase	90846dcc28	fix chat agent (#1586 )	2023-03-10 12:40:37 -08:00
Zach Schillaci	624c72c266	Add wikipedia tool doc (#1579 )	2023-03-10 07:07:27 -08:00
Tim Asp	30383abb12	Add CSVLoader document loader (#1573 ) Simple CSV document loader which wraps `csv` reader, and preps the file with a single `Document` per row. The column header is prepended to each value for context which is useful for context with embedding and semantic search	2023-03-09 16:35:18 -08:00
Andriy Mulyar	c9189d354a	AtlasDB vector store documentation updates. (#1572 ) - Updated errors in the AtlasDB vector store documentation - Removed extraneous output logs in example notebook.	2023-03-09 16:31:14 -08:00
Matt Robinson	7018806a92	feat: document loader for markdown files (#1558 ) ### Summary Adds a document loader for handling markdown files. This document loader requires `unstructured>=0.4.16`. ### Testing ```python from langchain.document_loaders import UnstructuredMarkdownLoader loader = UnstructuredMarkdownLoader("README.md") loader.load() ```	2023-03-09 10:55:07 -08:00
Harrison Chase	bd335ffd64	bump version to 106 (#1562 )	2023-03-09 10:20:54 -08:00
Harrison Chase	a094c49153	add chat agent (#1509 )	2023-03-09 09:12:08 -08:00
Brenton Wheeler	99fe023496	docs: fix typo in modules/indexes/chain_examples/question_answering (#1551 ) docs: fix typo in modules/indexes/chain_examples/question_answering ![image](https://user-images.githubusercontent.com/11394076/224007874-3a52adf6-ff7a-4f22-9dbf-18c83d08167f.png)	2023-03-09 09:11:43 -08:00
Harrison Chase	3ee32a01ea	Harrison/prompt layer (#1547 ) Co-authored-by: Jonathan Pedoeem <jonathanped@gmail.com> Co-authored-by: AbuBakar <abubakarsohail123@gmail.com>	2023-03-08 21:24:27 -08:00
Harrison Chase	cc423f40f1	Harrison/youtube loader (#1545 ) Co-authored-by: Julian Wustl <57504258+Julianwustl@users.noreply.github.com>	2023-03-08 20:53:27 -08:00
Harrison Chase	523ad8d2e2	Harrison/chat history formatter1 (#1538 ) Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>	2023-03-08 20:46:37 -08:00
Graham Neubig	31303d0b11	Added other evaluation metrics for data-augmented QA (#1521 ) This PR adds additional evaluation metrics for data-augmented QA, resulting in a report like this at the end of the notebook: ![Screen Shot 2023-03-08 at 8 53 23 AM](https://user-images.githubusercontent.com/398875/223731199-8eb8e77f-5ff3-40a2-a23e-f3bede623344.png) The score calculation is based on the [Critique](https://docs.inspiredco.ai/critique/) toolkit, an API-based toolkit (like OpenAI) that has minimal dependencies, so it should be easy for people to run if they choose. The code could further be simplified by actually adding a chain that calls Critique directly, but that probably should be saved for another PR if necessary. Any comments or change requests are welcome!	2023-03-08 20:41:03 -08:00
gidler	494c9d341a	[DOCS] Assorted wording, punctuation, and consistency revisions (#1443 ) Contributing some small fixes I noticed while reading through the documentation. Thank you for a creating and maintaining this project!	2023-03-08 20:16:09 -08:00
Harrison Chase	c4a557bdd4	add concept of prompt collection (#1507 )	2023-03-08 08:31:29 -08:00
Ivan	97e3666e0d	changed requests.run to requests.get (#1485 ) This pull request proposes an update to the Lightweight wrapper library's documentation. The current documentation provides an example of how to use the library's requests.run method, as follows: requests.run("https://www.google.com"). However, this example does not work for the 0.0.102 version of the library. Testing: The changes have been tested locally to ensure they are working as intended. Thank you for considering this pull request.	2023-03-07 21:10:23 -08:00
Tom Dyson	e3354404ad	Fix link to Pinecone notebook (#1492 )	2023-03-07 15:24:03 -08:00
Harrison Chase	3610ef2830	add fake embeddings class (#1503 )	2023-03-07 15:23:46 -08:00
Harrison Chase	4f41e20f09	memory docs (#1501 )	2023-03-07 11:02:46 -08:00
Harrison Chase	f276bfad8e	Harrison/chat memory (#1495 )	2023-03-07 09:02:40 -08:00
Harrison Chase	7bec461782	Harrison/memory refactor (#1478 ) moves memory to own module, factors out common stuff	2023-03-07 07:59:37 -08:00
Harrison Chase	0e21463f07	(rfc) chat models (#1424 ) Co-authored-by: Ankush Gola <ankush.gola@gmail.com>	2023-03-06 08:34:24 -08:00
Harrison Chase	63a5614d23	Harrison/simple memory (#1435 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	2023-03-04 08:15:52 -08:00
Harrison Chase	a1b9dfc099	Harrison/similarity search chroma (#1434 ) Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>	2023-03-04 08:10:15 -08:00
Ikko Eltociear Ashimine	b8a7828d1f	Update huggingface_datasets.ipynb (#1417 ) HuggingFace -> Hugging Face	2023-03-04 00:22:31 -08:00
Tim Asp	23231d65a9	Add PyMuPDF PDF loader (#1426 ) Different PDF libraries have different strengths and weaknesses. PyMuPDF does a good job at extracting the most amount of content from the doc, regardless of the source quality, extremely fast (especially compared to Unstructured). https://pymupdf.readthedocs.io/en/latest/index.html	2023-03-03 20:59:28 -08:00
blob42	3d54b05863	searx: add install instructions, update doc and notebooks (#1420 ) - Added instructions on setting up self hosted searx - Add notebook example with agent - Use `localhost:8888` as example url to stay consistent since public instances are not really usable. Co-authored-by: blob42 <spike@w530>	2023-03-03 20:57:50 -08:00
Tim Asp	bca0935d90	[docs] fix minor import error (#1425 )	2023-03-03 16:10:07 -08:00
JonLuca De Caro	443992c4d5	[Docs] Add missing word from prompt docs (#1406 ) The prompt in the first example of the quickstart guide was missing `for `	2023-03-02 16:02:54 -08:00
Jason Gill	1989e7d4c2	Update examples to prevent confusing missing _type warning (#1391 ) The YAML and JSON examples of prompt serialization now give a strange `No '_type' key found, defaulting to 'prompt'` message when you try to run them yourself or copy the format of the files. The reason for this harmless warning is that the _type key was not in the config files, which means they are parsed as a standard prompt. This could be confusing to new users (like it was confusing to me after upgrading from 0.0.85 to 0.0.86+ for my few_shot prompts that needed a _type added to the example_prompt config), so this update includes the _type key just for clarity. Obviously this is not critical as the warning is harmless, but it could be confusing to track down or be interpreted as an error by a new user, so this update should resolve that.	2023-03-02 07:39:57 -08:00
Harrison Chase	dda5259f68	bump version to 0.0.99 (#1390 )	2023-03-02 07:25:59 -08:00
Kacper Łukawski	9ac442624c	Add Qdrant named arguments (#1386 ) This PR: - Increases `qdrant-client` version to 1.0.4 - Introduces custom content and metadata keys (as requested in #1087) - Moves all the `QdrantClient` parameters into the method parameters to simplify code completion	2023-03-02 07:05:14 -08:00
Ankush Gola	fe30be6fba	add async and streaming support to `OpenAIChat` (#1378 ) title says it all	2023-03-01 21:55:43 -08:00
Lakshya Agarwal	cfed0497ac	Minor grammatical fixes (#1325 ) Fixed typos and links in a few places across documents	2023-03-01 21:18:09 -08:00
Harrison Chase	1cd8996074	Harrison/summarizer chain (#1356 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	2023-03-01 20:59:07 -08:00
Harrison Chase	4b5e850361	chatgpt wrapper (#1367 )	2023-03-01 11:47:01 -08:00
Harrison Chase	4d4b43cf5a	fix doc names (#1354 )	2023-03-01 09:40:31 -08:00
Harrison Chase	fe7dbecfe6	pandas and csv agents (#1353 )	2023-02-28 22:19:11 -08:00
Harrison Chase	02ec72df87	improve docs (#1351 )	2023-02-28 21:37:18 -08:00
Jon Luo	92ab27e4b8	sql doc formatting (#1350 ) My bad, missed a few tabs between the two PRs	2023-02-28 19:54:46 -08:00
Ankush Gola	82baecc892	Add a SQL agent for interacting with SQL Databases and JSON Agent for interacting with large JSON blobs (#1150 ) This PR adds * `ZeroShotAgent.as_sql_agent`, which returns an agent for interacting with a sql database. This builds off of `SQLDatabaseChain`. The main advantages are 1) answering general questions about the db, 2) access to a tool for double checking queries, and 3) recovering from errors * `ZeroShotAgent.as_json_agent` which returns an agent for interacting with json blobs. * Several examples in notebooks --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-02-28 19:44:39 -08:00
Jon Luo	35f1e8f569	separate columns by tabs instead of single space in sql sample rows (#1348 ) Use tabs to separate columns instead of a single space - confusing when there are spaces in a cell	2023-02-28 18:59:53 -08:00
James Brotchie	3574418a40	Fix link in summarization.md (#1344 ) "Utilities for working with Documents" was linking to a non-useful page. Re-linked to the utils page that includes info about working with docs.	2023-02-28 18:58:12 -08:00
Jon Luo	5bf8772f26	add option to use user-defined SQL table info (#1347 ) Currently, table information is gathered through SQLAlchemy as complete table DDL and a user-selected number of sample rows from each table. This PR adds the option to use user-defined table information instead of automatically collecting it. This will use the provided table information and fall back to the automatic gathering for tables that the user didn't provide information for. Off the top of my head, there are a few cases where this can be quite useful: - The first n rows of a table are uninformative, or very similar to one another. In this case, hand-crafting example rows for a table such that they provide the good, diverse information can be very helpful. Another approach we can think about later is getting a random sample of n rows instead of the first n rows, but there are some performance considerations that need to be taken there. Even so, hand-crafting the sample rows is useful and can guarantee the model sees informative data. - The user doesn't want every column to be available to the model. This is not an elegant way to fulfill this specific need since the user would have to provide the table definition instead of a simple list of columns to include or ignore, but it does work for this purpose. - For the developers, this makes it a lot easier to compare/benchmark the performance of different prompting structures for providing table information in the prompt. These are cases I've run into myself (particularly cases 1 and 3) and I've found these changes useful. Personally, I keep custom table info for a few tables in a yaml file for versioning and easy loading. Definitely open to other opinions/approaches though!	2023-02-28 18:58:04 -08:00
Harrison Chase	786852e9e6	partial variables (#1308 )	2023-02-28 08:40:35 -08:00
Tim Asp	72ef69d1ba	Add new iFixit document loader (#1333 ) iFixit is a wikipedia-like site that has a huge amount of open content on how to fix things, questions/answers for common troubleshooting and "things" related content that is more technical in nature. All content is licensed under CC-BY-SA-NC 3.0 Adding docs from iFixit as context for user questions like "I dropped my phone in water, what do I do?" or "My macbook pro is making a whining noise, what's wrong with it?" can yield significantly better responses than context free response from LLMs.	2023-02-27 20:40:20 -08:00
Matt Robinson	1aa41b5741	feat: document loader for image files (#1330 ) ### Summary Adds a document loader for image files such as `.jpg` and `.png` files. ### Testing Run the following using the example document from the [`unstructured` repo](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs). ```python from langchain.document_loaders.image import UnstructuredImageLoader loader = UnstructuredImageLoader("layout-parser-paper-fast.jpg") loader.load() ```	2023-02-27 14:43:32 -08:00
Eugene Yurtsev	c14cff60d0	Documentation: Minor typo fixes (#1327 ) Fixing a few minor typos in the documentation (and likely introducing other ones in the process).	2023-02-27 14:40:43 -08:00
Harrison Chase	f61858163d	bump version to 0.0.95 (#1324 )	2023-02-27 07:45:54 -08:00
Harrison Chase	0824d65a5c	Harrison/indexing pipeline (#1317 )	2023-02-27 00:31:36 -08:00
Akshay	a0bf856c70	Update agent_vectorstore.ipynb (#1318 ) nitpicking but just thought i'd add this typo which I found when going through the How-to 😄 (unless it was intentional) also, it's amazing that you added ReAct to LangChain!	2023-02-26 23:22:35 -08:00
Harrison Chase	166cda2cc6	Harrison/deeplake (#1316 ) Co-authored-by: Davit Buniatyan <d@activeloop.ai>	2023-02-26 22:35:04 -08:00
Harrison Chase	aaad6cc954	Harrison/atlas db (#1315 ) Co-authored-by: Brandon Duderstadt <brandonduderstadt@gmail.com>	2023-02-26 22:11:38 -08:00
Marc Puig	3989c793fd	Making it possible to use "certainty" as a parameter for the weaviate similarity_search (#1218 ) Checking if weaviate similarity_search kwargs contains "certainty" and use it accordingly. The minimal level of certainty must be a float, and it is computed by normalized distance.	2023-02-26 17:55:28 -08:00
Harrison Chase	81abcae91a	Harrison/banana fix (#1311 ) Co-authored-by: Erik Dunteman <44653944+erik-dunteman@users.noreply.github.com>	2023-02-26 17:53:57 -08:00
Casey A. Fitzpatrick	648b3b3909	Fix use case sentence for bash util doc (#1295 ) Thanks for all your hard work! I noticed a small typo in the bash util doc so here's a quick update. Additionally, my formatter caught some spacing in the `.md` as well. Happy to revert that if it's an issue. The main change is just ``` - A common use case this is for letting it interact with your local file system. + A common use case for this is letting the LLM interact with your local file system. ``` ## Testing `make docs_build` succeeds locally and the changes show as expected ✌️ <img width="704" alt="image" src="https://user-images.githubusercontent.com/17773666/221376160-e99e59a6-b318-49d1-a1d7-89f5c17cdab4.png">	2023-02-26 17:41:03 -08:00
Ingo Kleiber	fd9975dad7	add CoNLL-U document loader (#1297 ) I've added a simple [CoNLL-U](https://universaldependencies.org/format.html) document loader. CoNLL-U is a common format for NLP tasks and is used, for example, in the Universal Dependencies treebank corpora. The loader reads a single file in standard CoNLL-U format and returns a document.	2023-02-26 17:27:00 -08:00
Harrison Chase	d29f74114e	copy paste loader (#1302 )	2023-02-26 17:26:37 -08:00
Harrison Chase	ce441edd9c	improve docs (#1309 )	2023-02-26 11:25:16 -08:00
Harrison Chase	6f30d68581	add example of using agent with vectorstores (#1285 )	2023-02-25 13:27:24 -08:00
Matt Robinson	2f15c11b87	feat: document loader for MS Word documents (#1282 ) ### Summary Adds a document loader for MS Word Documents. Works with both `.docx` and `.doc` files as longer as the user has installed `unstructured>=0.4.11`. ### Testing The follow workflow test the loader for both `.doc` and `.docx` files using example docs from the `unstructured` repo. #### `.docx` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.docx" loader = UnstructuredWordDocumentLoader(filename) loader.load() ``` #### `.doc` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.doc" loader = UnstructuredWordDocumentLoader(filename) loader.load() ```	2023-02-24 08:26:19 -08:00
Harrison Chase	96db6ed073	cleanup (#1274 )	2023-02-24 07:38:24 -08:00
Harrison Chase	42167a1e24	Harrison/fb loader (#1277 ) Co-authored-by: Vairo Di Pasquale <vairo.dp@gmail.com>	2023-02-24 07:22:48 -08:00
Klein Tahiraj	8a0751dadd	adding .ipynb loader and documentation Fixes #1248 (#1252 ) `NotebookLoader.load()` loads the `.ipynb` notebook file into a `Document` object. Parameters: * `include_outputs` (bool): whether to include cell outputs in the resulting document (default is False). * `max_output_length` (int): the maximum number of characters to include from each cell output (default is 10). * `remove_newline` (bool): whether to remove newline characters from the cell sources and outputs (default is False). * `traceback` (bool): whether to include full traceback (default is False).	2023-02-24 07:10:35 -08:00

1 2 3 4 5 ...

544 Commits