langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-31 15:20:26 +00:00

Author	SHA1	Message	Date
Harrison Chase	5bdb8dd6fe	Harrison/unstructured io (#1200 )	2023-02-20 22:54:49 -08:00
Harrison Chase	d90a287d8f	Harrison/updating docs (#1196 )	2023-02-20 22:54:26 -08:00
Harrison Chase	b7708bbec6	rfc: callback changes (#1165 ) conceptually, no reason a tool should know what an "agent action" is unless any objections, can change in all callback handlers	2023-02-20 22:54:15 -08:00
Harrison Chase	fb83cd4ff4	catch networkx error (#1201 )	2023-02-20 21:43:02 -08:00
Harrison Chase	44c8d8a9ac	move serpapi wrapper (#1199 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	2023-02-20 21:15:45 -08:00
Konstantin Hebenstreit	af94f1dd97	HuggingFaceEndpoint: Correct Example for ImportError (#1176 ) When I try to import the Class HuggingFaceEndpoint I get an Import Error: cannot import name 'HuggingFaceEndpoint' from 'langchain'. (langchain version 0.0.88) These two imports work fine: from langchain import HuggingFacePipeline and from langchain import HuggingFaceHub. So I corrected the import statement in the example. There is probably a better solution to this, but this fixes the Error for me.	2023-02-20 21:09:39 -08:00
Harrison Chase	0c84ce1082	Harrison/add documents (#1197 ) Co-authored-by: OmriNach <32659330+OmriNach@users.noreply.github.com>	2023-02-20 21:02:28 -08:00
Francisco Ingham	0b6a650cb4	added ability to override default verbose and memory when load chain … (#1153 ) It is useful to be able to specify `verbose` or `memory` while still keeping the chain's overall structure. --------- Co-authored-by: Francisco Ingham <>	2023-02-20 21:00:32 -08:00
Anton Troynikov	d2ef5d6167	Default Chroma collection name (#1198 ) For persistence, it's convenient to have a default collection name which gets used everywhere.	2023-02-20 20:59:34 -08:00
Dennis Antela Martinez	23243ae69c	add gitbook document loader (#1180 ) Added a GitBook document loader. It lets you both, (1) fetch text from any single GitBook page, or (2) fetch all relative paths and return their respective content in Documents. I've modified the `scrape` method in the `WebBaseLoader` to accept custom web paths if given, but happy to remove it and move that logic into the `GitbookLoader` itself.	2023-02-20 20:05:04 -08:00
William FH	13ba0177d0	Add a StdIn "Interaction" Tool (#1193 ) Lets a chain prompt the user for more input as a part of its execution.	2023-02-20 18:40:02 -08:00
Naveen Tatikonda	0118706fd6	Add Support for OpenSearch Vector database (#1191 ) ### Description This PR adds a wrapper which adds support for the OpenSearch vector database. Using opensearch-py client we are ingesting the embeddings of given text into opensearch cluster using Bulk API. We can perform the `similarity_search` on the index using the 3 popular searching methods of OpenSearch k-NN plugin: - `Approximate k-NN Search` use approximate nearest neighbor (ANN) algorithms from the [nmslib](https://github.com/nmslib/nmslib), [faiss](https://github.com/facebookresearch/faiss), and [Lucene](https://lucene.apache.org/) libraries to power k-NN search. - `Script Scoring` extends OpenSearch’s script scoring functionality to execute a brute force, exact k-NN search. - `Painless Scripting` adds the distance functions as painless extensions that can be used in more complex combinations. Also, supports brute force, exact k-NN search like Script Scoring. ### Issues Resolved https://github.com/hwchase17/langchain/issues/1054 --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	2023-02-20 18:39:34 -08:00
Andrew White	c5015d77e2	Allow k to be higher than doc size in max_marginal_relevance_search (#1187 ) Fixes issue #1186. For some reason, #1117 didn't seem to fix it.	2023-02-20 16:39:13 -08:00
Zach Schillaci	159c560c95	Refactor some loops into list comprehensions (#1185 )	2023-02-20 16:38:43 -08:00
Harrison Chase	926c121b98	Harrison/text splitter docs (#1188 )	2023-02-20 15:14:03 -08:00
Harrison Chase	91446a5e9b	clean up text splitting docs (#1184 )	2023-02-20 11:24:31 -08:00
Harrison Chase	a5a14405ad	bump version to 0091 (#1181 )	2023-02-20 08:53:45 -08:00
Harrison Chase	5a954efdd7	update gallery with slack bot (#1177 )	2023-02-20 08:21:00 -08:00
Harrison Chase	4766b20223	clean up loaders (#1178 )	2023-02-20 08:20:48 -08:00
blob42	9962bda70b	searx_search: docs updates (#1175 ) - fix notebook formatting, remove empty cells and add scrolling for long text --------- Co-authored-by: blob42 <spike@w530>	2023-02-20 06:46:44 -08:00
Harrison Chase	4f3fbd7267	improve docs for indexes (#1146 )	2023-02-19 23:14:50 -08:00
Harrison Chase	28781a6213	Harrison/markdown splitter (#1169 ) Co-authored-by: Michael Chen <flamingdescent@gmail.com> Co-authored-by: Michael Chen <michaelchen@stripe.com>	2023-02-19 21:31:58 -08:00
Harrison Chase	37dd34bea5	fix path (#1168 )	2023-02-19 21:28:49 -08:00
Nan Wang	e8f224fd3a	docs: add missing links to toc (#1163 ) add missing links to toc --------- Signed-off-by: Nan Wang <nan.wang@jina.ai>	2023-02-19 21:15:11 -08:00
Nick	afe884fb96	AI21 documentation incorrectly titled Cohere (#1167 )	2023-02-19 21:14:59 -08:00
Ji	ed37fbaeff	for ChatVectorDBChain, add top_k_docs_for_context to allow control how many chunks of context will be retrieved (#1155 ) given that we allow user define chunk size, think it would be useful for user to define how many chunks of context will be retrieved.	2023-02-19 20:48:23 -08:00
Harrison Chase	955c89fccb	pass in prompts to vectordbqa (#1158 )	2023-02-19 20:47:17 -08:00
Harrison Chase	65cc81c479	directory loader improvements (#1162 )	2023-02-19 20:47:08 -08:00
Harrison Chase	05a05bcb04	bump version to 0.0.90 (#1157 )	2023-02-19 12:53:55 -08:00
Harrison Chase	9d6d8f85da	Harrison/self hosted runhouse (#1154 ) Co-authored-by: Donny Greenberg <dongreenberg2@gmail.com> Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net> Co-authored-by: Andrew White <white.d.andrew@gmail.com> Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com> Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com> Co-authored-by: jeff <tangj1122@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local> Co-authored-by: zanderchase <zander@unfold.ag> Co-authored-by: Charles Frye <cfrye59@gmail.com> Co-authored-by: zanderchase <zanderchase@gmail.com> Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com> Co-authored-by: Stefan Keselj <skeselj@princeton.edu> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: blob42 <contact@blob42.xyz> Co-authored-by: blob42 <spike@w530> Co-authored-by: Enrico Shippole <henryshippole@gmail.com> Co-authored-by: Ibis Prevedello <ibiscp@gmail.com> Co-authored-by: jped <jonathanped@gmail.com> Co-authored-by: Justin Torre <justintorre75@gmail.com> Co-authored-by: Ivan Vendrov <ivan@anthropic.com> Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com> Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com> Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io> Co-authored-by: Jeff Huber <jeffchuber@gmail.com> Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com> Co-authored-by: Andrew Huang <jhuang16888@gmail.com> Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com> Co-authored-by: seanaedmiston <seane999@gmail.com> Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com> Co-authored-by: Ivan Vendrov <ivendrov@gmail.com> Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu> Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com> Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr> Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>	2023-02-19 09:53:45 -08:00
CG80499	af8f5c1a49	Added constitutional chain. (#1147 ) - Added self-critique constitutional chain based on this [paper](https://www.anthropic.com/constitutional.pdf).	2023-02-18 19:31:51 -08:00
Harrison Chase	a83ba44efa	Harrison/ver0089 (#1144 )	2023-02-18 14:25:37 -08:00
Ankush Gola	7b5e160d28	Make Tools own model, add ToolKit Concept (#1095 ) Follow-up of @hinthornw's PR: - Migrate the Tool abstraction to a separate file (`BaseTool`). - `Tool` implementation of `BaseTool` takes in function and coroutine to more easily maintain backwards compatibility - Add a Toolkit abstraction that can own the generation of tools around a shared concept or state --------- Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net> Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>	2023-02-18 13:40:43 -08:00
Harrison Chase	45b5640fe5	fix sql (#1141 )	2023-02-18 11:49:08 -08:00
Sam Hogan	85c1449a96	Fix typo in HyDE docs (#1142 )	2023-02-18 11:48:46 -08:00
kekayan	9111f4ca8a	fix chatvectordbchain to use pinecone namespace (#1139 ) In the similarity search, the pinecone namespace is not used, which makes the bot return _I don't know_ where the embeddings are stored in the pinecone namespace. Now we can query by passing the namespace optionally. ```result = qa({"question": query, "chat_history": chat_history, "namespace":"01gshyhjcfgkq1q5wxjtm17gjh"})```	2023-02-18 10:58:48 -08:00
Harrison Chase	fb3c73d194	add srt loader (#1140 )	2023-02-18 10:58:39 -08:00
Francisco Ingham	3f29742adc	Sql alchemy commands used in table info (#1135 ) This approach has several advantages: * it improves the readability of the code * removes incompatibilities between SQL dialects * fixes a bug with `datetime` values in rows and `ast.literal_eval` Huge thanks and credits to @jzluo for finding the weaknesses in the current approach and for the thoughtful discussion on the best way to implement this. --------- Co-authored-by: Francisco Ingham <> Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>	2023-02-18 10:58:29 -08:00
Harrison Chase	483821ea3b	fix docs (#1133 )	2023-02-18 08:13:54 -08:00
Harrison Chase	ee3590cb61	instruct embeddings docs (#1131 )	2023-02-17 16:14:49 -08:00
Noah Gundotra	8c5fbab72d	[Integration Tests] Cast fake embeddings to ALL float values (#1102 ) Pydantic validation breaks tests for example (`test_qdrant.py`) because fake embeddings contain an integer. This PR casts the embeddings array to all floats. Now the `qdrant` test passes, `poetry run pytest tests/integration_tests/vectorstores/test_qdrant.py`	2023-02-17 15:18:09 -08:00
Harrison Chase	d5f3dfa1e1	Harrison/hn loader (#1130 ) Co-authored-by: William X <william.y.xuan@gmail.com>	2023-02-17 15:15:02 -08:00
Tom Bocklisch	47c3221fda	Max marginal relecance search fails if there are not enough docs (#1117 ) Implementation fails if there are not enough documents. Added the same check as used for similarity search. Current implementation raises ``` File ".venv/lib/python3.9/site-packages/langchain/vectorstores/faiss.py", line 160, in max_marginal_relevance_search _id = self.index_to_docstore_id[i] KeyError: -1 ```	2023-02-17 15:12:31 -08:00
Harrison Chase	511d41114f	return source documents for chat vector db chain (#1128 )	2023-02-17 13:40:52 -08:00
Jon Luo	c39ef70aa4	fix for database compatibility when getting table DDL (#1129 ) #1081 introduced a method to get DDL (table definitions) in a manner specific to sqlite3, thus breaking compatibility with other non-sqlite3 databases. This uses the sqlite3 command if the detected dialect is sqlite, and otherwise uses the standard SQL `SHOW CREATE TABLE`. This should fix #1103.	2023-02-17 13:39:44 -08:00
yakigac	1ed708391e	Fix a bug that shows "KeyError 'items'" (#1118 ) Fix KeyError 'items' when no result found. ## Problem When no result found for a query, google search crashed with `KeyError 'items'`. ## Solution I added a check for an empty response before accessing the 'items' key. It will handle the case correctly. ## Other my twitter: yakigac (I don't mind even if you don't mention me for this PR. But just because last time my real name was shout out :) )	2023-02-17 13:04:02 -08:00
Matt Robinson	2bee8d4941	feat: add support for `.ppt` files in `UnstructuredPowerPointLoader` (#1124 ) ### Summary Adds support for older `.ppt` file in the PowerPoint loader. ### Testing The following should work on `unstructured==0.4.11` using the example docs from the `unstructured` repo. ```python from langchain.document_loaders import UnstructuredPowerPointLoader filename = "../unstructured/example-docs/fake-power-point.pptx" loader = UnstructuredPowerPointLoader(filename) loader.load() filename = "../unstructured/example-docs/fake-power-point.ppt" loader = UnstructuredPowerPointLoader(filename) loader.load() ``` Now downgrade `unstructured` to version `0.4.10`. The following should work: ```python from langchain.document_loaders import UnstructuredPowerPointLoader filename = "../unstructured/example-docs/fake-power-point.pptx" loader = UnstructuredPowerPointLoader(filename) loader.load() ``` and the following should give you a `ValueError` and invite you to upgrade `unstructured`. ```python from langchain.document_loaders import UnstructuredPowerPointLoader filename = "../unstructured/example-docs/fake-power-point.ppt" loader = UnstructuredPowerPointLoader(filename) loader.load() ```	2023-02-17 13:03:25 -08:00
Matt Robinson	b956070f08	docs: add an unstructured section to the ecosystem page (#1125 ) ### Summary Adds an Unstructured section to the ecosystem page.	2023-02-17 13:02:23 -08:00
Hasegawa Yuya	383c67c1b2	Fix Issue #1100 (#1101 ) https://github.com/hwchase17/langchain/issues/1100 When faiss data and doc.index are created in past versions, error occurs that say there was no attribute. So I put hasattr in the check as a simple solution. However, increasing the number of such checks is not good for conservatism, so I think there is a better solution. Also, the code for the batch process was left out, so I put it back in.	2023-02-17 00:53:16 -08:00
Harrison Chase	3f50feb280	fix telegram imports (#1110 )	2023-02-17 00:53:01 -08:00

... 51 52 53 54 55 ...

3246 Commits