langchain

Commit Graph

Author	SHA1	Message	Date
Naveen Tatikonda	0118706fd6	Add Support for OpenSearch Vector database (#1191 ) ### Description This PR adds a wrapper which adds support for the OpenSearch vector database. Using opensearch-py client we are ingesting the embeddings of given text into opensearch cluster using Bulk API. We can perform the `similarity_search` on the index using the 3 popular searching methods of OpenSearch k-NN plugin: - `Approximate k-NN Search` use approximate nearest neighbor (ANN) algorithms from the [nmslib](https://github.com/nmslib/nmslib), [faiss](https://github.com/facebookresearch/faiss), and [Lucene](https://lucene.apache.org/) libraries to power k-NN search. - `Script Scoring` extends OpenSearch’s script scoring functionality to execute a brute force, exact k-NN search. - `Painless Scripting` adds the distance functions as painless extensions that can be used in more complex combinations. Also, supports brute force, exact k-NN search like Script Scoring. ### Issues Resolved https://github.com/hwchase17/langchain/issues/1054 --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	2 years ago
Andrew White	c5015d77e2	Allow k to be higher than doc size in max_marginal_relevance_search (#1187 ) Fixes issue #1186. For some reason, #1117 didn't seem to fix it.	2 years ago
Zach Schillaci	159c560c95	Refactor some loops into list comprehensions (#1185 )	2 years ago
Harrison Chase	926c121b98	Harrison/text splitter docs (#1188 )	2 years ago
Harrison Chase	91446a5e9b	clean up text splitting docs (#1184 )	2 years ago
Harrison Chase	a5a14405ad	bump version to 0091 (#1181 )	2 years ago
Harrison Chase	5a954efdd7	update gallery with slack bot (#1177 )	2 years ago
Harrison Chase	4766b20223	clean up loaders (#1178 )	2 years ago
blob42	9962bda70b	searx_search: docs updates (#1175 ) - fix notebook formatting, remove empty cells and add scrolling for long text --------- Co-authored-by: blob42 <spike@w530>	2 years ago
Harrison Chase	4f3fbd7267	improve docs for indexes (#1146 )	2 years ago
Harrison Chase	28781a6213	Harrison/markdown splitter (#1169 ) Co-authored-by: Michael Chen <flamingdescent@gmail.com> Co-authored-by: Michael Chen <michaelchen@stripe.com>	2 years ago
Harrison Chase	37dd34bea5	fix path (#1168 )	2 years ago
Nan Wang	e8f224fd3a	docs: add missing links to toc (#1163 ) add missing links to toc --------- Signed-off-by: Nan Wang <nan.wang@jina.ai>	2 years ago
Nick	afe884fb96	AI21 documentation incorrectly titled Cohere (#1167 )	2 years ago
Ji	ed37fbaeff	for ChatVectorDBChain, add top_k_docs_for_context to allow control how many chunks of context will be retrieved (#1155 ) given that we allow user define chunk size, think it would be useful for user to define how many chunks of context will be retrieved.	2 years ago
Harrison Chase	955c89fccb	pass in prompts to vectordbqa (#1158 )	2 years ago
Harrison Chase	65cc81c479	directory loader improvements (#1162 )	2 years ago
Harrison Chase	05a05bcb04	bump version to 0.0.90 (#1157 )	2 years ago
Harrison Chase	9d6d8f85da	Harrison/self hosted runhouse (#1154 ) Co-authored-by: Donny Greenberg <dongreenberg2@gmail.com> Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net> Co-authored-by: Andrew White <white.d.andrew@gmail.com> Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com> Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com> Co-authored-by: jeff <tangj1122@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local> Co-authored-by: zanderchase <zander@unfold.ag> Co-authored-by: Charles Frye <cfrye59@gmail.com> Co-authored-by: zanderchase <zanderchase@gmail.com> Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com> Co-authored-by: Stefan Keselj <skeselj@princeton.edu> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: blob42 <contact@blob42.xyz> Co-authored-by: blob42 <spike@w530> Co-authored-by: Enrico Shippole <henryshippole@gmail.com> Co-authored-by: Ibis Prevedello <ibiscp@gmail.com> Co-authored-by: jped <jonathanped@gmail.com> Co-authored-by: Justin Torre <justintorre75@gmail.com> Co-authored-by: Ivan Vendrov <ivan@anthropic.com> Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com> Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com> Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io> Co-authored-by: Jeff Huber <jeffchuber@gmail.com> Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com> Co-authored-by: Andrew Huang <jhuang16888@gmail.com> Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com> Co-authored-by: seanaedmiston <seane999@gmail.com> Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com> Co-authored-by: Ivan Vendrov <ivendrov@gmail.com> Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu> Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com> Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr> Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>	2 years ago
CG80499	af8f5c1a49	Added constitutional chain. (#1147 ) - Added self-critique constitutional chain based on this [paper](https://www.anthropic.com/constitutional.pdf).	2 years ago
Harrison Chase	a83ba44efa	Harrison/ver0089 (#1144 )	2 years ago
Ankush Gola	7b5e160d28	Make Tools own model, add ToolKit Concept (#1095 ) Follow-up of @hinthornw's PR: - Migrate the Tool abstraction to a separate file (`BaseTool`). - `Tool` implementation of `BaseTool` takes in function and coroutine to more easily maintain backwards compatibility - Add a Toolkit abstraction that can own the generation of tools around a shared concept or state --------- Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net> Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>	2 years ago
Harrison Chase	45b5640fe5	fix sql (#1141 )	2 years ago
Sam Hogan	85c1449a96	Fix typo in HyDE docs (#1142 )	2 years ago
kekayan	9111f4ca8a	fix chatvectordbchain to use pinecone namespace (#1139 ) In the similarity search, the pinecone namespace is not used, which makes the bot return _I don't know_ where the embeddings are stored in the pinecone namespace. Now we can query by passing the namespace optionally. ```result = qa({"question": query, "chat_history": chat_history, "namespace":"01gshyhjcfgkq1q5wxjtm17gjh"})```	2 years ago
Harrison Chase	fb3c73d194	add srt loader (#1140 )	2 years ago
Francisco Ingham	3f29742adc	Sql alchemy commands used in table info (#1135 ) This approach has several advantages: * it improves the readability of the code * removes incompatibilities between SQL dialects * fixes a bug with `datetime` values in rows and `ast.literal_eval` Huge thanks and credits to @jzluo for finding the weaknesses in the current approach and for the thoughtful discussion on the best way to implement this. --------- Co-authored-by: Francisco Ingham <> Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>	2 years ago
Harrison Chase	483821ea3b	fix docs (#1133 )	2 years ago
Harrison Chase	ee3590cb61	instruct embeddings docs (#1131 )	2 years ago
Noah Gundotra	8c5fbab72d	[Integration Tests] Cast fake embeddings to ALL float values (#1102 ) Pydantic validation breaks tests for example (`test_qdrant.py`) because fake embeddings contain an integer. This PR casts the embeddings array to all floats. Now the `qdrant` test passes, `poetry run pytest tests/integration_tests/vectorstores/test_qdrant.py`	2 years ago
Harrison Chase	d5f3dfa1e1	Harrison/hn loader (#1130 ) Co-authored-by: William X <william.y.xuan@gmail.com>	2 years ago
Tom Bocklisch	47c3221fda	Max marginal relecance search fails if there are not enough docs (#1117 ) Implementation fails if there are not enough documents. Added the same check as used for similarity search. Current implementation raises ``` File ".venv/lib/python3.9/site-packages/langchain/vectorstores/faiss.py", line 160, in max_marginal_relevance_search _id = self.index_to_docstore_id[i] KeyError: -1 ```	2 years ago
Harrison Chase	511d41114f	return source documents for chat vector db chain (#1128 )	2 years ago
Jon Luo	c39ef70aa4	fix for database compatibility when getting table DDL (#1129 ) #1081 introduced a method to get DDL (table definitions) in a manner specific to sqlite3, thus breaking compatibility with other non-sqlite3 databases. This uses the sqlite3 command if the detected dialect is sqlite, and otherwise uses the standard SQL `SHOW CREATE TABLE`. This should fix #1103.	2 years ago
yakigac	1ed708391e	Fix a bug that shows "KeyError 'items'" (#1118 ) Fix KeyError 'items' when no result found. ## Problem When no result found for a query, google search crashed with `KeyError 'items'`. ## Solution I added a check for an empty response before accessing the 'items' key. It will handle the case correctly. ## Other my twitter: yakigac (I don't mind even if you don't mention me for this PR. But just because last time my real name was shout out :) )	2 years ago
Matt Robinson	2bee8d4941	feat: add support for `.ppt` files in `UnstructuredPowerPointLoader` (#1124 ) ### Summary Adds support for older `.ppt` file in the PowerPoint loader. ### Testing The following should work on `unstructured==0.4.11` using the example docs from the `unstructured` repo. ```python from langchain.document_loaders import UnstructuredPowerPointLoader filename = "../unstructured/example-docs/fake-power-point.pptx" loader = UnstructuredPowerPointLoader(filename) loader.load() filename = "../unstructured/example-docs/fake-power-point.ppt" loader = UnstructuredPowerPointLoader(filename) loader.load() ``` Now downgrade `unstructured` to version `0.4.10`. The following should work: ```python from langchain.document_loaders import UnstructuredPowerPointLoader filename = "../unstructured/example-docs/fake-power-point.pptx" loader = UnstructuredPowerPointLoader(filename) loader.load() ``` and the following should give you a `ValueError` and invite you to upgrade `unstructured`. ```python from langchain.document_loaders import UnstructuredPowerPointLoader filename = "../unstructured/example-docs/fake-power-point.ppt" loader = UnstructuredPowerPointLoader(filename) loader.load() ```	2 years ago
Matt Robinson	b956070f08	docs: add an unstructured section to the ecosystem page (#1125 ) ### Summary Adds an Unstructured section to the ecosystem page.	2 years ago
Hasegawa Yuya	383c67c1b2	Fix Issue #1100 (#1101 ) https://github.com/hwchase17/langchain/issues/1100 When faiss data and doc.index are created in past versions, error occurs that say there was no attribute. So I put hasattr in the check as a simple solution. However, increasing the number of such checks is not good for conservatism, so I think there is a better solution. Also, the code for the batch process was left out, so I put it back in.	2 years ago
Harrison Chase	3f50feb280	fix telegram imports (#1110 )	2 years ago
trigaten	6fafcd0a70	Strange behavior with LLM import requirements (#1104 ) This import works fine: ```python from langchain import Anthropic ``` This import does not: ```python from langchain import AI21 ``` ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: cannot import name 'AI21' from 'langchain' (/opt/anaconda3/envs/fed_nlp/lib/python3.9/site-packages/langchain/__init__.py) ``` I think there is a slight documentation inconsistency here: https://langchain.readthedocs.io/en/latest/reference/modules/llms.html This PR starts to solve that. Should all the import examples be `from langchain.llms import X` instead of `from langchain import X`?	2 years ago
Kacper Łukawski	ab1a3cccac	Hotfix: Qdrant content retrieval (revert: #1088 ) (#1093 ) The #1088 introduced a bug in Qdrant integration. That PR reverts those changes and provides class attributes to ensure consistent payload keys. In addition to that, an exception will be thrown if any of texts is None (that could have been an issue reported in #1087)	2 years ago
Harrison Chase	6322b6f657	bump version 0.0.88 (#1090 )	2 years ago
Francisco Ingham	3462130e2d	Modify number of types of chains (#1089 ) Changed number of types of chains to make it consistent with the rest of the docs	2 years ago
Rishabh Raizada	5d11e5da40	Update qdrant.py (#1088 ) Fixes #1087	2 years ago
Harrison Chase	7745505482	chat qa with sources (#1084 )	2 years ago
Harrison Chase	badeeb37b0	fix stuff count (#1083 )	2 years ago
Harrison Chase	971458c5de	docs for batch size (#1082 )	2 years ago
Harrison Chase	5e10e19bfe	Harrison/align table (#1081 ) Co-authored-by: Francisco Ingham <fpingham@gmail.com>	2 years ago
Harrison Chase	c60954d0f8	Harrison/telegram loader (#1080 ) Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr>	2 years ago
Dennis Antela Martinez	a1c296bc3c	docs: increase width (#1049 ) This addresses #948. I set the documentation max width to 2560px, but can be adjusted - see screenshot below. <img width="1741" alt="Screenshot 2023-02-14 at 13 05 57" src="https://user-images.githubusercontent.com/23406704/218749076-ea51e90a-a220-4558-b4fe-5a95b39ebf15.png">	2 years ago

... 65 66 67 68 69 ...

3935 Commits (2b663089b5f6f16890c134d14981db7a0eb446ba) All Branches Search

3935 Commits (2b663089b5f6f16890c134d14981db7a0eb446ba)

All Branches