langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Harrison Chase	75149d6d38	bump version 120 (#1918 )	2023-03-22 23:21:56 -07:00
Harrison Chase	fab7994b74	Harrison/retrieval code (#1916 )	2023-03-22 23:15:04 -07:00
Harrison Chase	eb80d6e0e4	Harrison/from methods (#1912 ) Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>	2023-03-22 21:10:09 -07:00
Harrison Chase	b5667bed9e	human input default (#1911 )	2023-03-22 20:30:45 -07:00
Eric Zhu	b3be83c750	Add human as a tool (#1879 ) Human can help AI. #1871	2023-03-22 20:14:52 -07:00
Harrison Chase	50626a10ee	Hx23840 feat/add redisearch vectorstore (#1909 ) Co-authored-by: Peter <peter.shi@alephf.com> Co-authored-by: Peter Shi <42536066+hx23840@users.noreply.github.com>	2023-03-22 19:57:56 -07:00
Harrison Chase	6e1b5b8f7e	Harrison/figma doc loader (#1908 ) Co-authored-by: Ismail Pelaseyed <homanp@gmail.com>	2023-03-22 19:57:46 -07:00
Harrison Chase	eec9b1b306	Harrison/opensearch vectorstore (#1907 ) Co-authored-by: Mehmet Öner Yalçın <oneryalcin@gmail.com>	2023-03-22 19:57:38 -07:00
Xin Qiu	ea142f6a32	feat: add drop index in redis and fix prefix generate logic (#1857 ) # Description Add `drop_index` for redis RediSearch: [RediSearch quick start](https://redis.io/docs/stack/search/quick_start/) # How to use ``` from langchain.vectorstores.redis import Redis Redis.drop_index(index_name="doc",delete_documents=False) ```	2023-03-22 19:44:42 -07:00
Eli	12f868b292	Propagate "filter" arg in Chroma similarity_search (#1869 ) Technically a duplicate fix to #1619 but with unit tests and a small documentation update - Propagate `filter` arg in Chroma `similarity_search` to delegated call to `similarity_search_with_score` - Add `filter` arg to `similarity_search_by_vector` - Clarify doc strings on FakeEmbeddings	2023-03-22 19:40:10 -07:00
Memento Mori	31f9ecfc19	Fix tiktoken version (#1882 ) Fix https://github.com/hwchase17/langchain/issues/1881 This issue occurs when using `'gpt-3.5-turbo'` with `VectorDBQAWithSourcesChain`	2023-03-22 19:39:57 -07:00
Eric Zhu	273e9bf296	Simplify AzureChatOpenAI implementation. (#1902 ) Change AzureChatOpenAI class implementation as Azure just added support for chat completion API. See: https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/chatgpt?pivots=programming-language-chat-completions. This should make the code much simpler.	2023-03-22 19:36:51 -07:00
Maurício Maia	f155d9d3ec	Add metadata filter to PGVector search (#1872 ) Add ability to filter pgvector documents by metadata.	2023-03-22 15:21:40 -07:00
Klein Tahiraj	d3d4503ce2	Remove redundant .docx loader (closes #1716 ) + update how_to_guides.rst (#1891 ) In https://github.com/hwchase17/langchain/issues/1716 , it was identified that there were two .py files performing similar tasks. As a resolution, one of the files has been removed, as its purpose had already been fulfilled by the other file. Additionally, the init has been updated accordingly. Furthermore, the how_to_guides.rst file has been updated to include links to documentation that was previously missing. This was deemed necessary as the existing list on https://langchain.readthedocs.io/en/latest/modules/document_loaders/how_to_guides.html was incomplete, causing confusion for users who rely on the full list of documentation on the left sidebar of the website.	2023-03-22 15:19:42 -07:00
Harrison Chase	1f93c5cf69	extraction docs (#1898 )	2023-03-22 15:00:44 -07:00
Sean Zheng	15b5a08f4b	Update how_to_guides.rst (#1893 ) Adding OpenSearch examples	2023-03-22 14:30:43 -07:00
Kushal Chordiya	ff4a25b841	Fix minor bug in opensearch vector store add_texts function (#1878 ) In the langchain.vectorstores.opensearch_vector_search.py, in the add_texts function, around line 247, we have the following code ```python embeddings = [ self.embedding_function.embed_documents(list(text))[0] for text in texts ] ``` the goal of the `list(text)` part I believe is to pass a list to the embed_documents list instead of a a str. However, `list(text)` is a subtle bug `list(text)` would convert the string text into an array, where each element of the array is a character of the string <img width="937" alt="Screenshot 2023-03-22 at 1 27 18 PM" src="https://user-images.githubusercontent.com/88190553/226836470-384665a1-2f13-46bc-acfc-9a37417cd918.png"> The correct way should be to change the code to ```python embeddings = [ self.embedding_function.embed_documents([text])[0] for text in texts ] ``` Which wraps the string inside a list.	2023-03-22 11:27:32 -07:00
Maurício Maia	2212520a6c	Add PGVector collection metadata (#1887 ) The `CollectionStore` for `PGVector` has a `cmetadata` field but it's never used. This PR add the ability to save metadata information to the collection.	2023-03-22 11:27:07 -07:00
Harrison Chase	d08f940336	principles list (#1888 )	2023-03-22 10:48:38 -07:00
Harrison Chase	2280a2cb2f	bump version to 119 (#1886 )	2023-03-22 08:36:09 -07:00
Harrison Chase	ce5d97bcb3	Harrison/guarded output parser (#1804 ) Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>	2023-03-21 22:07:23 -07:00
DeadBranch	8fa1764c60	docs: update gpt index references to LlamaIndex (#1856 ) The GPT Index project is transitioning to the new project name, LlamaIndex. I've updated a few files referencing the old project name and repository URL to the current ones. From the [LlamaIndex repo](https://github.com/jerryjliu/llama_index): > NOTE: We are rebranding GPT Index as LlamaIndex! We will carry out this transition gradually. > > 2/25/2023: By default, our docs/notebooks/instructions now reference "LlamaIndex" instead of "GPT Index". > > 2/19/2023: By default, our docs/notebooks/instructions now use the llama-index package. However the gpt-index package still exists as a duplicate! > > 2/16/2023: We have a duplicate llama-index pip package. Simply replace all imports of gpt_index with llama_index if you choose to pip install llama-index. I'm not associated with LlamaIndex in any way. I just noticed the discrepancy when studying the lanchain documentation.	2023-03-21 22:01:05 -07:00
Harrison Chase	f299bd1416	clean up sagemaker nb (#1875 )	2023-03-21 22:00:08 -07:00
Philipp Schmid	064be93edf	[Embeddings] Add SageMaker Endpoint Embedding class (#1859 ) # What does this PR do? This PR adds similar to `llms` a SageMaker-powered `embeddings` class. This is helpful if you want to leverage Hugging Face models on SageMaker for creating your indexes. I added a example into the [docs/modules/indexes/examples/embeddings.ipynb](https://github.com/hwchase17/langchain/compare/master...philschmid:add-sm-embeddings?expand=1#diff-e82629e2894974ec87856aedd769d4bdfe400314b03734f32bee5990bc7e8062) document. The example currently includes some `_### TEMPORARY: Showing how to deploy a SageMaker Endpoint from a Hugging Face model ###_ ` code showing how you can deploy a sentence-transformers to SageMaker and then run the methods of the embeddings class. @hwchase17 please let me know if/when i should remove the `_### TEMPORARY: Showing how to deploy a SageMaker Endpoint from a Hugging Face model ###_` in the description i linked to a detail blog on how to deploy a Sentence Transformers so i think we don't need to include those steps here. I also reused the `ContentHandlerBase` from `langchain.llms.sagemaker_endpoint` and changed the output type to `any` since it is depending on the implementation.	2023-03-21 21:51:48 -07:00
anupam-tiwari	86822d1cc2	Fixes the import typo in the vector db text generator notebook (#1874 ) Fixes the import typo in the vector db text generator notebook for the chroma library Co-authored-by: Anupam <anupam@10-16-252-145.dynapool.wireless.nyu.edu>	2023-03-21 21:48:26 -07:00
Harrison Chase	a581bce379	remove key (#1863 )	2023-03-21 12:43:41 -07:00
Harrison Chase	2ffc643086	add listen api docs (#1855 )	2023-03-21 09:29:34 -07:00
Harrison Chase	2136dc94bb	bump version to 118 (#1854 )	2023-03-21 09:15:52 -07:00
Matt Tucker	a92344f476	Use regex match for bash process error output test assertion. (#1837 ) I was getting the same issue reported in #1339 by [MacYang555](https://github.com/MacYang555) when running the test suite on my Mac. I implemented the fix they suggested to use a regex match in the output assertion for the scenario under test. Resolves #1339	2023-03-21 09:06:52 -07:00
Tomoko Uchida	b706966ebc	Add setup instruction in Getting Started for Indexing (#1847 ) `VectorstoreIndexCreator` [uses Chroma as the vectorstore by default](`1c22657256/langchain/indexes/vectorstore.py (L49)`). It may be helpful to add a short note for the setup. You can see how the notebook looks here. https://github.com/mocobeta/langchain/blob/feat/add-setup-instruction-to-index-getting-started/docs/modules/indexes/getting_started.ipynb	2023-03-21 09:06:35 -07:00
Harrison Chase	1c22657256	Harrison/faiss merge (#1843 ) Co-authored-by: Ting Su <ting.su.1995@outlook.com>	2023-03-20 22:54:08 -07:00
Harrison Chase	6f02286805	Harrison/subtitles (#1842 ) Co-authored-by: David Ruan <ruanwz@gmail.com> Co-authored-by: David Ruan <david.ruan@analyticservice.net>	2023-03-20 22:53:52 -07:00
Simon Zhou	3674074eb0	Add Qdrant to ecosystem page (#1830 ) Add [Qdrant](https://qdrant.tech/) to [LangChain ecosystem](https://langchain.readthedocs.io/en/latest/ecosystem.html) page.	2023-03-20 22:06:40 -07:00
Wenbin Fang	a7e09d46c5	Add podcast api tool to use NLP to search all podcasts or episodes. (#1833 ) Use the following code to test: ```python import os from langchain.llms import OpenAI from langchain.chains.api import podcast_docs from langchain.chains import APIChain # Get api key here: https://openai.com/pricing os.environ["OPENAI_API_KEY"] = "sk-xxxxx" # Get api key here: https://www.listennotes.com/api/pricing/ listen_api_key = 'xxx' llm = OpenAI(temperature=0) headers = {"X-ListenAPI-Key": listen_api_key} chain = APIChain.from_llm_and_api_docs(llm, podcast_docs.PODCAST_DOCS, headers=headers, verbose=True) chain.run("Search for 'silicon valley bank' podcast episodes, audio length is more than 30 minutes, return only 1 results") ``` Known issues: the api response data might be too big, and we'll get such error: `openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 6733 tokens (6477 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.`	2023-03-20 22:04:17 -07:00
Matt Tucker	fa2e546b76	Add workaround for debugpy install issue to contrib docs. (#1835 ) When following the Quick Start instructions in the contributing docs, I was getting a "WheelFileValidationError" on installation of debugpy which was blocking the installation of a number of other deps. Google turned up this [GitHub issue](https://github.com/microsoft/debugpy/issues/1246) indicating a regression in Poetry 1.4.1 and workarounds. This PR updates the contrib docs noting the issue and the workarounds.	2023-03-20 22:03:19 -07:00
Daniel Dror (Dubovski)	c592b12043	Allow passing in encoding to csv_loader (#1836 )	2023-03-20 22:03:00 -07:00
Ikko Eltociear Ashimine	9555bbd5bb	Fix typo in sqlite.ipynb (#1828 ) overriden -> overridden	2023-03-20 16:47:19 -07:00
Harrison Chase	0ca1641b14	release 0.0.117 (#1819 )	2023-03-20 08:04:04 -07:00
Harrison Chase	d5b4393bb2	Harrison/llm math (#1808 ) Co-authored-by: Vadym Barda <vadim.barda@gmail.com>	2023-03-20 07:53:26 -07:00
Bryan Helmig	7b6ff7fe00	Follow up to #1803 to remove dynamic docs route. (#1818 ) The base docs are going to be more stable and familiar for folks. Dynamic route is currently in flux.	2023-03-20 07:52:41 -07:00
Harrison Chase	76c7b1f677	Harrison/wandb (#1764 ) Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com>	2023-03-20 07:52:27 -07:00
Paul	5aa8ece211	Corrected small typo in error message. (#1791 )	2023-03-20 07:51:35 -07:00
Harrison Chase	f6d24d5740	fix bug with openai token count (#1806 )	2023-03-20 07:51:18 -07:00
Harrison Chase	b1c4480d7c	fix typing (#1807 )	2023-03-20 07:50:49 -07:00
Daniel Chalef	b6ba989f2f	Add request timeout to ChatOpenAI (#1798 ) Add request_timeout field to ChatOpenAI. Defaults to 60s. --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	2023-03-19 20:19:42 -07:00
Ankush Gola	04acda55ec	Don't use dynamic api endpoint for Zapier NLA (#1803 ) From Robert "Right now the dynamic/ route for specifically the above endpoints is acting on all providers a user has set up, not just the provider for the supplied API key."	2023-03-19 20:12:33 -07:00
Harrison Chase	8e5c4ac867	bump version to 0.0.116 (#1788 )	2023-03-19 11:01:16 -07:00
Aratako	df8702fead	Small fix: Remove unused variable `summary_message_role` (#1789 ) After the changes in #1783, `summary_message_role` is no longer used in `ConversationSummaryBufferMemory`, so this PR removes it.	2023-03-19 11:01:03 -07:00
Harrison Chase	d5d50c39e6	Harrison/azure embeddings (#1787 ) Co-authored-by: Hemant <4627288+ghaccount@users.noreply.github.com>	2023-03-19 10:42:33 -07:00
Harrison Chase	1f18698b2a	Harrison/token buffer memory (#1786 ) Co-authored-by: Aratako <127325395+Aratako@users.noreply.github.com>	2023-03-19 10:42:24 -07:00

... 9 10 11 12 13 ...

1423 Commits