langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Harrison Chase	880a6a3db5	Harrison/redis id key (#2057 ) Co-authored-by: Fabrizio Ruocco <ruoccofabrizio@gmail.com>	2023-03-27 15:03:51 -07:00
cragwolfe	71e8eaff2b	UnstructuredURLLoader: allow url failures, keep processing (#1954 ) By default, UnstructuredURLLoader now continues processing remaining `urls` if encountering an error for a particular url. If failure of the entire loader is desired as was previously the case, use `continue_on_failure=False`. E.g., this fails splendidly, courtesy of the 2nd url: ``` from langchain.document_loaders import UnstructuredURLLoader urls = [ "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-8-2023", "https://doesnotexistithinkprobablynotverynotlikely.io", "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-9-2023", ] loader = UnstructuredURLLoader(urls=urls, continue_on_failure=False) data = loader.load() ``` Issue: https://github.com/hwchase17/langchain/issues/1939	2023-03-27 14:34:14 -07:00
Daniel Chalef	6598beacdb	PydanticOutputParser unit test (#2047 ) Unit test for PydanticOutputParser --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	2023-03-27 14:32:56 -07:00
William FH	e4f15e4eac	Add support for YAML Spec Plugins (#2054 ) It's common to use `yaml` for an OpenAPI spec used in the GPT plugins. For example: https://www.joinmilo.com/openapi.yaml or https://api.slack.com/specs/openapi/ai-plugin.yaml (from [Wong2's ChatGPT Plugins List](https://github.com/wong2/chatgpt-plugins))	2023-03-27 14:27:48 -07:00
weiyang	e50c1ea7fb	Fix the parameter error of 'Qdrant.maximal_marginal_relevance' (#1921 ) Hi, first and foremost, I would like to express my gratitude for your outstanding work; it's truly remarkable! https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/qdrant.py#L134 It appears that there might be a minor issue with the `limit` parameter being passed incorrectly in the `Qdrant.maximal_marginal_relevance` function. This seems to be a typographical error. Signed-off-by: weiyang <weiyang.ones@gmail.com>	2023-03-27 08:29:07 -07:00
goka	62e08f80de	feat #1915 support for google custom search site restricted api (#1920 ) #1915 https://developers.google.com/custom-search/v1/site_restricted_api It is possible to search unrestricted to specific sites.	2023-03-27 08:28:55 -07:00
david qiu	c50fafb35d	fix Poetry 1.4.0+ installation (#1935 ) Temporary fix for #1801 until upstream issues with `pydata-sphinx-theme` wheel are resolved.	2023-03-27 08:27:54 -07:00
Jason Holtkamp	3d3e523520	Update getting_started with better example (#1910 ) I noticed that the "getting started" guide section on agents included an example test where the agent was getting the question wrong 😅 I guess Olivia Wilde's dating life is too tough to keep track of for this simple agent example. Let's change it to something a little easier, so users who are running their agent for the first time are less likely to be confused by a result that doesn't match that which is on the docs.	2023-03-27 08:19:13 -07:00
Eduard van Valkenburg	c1a9d83b34	Added Azure Blob Storage File and Container Loader (#1890 ) Added support for document loaders for Azure Blob Storage using a connection string. Fixes #1805 --------- Co-authored-by: Mick Vleeshouwer <mick@imick.nl>	2023-03-27 08:17:14 -07:00
Harrison Chase	42d725223e	Harrison/num token calculation (#2041 ) Co-authored-by: Aratako <127325395+Aratako@users.noreply.github.com>	2023-03-27 08:16:32 -07:00
Harrison Chase	0bbcc7815b	Harrison/open search kwargs (#2040 ) Signed-off-by: Marcel Coetzee <marcelcoetzee@tutanota.com> Co-authored-by: Marcel <34739235+Pipboyguy@users.noreply.github.com>	2023-03-27 07:56:09 -07:00
Harrison Chase	b26fa1935d	fix headers (#2039 )	2023-03-27 07:55:57 -07:00
Harrison Chase	bc2ed93b77	fix doc tags (#2019 )	2023-03-26 21:43:51 -07:00
Ankush Gola	c71f2a7b26	small nit on index page (#2018 )	2023-03-27 00:15:24 -04:00
Harrison Chase	51681f653f	fix docs (#2017 )	2023-03-26 20:50:36 -07:00
Harrison Chase	705431aecc	big docs refactor (#1978 ) Co-authored-by: Ankush Gola <ankush.gola@gmail.com>	2023-03-26 19:49:46 -07:00
Harrison Chase	b83e826510	plugin tool (#1974 )	2023-03-24 12:30:08 -07:00
Mario Kostelac	e7d6de6b1c	(ChatOpenAI) Add model_name to LLMResult.llm_output (#1960 ) This makes sure OpenAI and ChatOpenAI have the same llm_output, and allow tracking usage per model. Same work for OpenAI was done in https://github.com/hwchase17/langchain/pull/1713.	2023-03-24 08:51:16 -07:00
Harrison Chase	6e0d3880df	bump version to 122 (#1970 )	2023-03-24 08:24:44 -07:00
Harrison Chase	6ec5780547	add docs for openai retriever ingest (#1969 )	2023-03-24 08:24:33 -07:00
Harrison Chase	47d37db2d2	WIP: Harrison/base retriever (#1765 )	2023-03-24 07:46:49 -07:00
Enwei Jiao	4f364db9a9	Add milvus for ecosystem (#1951 )	2023-03-23 22:01:28 -07:00
Tim Asp	030ce9f506	fix import error of bs4 (#1952 ) Ran into a broken build if bs4 wasn't installed in the project. Minor tweak to follow the other doc loaders optional package-loading conventions. Also updated html docs to include reference to this new html loader. side note: Should there be 2 different html-to-text document loaders? This new one only handles local files, while the existing unstructured html loader handles HTML from local and remote. So it seems like the improvement was adding the title to the metadata, which is useful but could also be added to `html.py`	2023-03-23 21:56:13 -07:00
Harrison Chase	8990122d5d	retrievers interface (#1948 )	2023-03-23 19:00:38 -07:00
Harrison Chase	52d6bf04d0	tracing improvements to docs (#1947 )	2023-03-23 19:00:18 -07:00
Harrison Chase	910da8518f	hotfix (#1928 )	2023-03-23 07:11:15 -07:00
Naoki Ainoya	2f27ef92fe	Fix typo in VectorStoreIndexWrapper method (#1922 ) Fixed a typo in the argument of the query method within the VectorStoreIndexWrapper class. Specifically, the argument `retriver` has been changed to `retriever`. With this correction, the correct argument name is used, and potential bugs are avoided.	2023-03-23 07:08:04 -07:00
Harrison Chase	75149d6d38	bump version 120 (#1918 )	2023-03-22 23:21:56 -07:00
Harrison Chase	fab7994b74	Harrison/retrieval code (#1916 )	2023-03-22 23:15:04 -07:00
Harrison Chase	eb80d6e0e4	Harrison/from methods (#1912 ) Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>	2023-03-22 21:10:09 -07:00
Harrison Chase	b5667bed9e	human input default (#1911 )	2023-03-22 20:30:45 -07:00
Eric Zhu	b3be83c750	Add human as a tool (#1879 ) Human can help AI. #1871	2023-03-22 20:14:52 -07:00
Harrison Chase	50626a10ee	Hx23840 feat/add redisearch vectorstore (#1909 ) Co-authored-by: Peter <peter.shi@alephf.com> Co-authored-by: Peter Shi <42536066+hx23840@users.noreply.github.com>	2023-03-22 19:57:56 -07:00
Harrison Chase	6e1b5b8f7e	Harrison/figma doc loader (#1908 ) Co-authored-by: Ismail Pelaseyed <homanp@gmail.com>	2023-03-22 19:57:46 -07:00
Harrison Chase	eec9b1b306	Harrison/opensearch vectorstore (#1907 ) Co-authored-by: Mehmet Öner Yalçın <oneryalcin@gmail.com>	2023-03-22 19:57:38 -07:00
Xin Qiu	ea142f6a32	feat: add drop index in redis and fix prefix generate logic (#1857 ) # Description Add `drop_index` for redis RediSearch: [RediSearch quick start](https://redis.io/docs/stack/search/quick_start/) # How to use ``` from langchain.vectorstores.redis import Redis Redis.drop_index(index_name="doc",delete_documents=False) ```	2023-03-22 19:44:42 -07:00
Eli	12f868b292	Propagate "filter" arg in Chroma similarity_search (#1869 ) Technically a duplicate fix to #1619 but with unit tests and a small documentation update - Propagate `filter` arg in Chroma `similarity_search` to delegated call to `similarity_search_with_score` - Add `filter` arg to `similarity_search_by_vector` - Clarify doc strings on FakeEmbeddings	2023-03-22 19:40:10 -07:00
Memento Mori	31f9ecfc19	Fix tiktoken version (#1882 ) Fix https://github.com/hwchase17/langchain/issues/1881 This issue occurs when using `'gpt-3.5-turbo'` with `VectorDBQAWithSourcesChain`	2023-03-22 19:39:57 -07:00
Eric Zhu	273e9bf296	Simplify AzureChatOpenAI implementation. (#1902 ) Change AzureChatOpenAI class implementation as Azure just added support for chat completion API. See: https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/chatgpt?pivots=programming-language-chat-completions. This should make the code much simpler.	2023-03-22 19:36:51 -07:00
Maurício Maia	f155d9d3ec	Add metadata filter to PGVector search (#1872 ) Add ability to filter pgvector documents by metadata.	2023-03-22 15:21:40 -07:00
Klein Tahiraj	d3d4503ce2	Remove redundant .docx loader (closes #1716 ) + update how_to_guides.rst (#1891 ) In https://github.com/hwchase17/langchain/issues/1716 , it was identified that there were two .py files performing similar tasks. As a resolution, one of the files has been removed, as its purpose had already been fulfilled by the other file. Additionally, the init has been updated accordingly. Furthermore, the how_to_guides.rst file has been updated to include links to documentation that was previously missing. This was deemed necessary as the existing list on https://langchain.readthedocs.io/en/latest/modules/document_loaders/how_to_guides.html was incomplete, causing confusion for users who rely on the full list of documentation on the left sidebar of the website.	2023-03-22 15:19:42 -07:00
Harrison Chase	1f93c5cf69	extraction docs (#1898 )	2023-03-22 15:00:44 -07:00
Sean Zheng	15b5a08f4b	Update how_to_guides.rst (#1893 ) Adding OpenSearch examples	2023-03-22 14:30:43 -07:00
Kushal Chordiya	ff4a25b841	Fix minor bug in opensearch vector store add_texts function (#1878 ) In the langchain.vectorstores.opensearch_vector_search.py, in the add_texts function, around line 247, we have the following code ```python embeddings = [ self.embedding_function.embed_documents(list(text))[0] for text in texts ] ``` the goal of the `list(text)` part I believe is to pass a list to the embed_documents list instead of a a str. However, `list(text)` is a subtle bug `list(text)` would convert the string text into an array, where each element of the array is a character of the string <img width="937" alt="Screenshot 2023-03-22 at 1 27 18 PM" src="https://user-images.githubusercontent.com/88190553/226836470-384665a1-2f13-46bc-acfc-9a37417cd918.png"> The correct way should be to change the code to ```python embeddings = [ self.embedding_function.embed_documents([text])[0] for text in texts ] ``` Which wraps the string inside a list.	2023-03-22 11:27:32 -07:00
Maurício Maia	2212520a6c	Add PGVector collection metadata (#1887 ) The `CollectionStore` for `PGVector` has a `cmetadata` field but it's never used. This PR add the ability to save metadata information to the collection.	2023-03-22 11:27:07 -07:00
Harrison Chase	d08f940336	principles list (#1888 )	2023-03-22 10:48:38 -07:00
Harrison Chase	2280a2cb2f	bump version to 119 (#1886 )	2023-03-22 08:36:09 -07:00
Harrison Chase	ce5d97bcb3	Harrison/guarded output parser (#1804 ) Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>	2023-03-21 22:07:23 -07:00
DeadBranch	8fa1764c60	docs: update gpt index references to LlamaIndex (#1856 ) The GPT Index project is transitioning to the new project name, LlamaIndex. I've updated a few files referencing the old project name and repository URL to the current ones. From the [LlamaIndex repo](https://github.com/jerryjliu/llama_index): > NOTE: We are rebranding GPT Index as LlamaIndex! We will carry out this transition gradually. > > 2/25/2023: By default, our docs/notebooks/instructions now reference "LlamaIndex" instead of "GPT Index". > > 2/19/2023: By default, our docs/notebooks/instructions now use the llama-index package. However the gpt-index package still exists as a duplicate! > > 2/16/2023: We have a duplicate llama-index pip package. Simply replace all imports of gpt_index with llama_index if you choose to pip install llama-index. I'm not associated with LlamaIndex in any way. I just noticed the discrepancy when studying the lanchain documentation.	2023-03-21 22:01:05 -07:00
Harrison Chase	f299bd1416	clean up sagemaker nb (#1875 )	2023-03-21 22:00:08 -07:00

1 2 3 4 5 ...

950 Commits