langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-08 07:10:35 +00:00

Author	SHA1	Message	Date
Virat Singh	a9dddd8a32	Virat/add param to optionally not refresh ES indices (#2233 ) Context Noticed a TODO in `langchain/vectorstores/elastic_vector_search.py` for adding the option to NOT refresh ES indices Change Added a param to `add_texts()` called `refresh_indices` to not refresh ES indices. The default value is `True` so that existing behavior does not break.	2023-04-01 12:53:02 -07:00
leo-gan	579ad85785	skip unit tests that fail in Windows (#2238 ) Issue #2174 Several unit tests fail in Windows. Added pytest attribute to skip these tests automatically.	2023-04-01 12:52:21 -07:00
Harrison Chase	609b14a570	Harrison/sql alchemy (#2216 ) Co-authored-by: Jason B. Hart <jasonbhart@users.noreply.github.com>	2023-04-01 12:52:08 -07:00
Sam Cordner-Matthews	1ddd6dbf0b	Add ability to pass kwargs to loader classes in `DirectoryLoader`, add ability to modify encoding and BeautifulSoup behaviour in `BSHTMLLoader` (#2275 ) Solves #2247. Noted that the only test I added checks for the BeautifulSoup behaviour change. Happy to add a test for `DirectoryLoader` if deemed necessary.	2023-04-01 12:48:27 -07:00
James Olds	2d0ff1a06d	Update apis.md (#2278 )	2023-04-01 12:48:16 -07:00
sergerdn	09f9464254	feat: add Dockerfile to run unit tests in a Docker container (#2188 ) This makes it easy to run the tests locally. Some tests may not be able to run in `Windows` environments, hence the need for a `Dockerfile`.   The new `Dockerfile` sets up a multi-stage build to install Poetry and dependencies, and then copies the project code to a final image for tests.   The `Makefile` has been updated to include a new 'docker_tests' target that builds the Docker image and runs the `unit tests` inside a container. It would be beneficial to offer a local testing environment for developers by enabling them to run a Docker image on their local machines with the required dependencies, particularly for integration tests. While this is not included in the current PR, it would be straightforward to add in the future. This pull request lacks documentation of the changes made at this moment.	2023-04-01 09:00:09 -07:00
Harrison Chase	582950291c	remote retriever (#2232 )	2023-04-01 08:59:04 -07:00
JC Touzalin	5a0844bae1	Open a Deeplake dataset in read only mode (#2240 ) I'm using Deeplake as a vector store for a Q&A application. When several questions are being processed at the same time for the same dataset, the 2nd one triggers the following error: > LockedException: This dataset cannot be open for writing as it is locked by another machine. Try loading the dataset with `read_only=True`. Answering questions doesn't require writing new embeddings so it's ok to open the dataset in read only mode at that time. This pull request thus adds the `read_only` option to the Deeplake constructor and to its subsequent `deeplake.load()` call. The related Deeplake documentation is [here](https://docs.deeplake.ai/en/latest/deeplake.html#deeplake.load). I've tested this update on my local dev environment. I don't know if an integration test and/or additional documentation are expected however. Let me know if it is, ideally with some guidance as I'm not particularly experienced in Python.	2023-04-01 08:58:53 -07:00
Travis Hammond	e49284acde	Add encoding parameter to TextLoader (#2250 ) This merge request proposes changes to the TextLoader class to make it more flexible and robust when handling text files with different encodings. The current implementation of TextLoader does not provide a way to specify the encoding of the text file being read. As a result, it might lead to incorrect handling of files with non-default encodings, causing issues with loading the content. Benefits: - The proposed changes will make the TextLoader class more flexible, allowing it to handle text files with different encodings. - The changes maintain backward compatibility, as the encoding parameter is optional.	2023-04-01 08:57:17 -07:00
akmhmgc	67dde7d893	Add wikipedia api example (#2267 ) # description Thanks for awesome repository!! I added example for wikipedia api wrapper.	2023-04-01 08:57:04 -07:00
Abdulla Al Blooshi	90e388b9f8	Update simple typo in llm_bash md (#2269 )	2023-04-01 08:56:54 -07:00
Patrick Storm	64f44c6483	Add titles to metadatas in gdrive loader (#2260 ) I noticed the Googledrive loader does not have the "title" metadata for google docs and PDFs. This just adds that info to match the sheets.	2023-04-01 08:43:34 -07:00
Francis Felici	4b59bb55c7	update vectorstore.ipynb (#2239 ) Hello! Maybe there's a mistake in the .ipynb, where `create_vectorstore_agent` should be `create_vectorstore_router_agent` Cheers!	2023-03-31 17:49:23 -07:00
Tim Asp	7a8f1d2854	Add total_cost estimates based on token count for openai (#2243 ) We have completion and prompt tokens, model names, so if we can, let's keep a running total of the cost.	2023-03-31 17:46:37 -07:00
LaloLalo1999	632c2b49da	Fixed the link to promptlayer dashboard (#2246 ) Fixed a simple error where in the PromptLayer LLM documentation, the "PromptLayer dashboard" hyperlink linked to "https://ww.promptlayer.com" instead of "https://www.promptlayer.com". Solved issue #2245	2023-03-31 16:16:23 -07:00
Harrison Chase	e57b045402	bump version to 128 (#2236 )	2023-03-31 11:16:21 -07:00
Philipp Schmid	0ce4767076	Add `__version__` (#2221 ) # What does this PR do? This PR adds the `__version__` variable in the main `__init__.py` to easily retrieve the version, e.g., for debugging purposes or when a user wants to open an issue and provide information. Usage ```python >>> import langchain >>> langchain.__version__ '0.0.127' ``` ![Bildschirmfoto 2023-03-31 um 10 30 18](https://user-images.githubusercontent.com/32632186/229068621-53d068b5-32f4-4154-ad2c-a3e1cc7e1ef3.png)	2023-03-31 09:49:12 -07:00
Kevin Kermani Nejad	6c66f51fb8	add error message to the google drive document loader (#2186 ) When downloading a google doc, if the document is not a google doc type, for example if you uploaded a .DOCX file to your google drive, the error you get is not informative at all. I added a error handler which print the exact error occurred during downloading the document from google docs.	2023-03-30 20:58:27 -07:00
Harrison Chase	2eeaccf01c	Harrison/apify (#2215 ) Co-authored-by: Jiří Moravčík <jiri.moravcik@gmail.com>	2023-03-30 20:58:14 -07:00
Alex Stachowiak	e6a9ee64b3	Update vectorstore-retriever.ipynb (#2210 )	2023-03-30 20:51:46 -07:00
Arttii	4e9ee566ef	Add MMR methods to chroma (#2148 ) Hi, I added MMR similar to faais and milvus to chroma. Please let me know what you think.	2023-03-30 20:51:16 -07:00
Harrison Chase	fc009f61c8	sitemap more flexible (#2214 )	2023-03-30 20:46:36 -07:00
Matt Robinson	3dfe1cf60e	feat: document loader for epublications (#2202 ) ### Summary Adds a new document loader for processing e-publications. Works with `unstructured>=0.5.4`. You need to have [`pandoc`](https://pandoc.org/installing.html) installed for this loader to work. ### Testing ```python from langchain.document_loaders import UnstructuredEPubLoader loader = UnstructuredEPubLoader("winter-sports.epub", mode="elements") data = loader.load() data[0] ```	2023-03-30 20:45:31 -07:00
Ikko Eltociear Ashimine	a4a1ee6b5d	Update huggingface_length_function.ipynb (#2203 ) HuggingFace -> Hugging Face	2023-03-30 20:43:58 -07:00
Harrison Chase	2d3918c152	make requests more general (#2209 )	2023-03-30 20:41:56 -07:00
Harrison Chase	1c03205cc2	embedding docs (#2200 )	2023-03-30 08:34:14 -07:00
Harrison Chase	feec4c61f4	Harrison/docs reqs (#2199 )	2023-03-30 08:20:30 -07:00
Harrison Chase	097684e5f2	bump version to 127 (#2197 )	2023-03-30 08:11:04 -07:00
Ben Heckmann	fd1fcb5a7d	fix typing for LLMMathChain (#2183 ) Fix typing in LLMMathChain to allow chat models (#1834). Might have been forgotten in related PR #1807.	2023-03-30 07:52:58 -07:00
Cory Zue	3207a74829	fix typo in chat_prompt_template docs (#2193 )	2023-03-30 07:52:40 -07:00
Alan deLevie	597378d1f6	Small typo in custom_agent.ipynb (#2194 ) determin -> determine	2023-03-30 07:52:29 -07:00
Jeru2023	64b9843b5b	Update text.py (#2195 ) Add encoding parameter when open txt file to support unicode files.	2023-03-30 07:52:17 -07:00
Rui Ferreira	5d86a6acf1	Fix wikipedia summaries (#2187 ) This upsteam wikipedia page loading seems to still have issues. Finding a compromise solution where it does an exact match search and not a search for the completion. See previous PR: https://github.com/hwchase17/langchain/pull/2169	2023-03-30 07:34:13 -07:00
Kei Kamikawa	35a3218e84	supported async retriever (#2149 )	2023-03-30 10:14:05 -04:00
Harrison Chase	65c0c73597	Harrison/arize (#2180 ) Co-authored-by: Hakan Tekgul <tekgul2@illinois.edu>	2023-03-29 22:55:21 -07:00
Harrison Chase	33a001933a	Harrison/clear ml (#2179 ) Co-authored-by: Victor Sonck <victor.sonck@gmail.com>	2023-03-29 22:45:34 -07:00
Harrison Chase	fe804d2a01	Harrison/aim integration (#2178 ) Co-authored-by: Hovhannes Tamoyan <hovhannes.tamoyan@gmail.com> Co-authored-by: Gor Arakelyan <arakelyangor10@gmail.com>	2023-03-29 22:37:56 -07:00
Gene Ruebsamen	68f039704c	missing word 'not' in constitutional prompts (#2176 ) arson should not be condoned. not was missing in the critique	2023-03-29 22:29:48 -07:00
Harrison Chase	bcfd071784	Harrison/engine args (#2177 ) Co-authored-by: Alvaro Sevilla <alvarosevilla95@gmail.com>	2023-03-29 22:29:38 -07:00
Tim Asp	7d90691adb	Add kwargs to from_* in PrompTemplate (#2161 ) This will let us use output parsers, etc, while using the `from_*` helper functions	2023-03-29 22:13:27 -07:00
Rui Ferreira	f83c36d8fd	Fix incorrect wikipage summaries (#2169 ) Creating a page using the title causes a wikipedia search with autocomplete set to true. This frequently causes the summaries to be unrelated to the actual page found. See: `1554943e8a/wikipedia/wikipedia.py (L254-L280)`	2023-03-29 22:13:03 -07:00
Tim Asp	6be67279fb	Add apredict_and_parse to LLM (#2164 ) `predict_and_parse` exists, and it's a nice abstraction to allow for applying output parsers to LLM generations. And async is very useful. As an aside, the difference between `call/acall`, `predict/apredict` and `generate/agenerate` isn't entirely clear to me other than they all call into the LLM in slightly different ways. Is there some documentation or a good way to think about these differences? One thought: output parsers should just work magically for all those LLM calls. If the `output_parser` arg is set on the prompt, the LLM has access, so it seems like extra work on the user's end to have to call `output_parser.parse` If this sounds reasonable, happy to throw something together. @hwchase17	2023-03-29 22:12:50 -07:00
Max Caldwell	3dc49a04a3	[Documents] Updated Figma docs and added example (#2172 ) - Current docs are pointing to the wrong module, fixed - Added some explanation on how to find the necessary parameters - Added chat-based codegen example w/ retrievers Picture of the new page: ![Screenshot 2023-03-29 at 20-11-29 Figma — 🦜🔗 LangChain 0 0 126](https://user-images.githubusercontent.com/2172753/228719338-c7ec5b11-01c2-4378-952e-38bc809f217b.png) Please let me know if you'd like any tweaks! I wasn't sure if the example was too heavy for the page or not but decided "hey, I probably would want to see it" and so included it. Co-authored-by: maxtheman <max@maxs-mbp.lan>	2023-03-29 22:11:45 -07:00
Harrison Chase	5c907d9998	Harrison/base agent without docs (#2166 )	2023-03-29 22:11:25 -07:00
Zoltan Fedor	1b7cfd7222	Bugfix: Redis `lrange()` retrieves records in opposite order of inseerting (#2167 ) The new functionality of Redis backend for chat message history ([see](https://github.com/hwchase17/langchain/pull/2122)) uses the Redis list object to store messages and then uses the `lrange()` to retrieve the list of messages ([see](https://github.com/hwchase17/langchain/blob/master/langchain/memory/chat_message_histories/redis.py#L50)). Unfortunately this retrieves the messages as a list sorted in the opposite order of how they were inserted - meaning the last inserted message will be first in the retrieved list - which is not what we want. This PR fixes that as it changes the order to match the order of insertion.	2023-03-29 22:09:01 -07:00
blob42	7859245fc5	doc: more details on BaseOutputParser docstrings (#2171 ) Co-authored-by: blob42 <spike@w530>	2023-03-29 22:07:05 -07:00
Ankush Gola	529a1f39b9	make tool verbosity override agent verbosity (#2173 ) Currently, if a tool is set to verbose, an agent can override it by passing in its own verbose flag. This is not ideal if we want to stream back responses from agents, as we want the llm and tools to be sending back events but nothing else. This also makes the behavior consistent with ts.	2023-03-29 22:05:58 -07:00
Harrison Chase	f5a4bf0ce4	remove prep (#2136 ) agents should be stateless or async stuff may not work	2023-03-29 14:38:21 -07:00
sergerdn	a0453ebcf5	docs: update docstrings in ElasticVectorSearch class (#2141 ) This merge includes updated comments in the ElasticVectorSearch class to provide information on how to connect to `Elasticsearch` instances that require login credentials, including Elastic Cloud, without any functional changes. The `ElasticVectorSearch` class now inherits from the `ABC` abstract base class, which does not break or change any functionality. This allows for easy subclassing and creation of custom implementations in the future or for any users, especially for me 😄 I confirm that before pushing these changes, I ran: ```bash make format && make lint ``` To ensure that the new documentation is rendered correctly I ran ```bash make docs_build ``` To ensure that the new documentation has no broken links, I ran a check ```bash make docs_linkcheck ``` ![Capture](https://user-images.githubusercontent.com/64213648/228541688-38f17c7b-b012-4678-86b9-4dd607469062.JPG) Also take a look at https://github.com/hwchase17/langchain/issues/1865 P.S. Sorry for spamming you with force-pushes. In the future, I will be smarter.	2023-03-29 16:20:29 -04:00
Ankush Gola	ffb7de34ca	Fix docstring (#2147 ) (#2160 ) Somehow docstring was doubled. A minor fix for this --------- Co-authored-by: Piotr Mazurek <piotr635@gmail.com>	2023-03-29 16:17:54 -04:00

1 2 3 4 5 ...

1058 Commits