langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Bagatur	c7a5bb6031	bump 270 (#9549 )	2023-08-21 10:18:46 -07:00
Nuno Campos	28e1ee4891	Nc/small fixes 21aug (#9542 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-21 18:01:20 +01:00
Bagatur	d11841d760	bump 269 (#9487 )	2023-08-21 08:34:16 -07:00
axiangcoding	05aa02005b	feat(llms): support ERNIE Embedding-V1 (#9370 ) - Description: support [ERNIE Embedding-V1](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/alj562vvu), which is part of ERNIE ecology - Issue: None - Dependencies: None - Tag maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 07:52:25 -07:00
José Ferraz Neto	f116e10d53	Add SharePoint Loader (#4284 ) - Added a loader (`SharePointLoader`) that can pull documents (`pdf`, `docx`, `doc`) from the [SharePoint Document Library](https://support.microsoft.com/en-us/office/what-is-a-document-library-3b5976dd-65cf-4c9e-bf5a-713c10ca2872). - Added a Base Loader (`O365BaseLoader`) to be used for all Loaders that use [O365](https://github.com/O365/python-o365) Package - Code refactoring on `OneDriveLoader` to use the new `O365BaseLoader`. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 07:49:07 -07:00
Utku Ege Tuluk	bb4f7936f9	feat(llms): add streaming support to textgen (#9295 ) - Description: Added streaming support to the textgen component in the llms module. - Dependencies: websocket-client = "^1.6.1"	2023-08-21 07:39:14 -07:00
Eugene Yurtsev	02c5c13a6e	Fast linters go first (#9501 ) Proposal to reverse the order of linters based on the principle of running the fast ones first.	2023-08-21 00:20:54 -07:00
Ofer Mendelevitch	a758496236	Fixed issue with metadata in query (#9500 ) - Description: Changed metadata retrieval so that it combines Vectara doc level and part level metadata - Tag maintainer: @rlancemartin - Twitter handle: @ofermend	2023-08-20 16:00:14 -07:00
Eugene Yurtsev	e51bccdb28	Add strict flag to the JSON parser (#9471 ) This updates the default configuration since I think it's almost always what we want to happen. But we should evaluate whether there are any issues.	2023-08-19 22:02:12 -04:00
Predrag Gruevski	be9bc62f8b	Fix bash test regex for Linux under WSL2. (#9475 ) It fails with `Permission denied` and not `not found`. Both seem reasonable.	2023-08-19 09:27:14 -04:00
Lorenzo	5b3dbf12a5	Uniform valid suffixes and clarify exceptions (#9463 ) Description: - Uniformed the current valid suffixes (file formats) for loading agents from hubs and files (to better handle future additions); - Clarified exception messages (also in unit test).	2023-08-18 21:35:53 -07:00
Brendan Collins	9f545825b7	Added Geometry Validation, Geometry Metadata, and WKT instead of Python str() to GeoDataFrame Loader (#9466 ) @rlancemartin The current implementation within `Geopandas.GeoDataFrame` loader uses the python builtin `str()` function on the input geometries. While this looks very close to WKT (Well known text), Python's str function doesn't guarantee that. In the interest of interop., I've changed to the of use `wkt` property on the Shapely geometries for generating the text representation of the geometries. Also, included here: - validation of the input `page_content_column` as being a GeoSeries. - geometry `crs` (Coordinate Reference System) / bounds (xmin/ymin/xmax/ymax) added to Document metadata. Having the CRS is critical... having the bounds is just helpful! I think there is a larger question of "Should the geometry live in the `page_content`, or should the record be better summarized and tuck the geom into metadata?" ...something for another day and another PR.	2023-08-18 21:35:39 -07:00
Kacper Łukawski	616e728ef9	Enhance qdrant vs using async embed documents (#9462 ) This is an extension of #8104. I updated some of the signatures so all the tests pass. @danhnn I couldn't commit to your PR, so I created a new one. Thanks for your contribution! @baskaryan Could you please merge it? --------- Co-authored-by: Danh Nguyen <dnncntt@gmail.com>	2023-08-18 18:59:48 -07:00
Matt Robinson	83d2a871eb	fix: apply unstructured preprocess functions (#9473 ) ### Summary Fixes a bug from #7850 where post processing functions in Unstructured loaders were not apply. Adds a assertion to the test to verify the post processing function was applied and also updates the explanation in the example notebook.	2023-08-18 18:54:28 -07:00
William FH	292ae8468e	Let you specify run id in trace as chain group (#9484 ) I think we'll deprecate this soon anyway but still nice to be able to fetch the run id	2023-08-18 17:21:53 -07:00
Predrag Gruevski	df8e35fd81	Remove incorrect ABC from two Elasticsearch classes. (#9470 ) Neither is an ABC because their own example code instantiates them directly.	2023-08-18 15:01:02 -04:00
Predrag Gruevski	82f28ca9ef	`ChatPromptTemplate` is not an `ABC`, it's instantiated directly. (#9468 ) Its own `__add__` method constructs `ChatPromptTemplate` objects directly, it cannot be abstract. Found while debugging something else with @nfcampos.	2023-08-18 14:37:10 -04:00
vamseeyarla	82fb56b79c	Issue 9401 - SequentialChain runs the same callbacks over and over in async mode (#9452 ) Issue: https://github.com/langchain-ai/langchain/issues/9401 In the Async mode, SequentialChain implementation seems to run the same callbacks over and over since it is re-using the same callbacks object. Langchain version: 0.0.264, master The implementation of this aysnc route differs from the sync route and sync approach follows the right pattern of generating a new callbacks object instead of re-using the old one and thus avoiding the cascading run of callbacks at each step. Async mode: ``` _run_manager = run_manager or AsyncCallbackManagerForChainRun.get_noop_manager() callbacks = _run_manager.get_child() ... for i, chain in enumerate(self.chains): _input = await chain.arun(_input, callbacks=callbacks) ... ``` Regular mode: ``` _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager() for i, chain in enumerate(self.chains): _input = chain.run(_input, callbacks=_run_manager.get_child(f"step_{i+1}")) ... ``` Notice how we are reusing the callbacks object in the Async code which will have a cascading effect as we run through the chain. It runs the same callbacks over and over resulting in issues. Solution: Define the async function in the same pattern as the regular one and added tests. --------- Co-authored-by: vamsee_yarlagadda <vamsee.y@airbnb.com>	2023-08-18 11:26:12 -07:00
William FH	c29fbede59	Wfh/rm num repetitions (#9425 ) Makes it hard to do test run comparison views and we'd probably want to just run multiple runs right now	2023-08-18 10:08:39 -07:00
Predrag Gruevski	eee0d1d0dd	Update repository links in the package metadata. (#9454 )	2023-08-18 12:55:43 -04:00
Bagatur	50b8f4dcc7	bump 268 (#9455 )	2023-08-18 08:46:39 -07:00
Nuno Campos	d5eb228874	Add kwargs to all other optional runnable methods (#9439 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-18 15:04:26 +01:00
Leonid Ganeline	a3dd4dcadf	📖 docstrings `retrievers` consistency (#9422 ) 📜 - updated the top-level descriptions to a consistent format; - changed the format of several 100% internal functions from "name" to "_name". So, these functions are not shown in the Top-level API Reference page (with lists of classes/functions)	2023-08-18 09:20:39 -04:00
Nuno Campos	9417961b17	Add lock on tee peer cleanup (#9446 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-18 14:20:09 +01:00
Jacob Lee	0689628489	Adds streaming for runnable maps (#9283 ) @nfcampos @baskaryan --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-08-18 07:46:23 +01:00
Aashish Saini	ce78877a87	Replaced instances of raising ValueError with raising ImportError. (#9388 ) Refactored code to ensure consistent handling of ImportError. Replaced instances of raising ValueError with raising ImportError. The choice of raising a ValueError here is somewhat unconventional and might lead to confusion for anyone reading the code. Typically, when dealing with import-related errors, the recommended approach is to raise an ImportError with a descriptive message explaining the issue. This provides a clearer indication that the problem is related to importing the required module. @hwchase17 , @baskaryan , @eyurtsev Thanks Aashish --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-17 12:24:08 -07:00
Bagatur	8c986221e4	make openapi_schema_pydantic opt (#9408 )	2023-08-17 11:49:23 -07:00
Eugene Yurtsev	77b359edf5	More missing type annotations (#9406 ) This PR fills in more missing type annotations on pydantic models. It's OK if it missed some annotations, we just don't want it to get annotations wrong at this stage. I'll do a few more passes over the same files!	2023-08-17 12:19:50 -04:00
Bagatur	a69d1b84f4	bump 267 (#9403 )	2023-08-17 08:47:13 -07:00
Nuno Campos	c0d67420e5	Use a submodule for pydantic v1 compat (#9371 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-17 16:35:49 +01:00
Bagatur	995ef8a7fc	unpin pydantic (#9356 )	2023-08-17 01:55:46 -07:00
Tong Gao	3c8e9a9641	Fix typos in eval_chain.py (#9365 ) Fixed two minor typos.	2023-08-17 01:53:46 -07:00
Eugene Yurtsev	2673b3a314	Create pydantic v1 namespace in langchain (#9254 ) Create pydantic v1 namespace in langchain experimental	2023-08-16 21:19:31 -07:00
Eugene Yurtsev	4c2de2a7f2	Adding missing types in some pydantic models (#9355 ) * Adding missing types in some pydantic models -- this change is required for making the code work with pydantic v2.	2023-08-16 20:10:34 -07:00
Harrison Chase	1c089cadd7	fix import v2 (#9346 )	2023-08-16 17:33:01 -07:00
qqjettkgjzhxmwj	84a97d55e1	Fix typo in llm_router.py (#9322 ) Fix typo	2023-08-16 15:56:44 -07:00
Joe Reuter	09aa1eac03	Airbyte loaders: Fix last_state getter (#9314 ) This PR fixes the Airbyte loaders when doing incremental syncs. The notebooks are calling out to access `loader.last_state` to get the current state of incremental syncs, but this didn't work due to a refactoring of how the loaders are structured internally in the original PR. This PR fixes the issue by adding a `last_state` property that forwards the state correctly from the CDK adapter. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-16 15:56:33 -07:00
Jakub Kuciński	8bebc9206f	Add improved sources splitting in BaseQAWithSourcesChain (#8716 ) ## Type: Improvement --- ## Description: Running QAWithSourcesChain sometimes raises ValueError as mentioned in issue #7184: ``` ValueError: too many values to unpack (expected 2) Traceback: response = qa({"question": pregunta}, return_only_outputs=True) File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\base.py", line 166, in __call__ raise e File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\base.py", line 160, in __call__ self._call(inputs, run_manager=run_manager) File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\qa_with_sources\base.py", line 132, in _call answer, sources = re.split(r"SOURCES:\s", answer) ``` This is due to LLM model generating subsequent question, answer and sources, that is complement in a similar form as below: ``` <final_answer> SOURCES: <sources> QUESTION: <new_or_repeated_question> FINAL ANSWER: <new_or_repeated_final_answer> SOURCES: <new_or_repeated_sources> ``` It leads the following line ``` re.split(r"SOURCES:\s", answer) ``` to return more than 2 elements and result in ValueError. The simple fix is to split also with "QUESTION:\s" and take the first two elements: ``` answer, sources = re.split(r"SOURCES:\s\|QUESTION:\s", answer)[:2] ``` Sometimes LLM might also generate some other texts, like alternative answers in a form: ``` <final_answer_1> SOURCES: <sources> <final_answer_2> SOURCES: <sources> <final_answer_3> SOURCES: <sources> ``` In such cases it is the best to split previously obtained sources with new line: ``` sources = re.split(r"\n", sources.lstrip())[0] ``` --- ## Issue: Resolves #7184 --- ## Maintainer: @baskaryan	2023-08-16 13:30:15 -07:00
Bagatur	a3c79b1909	Add tiktoken integration dep (#9332 )	2023-08-16 12:09:22 -07:00
Bagatur	ba5fbaba70	bump 266 (#9296 )	2023-08-16 01:13:19 -07:00
axiangcoding	63601551b1	fix(llms): improve the ernie chat model (#9289 ) - Description: improve the ernie chat model. - fix missing kwargs to payload - new test cases - add some debug level log - improve description - Issue: None - Dependencies: None - Tag maintainer: @baskaryan	2023-08-16 00:48:42 -07:00
Daniel Chalef	1d55141c50	zep/new ZepVectorStore (#9159 ) - new ZepVectorStore class - ZepVectorStore unit tests - ZepVectorStore demo notebook - update zep-python to ~1.0.2 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-16 00:23:07 -07:00
William FH	2519580994	Add Schema Evals (#9228 ) Simple eval checks for whether a generation is valid json and whether it matches an expected dict	2023-08-15 17:17:32 -07:00
Kenny	74a64cfbab	expose output key to create_openai_fn_chain (#9155 ) I quick change to allow the output key of create_openai_fn_chain to optionally be changed. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 17:01:32 -07:00
Bagatur	afba2be3dc	update openai functions docs (#9278 )	2023-08-15 17:00:56 -07:00
Bagatur	9abf60acb6	Bagatur/vectara regression (#9276 ) Co-authored-by: Ofer Mendelevitch <ofer@vectara.com> Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>	2023-08-15 16:19:46 -07:00
Xiaoyu Xee	b30f449dae	Add dashvector vectorstore (#9163 ) ## Description Add `Dashvector` vectorstore for langchain - [dashvector quick start](https://help.aliyun.com/document_detail/2510223.html) - [dashvector package description](https://pypi.org/project/dashvector/) ## How to use ```python from langchain.vectorstores.dashvector import DashVector dashvector = DashVector.from_documents(docs, embeddings) ``` --------- Co-authored-by: smallrain.xuxy <smallrain.xuxy@alibaba-inc.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 16:19:30 -07:00
Bagatur	bfbb97b74c	Bagatur/deeplake docs fixes (#9275 ) Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>	2023-08-15 15:56:36 -07:00
Kunj-2206	1b3942ba74	Added BittensorLLM (#9250 ) Description: Adding NIBittensorLLM via Validator Endpoint to langchain llms Tag maintainer: @Kunj-2206 Maintainer responsibilities: Models / Prompts: @hwchase17, @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 15:40:52 -07:00
Toshish Jawale	852722ea45	Improvements in Nebula LLM (#9226 ) - Description: Added improvements in Nebula LLM to perform auto-retry; more generation parameters supported. Conversation is no longer required to be passed in the LLM object. Examples are updated. - Issue: N/A - Dependencies: N/A - Tag maintainer: @baskaryan - Twitter handle: symbldotai --------- Co-authored-by: toshishjawale <toshish@symbl.ai>	2023-08-15 15:33:07 -07:00

1 2 3 4 5 ...

451 Commits