langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Nuno Campos	3458489936	Lint	2023-08-23 19:39:46 +01:00
Nuno Campos	e420bf22b6	Lint	2023-08-23 19:39:46 +01:00
Nuno Campos	cc83f54694	L:int	2023-08-23 19:39:46 +01:00
Nuno Campos	d414d47c78	Use a shared executor for all parallel calls	2023-08-23 19:39:46 +01:00
Bagatur	a40c12bb88	Update the nlpcloud connector after some changes on the NLP Cloud API (#9586 ) - Description: remove some text generation deprecated parameters and update the embeddings doc, - Tag maintainer: @rlancemartin	2023-08-23 11:35:08 -07:00
Bagatur	e2e582f1f6	Fixed source key name for docugami loader (#8598 ) The Docugami loader was not returning the source metadata key. This was triggering this exception when used with retrievers, per https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/schema/prompt_template.py#L193C1-L195C41 The fix is simple and just updates the metadata key name for the document each chunk is sourced from, from "name" to "source" as expected. I tested by running the python notebook that has an end to end scenario in it. Tagging DataLoader maintainers @rlancemartin @eyurtsev	2023-08-23 11:24:55 -07:00
karynzv	5508baf1eb	Add CrateDB prompt (#9657 ) Adds a prompt template for the CrateDB SQL dialect.	2023-08-23 13:33:37 -04:00
Bagatur	a8e8a31b41	Merge branch 'master' into bagatur/locals_in_config	2023-08-23 10:26:11 -07:00
Bagatur	ef2500584c	fmt	2023-08-23 10:15:45 -07:00
Zizhong Zhang	8a03836160	docs: fix PromptGuard docs (#9659 ) Fix PromptGuard docs. Noticed several trivial issues on the docs when integrating the new class. cc @baskaryan	2023-08-23 10:04:53 -07:00
Guy Korland	39a5d02225	Cleanup of ruff warnings use isinstance() instead of type() (#9655 ) Minor cosmetic PR just cleanup of `ruff` warnings use `isinstance()` instead of `type()`	2023-08-23 07:14:31 -07:00
Joseph McElroy	2a06e7b216	ElasticsearchStore: improve error logging for adding documents (#9648 ) Not obvious what the error is when you cannot index. This pr adds the ability to log the first errors reason, to help the user diagnose the issue. Also added some more documentation for when you want to use the vectorstore with an embedding model deployed in elasticsearch. Credit: @elastic and @phoey1	2023-08-23 07:04:09 -07:00
Julien Salinas	f1072cc31f	Merge branch 'master' into master	2023-08-23 14:42:40 +02:00
Jun Liu	b379c5f9c8	Fixed the error on ConfluenceLoader when content_format=VIEW and `keep_markdown_format`=True (#9633 ) - Description: a description of the change when I set `content_format=ContentFormat.VIEW` and `keep_markdown_format=True` on ConfluenceLoader, it shows the following error: ``` langchain/document_loaders/confluence.py", line 459, in process_page page["body"]["storage"]["value"], heading_style="ATX" KeyError: 'storage' ``` The reason is because the content format was set to `view` but it was still trying to get the content from `page["body"]["storage"]["value"]`. Also added the other content formats which are supported by Atlassian API https://stackoverflow.com/questions/34353955/confluence-rest-api-expanding-page-body-when-retrieving-page-by-title/34363386#34363386 - Issue: the issue # it fixes (if applicable), Not applicable. - Dependencies: any dependencies required for this change, Added optional dependency `markdownify` if anyone wants to extract in markdown format. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-22 21:00:15 -07:00
Gabriel Fu	b2d9970fc1	Allow specifying dtype in `langchain.llms.VLLM` (#9635 ) - Description: add `dtype` argument for VLLM - Issue: #9593 - Dependencies: none - Tag maintainer: @hwchase17, @baskaryan	2023-08-22 20:21:56 -07:00
anifort	900c1f3e8d	Add support for structured data sources with google enterprise search (#9037 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: Added the capability to handles structured data from google enterprise search, - Issue: Retriever failed when underline search engine was integrated with structured data, - Dependencies: google-api-core - Tag maintainer: @jarokaz - Twitter handle: anifort Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Christos Aniftos <aniftos@google.com> Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-08-22 23:18:10 -04:00
Harrison Chase	02545a54b3	python repl improvement for csv agent (#9618 )	2023-08-22 17:06:18 -07:00
Erick Friis	fc64e6349e	Hub stub updates (#9577 ) Updates the hub stubs to not fail when no api key is found. For supporting singleton tenants and default values from sdk 0.1.6. Also adds the ability to define is_public and description for backup repo creation on push.	2023-08-22 16:05:41 -07:00
Kim Minjong	ca8232a3c1	Update BaseChatModel.astream to respect generation_info (#9430 ) Currently, generation_info is not respected by only reflecting messages in chunks. Change it to add generations so that generation chunks are merged properly. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-08-22 15:18:24 -07:00
Bagatur	81163e3c0c	parent retriever nit (#9570 ) if ids are nullable seems like they should have default val None. mirrors VectorStore interface as well. cc @mcantillon21 @jacoblee93	2023-08-22 14:58:16 -04:00
Myeongseop Kim	f1e602996a	import tqdm.auto instead of tqdm tqdm for OpenAIEmbeddings (#9584 ) - Description: current code does not work very well on jupyter notebook, so I changed the code so that it imports `tqdm.auto` instead. - Issue: #9582 - Dependencies: N/A - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: N/A Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-08-22 14:54:07 -04:00
Predrag Gruevski	d564ec944c	`poetry lock` the experimental package. (#9478 )	2023-08-22 14:09:35 -04:00
Predrag Gruevski	65e893b9cd	`poetry lock` on langchain. (#9476 )	2023-08-22 14:09:23 -04:00
Predrag Gruevski	3c7cc4d440	Test experimental package with `langchain` on `master` branch. (#9621 ) It's possible that langchain-experimental works fine with the latest published langchain, but is broken with the langchain on `master`. Unfortunately, you can see this is currently the case — this is why this PR also includes a minor fix for the `langchain` package itself. We want to catch situations like that before releasing a new langchain, hence this test.	2023-08-22 13:35:21 -04:00
Eugene Yurtsev	3408810748	Add batch util (#9620 ) Add `batch` utility to langchain	2023-08-22 12:31:18 -04:00
Bagatur	2b663089b5	bump 271 (#9615 )	2023-08-22 08:10:22 -07:00
klae01	b868ef23bc	Add AINetwork blockchain toolkit integration (#9527 ) # Description This PR introduces a new toolkit for interacting with the AINetwork blockchain. The toolkit provides a set of tools for performing various operations on the AINetwork blockchain, such as transferring AIN, reading and writing values to the blockchain database, managing apps, setting rules and owners. # Dependencies [ain-py](https://github.com/ainblockchain/ain-py) >= 1.0.2 # Misc The example notebook (langchain/docs/extras/integrations/toolkits/ainetwork.ipynb) is in the PR --------- Co-authored-by: kriii <kriii@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-22 08:03:33 -07:00
Bagatur	e99ef12cb1	Bagatur/litellm model name (#9613 ) Co-authored-by: ishaan-jaff <ishaanjaffer0324@gmail.com>	2023-08-22 07:44:00 -07:00
Harrison Chase	1720e99397	add variables for field names (#9563 )	2023-08-22 07:43:21 -07:00
Anthony Mahanna	dfb9ff1079	bugfix: ArangoDB Empty Schema Case (#9574 ) - Introduces a conditional in `ArangoGraph.generate_schema()` to exclude empty ArangoDB Collections from the schema - Add empty collection test case Issue: N/A Dependencies: None	2023-08-22 07:41:06 -07:00
Philippe PRADOS	d4c49b16e4	Fix ChatMessageHistory (#9594 ) The initialization of the array of ChatMessageHistory is buggy. The list is shared with all instances.	2023-08-22 07:36:36 -07:00
toddkim95	fba29f203a	Add to support polars (#9610 ) ### Description Polars is a DataFrame interface on top of an OLAP Query Engine implemented in Rust. Polars is faster to read than pandas, so I'm looking forward to seeing it added to the document loader. ### Dependencies polars (https://pola-rs.github.io/polars-book/user-guide/) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-22 07:36:24 -07:00
Aashish Saini	3c4f32c8b8	Replacing Exception type from ValueError to ImportError (#9588 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation. @eyurtsev , @baskaryan Thanks	2023-08-22 07:34:05 -07:00
Julien Salinas	033b874701	Remove some deprecated text generation parameters.	2023-08-22 09:26:37 +02:00
Bagatur	4e7e6bfe0a	revert	2023-08-21 18:01:49 -07:00
Bagatur	a9bf409a09	param	2023-08-21 17:37:07 -07:00
Bagatur	fa478638a9	Merge branch 'master' into bagatur/locals_in_config	2023-08-21 17:31:39 -07:00
Bagatur	182b059bf4	param	2023-08-21 17:31:38 -07:00
Bagatur	04f2d69b83	improve confluence doc loader param validation (#9568 )	2023-08-21 15:02:36 -07:00
Zizhong Zhang	00eff8c4a7	feat: Add PromptGuard integration (#9481 ) Add PromptGuard integration ------- There are two approaches to integrate PromptGuard with a LangChain application. 1. PromptGuardLLMWrapper 2. functions that can be used in LangChain expression. ----- - Dependencies `promptguard` python package, which is a runtime requirement if you'd try out the demo. - @baskaryan @hwchase17 Thanks for the ideas and suggestions along the development process. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 14:59:36 -07:00
Sathindu	652c542b2f	fix: Imports for the ConfluenceLoader:process_page (#9432 ) ### Description When we're loading documents using `ConfluenceLoader`:`load` function and, if both `include_comments=True` and `keep_markdown_format=True`, we're getting an error saying `NameError: free variable 'BeautifulSoup' referenced before assignment in enclosing scope`. loader = ConfluenceLoader(url="URI", token="TOKEN") documents = loader.load( space_key="SPACE", include_comments=True, keep_markdown_format=True, ) This happens because previous imports only consider the `keep_markdown_format` parameter, however to include the comments, it's using `BeautifulSoup` Now it's fixed to handle all four scenarios considering both `include_comments` and `keep_markdown_format`. ### Twitter `@SathinduGA` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 13:44:52 -07:00
Mike Salvatore	7c0b1b8171	Add session to ConfluenceLoader.__init__() (#9437 ) - Description: Allows the user of `ConfluenceLoader` to pass a `requests.Session` object in lieu of an authentication mechanism - Issue: None - Dependencies: None - Tag maintainer: @hwchase17	2023-08-21 13:18:35 -07:00
Kim Minjong	3d1095218c	Update ChatOpenAI._astream to respect finish_reason (#9431 ) Currently, ChatOpenAI._astream does not reflect finish_reason to generation_info. Change it to reflect that.	2023-08-21 12:56:42 -07:00
Matthew Zeiler	949b2cf177	Improvements to the Clarifai integration (#9290 ) - Improved docs - Improved performance in multiple ways through batching, threading, etc. - fixed error message - Added support for metadata filtering during similarity search. @baskaryan PTAL	2023-08-21 12:53:36 -07:00
ricki-epsilla	66a47d9a61	add Epsilla vectorstore (#9239 ) [Epsilla](https://github.com/epsilla-cloud/vectordb) vectordb is an open-source vector database that leverages the advanced academic parallel graph traversal techniques for vector indexing. This PR adds basic integration with [pyepsilla](https://github.com/epsilla-cloud/epsilla-python-client)(Epsilla vectordb python client) as a vectorstore. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 12:51:15 -07:00
Bagatur	dda5b1e370	Bagatur/doc loader confluence (#9524 ) Co-authored-by: chanjetsdp <chanjetsdp@chanjet.com>	2023-08-21 12:40:44 -07:00
Predrag Gruevski	de1f63505b	Add `py.typed` file to `langchain-experimental`. (#9557 ) The package is linted with mypy, so its type hints are correct and should be exposed publicly. Without this file, the type hints remain private and cannot be used by downstream users of the package.	2023-08-21 15:37:16 -04:00
Raynor Chavez	973866c894	fix: Updated marqo integration for marqo version 1.0.0+ (#9521 ) - Description: Updated marqo integration to use tensor_fields instead of non_tensor_fields. Upgraded marqo version to 1.2.4 - Dependencies: marqo 1.2.4 --------- Co-authored-by: Raynor Kirkson E. Chavez <raynor.chavez@192.168.254.171> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 10:43:15 -07:00
Bagatur	c7a5bb6031	bump 270 (#9549 )	2023-08-21 10:18:46 -07:00
Nuno Campos	28e1ee4891	Nc/small fixes 21aug (#9542 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-21 18:01:20 +01:00
Bagatur	d11841d760	bump 269 (#9487 )	2023-08-21 08:34:16 -07:00
axiangcoding	05aa02005b	feat(llms): support ERNIE Embedding-V1 (#9370 ) - Description: support [ERNIE Embedding-V1](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/alj562vvu), which is part of ERNIE ecology - Issue: None - Dependencies: None - Tag maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 07:52:25 -07:00
José Ferraz Neto	f116e10d53	Add SharePoint Loader (#4284 ) - Added a loader (`SharePointLoader`) that can pull documents (`pdf`, `docx`, `doc`) from the [SharePoint Document Library](https://support.microsoft.com/en-us/office/what-is-a-document-library-3b5976dd-65cf-4c9e-bf5a-713c10ca2872). - Added a Base Loader (`O365BaseLoader`) to be used for all Loaders that use [O365](https://github.com/O365/python-o365) Package - Code refactoring on `OneDriveLoader` to use the new `O365BaseLoader`. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 07:49:07 -07:00
Utku Ege Tuluk	bb4f7936f9	feat(llms): add streaming support to textgen (#9295 ) - Description: Added streaming support to the textgen component in the llms module. - Dependencies: websocket-client = "^1.6.1"	2023-08-21 07:39:14 -07:00
Eugene Yurtsev	02c5c13a6e	Fast linters go first (#9501 ) Proposal to reverse the order of linters based on the principle of running the fast ones first.	2023-08-21 00:20:54 -07:00
Ofer Mendelevitch	a758496236	Fixed issue with metadata in query (#9500 ) - Description: Changed metadata retrieval so that it combines Vectara doc level and part level metadata - Tag maintainer: @rlancemartin - Twitter handle: @ofermend	2023-08-20 16:00:14 -07:00
Eugene Yurtsev	e51bccdb28	Add strict flag to the JSON parser (#9471 ) This updates the default configuration since I think it's almost always what we want to happen. But we should evaluate whether there are any issues.	2023-08-19 22:02:12 -04:00
Taqi Jaffri	5cd244e9b7	CR feedback	2023-08-19 13:48:15 -07:00
Predrag Gruevski	be9bc62f8b	Fix bash test regex for Linux under WSL2. (#9475 ) It fails with `Permission denied` and not `not found`. Both seem reasonable.	2023-08-19 09:27:14 -04:00
Lorenzo	5b3dbf12a5	Uniform valid suffixes and clarify exceptions (#9463 ) Description: - Uniformed the current valid suffixes (file formats) for loading agents from hubs and files (to better handle future additions); - Clarified exception messages (also in unit test).	2023-08-18 21:35:53 -07:00
Brendan Collins	9f545825b7	Added Geometry Validation, Geometry Metadata, and WKT instead of Python str() to GeoDataFrame Loader (#9466 ) @rlancemartin The current implementation within `Geopandas.GeoDataFrame` loader uses the python builtin `str()` function on the input geometries. While this looks very close to WKT (Well known text), Python's str function doesn't guarantee that. In the interest of interop., I've changed to the of use `wkt` property on the Shapely geometries for generating the text representation of the geometries. Also, included here: - validation of the input `page_content_column` as being a GeoSeries. - geometry `crs` (Coordinate Reference System) / bounds (xmin/ymin/xmax/ymax) added to Document metadata. Having the CRS is critical... having the bounds is just helpful! I think there is a larger question of "Should the geometry live in the `page_content`, or should the record be better summarized and tuck the geom into metadata?" ...something for another day and another PR.	2023-08-18 21:35:39 -07:00
Kacper Łukawski	616e728ef9	Enhance qdrant vs using async embed documents (#9462 ) This is an extension of #8104. I updated some of the signatures so all the tests pass. @danhnn I couldn't commit to your PR, so I created a new one. Thanks for your contribution! @baskaryan Could you please merge it? --------- Co-authored-by: Danh Nguyen <dnncntt@gmail.com>	2023-08-18 18:59:48 -07:00
Matt Robinson	83d2a871eb	fix: apply unstructured preprocess functions (#9473 ) ### Summary Fixes a bug from #7850 where post processing functions in Unstructured loaders were not apply. Adds a assertion to the test to verify the post processing function was applied and also updates the explanation in the example notebook.	2023-08-18 18:54:28 -07:00
William FH	292ae8468e	Let you specify run id in trace as chain group (#9484 ) I think we'll deprecate this soon anyway but still nice to be able to fetch the run id	2023-08-18 17:21:53 -07:00
Predrag Gruevski	df8e35fd81	Remove incorrect ABC from two Elasticsearch classes. (#9470 ) Neither is an ABC because their own example code instantiates them directly.	2023-08-18 15:01:02 -04:00
Predrag Gruevski	82f28ca9ef	`ChatPromptTemplate` is not an `ABC`, it's instantiated directly. (#9468 ) Its own `__add__` method constructs `ChatPromptTemplate` objects directly, it cannot be abstract. Found while debugging something else with @nfcampos.	2023-08-18 14:37:10 -04:00
vamseeyarla	82fb56b79c	Issue 9401 - SequentialChain runs the same callbacks over and over in async mode (#9452 ) Issue: https://github.com/langchain-ai/langchain/issues/9401 In the Async mode, SequentialChain implementation seems to run the same callbacks over and over since it is re-using the same callbacks object. Langchain version: 0.0.264, master The implementation of this aysnc route differs from the sync route and sync approach follows the right pattern of generating a new callbacks object instead of re-using the old one and thus avoiding the cascading run of callbacks at each step. Async mode: ``` _run_manager = run_manager or AsyncCallbackManagerForChainRun.get_noop_manager() callbacks = _run_manager.get_child() ... for i, chain in enumerate(self.chains): _input = await chain.arun(_input, callbacks=callbacks) ... ``` Regular mode: ``` _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager() for i, chain in enumerate(self.chains): _input = chain.run(_input, callbacks=_run_manager.get_child(f"step_{i+1}")) ... ``` Notice how we are reusing the callbacks object in the Async code which will have a cascading effect as we run through the chain. It runs the same callbacks over and over resulting in issues. Solution: Define the async function in the same pattern as the regular one and added tests. --------- Co-authored-by: vamsee_yarlagadda <vamsee.y@airbnb.com>	2023-08-18 11:26:12 -07:00
William FH	c29fbede59	Wfh/rm num repetitions (#9425 ) Makes it hard to do test run comparison views and we'd probably want to just run multiple runs right now	2023-08-18 10:08:39 -07:00
Predrag Gruevski	eee0d1d0dd	Update repository links in the package metadata. (#9454 )	2023-08-18 12:55:43 -04:00
Bagatur	50b8f4dcc7	bump 268 (#9455 )	2023-08-18 08:46:39 -07:00
Nuno Campos	354c42afd2	Lint	2023-08-18 15:30:30 +01:00
Nuno Campos	4452314aab	Merge branch 'master' into bagatur/locals_in_config	2023-08-18 15:23:05 +01:00
Nuno Campos	d5eb228874	Add kwargs to all other optional runnable methods (#9439 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-18 15:04:26 +01:00
Leonid Ganeline	a3dd4dcadf	📖 docstrings `retrievers` consistency (#9422 ) 📜 - updated the top-level descriptions to a consistent format; - changed the format of several 100% internal functions from "name" to "_name". So, these functions are not shown in the Top-level API Reference page (with lists of classes/functions)	2023-08-18 09:20:39 -04:00
Nuno Campos	9417961b17	Add lock on tee peer cleanup (#9446 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-18 14:20:09 +01:00
Nuno Campos	d3f10d2f4f	Update test	2023-08-18 11:36:16 +01:00
Nuno Campos	6ae58da668	Assign defaults in batch calls	2023-08-18 10:53:10 +01:00
Nuno Campos	ddcb4ff5fb	Li t	2023-08-18 10:30:42 +01:00
Nuno Campos	1baedc4e18	Move patch_config	2023-08-18 10:28:39 +01:00
Nuno Campos	46f3850794	Lint	2023-08-18 10:25:41 +01:00
Nuno Campos	24a197f96a	Merge branch 'master' into bagatur/locals_in_config	2023-08-18 10:12:10 +01:00
Nuno Campos	8ddaaf3d41	Move config helpers	2023-08-18 10:10:35 +01:00
Nuno Campos	a5e7dcec61	Lint	2023-08-18 10:03:28 +01:00
Nuno Campos	c1b1666ec8	Ensure config defaults apply even when a config is passed in	2023-08-18 10:02:29 +01:00
Nuno Campos	7fe474d198	Update snapshots	2023-08-18 10:02:11 +01:00
Jacob Lee	0689628489	Adds streaming for runnable maps (#9283 ) @nfcampos @baskaryan --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-08-18 07:46:23 +01:00
Bagatur	ab21af71be	wip	2023-08-17 17:28:02 -07:00
Bagatur	6f69b19ff5	wip tests	2023-08-17 16:45:52 -07:00
Bagatur	9e906c39ba	nit	2023-08-17 16:22:22 -07:00
Bagatur	6b0a849f59	fix	2023-08-17 16:22:12 -07:00
Bagatur	c447e9a854	cr	2023-08-17 15:29:00 -07:00
Bagatur	bd80cad6db	add	2023-08-17 13:52:19 -07:00
Bagatur	8c1a528c71	cr	2023-08-17 13:52:09 -07:00
Bagatur	25cbcd9374	merge	2023-08-17 13:03:28 -07:00
Aashish Saini	ce78877a87	Replaced instances of raising ValueError with raising ImportError. (#9388 ) Refactored code to ensure consistent handling of ImportError. Replaced instances of raising ValueError with raising ImportError. The choice of raising a ValueError here is somewhat unconventional and might lead to confusion for anyone reading the code. Typically, when dealing with import-related errors, the recommended approach is to raise an ImportError with a descriptive message explaining the issue. This provides a clearer indication that the problem is related to importing the required module. @hwchase17 , @baskaryan , @eyurtsev Thanks Aashish --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-17 12:24:08 -07:00
Bagatur	8c986221e4	make openapi_schema_pydantic opt (#9408 )	2023-08-17 11:49:23 -07:00
Eugene Yurtsev	77b359edf5	More missing type annotations (#9406 ) This PR fills in more missing type annotations on pydantic models. It's OK if it missed some annotations, we just don't want it to get annotations wrong at this stage. I'll do a few more passes over the same files!	2023-08-17 12:19:50 -04:00
Bagatur	a69d1b84f4	bump 267 (#9403 )	2023-08-17 08:47:13 -07:00
Nuno Campos	c0d67420e5	Use a submodule for pydantic v1 compat (#9371 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-17 16:35:49 +01:00
Bagatur	995ef8a7fc	unpin pydantic (#9356 )	2023-08-17 01:55:46 -07:00
Tong Gao	3c8e9a9641	Fix typos in eval_chain.py (#9365 ) Fixed two minor typos.	2023-08-17 01:53:46 -07:00
Eugene Yurtsev	2673b3a314	Create pydantic v1 namespace in langchain (#9254 ) Create pydantic v1 namespace in langchain experimental	2023-08-16 21:19:31 -07:00
Eugene Yurtsev	4c2de2a7f2	Adding missing types in some pydantic models (#9355 ) * Adding missing types in some pydantic models -- this change is required for making the code work with pydantic v2.	2023-08-16 20:10:34 -07:00
Harrison Chase	1c089cadd7	fix import v2 (#9346 )	2023-08-16 17:33:01 -07:00
qqjettkgjzhxmwj	84a97d55e1	Fix typo in llm_router.py (#9322 ) Fix typo	2023-08-16 15:56:44 -07:00
Joe Reuter	09aa1eac03	Airbyte loaders: Fix last_state getter (#9314 ) This PR fixes the Airbyte loaders when doing incremental syncs. The notebooks are calling out to access `loader.last_state` to get the current state of incremental syncs, but this didn't work due to a refactoring of how the loaders are structured internally in the original PR. This PR fixes the issue by adding a `last_state` property that forwards the state correctly from the CDK adapter. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-16 15:56:33 -07:00
Jakub Kuciński	8bebc9206f	Add improved sources splitting in BaseQAWithSourcesChain (#8716 ) ## Type: Improvement --- ## Description: Running QAWithSourcesChain sometimes raises ValueError as mentioned in issue #7184: ``` ValueError: too many values to unpack (expected 2) Traceback: response = qa({"question": pregunta}, return_only_outputs=True) File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\base.py", line 166, in __call__ raise e File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\base.py", line 160, in __call__ self._call(inputs, run_manager=run_manager) File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\qa_with_sources\base.py", line 132, in _call answer, sources = re.split(r"SOURCES:\s", answer) ``` This is due to LLM model generating subsequent question, answer and sources, that is complement in a similar form as below: ``` <final_answer> SOURCES: <sources> QUESTION: <new_or_repeated_question> FINAL ANSWER: <new_or_repeated_final_answer> SOURCES: <new_or_repeated_sources> ``` It leads the following line ``` re.split(r"SOURCES:\s", answer) ``` to return more than 2 elements and result in ValueError. The simple fix is to split also with "QUESTION:\s" and take the first two elements: ``` answer, sources = re.split(r"SOURCES:\s\|QUESTION:\s", answer)[:2] ``` Sometimes LLM might also generate some other texts, like alternative answers in a form: ``` <final_answer_1> SOURCES: <sources> <final_answer_2> SOURCES: <sources> <final_answer_3> SOURCES: <sources> ``` In such cases it is the best to split previously obtained sources with new line: ``` sources = re.split(r"\n", sources.lstrip())[0] ``` --- ## Issue: Resolves #7184 --- ## Maintainer: @baskaryan	2023-08-16 13:30:15 -07:00
Bagatur	a3c79b1909	Add tiktoken integration dep (#9332 )	2023-08-16 12:09:22 -07:00
Bagatur	ba5fbaba70	bump 266 (#9296 )	2023-08-16 01:13:19 -07:00
axiangcoding	63601551b1	fix(llms): improve the ernie chat model (#9289 ) - Description: improve the ernie chat model. - fix missing kwargs to payload - new test cases - add some debug level log - improve description - Issue: None - Dependencies: None - Tag maintainer: @baskaryan	2023-08-16 00:48:42 -07:00
Daniel Chalef	1d55141c50	zep/new ZepVectorStore (#9159 ) - new ZepVectorStore class - ZepVectorStore unit tests - ZepVectorStore demo notebook - update zep-python to ~1.0.2 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-16 00:23:07 -07:00
William FH	2519580994	Add Schema Evals (#9228 ) Simple eval checks for whether a generation is valid json and whether it matches an expected dict	2023-08-15 17:17:32 -07:00
Kenny	74a64cfbab	expose output key to create_openai_fn_chain (#9155 ) I quick change to allow the output key of create_openai_fn_chain to optionally be changed. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 17:01:32 -07:00
Bagatur	afba2be3dc	update openai functions docs (#9278 )	2023-08-15 17:00:56 -07:00
Bagatur	9abf60acb6	Bagatur/vectara regression (#9276 ) Co-authored-by: Ofer Mendelevitch <ofer@vectara.com> Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>	2023-08-15 16:19:46 -07:00
Xiaoyu Xee	b30f449dae	Add dashvector vectorstore (#9163 ) ## Description Add `Dashvector` vectorstore for langchain - [dashvector quick start](https://help.aliyun.com/document_detail/2510223.html) - [dashvector package description](https://pypi.org/project/dashvector/) ## How to use ```python from langchain.vectorstores.dashvector import DashVector dashvector = DashVector.from_documents(docs, embeddings) ``` --------- Co-authored-by: smallrain.xuxy <smallrain.xuxy@alibaba-inc.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 16:19:30 -07:00
Bagatur	bfbb97b74c	Bagatur/deeplake docs fixes (#9275 ) Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>	2023-08-15 15:56:36 -07:00
Kunj-2206	1b3942ba74	Added BittensorLLM (#9250 ) Description: Adding NIBittensorLLM via Validator Endpoint to langchain llms Tag maintainer: @Kunj-2206 Maintainer responsibilities: Models / Prompts: @hwchase17, @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 15:40:52 -07:00
Toshish Jawale	852722ea45	Improvements in Nebula LLM (#9226 ) - Description: Added improvements in Nebula LLM to perform auto-retry; more generation parameters supported. Conversation is no longer required to be passed in the LLM object. Examples are updated. - Issue: N/A - Dependencies: N/A - Tag maintainer: @baskaryan - Twitter handle: symbldotai --------- Co-authored-by: toshishjawale <toshish@symbl.ai>	2023-08-15 15:33:07 -07:00
Bagatur	358562769a	Bagatur/refac faiss (#9076 ) Code cleanup and bug fix in deletion	2023-08-15 15:19:00 -07:00
Bagatur	3eccd72382	pin pydantic (#9274 ) don't want default to be v2 yet	2023-08-15 15:02:28 -07:00
Erick Friis	76d09b4ed0	hub push/pull (#9225 ) Description: Adds push/pull functions to interact with the hub Issue: n/a Dependencies: `langchainhub` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 14:11:43 -07:00
Alex Gamble	cf17c58b47	Update documentation for the Context integration with new URL and features (#9259 ) Update documentation and URLs for the Langchain Context integration. We've moved from getcontext.ai to context.ai \o/ Thanks in advance for the review!	2023-08-15 11:38:34 -07:00
Eugene Yurtsev	a091b4bf4c	Update testing workflow to test with both pydantic versions (#9206 ) * PR updates test.yml to test with both pydantic versions * Code should be refactored to make it easier to do testing in matrix format w/ packages * Added steps to assert that pydantic version in the environment is as expected	2023-08-15 13:21:11 -04:00
Bagatur	e0162baa3b	add oai sched tests (#9257 )	2023-08-15 09:40:33 -07:00
Joseph McElroy	5e9687a196	Elasticsearch self-query retriever (#9248 ) Now with ElasticsearchStore VectorStore merged, i've added support for the self-query retriever. I've added a notebook also to demonstrate capability. I've also added unit tests. Credit @elastic and @phoey1 on twitter.	2023-08-15 10:53:43 -04:00
Eugene Yurtsev	0470198fb5	Remove packages for pydantic compatibility (#9217 ) # Poetry updates This PR updates LangChains poetry file to remove any dependencies that aren't pydantic v2 compatible yet. All packages remain usable under pydantic v1, and can be installed separately. ## Bumping the following packages: * langsmith ## Removing the following packages not used in extended unit-tests: * zep-python, anthropic, jina, spacy, steamship, betabageldb not used at all: * octoai-sdk Cleaning up extras w/ for removed packages. ## Snapshots updated Some snapshots had to be updated due to a change in the data model in langsmith. RunType used to be Union of Enum and string and was changed to be string only.	2023-08-15 10:41:25 -04:00
Bagatur	e986afa13a	bump 265 (#9253 )	2023-08-15 07:21:32 -07:00
Hech	4b505060bd	fix: max_marginal_relevance_search and docs in Dingo (#9244 )	2023-08-15 01:06:06 -07:00
axiangcoding	664ff28cba	feat(llms): support ernie chat (#9114 ) Description: support ernie (文心一言) chat model Related issue: #7990 Dependencies: None Tag maintainer: @baskaryan	2023-08-15 01:05:46 -07:00
Bharat Ramanathan	08a8363fc6	feat(integration): Add support to serialize protobufs in WandbTracer (#8914 ) This PR adds serialization support for protocol bufferes in `WandbTracer`. This allows code generation chains to be visualized. Additionally, it also fixes a minor bug where the settings are not honored when a run is initialized before using the `WandbTracer` @agola11 --------- Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 01:05:12 -07:00
Joshua Sundance Bailey	ef0664728e	ArcGISLoader update (#9240 ) Small bug fixes and added metadata based on user feedback. This PR is from the author of https://github.com/langchain-ai/langchain/pull/8873 .	2023-08-14 23:44:29 -07:00
Joseph McElroy	eac4ddb4bb	Elasticsearch Store Improvements (#8636 ) Todo: - [x] Connection options (cloud, localhost url, es_connection) support - [x] Logging support - [x] Customisable field support - [x] Distance Similarity support - [x] Metadata support - [x] Metadata Filter support - [x] Retrieval Strategies - [x] Approx - [x] Approx with Hybrid - [x] Exact - [x] Custom - [x] ELSER (excluding hybrid as we are working on RRF support) - [x] integration tests - [x] Documentation 👋 this is a contribution to improve Elasticsearch integration with Langchain. Its based loosely on the changes that are in master but with some notable changes: ## Package name & design improvements The import name is now `ElasticsearchStore`, to aid discoverability of the VectorStore. ```py ## Before from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch, ElasticKnnSearch ## Now from langchain.vectorstores.elasticsearch import ElasticsearchStore ``` ## Retrieval Strategy support Before we had a number of classes, depending on the strategy you wanted. `ElasticKnnSearch` for approx, `ElasticVectorSearch` for exact / brute force. With `ElasticsearchStore` we have retrieval strategies: ### Approx Example Default strategy for the vast majority of developers who use Elasticsearch will be inferring the embeddings from outside of Elasticsearch. Uses KNN functionality of _search. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index" ) output = docsearch.similarity_search("foo", k=1) ``` ### Approx, with hybrid Developers who want to search, using both the embedding and the text bm25 match. Its simple to enable. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.ApproxRetrievalStrategy(hybrid=True) ) output = docsearch.similarity_search("foo", k=1) ``` ### Approx, with `query_model_id` Developers who want to infer within Elasticsearch, using the model loaded in the ml node. This relies on the developer to setup the pipeline and index if they wish to embed the text in Elasticsearch. Example of this in the test. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.ApproxRetrievalStrategy( query_model_id="sentence-transformers__all-minilm-l6-v2" ), ) output = docsearch.similarity_search("foo", k=1) ``` ### I want to provide my own custom Elasticsearch Query You might want to have more control over the query, to perform multi-phase retrieval such as LTR, linearly boosting on document parameters like recently updated or geo-distance. You can do this with `custom_query_fn` ```py def my_custom_query(query_body: dict, query: str) -> dict: return {"query": {"match": {"text": {"query": "bar"}}}} texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), **elasticsearch_connection, index_name=index_name ) docsearch.similarity_search("foo", k=1, custom_query=my_custom_query) ``` ### Exact Example Developers who have a small dataset in Elasticsearch, dont want the cost of indexing the dims vs tradeoff on cost at query time. Uses script_score. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.ExactRetrievalStrategy(), ) output = docsearch.similarity_search("foo", k=1) ``` ### ELSER Example Elastic provides its own sparse vector model called ELSER. With these changes, its really easy to use. The vector store creates a pipeline and index thats setup for ELSER. All the developer needs to do is configure, ingest and query via langchain tooling. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.SparseVectorStrategy(), ) output = docsearch.similarity_search("foo", k=1) ``` ## Architecture In future, we can introduce new strategies and allow us to not break bwc as we evolve the index / query strategy. ## Credit On release, could you credit @elastic and @phoey1 please? Thank you! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-14 23:42:35 -07:00
Divyansh Garg	9529483c2a	Improve MultiOn client toolkit prompts (#9222 ) - Updated prompts for the MultiOn toolkit for better functionality - Non-blocking but good to have it merged to improve the overall performance for the toolkit @hinthornw @hwchase17 --------- Co-authored-by: Naman Garg <ngarg3@binghamton.edu>	2023-08-14 17:39:51 -07:00
William FH	c478fc208e	Default On Retry (#9230 ) Base callbacks don't have a default on retry event Fix #8542 --------- Co-authored-by: landonsilla <landon.silla@stepstone.com>	2023-08-14 16:45:17 -07:00
Leonid Ganeline	93dd499997	docstrings: `document_loaders` consistency 3 (#9216 ) Updated docstrings into the consistent format (probably, the last update for the `document_loaders`.	2023-08-14 16:28:39 -07:00
Kshitij Wadhwa	a69cb95850	track langchain usage for Rockset (#9229 ) Add ability to track langchain usage for Rockset. Rockset's new python client allows setting this. To prevent old clients from failing, it ignore if setting throws exception (we can't track old versions) Tested locally with old and new Rockset python client cc @baskaryan	2023-08-14 16:27:34 -07:00
Leonid Ganeline	7810ea5812	docstrings: `chat_models` consistency (#9227 ) Updated docstrings into the consistent format.	2023-08-14 16:15:56 -07:00
William FH	b0896210c7	Return feedback with failed response if there's an error (#9223 ) In Evals	2023-08-14 15:59:16 -07:00
William FH	7124f2ebfa	Parent Doc Retriever (#9214 ) 2 things: - Implement the private method rather than the public one so callbacks are handled properly - Add search_kwargs (Open to not adding this if we are trying to deprecate this UX but seems like as a user i'd assume similar args to the vector store retriever. In fact some may assume this implements the same interface but I'm not dealing with that here) -	2023-08-14 15:41:53 -07:00
Harrison Chase	3f601b5809	add async method in (#9204 )	2023-08-14 11:04:31 -07:00
Clark	03ea0762a1	fix(jinachat): related to #9197 (#9200 ) related to: https://github.com/langchain-ai/langchain/issues/9197 --------- Co-authored-by: qianjun.wqj <qianjun.wqj@alibaba-inc.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-14 11:04:20 -07:00
Eugene Yurtsev	4f1feaca83	Wrap OpenAPI features in conditionals for pydantic v2 compatibility (#9205 ) Wrap OpenAPI in conditionals for pydantic v2 compatibility.	2023-08-14 13:40:58 -04:00
Glauco Custódio	89be10f6b4	add ttl to RedisCache (#9068 ) Add `ttl` (time to live) to `RedisCache`	2023-08-14 12:59:18 -04:00
Eugene Yurtsev	04bc5f3b18	Conditionally add pydantic v1 to namespace (#9202 ) Conditionally add pydantic_v1 to namespace.	2023-08-14 11:26:45 -04:00
shibuiwilliam	feec422bf7	fix logging to logger (#9192 ) # What - fix logging to logger	2023-08-14 08:21:09 -07:00
Bagatur	5935767056	bump lc 246, lce 9 (#9207 )	2023-08-14 08:14:37 -07:00
Bagatur	b5a57acf6c	lite llm lint (#9208 )	2023-08-14 11:03:06 -04:00
Krish Dholakia	49f1d8477c	Adding ChatLiteLLM model (#9020 ) Description: Adding a langchain integration for the LiteLLM library Tag maintainer: @hwchase17, @baskaryan Twitter handle: @krrish_dh / @Berri_AI --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-14 07:43:40 -07:00
Eugene Yurtsev	72f9150a50	Update 2 more pydantic imports (#9203 ) Update two more pydantic imports to use v1 explicitly	2023-08-14 10:11:30 -04:00

1 2 3 4 5 ...

629 Commits