langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Bagatur	ef2500584c	fmt	2023-08-23 10:15:45 -07:00
Zizhong Zhang	8a03836160	docs: fix PromptGuard docs (#9659 ) Fix PromptGuard docs. Noticed several trivial issues on the docs when integrating the new class. cc @baskaryan	2023-08-23 10:04:53 -07:00
Guy Korland	39a5d02225	Cleanup of ruff warnings use isinstance() instead of type() (#9655 ) Minor cosmetic PR just cleanup of `ruff` warnings use `isinstance()` instead of `type()`	2023-08-23 07:14:31 -07:00
Joseph McElroy	2a06e7b216	ElasticsearchStore: improve error logging for adding documents (#9648 ) Not obvious what the error is when you cannot index. This pr adds the ability to log the first errors reason, to help the user diagnose the issue. Also added some more documentation for when you want to use the vectorstore with an embedding model deployed in elasticsearch. Credit: @elastic and @phoey1	2023-08-23 07:04:09 -07:00
Julien Salinas	f1072cc31f	Merge branch 'master' into master	2023-08-23 14:42:40 +02:00
Jun Liu	b379c5f9c8	Fixed the error on ConfluenceLoader when content_format=VIEW and `keep_markdown_format`=True (#9633 ) - Description: a description of the change when I set `content_format=ContentFormat.VIEW` and `keep_markdown_format=True` on ConfluenceLoader, it shows the following error: ``` langchain/document_loaders/confluence.py", line 459, in process_page page["body"]["storage"]["value"], heading_style="ATX" KeyError: 'storage' ``` The reason is because the content format was set to `view` but it was still trying to get the content from `page["body"]["storage"]["value"]`. Also added the other content formats which are supported by Atlassian API https://stackoverflow.com/questions/34353955/confluence-rest-api-expanding-page-body-when-retrieving-page-by-title/34363386#34363386 - Issue: the issue # it fixes (if applicable), Not applicable. - Dependencies: any dependencies required for this change, Added optional dependency `markdownify` if anyone wants to extract in markdown format. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-22 21:00:15 -07:00
Gabriel Fu	b2d9970fc1	Allow specifying dtype in `langchain.llms.VLLM` (#9635 ) - Description: add `dtype` argument for VLLM - Issue: #9593 - Dependencies: none - Tag maintainer: @hwchase17, @baskaryan	2023-08-22 20:21:56 -07:00
anifort	900c1f3e8d	Add support for structured data sources with google enterprise search (#9037 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: Added the capability to handles structured data from google enterprise search, - Issue: Retriever failed when underline search engine was integrated with structured data, - Dependencies: google-api-core - Tag maintainer: @jarokaz - Twitter handle: anifort Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Christos Aniftos <aniftos@google.com> Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-08-22 23:18:10 -04:00
Harrison Chase	02545a54b3	python repl improvement for csv agent (#9618 )	2023-08-22 17:06:18 -07:00
Erick Friis	fc64e6349e	Hub stub updates (#9577 ) Updates the hub stubs to not fail when no api key is found. For supporting singleton tenants and default values from sdk 0.1.6. Also adds the ability to define is_public and description for backup repo creation on push.	2023-08-22 16:05:41 -07:00
Kim Minjong	ca8232a3c1	Update BaseChatModel.astream to respect generation_info (#9430 ) Currently, generation_info is not respected by only reflecting messages in chunks. Change it to add generations so that generation chunks are merged properly. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-08-22 15:18:24 -07:00
Bagatur	81163e3c0c	parent retriever nit (#9570 ) if ids are nullable seems like they should have default val None. mirrors VectorStore interface as well. cc @mcantillon21 @jacoblee93	2023-08-22 14:58:16 -04:00
Myeongseop Kim	f1e602996a	import tqdm.auto instead of tqdm tqdm for OpenAIEmbeddings (#9584 ) - Description: current code does not work very well on jupyter notebook, so I changed the code so that it imports `tqdm.auto` instead. - Issue: #9582 - Dependencies: N/A - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: N/A Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-08-22 14:54:07 -04:00
Predrag Gruevski	d564ec944c	`poetry lock` the experimental package. (#9478 )	2023-08-22 14:09:35 -04:00
Predrag Gruevski	65e893b9cd	`poetry lock` on langchain. (#9476 )	2023-08-22 14:09:23 -04:00
Predrag Gruevski	3c7cc4d440	Test experimental package with `langchain` on `master` branch. (#9621 ) It's possible that langchain-experimental works fine with the latest published langchain, but is broken with the langchain on `master`. Unfortunately, you can see this is currently the case — this is why this PR also includes a minor fix for the `langchain` package itself. We want to catch situations like that before releasing a new langchain, hence this test.	2023-08-22 13:35:21 -04:00
Eugene Yurtsev	3408810748	Add batch util (#9620 ) Add `batch` utility to langchain	2023-08-22 12:31:18 -04:00
Bagatur	2b663089b5	bump 271 (#9615 )	2023-08-22 08:10:22 -07:00
klae01	b868ef23bc	Add AINetwork blockchain toolkit integration (#9527 ) # Description This PR introduces a new toolkit for interacting with the AINetwork blockchain. The toolkit provides a set of tools for performing various operations on the AINetwork blockchain, such as transferring AIN, reading and writing values to the blockchain database, managing apps, setting rules and owners. # Dependencies [ain-py](https://github.com/ainblockchain/ain-py) >= 1.0.2 # Misc The example notebook (langchain/docs/extras/integrations/toolkits/ainetwork.ipynb) is in the PR --------- Co-authored-by: kriii <kriii@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-22 08:03:33 -07:00
Bagatur	e99ef12cb1	Bagatur/litellm model name (#9613 ) Co-authored-by: ishaan-jaff <ishaanjaffer0324@gmail.com>	2023-08-22 07:44:00 -07:00
Harrison Chase	1720e99397	add variables for field names (#9563 )	2023-08-22 07:43:21 -07:00
Anthony Mahanna	dfb9ff1079	bugfix: ArangoDB Empty Schema Case (#9574 ) - Introduces a conditional in `ArangoGraph.generate_schema()` to exclude empty ArangoDB Collections from the schema - Add empty collection test case Issue: N/A Dependencies: None	2023-08-22 07:41:06 -07:00
Philippe PRADOS	d4c49b16e4	Fix ChatMessageHistory (#9594 ) The initialization of the array of ChatMessageHistory is buggy. The list is shared with all instances.	2023-08-22 07:36:36 -07:00
toddkim95	fba29f203a	Add to support polars (#9610 ) ### Description Polars is a DataFrame interface on top of an OLAP Query Engine implemented in Rust. Polars is faster to read than pandas, so I'm looking forward to seeing it added to the document loader. ### Dependencies polars (https://pola-rs.github.io/polars-book/user-guide/) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-22 07:36:24 -07:00
Aashish Saini	3c4f32c8b8	Replacing Exception type from ValueError to ImportError (#9588 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation. @eyurtsev , @baskaryan Thanks	2023-08-22 07:34:05 -07:00
Julien Salinas	033b874701	Remove some deprecated text generation parameters.	2023-08-22 09:26:37 +02:00
Bagatur	4e7e6bfe0a	revert	2023-08-21 18:01:49 -07:00
Bagatur	a9bf409a09	param	2023-08-21 17:37:07 -07:00
Bagatur	fa478638a9	Merge branch 'master' into bagatur/locals_in_config	2023-08-21 17:31:39 -07:00
Bagatur	182b059bf4	param	2023-08-21 17:31:38 -07:00
Bagatur	04f2d69b83	improve confluence doc loader param validation (#9568 )	2023-08-21 15:02:36 -07:00
Zizhong Zhang	00eff8c4a7	feat: Add PromptGuard integration (#9481 ) Add PromptGuard integration ------- There are two approaches to integrate PromptGuard with a LangChain application. 1. PromptGuardLLMWrapper 2. functions that can be used in LangChain expression. ----- - Dependencies `promptguard` python package, which is a runtime requirement if you'd try out the demo. - @baskaryan @hwchase17 Thanks for the ideas and suggestions along the development process. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 14:59:36 -07:00
Sathindu	652c542b2f	fix: Imports for the ConfluenceLoader:process_page (#9432 ) ### Description When we're loading documents using `ConfluenceLoader`:`load` function and, if both `include_comments=True` and `keep_markdown_format=True`, we're getting an error saying `NameError: free variable 'BeautifulSoup' referenced before assignment in enclosing scope`. loader = ConfluenceLoader(url="URI", token="TOKEN") documents = loader.load( space_key="SPACE", include_comments=True, keep_markdown_format=True, ) This happens because previous imports only consider the `keep_markdown_format` parameter, however to include the comments, it's using `BeautifulSoup` Now it's fixed to handle all four scenarios considering both `include_comments` and `keep_markdown_format`. ### Twitter `@SathinduGA` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 13:44:52 -07:00
Mike Salvatore	7c0b1b8171	Add session to ConfluenceLoader.__init__() (#9437 ) - Description: Allows the user of `ConfluenceLoader` to pass a `requests.Session` object in lieu of an authentication mechanism - Issue: None - Dependencies: None - Tag maintainer: @hwchase17	2023-08-21 13:18:35 -07:00
Kim Minjong	3d1095218c	Update ChatOpenAI._astream to respect finish_reason (#9431 ) Currently, ChatOpenAI._astream does not reflect finish_reason to generation_info. Change it to reflect that.	2023-08-21 12:56:42 -07:00
Matthew Zeiler	949b2cf177	Improvements to the Clarifai integration (#9290 ) - Improved docs - Improved performance in multiple ways through batching, threading, etc. - fixed error message - Added support for metadata filtering during similarity search. @baskaryan PTAL	2023-08-21 12:53:36 -07:00
ricki-epsilla	66a47d9a61	add Epsilla vectorstore (#9239 ) [Epsilla](https://github.com/epsilla-cloud/vectordb) vectordb is an open-source vector database that leverages the advanced academic parallel graph traversal techniques for vector indexing. This PR adds basic integration with [pyepsilla](https://github.com/epsilla-cloud/epsilla-python-client)(Epsilla vectordb python client) as a vectorstore. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 12:51:15 -07:00
Bagatur	dda5b1e370	Bagatur/doc loader confluence (#9524 ) Co-authored-by: chanjetsdp <chanjetsdp@chanjet.com>	2023-08-21 12:40:44 -07:00
Predrag Gruevski	de1f63505b	Add `py.typed` file to `langchain-experimental`. (#9557 ) The package is linted with mypy, so its type hints are correct and should be exposed publicly. Without this file, the type hints remain private and cannot be used by downstream users of the package.	2023-08-21 15:37:16 -04:00
Raynor Chavez	973866c894	fix: Updated marqo integration for marqo version 1.0.0+ (#9521 ) - Description: Updated marqo integration to use tensor_fields instead of non_tensor_fields. Upgraded marqo version to 1.2.4 - Dependencies: marqo 1.2.4 --------- Co-authored-by: Raynor Kirkson E. Chavez <raynor.chavez@192.168.254.171> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 10:43:15 -07:00
Bagatur	c7a5bb6031	bump 270 (#9549 )	2023-08-21 10:18:46 -07:00
Nuno Campos	28e1ee4891	Nc/small fixes 21aug (#9542 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-21 18:01:20 +01:00
Bagatur	d11841d760	bump 269 (#9487 )	2023-08-21 08:34:16 -07:00
axiangcoding	05aa02005b	feat(llms): support ERNIE Embedding-V1 (#9370 ) - Description: support [ERNIE Embedding-V1](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/alj562vvu), which is part of ERNIE ecology - Issue: None - Dependencies: None - Tag maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 07:52:25 -07:00
José Ferraz Neto	f116e10d53	Add SharePoint Loader (#4284 ) - Added a loader (`SharePointLoader`) that can pull documents (`pdf`, `docx`, `doc`) from the [SharePoint Document Library](https://support.microsoft.com/en-us/office/what-is-a-document-library-3b5976dd-65cf-4c9e-bf5a-713c10ca2872). - Added a Base Loader (`O365BaseLoader`) to be used for all Loaders that use [O365](https://github.com/O365/python-o365) Package - Code refactoring on `OneDriveLoader` to use the new `O365BaseLoader`. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-21 07:49:07 -07:00
Utku Ege Tuluk	bb4f7936f9	feat(llms): add streaming support to textgen (#9295 ) - Description: Added streaming support to the textgen component in the llms module. - Dependencies: websocket-client = "^1.6.1"	2023-08-21 07:39:14 -07:00
Eugene Yurtsev	02c5c13a6e	Fast linters go first (#9501 ) Proposal to reverse the order of linters based on the principle of running the fast ones first.	2023-08-21 00:20:54 -07:00
Ofer Mendelevitch	a758496236	Fixed issue with metadata in query (#9500 ) - Description: Changed metadata retrieval so that it combines Vectara doc level and part level metadata - Tag maintainer: @rlancemartin - Twitter handle: @ofermend	2023-08-20 16:00:14 -07:00
Eugene Yurtsev	e51bccdb28	Add strict flag to the JSON parser (#9471 ) This updates the default configuration since I think it's almost always what we want to happen. But we should evaluate whether there are any issues.	2023-08-19 22:02:12 -04:00
Taqi Jaffri	5cd244e9b7	CR feedback	2023-08-19 13:48:15 -07:00
Predrag Gruevski	be9bc62f8b	Fix bash test regex for Linux under WSL2. (#9475 ) It fails with `Permission denied` and not `not found`. Both seem reasonable.	2023-08-19 09:27:14 -04:00
Lorenzo	5b3dbf12a5	Uniform valid suffixes and clarify exceptions (#9463 ) Description: - Uniformed the current valid suffixes (file formats) for loading agents from hubs and files (to better handle future additions); - Clarified exception messages (also in unit test).	2023-08-18 21:35:53 -07:00
Brendan Collins	9f545825b7	Added Geometry Validation, Geometry Metadata, and WKT instead of Python str() to GeoDataFrame Loader (#9466 ) @rlancemartin The current implementation within `Geopandas.GeoDataFrame` loader uses the python builtin `str()` function on the input geometries. While this looks very close to WKT (Well known text), Python's str function doesn't guarantee that. In the interest of interop., I've changed to the of use `wkt` property on the Shapely geometries for generating the text representation of the geometries. Also, included here: - validation of the input `page_content_column` as being a GeoSeries. - geometry `crs` (Coordinate Reference System) / bounds (xmin/ymin/xmax/ymax) added to Document metadata. Having the CRS is critical... having the bounds is just helpful! I think there is a larger question of "Should the geometry live in the `page_content`, or should the record be better summarized and tuck the geom into metadata?" ...something for another day and another PR.	2023-08-18 21:35:39 -07:00
Kacper Łukawski	616e728ef9	Enhance qdrant vs using async embed documents (#9462 ) This is an extension of #8104. I updated some of the signatures so all the tests pass. @danhnn I couldn't commit to your PR, so I created a new one. Thanks for your contribution! @baskaryan Could you please merge it? --------- Co-authored-by: Danh Nguyen <dnncntt@gmail.com>	2023-08-18 18:59:48 -07:00
Matt Robinson	83d2a871eb	fix: apply unstructured preprocess functions (#9473 ) ### Summary Fixes a bug from #7850 where post processing functions in Unstructured loaders were not apply. Adds a assertion to the test to verify the post processing function was applied and also updates the explanation in the example notebook.	2023-08-18 18:54:28 -07:00
William FH	292ae8468e	Let you specify run id in trace as chain group (#9484 ) I think we'll deprecate this soon anyway but still nice to be able to fetch the run id	2023-08-18 17:21:53 -07:00
Predrag Gruevski	df8e35fd81	Remove incorrect ABC from two Elasticsearch classes. (#9470 ) Neither is an ABC because their own example code instantiates them directly.	2023-08-18 15:01:02 -04:00
Predrag Gruevski	82f28ca9ef	`ChatPromptTemplate` is not an `ABC`, it's instantiated directly. (#9468 ) Its own `__add__` method constructs `ChatPromptTemplate` objects directly, it cannot be abstract. Found while debugging something else with @nfcampos.	2023-08-18 14:37:10 -04:00
vamseeyarla	82fb56b79c	Issue 9401 - SequentialChain runs the same callbacks over and over in async mode (#9452 ) Issue: https://github.com/langchain-ai/langchain/issues/9401 In the Async mode, SequentialChain implementation seems to run the same callbacks over and over since it is re-using the same callbacks object. Langchain version: 0.0.264, master The implementation of this aysnc route differs from the sync route and sync approach follows the right pattern of generating a new callbacks object instead of re-using the old one and thus avoiding the cascading run of callbacks at each step. Async mode: ``` _run_manager = run_manager or AsyncCallbackManagerForChainRun.get_noop_manager() callbacks = _run_manager.get_child() ... for i, chain in enumerate(self.chains): _input = await chain.arun(_input, callbacks=callbacks) ... ``` Regular mode: ``` _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager() for i, chain in enumerate(self.chains): _input = chain.run(_input, callbacks=_run_manager.get_child(f"step_{i+1}")) ... ``` Notice how we are reusing the callbacks object in the Async code which will have a cascading effect as we run through the chain. It runs the same callbacks over and over resulting in issues. Solution: Define the async function in the same pattern as the regular one and added tests. --------- Co-authored-by: vamsee_yarlagadda <vamsee.y@airbnb.com>	2023-08-18 11:26:12 -07:00
William FH	c29fbede59	Wfh/rm num repetitions (#9425 ) Makes it hard to do test run comparison views and we'd probably want to just run multiple runs right now	2023-08-18 10:08:39 -07:00
Predrag Gruevski	eee0d1d0dd	Update repository links in the package metadata. (#9454 )	2023-08-18 12:55:43 -04:00
Bagatur	50b8f4dcc7	bump 268 (#9455 )	2023-08-18 08:46:39 -07:00
Nuno Campos	354c42afd2	Lint	2023-08-18 15:30:30 +01:00
Nuno Campos	4452314aab	Merge branch 'master' into bagatur/locals_in_config	2023-08-18 15:23:05 +01:00
Nuno Campos	d5eb228874	Add kwargs to all other optional runnable methods (#9439 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-18 15:04:26 +01:00
Leonid Ganeline	a3dd4dcadf	📖 docstrings `retrievers` consistency (#9422 ) 📜 - updated the top-level descriptions to a consistent format; - changed the format of several 100% internal functions from "name" to "_name". So, these functions are not shown in the Top-level API Reference page (with lists of classes/functions)	2023-08-18 09:20:39 -04:00
Nuno Campos	9417961b17	Add lock on tee peer cleanup (#9446 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-18 14:20:09 +01:00
Nuno Campos	d3f10d2f4f	Update test	2023-08-18 11:36:16 +01:00
Nuno Campos	6ae58da668	Assign defaults in batch calls	2023-08-18 10:53:10 +01:00
Nuno Campos	ddcb4ff5fb	Li t	2023-08-18 10:30:42 +01:00
Nuno Campos	1baedc4e18	Move patch_config	2023-08-18 10:28:39 +01:00
Nuno Campos	46f3850794	Lint	2023-08-18 10:25:41 +01:00
Nuno Campos	24a197f96a	Merge branch 'master' into bagatur/locals_in_config	2023-08-18 10:12:10 +01:00
Nuno Campos	8ddaaf3d41	Move config helpers	2023-08-18 10:10:35 +01:00
Nuno Campos	a5e7dcec61	Lint	2023-08-18 10:03:28 +01:00
Nuno Campos	c1b1666ec8	Ensure config defaults apply even when a config is passed in	2023-08-18 10:02:29 +01:00
Nuno Campos	7fe474d198	Update snapshots	2023-08-18 10:02:11 +01:00
Jacob Lee	0689628489	Adds streaming for runnable maps (#9283 ) @nfcampos @baskaryan --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-08-18 07:46:23 +01:00
Bagatur	ab21af71be	wip	2023-08-17 17:28:02 -07:00
Bagatur	6f69b19ff5	wip tests	2023-08-17 16:45:52 -07:00
Bagatur	9e906c39ba	nit	2023-08-17 16:22:22 -07:00
Bagatur	6b0a849f59	fix	2023-08-17 16:22:12 -07:00
Bagatur	c447e9a854	cr	2023-08-17 15:29:00 -07:00
Bagatur	bd80cad6db	add	2023-08-17 13:52:19 -07:00
Bagatur	8c1a528c71	cr	2023-08-17 13:52:09 -07:00
Bagatur	25cbcd9374	merge	2023-08-17 13:03:28 -07:00
Aashish Saini	ce78877a87	Replaced instances of raising ValueError with raising ImportError. (#9388 ) Refactored code to ensure consistent handling of ImportError. Replaced instances of raising ValueError with raising ImportError. The choice of raising a ValueError here is somewhat unconventional and might lead to confusion for anyone reading the code. Typically, when dealing with import-related errors, the recommended approach is to raise an ImportError with a descriptive message explaining the issue. This provides a clearer indication that the problem is related to importing the required module. @hwchase17 , @baskaryan , @eyurtsev Thanks Aashish --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-17 12:24:08 -07:00
Bagatur	8c986221e4	make openapi_schema_pydantic opt (#9408 )	2023-08-17 11:49:23 -07:00
Eugene Yurtsev	77b359edf5	More missing type annotations (#9406 ) This PR fills in more missing type annotations on pydantic models. It's OK if it missed some annotations, we just don't want it to get annotations wrong at this stage. I'll do a few more passes over the same files!	2023-08-17 12:19:50 -04:00
Bagatur	a69d1b84f4	bump 267 (#9403 )	2023-08-17 08:47:13 -07:00
Nuno Campos	c0d67420e5	Use a submodule for pydantic v1 compat (#9371 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	2023-08-17 16:35:49 +01:00
Bagatur	995ef8a7fc	unpin pydantic (#9356 )	2023-08-17 01:55:46 -07:00
Tong Gao	3c8e9a9641	Fix typos in eval_chain.py (#9365 ) Fixed two minor typos.	2023-08-17 01:53:46 -07:00
Eugene Yurtsev	2673b3a314	Create pydantic v1 namespace in langchain (#9254 ) Create pydantic v1 namespace in langchain experimental	2023-08-16 21:19:31 -07:00
Eugene Yurtsev	4c2de2a7f2	Adding missing types in some pydantic models (#9355 ) * Adding missing types in some pydantic models -- this change is required for making the code work with pydantic v2.	2023-08-16 20:10:34 -07:00
Harrison Chase	1c089cadd7	fix import v2 (#9346 )	2023-08-16 17:33:01 -07:00
qqjettkgjzhxmwj	84a97d55e1	Fix typo in llm_router.py (#9322 ) Fix typo	2023-08-16 15:56:44 -07:00
Joe Reuter	09aa1eac03	Airbyte loaders: Fix last_state getter (#9314 ) This PR fixes the Airbyte loaders when doing incremental syncs. The notebooks are calling out to access `loader.last_state` to get the current state of incremental syncs, but this didn't work due to a refactoring of how the loaders are structured internally in the original PR. This PR fixes the issue by adding a `last_state` property that forwards the state correctly from the CDK adapter. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-16 15:56:33 -07:00
Jakub Kuciński	8bebc9206f	Add improved sources splitting in BaseQAWithSourcesChain (#8716 ) ## Type: Improvement --- ## Description: Running QAWithSourcesChain sometimes raises ValueError as mentioned in issue #7184: ``` ValueError: too many values to unpack (expected 2) Traceback: response = qa({"question": pregunta}, return_only_outputs=True) File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\base.py", line 166, in __call__ raise e File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\base.py", line 160, in __call__ self._call(inputs, run_manager=run_manager) File "C:\Anaconda3\envs\iagen_3_10\lib\site-packages\langchain\chains\qa_with_sources\base.py", line 132, in _call answer, sources = re.split(r"SOURCES:\s", answer) ``` This is due to LLM model generating subsequent question, answer and sources, that is complement in a similar form as below: ``` <final_answer> SOURCES: <sources> QUESTION: <new_or_repeated_question> FINAL ANSWER: <new_or_repeated_final_answer> SOURCES: <new_or_repeated_sources> ``` It leads the following line ``` re.split(r"SOURCES:\s", answer) ``` to return more than 2 elements and result in ValueError. The simple fix is to split also with "QUESTION:\s" and take the first two elements: ``` answer, sources = re.split(r"SOURCES:\s\|QUESTION:\s", answer)[:2] ``` Sometimes LLM might also generate some other texts, like alternative answers in a form: ``` <final_answer_1> SOURCES: <sources> <final_answer_2> SOURCES: <sources> <final_answer_3> SOURCES: <sources> ``` In such cases it is the best to split previously obtained sources with new line: ``` sources = re.split(r"\n", sources.lstrip())[0] ``` --- ## Issue: Resolves #7184 --- ## Maintainer: @baskaryan	2023-08-16 13:30:15 -07:00
Bagatur	a3c79b1909	Add tiktoken integration dep (#9332 )	2023-08-16 12:09:22 -07:00
Bagatur	ba5fbaba70	bump 266 (#9296 )	2023-08-16 01:13:19 -07:00
axiangcoding	63601551b1	fix(llms): improve the ernie chat model (#9289 ) - Description: improve the ernie chat model. - fix missing kwargs to payload - new test cases - add some debug level log - improve description - Issue: None - Dependencies: None - Tag maintainer: @baskaryan	2023-08-16 00:48:42 -07:00
Daniel Chalef	1d55141c50	zep/new ZepVectorStore (#9159 ) - new ZepVectorStore class - ZepVectorStore unit tests - ZepVectorStore demo notebook - update zep-python to ~1.0.2 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-16 00:23:07 -07:00
William FH	2519580994	Add Schema Evals (#9228 ) Simple eval checks for whether a generation is valid json and whether it matches an expected dict	2023-08-15 17:17:32 -07:00
Kenny	74a64cfbab	expose output key to create_openai_fn_chain (#9155 ) I quick change to allow the output key of create_openai_fn_chain to optionally be changed. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 17:01:32 -07:00
Bagatur	afba2be3dc	update openai functions docs (#9278 )	2023-08-15 17:00:56 -07:00
Bagatur	9abf60acb6	Bagatur/vectara regression (#9276 ) Co-authored-by: Ofer Mendelevitch <ofer@vectara.com> Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>	2023-08-15 16:19:46 -07:00
Xiaoyu Xee	b30f449dae	Add dashvector vectorstore (#9163 ) ## Description Add `Dashvector` vectorstore for langchain - [dashvector quick start](https://help.aliyun.com/document_detail/2510223.html) - [dashvector package description](https://pypi.org/project/dashvector/) ## How to use ```python from langchain.vectorstores.dashvector import DashVector dashvector = DashVector.from_documents(docs, embeddings) ``` --------- Co-authored-by: smallrain.xuxy <smallrain.xuxy@alibaba-inc.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 16:19:30 -07:00
Bagatur	bfbb97b74c	Bagatur/deeplake docs fixes (#9275 ) Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>	2023-08-15 15:56:36 -07:00
Kunj-2206	1b3942ba74	Added BittensorLLM (#9250 ) Description: Adding NIBittensorLLM via Validator Endpoint to langchain llms Tag maintainer: @Kunj-2206 Maintainer responsibilities: Models / Prompts: @hwchase17, @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 15:40:52 -07:00
Toshish Jawale	852722ea45	Improvements in Nebula LLM (#9226 ) - Description: Added improvements in Nebula LLM to perform auto-retry; more generation parameters supported. Conversation is no longer required to be passed in the LLM object. Examples are updated. - Issue: N/A - Dependencies: N/A - Tag maintainer: @baskaryan - Twitter handle: symbldotai --------- Co-authored-by: toshishjawale <toshish@symbl.ai>	2023-08-15 15:33:07 -07:00
Bagatur	358562769a	Bagatur/refac faiss (#9076 ) Code cleanup and bug fix in deletion	2023-08-15 15:19:00 -07:00
Bagatur	3eccd72382	pin pydantic (#9274 ) don't want default to be v2 yet	2023-08-15 15:02:28 -07:00
Erick Friis	76d09b4ed0	hub push/pull (#9225 ) Description: Adds push/pull functions to interact with the hub Issue: n/a Dependencies: `langchainhub` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 14:11:43 -07:00
Alex Gamble	cf17c58b47	Update documentation for the Context integration with new URL and features (#9259 ) Update documentation and URLs for the Langchain Context integration. We've moved from getcontext.ai to context.ai \o/ Thanks in advance for the review!	2023-08-15 11:38:34 -07:00
Eugene Yurtsev	a091b4bf4c	Update testing workflow to test with both pydantic versions (#9206 ) * PR updates test.yml to test with both pydantic versions * Code should be refactored to make it easier to do testing in matrix format w/ packages * Added steps to assert that pydantic version in the environment is as expected	2023-08-15 13:21:11 -04:00
Bagatur	e0162baa3b	add oai sched tests (#9257 )	2023-08-15 09:40:33 -07:00
Joseph McElroy	5e9687a196	Elasticsearch self-query retriever (#9248 ) Now with ElasticsearchStore VectorStore merged, i've added support for the self-query retriever. I've added a notebook also to demonstrate capability. I've also added unit tests. Credit @elastic and @phoey1 on twitter.	2023-08-15 10:53:43 -04:00
Eugene Yurtsev	0470198fb5	Remove packages for pydantic compatibility (#9217 ) # Poetry updates This PR updates LangChains poetry file to remove any dependencies that aren't pydantic v2 compatible yet. All packages remain usable under pydantic v1, and can be installed separately. ## Bumping the following packages: * langsmith ## Removing the following packages not used in extended unit-tests: * zep-python, anthropic, jina, spacy, steamship, betabageldb not used at all: * octoai-sdk Cleaning up extras w/ for removed packages. ## Snapshots updated Some snapshots had to be updated due to a change in the data model in langsmith. RunType used to be Union of Enum and string and was changed to be string only.	2023-08-15 10:41:25 -04:00
Bagatur	e986afa13a	bump 265 (#9253 )	2023-08-15 07:21:32 -07:00
Hech	4b505060bd	fix: max_marginal_relevance_search and docs in Dingo (#9244 )	2023-08-15 01:06:06 -07:00
axiangcoding	664ff28cba	feat(llms): support ernie chat (#9114 ) Description: support ernie (文心一言) chat model Related issue: #7990 Dependencies: None Tag maintainer: @baskaryan	2023-08-15 01:05:46 -07:00
Bharat Ramanathan	08a8363fc6	feat(integration): Add support to serialize protobufs in WandbTracer (#8914 ) This PR adds serialization support for protocol bufferes in `WandbTracer`. This allows code generation chains to be visualized. Additionally, it also fixes a minor bug where the settings are not honored when a run is initialized before using the `WandbTracer` @agola11 --------- Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-15 01:05:12 -07:00
Joshua Sundance Bailey	ef0664728e	ArcGISLoader update (#9240 ) Small bug fixes and added metadata based on user feedback. This PR is from the author of https://github.com/langchain-ai/langchain/pull/8873 .	2023-08-14 23:44:29 -07:00
Joseph McElroy	eac4ddb4bb	Elasticsearch Store Improvements (#8636 ) Todo: - [x] Connection options (cloud, localhost url, es_connection) support - [x] Logging support - [x] Customisable field support - [x] Distance Similarity support - [x] Metadata support - [x] Metadata Filter support - [x] Retrieval Strategies - [x] Approx - [x] Approx with Hybrid - [x] Exact - [x] Custom - [x] ELSER (excluding hybrid as we are working on RRF support) - [x] integration tests - [x] Documentation 👋 this is a contribution to improve Elasticsearch integration with Langchain. Its based loosely on the changes that are in master but with some notable changes: ## Package name & design improvements The import name is now `ElasticsearchStore`, to aid discoverability of the VectorStore. ```py ## Before from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch, ElasticKnnSearch ## Now from langchain.vectorstores.elasticsearch import ElasticsearchStore ``` ## Retrieval Strategy support Before we had a number of classes, depending on the strategy you wanted. `ElasticKnnSearch` for approx, `ElasticVectorSearch` for exact / brute force. With `ElasticsearchStore` we have retrieval strategies: ### Approx Example Default strategy for the vast majority of developers who use Elasticsearch will be inferring the embeddings from outside of Elasticsearch. Uses KNN functionality of _search. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index" ) output = docsearch.similarity_search("foo", k=1) ``` ### Approx, with hybrid Developers who want to search, using both the embedding and the text bm25 match. Its simple to enable. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.ApproxRetrievalStrategy(hybrid=True) ) output = docsearch.similarity_search("foo", k=1) ``` ### Approx, with `query_model_id` Developers who want to infer within Elasticsearch, using the model loaded in the ml node. This relies on the developer to setup the pipeline and index if they wish to embed the text in Elasticsearch. Example of this in the test. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.ApproxRetrievalStrategy( query_model_id="sentence-transformers__all-minilm-l6-v2" ), ) output = docsearch.similarity_search("foo", k=1) ``` ### I want to provide my own custom Elasticsearch Query You might want to have more control over the query, to perform multi-phase retrieval such as LTR, linearly boosting on document parameters like recently updated or geo-distance. You can do this with `custom_query_fn` ```py def my_custom_query(query_body: dict, query: str) -> dict: return {"query": {"match": {"text": {"query": "bar"}}}} texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), **elasticsearch_connection, index_name=index_name ) docsearch.similarity_search("foo", k=1, custom_query=my_custom_query) ``` ### Exact Example Developers who have a small dataset in Elasticsearch, dont want the cost of indexing the dims vs tradeoff on cost at query time. Uses script_score. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.ExactRetrievalStrategy(), ) output = docsearch.similarity_search("foo", k=1) ``` ### ELSER Example Elastic provides its own sparse vector model called ELSER. With these changes, its really easy to use. The vector store creates a pipeline and index thats setup for ELSER. All the developer needs to do is configure, ingest and query via langchain tooling. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.SparseVectorStrategy(), ) output = docsearch.similarity_search("foo", k=1) ``` ## Architecture In future, we can introduce new strategies and allow us to not break bwc as we evolve the index / query strategy. ## Credit On release, could you credit @elastic and @phoey1 please? Thank you! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-14 23:42:35 -07:00
Divyansh Garg	9529483c2a	Improve MultiOn client toolkit prompts (#9222 ) - Updated prompts for the MultiOn toolkit for better functionality - Non-blocking but good to have it merged to improve the overall performance for the toolkit @hinthornw @hwchase17 --------- Co-authored-by: Naman Garg <ngarg3@binghamton.edu>	2023-08-14 17:39:51 -07:00
William FH	c478fc208e	Default On Retry (#9230 ) Base callbacks don't have a default on retry event Fix #8542 --------- Co-authored-by: landonsilla <landon.silla@stepstone.com>	2023-08-14 16:45:17 -07:00
Leonid Ganeline	93dd499997	docstrings: `document_loaders` consistency 3 (#9216 ) Updated docstrings into the consistent format (probably, the last update for the `document_loaders`.	2023-08-14 16:28:39 -07:00
Kshitij Wadhwa	a69cb95850	track langchain usage for Rockset (#9229 ) Add ability to track langchain usage for Rockset. Rockset's new python client allows setting this. To prevent old clients from failing, it ignore if setting throws exception (we can't track old versions) Tested locally with old and new Rockset python client cc @baskaryan	2023-08-14 16:27:34 -07:00
Leonid Ganeline	7810ea5812	docstrings: `chat_models` consistency (#9227 ) Updated docstrings into the consistent format.	2023-08-14 16:15:56 -07:00
William FH	b0896210c7	Return feedback with failed response if there's an error (#9223 ) In Evals	2023-08-14 15:59:16 -07:00
William FH	7124f2ebfa	Parent Doc Retriever (#9214 ) 2 things: - Implement the private method rather than the public one so callbacks are handled properly - Add search_kwargs (Open to not adding this if we are trying to deprecate this UX but seems like as a user i'd assume similar args to the vector store retriever. In fact some may assume this implements the same interface but I'm not dealing with that here) -	2023-08-14 15:41:53 -07:00
Harrison Chase	3f601b5809	add async method in (#9204 )	2023-08-14 11:04:31 -07:00
Clark	03ea0762a1	fix(jinachat): related to #9197 (#9200 ) related to: https://github.com/langchain-ai/langchain/issues/9197 --------- Co-authored-by: qianjun.wqj <qianjun.wqj@alibaba-inc.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-14 11:04:20 -07:00
Eugene Yurtsev	4f1feaca83	Wrap OpenAPI features in conditionals for pydantic v2 compatibility (#9205 ) Wrap OpenAPI in conditionals for pydantic v2 compatibility.	2023-08-14 13:40:58 -04:00
Glauco Custódio	89be10f6b4	add ttl to RedisCache (#9068 ) Add `ttl` (time to live) to `RedisCache`	2023-08-14 12:59:18 -04:00
Eugene Yurtsev	04bc5f3b18	Conditionally add pydantic v1 to namespace (#9202 ) Conditionally add pydantic_v1 to namespace.	2023-08-14 11:26:45 -04:00
shibuiwilliam	feec422bf7	fix logging to logger (#9192 ) # What - fix logging to logger	2023-08-14 08:21:09 -07:00
Bagatur	5935767056	bump lc 246, lce 9 (#9207 )	2023-08-14 08:14:37 -07:00
Bagatur	b5a57acf6c	lite llm lint (#9208 )	2023-08-14 11:03:06 -04:00
Krish Dholakia	49f1d8477c	Adding ChatLiteLLM model (#9020 ) Description: Adding a langchain integration for the LiteLLM library Tag maintainer: @hwchase17, @baskaryan Twitter handle: @krrish_dh / @Berri_AI --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-14 07:43:40 -07:00
Eugene Yurtsev	72f9150a50	Update 2 more pydantic imports (#9203 ) Update two more pydantic imports to use v1 explicitly	2023-08-14 10:11:30 -04:00
Eugene Yurtsev	c172f972ea	Create pydantic v1 namespace, add partial compatibility for pydantic v2 (#9123 ) First of a few PRs to add full compatibility to both pydantic v1 and v2. This PR creates pydantic v1 namespace and adds it to sys.modules. Upcoming changes: 1. Handle `openapi-schema-pydantic = "^1.2"` and dependent chains/tools 2. bump dependencies to versions that are cross compatible for pydantic or remove them (see below) 3. Add tests to github workflows to test with pydantic v1 and v2 Dependencies From a quick look (could be wrong since was done manually) dependencies pinning pydantic below 2 (some of these can be bumped to newer versions are provide cross-compatible code) anthropic bentoml confection fastapi langsmith octoai-sdk openapi-schema-pydantic qdrant-client spacy steamship thinc zep-python Unpinned marqo () nomic () xinference(*)	2023-08-14 09:37:32 -04:00
Evan Schultz	8189dea0d8	Fixes typing issues in BaseOpenAI (#9183 ) ## Description: Sets default values for `client` and `model` attributes in the BaseOpenAI class to fix Pylance Typing issue. - Issue: #9182. - Twitter handle: @evanmschultz	2023-08-13 23:03:28 -07:00
Massimiliano Pronesti	d95eeaedbe	feat(llms): support vLLM's OpenAI-compatible server (#9179 ) This PR aims at supporting [vLLM's OpenAI-compatible server feature](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server), i.e. allowing to call vLLM's LLMs like if they were OpenAI's. I've also udpated the related notebook providing an example usage. At the moment, vLLM only supports the `Completion` API.	2023-08-13 23:03:05 -07:00
Michael Goin	621da3c164	Adds DeepSparse as an LLM (#9184 ) Adds [DeepSparse](https://github.com/neuralmagic/deepsparse) as an LLM backend. DeepSparse supports running various open-source sparsified models hosted on [SparseZoo](https://sparsezoo.neuralmagic.com/) for performance gains on CPUs. Twitter handles: @mgoin_ @neuralmagic --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-13 22:35:58 -07:00
Bagatur	0fa69d8988	Bagatur/zep python 1.0 (#9186 ) Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2023-08-13 21:52:53 -07:00
Eugene Yurtsev	9b24f0b067	Enhance deprecation decorator to modify docs with sphinx directives (#9069 ) Enhance deprecation decorator	2023-08-13 15:35:01 -04:00
Bagatur	cdfe2c96c5	bump 263 (#9156 )	2023-08-12 12:36:44 -07:00
Leonid Ganeline	19f504790e	docstrings: document_loaders consitency 2 (#9148 ) This is Part 2. See #9139 (Part 1).	2023-08-11 16:25:40 -07:00
Harrison Chase	1b58460fe3	update keys for chain (#5164 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-11 16:25:13 -07:00
胡亮	7edf4ca396	Support multi gpu inference for HuggingFaceEmbeddings (#4732 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-11 15:55:44 -07:00
UmerHA	8aab39e3ce	Added SmartGPT workflow (issue #4463 ) (#4816 ) # Added SmartGPT workflow by providing SmartLLM wrapper around LLMs Edit: As @hwchase17 suggested, this should be a chain, not an LLM. I have adapted the PR. It is used like this: ``` from langchain.prompts import PromptTemplate from langchain.chains import SmartLLMChain from langchain.chat_models import ChatOpenAI hard_question = "I have a 12 liter jug and a 6 liter jug. I want to measure 6 liters. How do I do it?" hard_question_prompt = PromptTemplate.from_template(hard_question) llm = ChatOpenAI(model_name="gpt-4") prompt = PromptTemplate.from_template(hard_question) chain = SmartLLMChain(llm=llm, prompt=prompt, verbose=True) chain.run({}) ``` Original text: Added SmartLLM wrapper around LLMs to allow for SmartGPT workflow (as in https://youtu.be/wVzuvf9D9BU). SmartLLM can be used wherever LLM can be used. E.g: ``` smart_llm = SmartLLM(llm=OpenAI()) smart_llm("What would be a good company name for a company that makes colorful socks?") ``` or ``` smart_llm = SmartLLM(llm=OpenAI()) prompt = PromptTemplate( input_variables=["product"], template="What is a good name for a company that makes {product}?", ) chain = LLMChain(llm=smart_llm, prompt=prompt) chain.run("colorful socks") ``` SmartGPT consists of 3 steps: 1. Ideate - generate n possible solutions ("ideas") to user prompt 2. Critique - find flaws in every idea & select best one 3. Resolve - improve upon best idea & return it Fixes #4463 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @hwchase17 - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-11 15:44:27 -07:00
Lucas Pickup	1d3735a84c	Ensure deployment_id is set to provided deployment, required for Azure OpenAI. (#5002 ) # Ensure deployment_id is set to provided deployment, required for Azure OpenAI. --------- Co-authored-by: Lucas Pickup <lupickup@microsoft.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-11 15:43:01 -07:00
Bagatur	45741bcc1b	Bagatur/vectara nit (#9140 ) Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>	2023-08-11 15:32:03 -07:00
Dominick DEV	9b64932e55	Add LangChain utility for real-time crypto exchange prices (#4501 ) This commit adds the LangChain utility which allows for the real-time retrieval of cryptocurrency exchange prices. With LangChain, users can easily access up-to-date pricing information by running the command ".run(from_currency, to_currency)". This new feature provides a convenient way to stay informed on the latest exchange rates and make informed decisions when trading crypto. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-11 14:45:06 -07:00
Joshua Sundance Bailey	eaa505fb09	Create ArcGISLoader & example notebook (#8873 ) - Description: Adds the ArcGISLoader class to `langchain.document_loaders` - Allows users to load data from ArcGIS Online, Portal, and similar - Users can authenticate with `arcgis.gis.GIS` or retrieve public data anonymously - Uses the `arcgis.features.FeatureLayer` class to retrieve the data - Defines the most relevant keywords arguments and accepts `**kwargs` - Dependencies: Using this class requires `arcgis` and, optionally, `bs4.BeautifulSoup`. Tagging maintainers: - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-11 14:33:40 -07:00
Bagatur	e21152358a	fix (#9145 )	2023-08-11 13:58:23 -07:00
Leonid Ganeline	edb585228d	docstrings: document_loaders consitency (#9139 ) Formatted docstrings from different formats to consistent format, lile: >Loads processed docs from Docugami. "Load from `Docugami`." >Loader that uses Unstructured to load HTML files. "Load `HTML` files using `Unstructured`." >Load documents from a directory. "Load from a directory." - `Load` - no `Loads` - DocumentLoader always loads Documents, so no more "documents/docs/texts/ etc" - integrated systems and APIs enclosed in backticks,	2023-08-11 13:09:31 -07:00
Markus Schiffer	00bf472265	Fix for SVM retriever discarding document metadata (#9141 ) As stated in the title the SVM retriever discarded the metadata of passed in docs. This code fixes that. I also added one unit test that should test that. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-11 13:08:17 -07:00
Bagatur	bace17e0aa	rm integration deps (#9142 )	2023-08-11 12:43:08 -07:00
Eugene Yurtsev	44bc89b7bf	Support a few list like operations on ChatPromptTemplate (#9077 ) Make it easier to work with chat prompt template	2023-08-11 14:49:51 -04:00
Hai The Dude	e4418d1b7e	Added new use case docs for Web Scraping, Chromium loader, BS4 transformer (#8732 ) - Description: Added a new use case category called "Web Scraping", and a tutorial to scrape websites using OpenAI Functions Extraction chain to the docs. - Tag maintainer:@baskaryan @hwchase17 , - Twitter handle: https://www.linkedin.com/in/haiphunghiem/ (I'm on LinkedIn mostly) --------- Co-authored-by: Lance Martin <lance@langchain.dev>	2023-08-11 11:46:59 -07:00
sseide	6cb763507c	add basic support for redis cluster server (#9128 ) This change updates the central utility class to recognize a Redis cluster server after connection and returns an new cluster aware Redis client. The "normal" Redis client would not be able to talk to a cluster node because keys might be stored on other shards of the Redis cluster and therefor not readable or writable. With this patch clients do not need to know what Redis server it is, they just connect though the same API calls for standalone and cluster server. There are no dependencies added due to this MR. Remark - with current redis-py client library (4.6.0) a cluster cannot be used as VectorStore. It can be used for other use-cases. There is a bug / missing feature(?) in the Redis client breaking the VectorStore implementation. I opened an issue at the client library too (redis/redis-py#2888) to fix this. As soon as this is fixed in `redis-py` library it should be usable there too. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-11 11:37:44 -07:00
David Duong	6d03f8b5d8	Add serialisable support for Replicate (#8525 )	2023-08-11 11:35:21 -07:00
niklub	16af5f8690	Add LabelStudio integration (#8880 ) This PR introduces [Label Studio](https://labelstud.io/) integration with LangChain via `LabelStudioCallbackHandler`: - sending data to the Label Studio instance - labeling dataset for supervised LLM finetuning - rating model responses - tracking and displaying chat history - support for custom data labeling workflow ### Example ``` chat_llm = ChatOpenAI(callbacks=[LabelStudioCallbackHandler(mode="chat")]) chat_llm([ SystemMessage(content="Always use emojis in your responses."), HumanMessage(content="Hey AI, how's your day going?"), AIMessage(content="🤖 I don't have feelings, but I'm running smoothly! How can I help you today?"), HumanMessage(content="I'm feeling a bit down. Any advice?"), AIMessage(content="🤗 I'm sorry to hear that. Remember, it's okay to seek help or talk to someone if you need to. 💬"), HumanMessage(content="Can you tell me a joke to lighten the mood?"), AIMessage(content="Of course! 🎭 Why did the scarecrow win an award? Because he was outstanding in his field! 🌾"), HumanMessage(content="Haha, that was a good one! Thanks for cheering me up."), AIMessage(content="Always here to help! 😊 If you need anything else, just let me know."), HumanMessage(content="Will do! By the way, can you recommend a good movie?"), ]) ``` <img width="906" alt="image" src="https://github.com/langchain-ai/langchain/assets/6087484/0a1cf559-0bd3-4250-ad96-6e71dbb1d2f3"> ### Dependencies - [label-studio](https://pypi.org/project/label-studio/) - [label-studio-sdk](https://pypi.org/project/label-studio-sdk/) https://twitter.com/labelstudiohq --------- Co-authored-by: nik <nik@heartex.net>	2023-08-11 11:24:10 -07:00
Bagatur	8cb2594562	Bagatur/dingo (#9079 ) Co-authored-by: gary <1625721671@qq.com>	2023-08-11 10:54:45 -07:00
Jacques Arnoux	926c64da60	Fix web research retriever for unknown links in results (#9115 ) Fixes an issue with web research retriever for unknown links in results. This is currently making the retrieve crash sometimes. @rlancemartin	2023-08-11 10:50:37 -07:00
Alvaro Bartolome	f7ae183f40	`ArgillaCallbackHandler` to properly use default values for `api_url` and `api_key` (#9113 ) As of the recent PR at #9043, after some testing we've realised that the default values were not being used for `api_key` and `api_url`. Besides that, the default for `api_key` was set to `argilla.apikey`, but since the default values are intended for people using the Argilla Quickstart (easy to run and setup), the defaults should be instead `owner.apikey` if using Argilla 1.11.0 or higher, or `admin.apikey` if using a lower version of Argilla. Additionally, we've removed the f-string replacements from the docstrings. --------- Co-authored-by: Gabriel Martin <gabriel@argilla.io>	2023-08-11 09:37:06 -07:00
Bagatur	01ef786e7e	bump 262 (#9108 )	2023-08-11 01:29:07 -07:00
Bagatur	3b754b5461	Bagatur/filter metadata (#9015 ) Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>	2023-08-11 01:10:00 -07:00
Kim Minjong	7f0e847c13	Update pydantic format instruction prompt (#9095 ) - remove unopened bracket	2023-08-11 00:22:13 -07:00
Ashutosh Sanzgiri	991b448dfc	minor edits (#9093 ) Description: Minor edit to PR#845 Thanks!	2023-08-10 23:40:36 -07:00
Bagatur	3ab4e21579	fix json tool (#9096 )	2023-08-10 23:39:25 -07:00
Sam Groenjes	2184e3a400	Fix IndexError when input_list is Empty in prep_prompts (#5769 ) This MR corrects the IndexError arising in prep_prompts method when no documents are returned from a similarity search. Fixes #1733 Co-authored-by: Sam Groenjes <sam.groenjes@darkwolfsolutions.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-10 22:50:39 -07:00
Chenyu Zhao	c0acbdca1b	Update Fireworks model names (#9085 )	2023-08-10 19:23:42 -07:00
Bagatur	b80e3825a6	Bagatur/pinecone by vector (#9087 ) Co-authored-by: joseph <joe@outverse.com>	2023-08-10 18:28:55 -07:00
Nikhil Kumar	6abb2c2c08	Buffer method of ConversationTokenBufferMemory should be able to return messages as string (#7057 ) ### Description: `ConversationBufferTokenMemory` should have a simple way of returning the conversation messages as a string. Previously to complete this, you would only have the option to return memory as an array through the buffer method and call `get_buffer_string` by importing it from `langchain.schema`, or use the `load_memory_variables` method and key into `self.memory_key`. ### Maintainer @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-10 18:17:22 -07:00
William FH	57dd4daa9a	Add string example mapper (#9086 ) Now that we accept any runnable or arbitrary function to evaluate, we don't always look up the input keys. If an evaluator requires references, we should try to infer if there's one key present. We only have delayed validation here but it's better than nothing	2023-08-10 17:07:02 -07:00
Bidhan Roy	02430e25b6	BagelDB (bageldb.ai), VectorStore integration. (#8971 ) - Description: [BagelDB](bageldb.ai) a collaborative vector database. Integrated the bageldb PyPi package with langchain with related tests and code. - Issue: Not applicable. - Dependencies: `betabageldb` PyPi package. - Tag maintainer: @rlancemartin, @eyurtsev, @baskaryan - Twitter handle: bageldb_ai (https://twitter.com/BagelDB_ai) We ran `make format`, `make lint` and `make test` locally. Followed the contribution guideline thoroughly https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --------- Co-authored-by: Towhid1 <nurulaktertowhid@gmail.com>	2023-08-10 16:48:36 -07:00
DJ Atha	ee52482db8	Fix issue 7445 (#7635 ) Description: updated BabyAGI examples and experimental to append the iteration to the result id to fix error storing data to vectorstore. Issue: 7445 Dependencies: no Tag maintainer: @eyurtsev This fix worked for me locally. Happy to take some feedback and iterate on a better solution. I was considering appending a uuid instead but didn't want to over complicate the example. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-10 16:29:31 -07:00
Harrison Chase	bb6fbf4c71	openai adapters (#8988 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-08-10 16:08:50 -07:00
Harrison Chase	45f0f9460a	add async for python repl (#9080 )	2023-08-10 16:07:06 -07:00
Neil Murphy	105c787e5a	Add convenience methods to ConversationBufferMemory and ConversationB… (#8981 ) Add convenience methods to `ConversationBufferMemory` and `ConversationBufferWindowMemory` to get buffer either as messages or as string. Helps when `return_messages` is set to `True` but you want access to the messages as a string, and vice versa. @hwchase17 One use case: Using a `MultiPromptRouter` where `default_chain` is `ConversationChain`, but destination chains are `LLMChains`. Injecting chat memory into prompts for destination chains prints a stringified `List[Messages]` in the prompt, which creates a lot of noise. These convenience methods allow caller to choose either as needed. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-10 15:45:30 -07:00
Zend	6221eb5974	Recursive url loader w/ test (#8813 ) Description: Due to some issue on the test, this is a separate PR with the test for #8502 Tag maintainer: @rlancemartin --------- Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-10 14:50:31 -07:00
Junlin Zhou	cb5fb751e9	Enhance regex of structured_chat agents' output parser (#8965 ) Current regex only extracts agent's action between '` ``` ``` `', this commit will extract action between both '` ```json ``` `' and '` ``` ``` `' This is very similar to #7511 Co-authored-by: zjl <junlinzhou@yzbigdata.com>	2023-08-10 14:26:07 -07:00
Bagatur	16bd328aab	Use Embeddings in pinecone (#8982 ) cc @eyurtsev @olivier-lacroix @jamescalam redo of #2741	2023-08-10 14:22:41 -07:00
Piyush Jain	8eea46ed0e	Bedrock embeddings async methods (#9024 ) ## Description This PR adds the `aembed_query` and `aembed_documents` async methods for improving the embeddings generation for large documents. The implementation uses asyncio tasks and gather to achieve concurrency as there is no bedrock async API in boto3. ### Maintainers @agola11 @aarora79 ### Open questions To avoid throttling from the Bedrock API, should there be an option to limit the concurrency of the calls?	2023-08-10 14:21:03 -07:00
Eugene Yurtsev	67ca187560	Fix incorrect code blocks in documentation (#9060 ) Fixes incorrect code block syntax in doc strings.	2023-08-10 14:13:42 -07:00
Eugene Yurtsev	46f3428cb3	Fix more incorrect code blocks in doc strings (#9073 ) Fix 2 more incorrect code blocks in strings	2023-08-10 13:49:15 -07:00
Eugene Yurtsev	a5a4c53280	RedisStore: Update init and Documentation updates (#9044 ) * Update Redis Store to support init from parameters * Update notebook to show how to use redis store, and some fixes in documentation	2023-08-10 15:30:29 -04:00
Leonid Ganeline	fcbbddedae	ArxivLoader fix for issue 9046 (#9061 ) Fixed #9046 Added ut-s for this fix. @eyurtsev	2023-08-10 14:59:39 -04:00
Mike Lambert	e94a5d753f	Move from test to supported claude-instant-1 model (#9066 ) Moves from "test" model to "claude-instant-1" model which is supported and has actual capacity	2023-08-10 11:57:28 -07:00
Eugene Yurtsev	b7bc8ec87f	Add excludes to FileSystemBlobLoader (#9064 ) Add option to specify exclude patterns. https://github.com/langchain-ai/langchain/discussions/9059	2023-08-10 14:56:58 -04:00
Eugene Yurtsev	6c70f491ba	ChatPromptTemplate pending deprecation proposal (#9004 ) Pending deprecations for ChatPromptTemplate proposals	2023-08-10 14:40:55 -04:00
TRY-ER	2431eca700	Agent vector store tool doc (#9029 ) I was initially confused weather to use create_vectorstore_agent or create_vectorstore_router_agent due to lack of documentation so I created a simple documentation for each of the function about their different usecase. Replace this comment with: - Description: Added the doc_strings in create_vectorstore_agent and create_vectorstore_router_agent to point out the difference in their usecase - Tag maintainer: @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-10 11:13:12 -07:00
Alvaro Bartolome	08a0741d82	Update `ArgillaCallbackHandler` as of latest `argilla` release (#9043 ) Hi @agola11, or whoever is reviewing this PR 😄 ## What's in this PR? As of the latest Argilla release, we'll change and refactor some things to make some workflows easier, one of those is how everything's pushed to Argilla, so that now there's no need to call `push_to_argilla` over a `FeedbackDataset` when either `push_to_argilla` is called for the first time, or `from_argilla` is called; among others. We also add some class variables to make sure those are easy to update in case we update those internally in the future, also to make the `warnings.warn` message lighter from the code view. P.S. Regarding the Twitter/X mention feel free to do so at either https://twitter.com/argilla_io or https://twitter.com/alvarobartt, or both if applicable, otherwise, just the first Twitter/X handle.	2023-08-10 10:59:46 -07:00
Blake (Yung Cher Ho)	8d351bfc20	Takeoff integration (#9045 ) ## Description: This PR adds the Titan Takeoff Server to the available LLMs in LangChain. Titan Takeoff is an inference server created by [TitanML](https://www.titanml.co/) that allows you to deploy large language models locally on your hardware in a single command. Most generative model architectures are included, such as Falcon, Llama 2, GPT2, T5 and many more. Read more about Titan Takeoff here: - [Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e) - [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started) #### Testing As Titan Takeoff runs locally on port 8000 by default, no network access is needed. Responses are mocked for testing. - [x] Make Lint - [x] Make Format - [x] Make Test #### Dependencies No new dependencies are introduced. However, users will need to install the titan-iris package in their local environment and start the Titan Takeoff inferencing server in order to use the Titan Takeoff integration. Thanks for your help and please let me know if you have any questions. cc: @hwchase17 @baskaryan	2023-08-10 10:56:06 -07:00
Nuno Campos	3bdc273ab3	Implement .transform() in RunnablePassthrough() (#9032 ) - This ensures passthrough doesnt break streaming --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-10 10:41:19 -07:00
Bagatur	206f809366	fix sched ci (more) (#9056 )	2023-08-10 10:39:29 -07:00

... 2 3 4 5 6 ...

677 Commits