langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-08 07:10:35 +00:00

Author	SHA1	Message	Date
Dídac Sabatés	e0cb3ea90c	Fix sql_database.ipynb link (#6525 ) Looks like the [SQLDatabaseChain](https://langchain.readthedocs.io/en/latest/modules/chains/examples/sqlite.html) in the SQL Database Agent page was broken I've change it to the SQL Chain page	2023-07-06 13:07:37 -04:00
Leonid Ganeline	4450791edd	docs: tutorials update (#7230 ) updated `tutorials.mdx`: - added a link to new `Deeplearning AI` course on LangChain - added links to other tutorial videos - fixed format @baskaryan, @hwchase17	2023-07-06 12:44:23 -04:00
Diego Machado	a7ae35fe4e	Fix duplicated sentence in documentation's introduction (#6351 ) Fix duplicated sentence in documentation's introduction	2023-07-06 12:12:18 -04:00
Bagatur	681f2678a3	add elasticknn to init (#7284 )	2023-07-06 11:58:24 -04:00
hayao-k	c23e16c459	docs: Fixed typos in Amazon Kendra Retriever documentation (#7261 ) ## Description Fixed to the official service name Amazon Kendra. ## Tag maintainer @baskaryan	2023-07-06 11:56:52 -04:00
zhujiangwei	8c371e12eb	refactor BedrockEmbeddings class (#7266 ) #### Description refactor BedrockEmbeddings class to clean code as below: 1. inline content type and accept 2. rewrite input_body as a dictionary literal 3. no need to declare embeddings variable, so remove it	2023-07-06 11:56:30 -04:00
Chui	c7cf11b8ab	Remove whitespace in filename (#7264 )	2023-07-06 11:55:42 -04:00
Jan Kubica	fed64ae060	Chroma: add vector search with scores (#6864 ) - Description: Adding to Chroma integration the option to run a similarity search by a vector with relevance scores. Fixing two minor typos. - Issue: The "lambda_mult" typo is related to #4861 - Maintainer: @rlancemartin, @eyurtsev	2023-07-06 10:01:55 -04:00
William FH	576880abc5	Re-use Trajectory Evaluator (#7248 ) Use the trajectory eval chain in the run evaluation implementation and update the prepare inputs method to apply to both asynca nd sync	2023-07-06 07:00:24 -07:00
zhaoshengbo	e8f24164f0	Improve the alibaba cloud opensearch vector store documentation (#6964 ) Based on user feedback, we have improved the Alibaba Cloud OpenSearch vector store documentation. Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>	2023-07-06 09:47:49 -04:00
Eduard van Valkenburg	ae5aa496ee	PowerBI updates (#7143 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Several updates for the PowerBI tools: - Handle 0 records returned by requesting redo with different filtering - Handle too large results by optionally tokenizing the result and comparing against a max (change in signature, non-breaking) - Implemented LLMChain with Chat for chat models for the tools. - Updates to the main prompt including tables - Update to Tool prompt with TOPN function - Split the tool prompt to allow the LLMChain with ChatPromptTemplate Smaller fixes for stability. For visibility: @hinthornw	2023-07-06 09:39:23 -04:00
emarco177	b9d6d4cd4c	added template repo for CI/CD deployment on Google Cloud Run (#7218 ) Replace this comment with: - Description: added documentation for a template repo that helps dockerizing and deploying a LangChain using a Cloud Build CI/CD pipeline to Google Cloud build serverless - Issue: None, - Dependencies: None, - Tag maintainer: @baskaryan, - Twitter handle: EdenEmarco177 If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use.	2023-07-06 09:38:38 -04:00
Leonid Kuligin	8b19f6a0da	Added retries for Vertex LLM (#7219 ) #7217 --------- Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-07-06 09:38:01 -04:00
William FH	ec66d5188c	Add Better Errors for Comparison Chain (#7033 ) + change to ABC - this lets us add things like the evaluation name for loading	2023-07-06 06:37:04 -07:00
Stefano Lottini	e61cfb6e99	FLARE Example notebook: switch to named arg to pass pydantic validation (#7267 ) Adding the name of the parameter to comply with latest requirements by Pydantic usage for BaseModels.	2023-07-06 09:32:00 -04:00
Sasmitha Manathunga	0c7a5cb206	Fix inconsistent behavior of `CharacterTextSplitter` when changing `keep_separator` (#7263 ) - Description: - When `keep_separator` is `True` the `_split_text_with_regex()` method in `text_splitter` uses regex to split, but when `keep_separator` is `False` it uses `str.split()`. This causes problems when the separator is a special regex character like `.` or `*`. This PR fixes that by using `re.split()` in both cases. - Issue: #7262 - Tag maintainer: @baskaryan	2023-07-06 09:30:03 -04:00
os1ma	b151d4257a	docs: Update documentation for Wikipedia tool to use WikipediaQueryRun (#7258 ) Description In the following page, "Wikipedia" tool is explained. https://python.langchain.com/docs/modules/agents/tools/integrations/wikipedia However, the WikipediaAPIWrapper being used is not a tool. This PR updated the documentation to use a tool WikipediaQueryRun. Issue None Tag maintainer Agents / Tools / Toolkits: @hinthornw	2023-07-06 09:29:38 -04:00
Jeroen Van Goey	887bb12287	Use correct Language for html_splitter (#7274 ) `html_splitter` was using `Language.MARKDOWN`.	2023-07-06 09:24:25 -04:00
Shantanu Nair	f773c21723	Update supabase match_docs ddl and notebook to use expected id type (#7257 ) - Description: Switch supabase match function DDL to use expected uuid type instead of bigint - Issue: https://github.com/hwchase17/langchain/issues/6743, https://github.com/hwchase17/langchain/issues/7179 - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: https://twitter.com/ShantanuNair	2023-07-06 09:22:41 -04:00
Myeongseop Kim	0e878ccc2d	Add HumanInputChatModel (#7256 ) - Description: This is a chat model equivalent of HumanInputLLM. An example notebook is also added. - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: N/A	2023-07-06 09:21:03 -04:00
Myeongseop Kim	57d8a3d1e8	Make tqdm for OpenAIEmbeddings optional (#7247 ) - Description: I have added a `show_progress_bar` parameter (defaults.to `False`) to the `OpenAIEmbeddings`. If the user sets `show_progress_bar` to `True`, a progress bar will be displayed. - Issue: #7246 - Dependencies: N/A - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: N/A	2023-07-05 23:36:01 -04:00
Harrison Chase	c36f852846	fix conversational retrieval docs (#7245 )	2023-07-05 21:51:33 -04:00
Harrison Chase	035ad33a5b	bump ver to 225 (#7244 )	2023-07-05 21:22:18 -04:00
Shantanu Nair	cabd358c3a	Add missing token_max in reduce.py acombine_docs (#7241 ) Replace this comment with: - Description: reduce.py reduce chain implementation's acombine_docs call does not propagate token_max. Without this, the async call will end up using 3000 tokens, the default, for the collapse chain. - Tag maintainer: @hwchase17 @agola11 @baskaryan - Twitter handle: https://twitter.com/ShantanuNair Related PR: https://github.com/hwchase17/langchain/pull/7201 and https://github.com/hwchase17/langchain/pull/7204 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 21:02:45 -04:00
Harrison Chase	52b016920c	Harrison/update anthropic (#7237 ) Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>	2023-07-05 21:02:35 -04:00
Harrison Chase	695e7027e6	Harrison/parameter (#7081 ) add parameter to use original question or not --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-05 20:51:25 -04:00
Yevgnen	930e319ca7	Add concurrency to GitbookLoader (#7069 ) - Description: Fetch all pages concurrently. - Dependencies: `scrape_all` -> `fetch_all` -> `_fetch_with_rate_limit` -> `_fetch` (might be broken currently: https://github.com/hwchase17/langchain/pull/6519) - Tag maintainer: @rlancemartin, @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 20:51:10 -04:00
Hashem Alsaket	6aa66fd2b0	Update Hugging Face Hub notebook (#7236 ) Description: `flan-t5-xl` hangs, updated to `flan-t5-xxl`. Tested all stabilityai LLMs- all hang so removed from tutorial. Temperature > 0 to prevent unintended determinism. Issue: #3275 Tag maintainer: @baskaryan	2023-07-05 20:45:02 -04:00
Mykola Zomchak	8afc8e6f5d	Fix web_base.py (#6519 ) Fix for bug in SitemapLoader `aiohttp` `get` does not accept `verify` argument, and currently throws error, so SitemapLoader is not working This PR fixes it by removing `verify` param for `get` function call Fixes #6107 #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev --------- Co-authored-by: techcenary <127699216+techcenary@users.noreply.github.com>	2023-07-05 16:53:57 -07:00
William FH	f891f7d69f	Skip evaluation of unfinished runs (#7235 ) Cut down on errors logged Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>	2023-07-05 16:35:20 -07:00
William FH	83cf01683e	Add 'eval' tag (#7209 ) Add an "eval" tag to traced evaluation runs Most of this PR is actually https://github.com/hwchase17/langchain/pull/7207 but I can't diff off two separate PRs --------- Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>	2023-07-05 16:28:34 -07:00
William FH	607708a411	Add tags support for langchaintracer (#7207 )	2023-07-05 16:19:04 -07:00
William FH	75aa408f10	Send evaluator logs to new session (#7206 ) Also stop specifying "eval" mode since explicit project modes are deprecated	2023-07-05 16:15:29 -07:00
Harrison Chase	0dc700eebf	Harrison/scene xplain (#7228 ) Co-authored-by: Kevin Pham <37129444+deoxykev@users.noreply.github.com>	2023-07-05 18:34:50 -04:00
Harrison Chase	d6541da161	remove arize nb (#7238 ) was causing some issues with docs build	2023-07-05 18:34:20 -04:00
Mike Nitsenko	d669b9ece9	Document loader for Cube Semantic Layer (#6882 ) ### Description This pull request introduces the "Cube Semantic Layer" document loader, which demonstrates the retrieval of Cube's data model metadata in a format suitable for passing to LLMs as embeddings. This enhancement aims to provide contextual information and improve the understanding of data. Twitter handle: @the_cube_dev --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-07-05 15:18:12 -07:00
Tom	e533da8bf2	Adding Marqo to vectorstore ecosystem (#7068 ) This PR brings in a vectorstore interface for [Marqo](https://www.marqo.ai/). The Marqo vectorstore exposes some of Marqo's functionality in addition the the VectorStore base class. The Marqo vectorstore also makes the embedding parameter optional because inference for embeddings is an inherent part of Marqo. Docs, notebook examples and integration tests included. Related PR: https://github.com/hwchase17/langchain/pull/2807 --------- Co-authored-by: Tom Hamer <tom@marqo.ai> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 14:44:12 -07:00
Filip Haltmayer	836d2009cb	Update milvus and zilliz docstring (#7216 ) Description: Updating the docstrings for Milvus and Zilliz so that they appear correctly on https://integrations.langchain.com/vectorstores. No changes done to code. Maintainer: @baskaryan Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>	2023-07-05 17:03:51 -04:00
Matt Robinson	d65b1951bd	docs: update docs strings for base unstructured loaders (#7222 ) ### Summary Updates the docstrings for the unstructured base loaders so more useful information appears on the integrations page. If these look good, will add similar docstrings to the other loaders. ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17	2023-07-05 17:02:26 -04:00
Mike Salvatore	265f05b10e	Enable InMemoryDocstore to be constructed without providing a dict (#6976 ) - Description: Allow `InMemoryDocstore` to be created without passing a dict to the constructor; the constructor can create a dict at runtime if one isn't provided. - Tag maintainer: @dev2049	2023-07-05 16:56:31 -04:00
Harrison Chase	47e7d09dff	fix arize nb (#7227 )	2023-07-05 16:55:48 -04:00
Feras Almannaa	79b59a8e06	optimize pgvector `add_texts` (#7185 ) - Description: At the moment, inserting new embeddings to pgvector is querying all embeddings every time as the defined `embeddings` relationship is using the default params, which sets `lazy="select"`. This change drastically improves the performance and adds a few additional cleanups: * remove `collection.embeddings.append` as it was querying all embeddings on insert, replace with `collection_id` param * centralize storing logic in add_embeddings function to reduce duplication * remove boilerplate - Issue: No issue was opened. - Dependencies: None. - Tag maintainer: this is a vectorstore update, so I think @rlancemartin, @eyurtsev - Twitter handle: @falmannaa	2023-07-05 13:19:42 -07:00
Harrison Chase	6711854e30	Harrison/dataforseo (#7214 ) Co-authored-by: Alexander <sune357@gmail.com>	2023-07-05 16:02:02 -04:00
Richy Wang	cab7d86f23	Implement delete interface of vector store on AnalyticDB (#7170 ) Hi, there This pull request contains two commit: 1. Implement delete interface with optional ids parameter on AnalyticDB. 2. Allow customization of database connection behavior by exposing engine_args parameter in interfaces. - This commit adds the `engine_args` parameter to the interfaces, allowing users to customize the behavior of the database connection. The `engine_args` parameter accepts a dictionary of additional arguments that will be passed to the create_engine function. Users can now modify various aspects of the database connection, such as connection pool size and recycle time. This enhancement provides more flexibility and control to users when interacting with the database through the exposed interfaces. This commit is related to VectorStores @rlancemartin @eyurtsev Thank you for your attention and consideration.	2023-07-05 13:01:00 -07:00
Mike Salvatore	3ae11b7582	Handle kwargs in FAISS.load_local() (#6987 ) - Description: This allows parameters such as `relevance_score_fn` to be passed to the `FAISS` constructor via the `load_local()` class method. - Tag maintainer: @rlancemartin @eyurtsev	2023-07-05 15:56:40 -04:00
Jamal	a2f191a322	Replace JIRA Arbitrary Code Execution vulnerability with finer grain API wrapper (#6992 ) This fixes #4833 and the critical vulnerability https://nvd.nist.gov/vuln/detail/CVE-2023-34540 Previously, the JIRA API Wrapper had a mode that simply pipelined user input into an `exec()` function. [The intended use of the 'other' mode is to cover any of Atlassian's API that don't have an existing interface](`cc33bde74f/langchain/tools/jira/prompt.py (L24)`) Fortunately all of the [Atlassian JIRA API methods are subfunctions of their `Jira` class](https://atlassian-python-api.readthedocs.io/jira.html), so this implementation calls these subfunctions directly. As well as passing a string representation of the function to call, the implementation flexibly allows for optionally passing args and/or keyword-args. These are given as part of the dictionary input. Example: ``` { "function": "update_issue_field", #function to execute "args": [ #list of ordered args similar to other examples in this JiraAPIWrapper "key", {"summary": "New summary"} ], "kwargs": {} #dict of key value keyword-args pairs } ``` the above is equivalent to `self.jira.update_issue_field("key", {"summary": "New summary"})` Alternate query schema designs are welcome to make querying easier without passing and evaluating arbitrary python code. I considered parsing (without evaluating) input python code and extracting the function, args, and kwargs from there and then pipelining them into the callable function via `f(args, *kwargs)` - but this seemed more direct. @vowelparrot @dev2049 --------- Co-authored-by: Jamal Rahman <jamal.rahman@builder.ai>	2023-07-05 15:56:01 -04:00
Hakan Tekgul	61938a02a1	Create arize_llm_observability.ipynb (#7000 ) Adding documentation and notebook for Arize callback handler. - @dev2049 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11	2023-07-05 15:55:47 -04:00
Leonid Ganeline	ecee4d6e92	docs: update `youtube` videos and tutorials (#6515 ) added tutorials.mdx; updated youtube.mdx Rationale: the Tutorials section in the documentation is top-priority. (for example, https://pytorch.org/docs/stable/index.html) Not every project has resources to make tutorials. We have such a privilege. Community experts created several tutorials on YouTube. But the tutorial links are now hidden on the YouTube page and not easily discovered by first-time visitors. - Added new videos and tutorials that were created since the last update. - Made some reprioritization between videos on the base of the view numbers. #### Who can review? - @hwchase17 - @dev2049	2023-07-05 12:50:31 -07:00
Santiago Delgado	fa55c5a16b	Fixed Office365 tool __init__.py files, tests, and get_tools() function (#7046 ) ## Description Added Office365 tool modules to `__init__.py` files ## Issue As described in Issue https://github.com/hwchase17/langchain/issues/6936, the Office365 toolkit can't be loaded easily because it is not included in the `__init__.py` files. ## Reviewer @dev2049	2023-07-05 15:46:21 -04:00
wewebber-merlin	8a7c95e555	Retryable exception for empty OpenAI embedding. (#7070 ) Description: The OpenAI "embeddings" API intermittently falls into a failure state where an embedding is returned as [ Nan ], rather than the expected 1536 floats. This patch checks for that state (specifically, for an embedding of length 1) and if it occurs, throws an ApiError, which will cause the chunk to be retried. Issue: I have been unable to find an official langchain issue for this problem, but it is discussed (by another user) at https://stackoverflow.com/questions/76469415/getting-embeddings-of-length-1-from-langchain-openaiembeddings Maintainer: @dev2049 Testing: Since this is an intermittent OpenAI issue, I have not provided a unit or integration test. The provided code has, though, been run successfully over several million tokens. --------- Co-authored-by: William Webber <william@williamwebber.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 15:23:45 -04:00

1 2 3 4 5 ...

2990 Commits