langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-08 07:10:35 +00:00

Author	SHA1	Message	Date
Eugene Yurtsev	d8fa94e6fa	RunnablePassthrough: In code documentation (#11552 ) Add in code documentation for a runnable passthrough	2023-10-09 16:02:16 -04:00
Eugene Yurtsev	b42f218cfc	RunnableLambda: Add in code docs (#11521 ) Add in code docs for Runnable Lambda	2023-10-09 14:37:46 -04:00
maks-operlejn-ds	f64522fbaf	Reset deanonymizer mapping (#11559 ) @hwchase17 @baskaryan	2023-10-09 11:11:05 -07:00
maks-operlejn-ds	b14b65d62a	Support all presidio entities (#11558 ) https://microsoft.github.io/presidio/supported_entities/ @baskaryan @hwchase17	2023-10-09 11:10:46 -07:00
maks-operlejn-ds	4d62def9ff	Better deanonymizer matching strategy (#11557 ) @baskaryan, @hwchase17	2023-10-09 11:10:29 -07:00
Ash Vardanian	a992b9670d	Fix: Missing DuckDuckGo package version (#11535 ) [The `duckduckgo-search` v3.9.2 was removed from PyPi](https://pypi.org/project/duckduckgo-search/#history). That breaks the build. - Description: refreshes the Poetry dependency to v3.9.3 - Tag maintainer: @baskaryan - Twitter handle: @ashvardanian	2023-10-09 10:55:46 -07:00
Bagatur	8932ed3f07	bump 311 (#11555 )	2023-10-09 08:17:07 -07:00
Bagatur	e7a0def1bc	QoL improvements to query constructor (#11504 ) updating query constructor and self query retriever to - make it easier to pass in examples - validate attributes used in query - remove invalid parts of query - make it easier to get + edit prompt - make query constructor a runnable - make self query retriever use as runnable	2023-10-09 08:10:52 -07:00
Taikono-Himazin	eec53fa294	Added autodetect_encoding option to csvLoader (#11327 )	2023-10-09 08:06:43 -07:00
Holt Skinner	09c66fe04f	feat: Update Google Document AI Parser (#11413 ) - Description: Code Refactoring, Documentation Improvements for Google Document AI PDF Parser - Adds Online (synchronous) processing option. - Adds default field mask to limit payload size. - Skips Human review by default. - Issue: Fixes #10589 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-09 08:04:25 -07:00
Nuno Campos	628cc4cce8	Rename RunnableMap to RunnableParallel (#11487 ) - keep alias for RunnableMap - update docs to use RunnableParallel and RunnablePassthrough.assign <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-09 11:22:03 +01:00
Eugene Yurtsev	6a10e8ef31	Add documentation to Runnable (#11516 )	2023-10-08 08:09:04 +01:00
William FH	eb572f41a6	Add LangSmith Run Chat Loader (#11458 )	2023-10-06 17:02:18 -07:00
David Duong	484947c492	Fetch up-to-date attributes for env-pulled kwargs during serialisation of OpenAI classes (#11499 )	2023-10-06 22:43:29 +01:00
Bagatur	5470e730d2	raise openapi import error (#11495 )	2023-10-06 12:57:24 -07:00
Erick Friis	29f5f70415	Rename some last hwchase17/langchain links (#11494 )	2023-10-06 12:34:30 -07:00
Fabrice Pont	872836c541	feat: add markdown list parser (#11411 ) Description: add `MarkdownListOutputParser` as a new `ListOutputParser` Issue: #11410	2023-10-06 12:25:45 -07:00
Erick Friis	8f50b616c5	Remove optional from vectara source (#11493 ) fyi @ofermend --------- Co-authored-by: Ofer Mendelevitch <ofer@vectara.com> Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>	2023-10-06 12:12:44 -07:00
Bagatur	53887242a1	bump 310 (#11486 )	2023-10-06 09:49:10 -07:00
Jesús Vélez Santiago	a1c7532298	Add async sql record manager and async indexing API (#10726 ) - Description: Add support for a SQLRecordManager in async environments. It includes the creation of `RecorManagerAsync` abstract class. - Issue: None - Dependencies: Optional `aiosqlite`. - Tag maintainer: @nfcampos - Twitter handle: @jvelezmagic --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-06 09:38:44 -04:00
Qihui Xie	57ade13b2b	fix llm_inputs duplication problem in intermediate_steps in SQLDatabaseChain (#10279 ) Use `.copy()` to fix the bug that the first `llm_inputs` element is overwritten by the second `llm_inputs` element in `intermediate_steps`. *Problem description:* In [line 127]( `c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L127C17-L127C17)`), the `llm_inputs` of the sql generation step is appended as the first element of `intermediate_steps`: ``` intermediate_steps.append(llm_inputs) # input: sql generation ``` However, `llm_inputs` is a mutable dict, it is updated in [line 179](https://github.com/langchain-ai/langchain/blob/master/libs/experimental/langchain_experimental/sql/base.py#L179) for the final answer step: ``` llm_inputs["input"] = input_text ``` Then, the updated `llm_inputs` is appended as another element of `intermediate_steps` in [line 180](`c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L180)`): ``` intermediate_steps.append(llm_inputs) # input: final answer ``` As a result, the final `intermediate_steps` returned in [line 189](`c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L189C43-L189C43)`) actually contains two same `llm_inputs` elements, i.e., the `llm_inputs` for the sql generation step overwritten by the one for final answer step by mistake. Users are not able to get the actual `llm_inputs` for the sql generation step from `intermediate_steps` Simply calling `.copy()` when appending `llm_inputs` to `intermediate_steps` can solve this problem.	2023-10-05 21:32:08 -07:00
Florian	d78f418c0d	Extract abstracts from Pubmed articles, even if they have no extra label (#10245 ) ### Description This pull request involves modifications to the extraction method for abstracts/summaries within the PubMed utility. A condition has been added to verify the presence of unlabeled abstracts. Now an abstract will be extracted even if it does not have a subtitle. In addition, the extraction of the abstract was extended to books. ### Issue The PubMed utility occasionally returns an empty result when extracting abstracts from articles, despite the presence of an abstract for the paper on PubMed. This issue arises due to the varying structure of articles; some articles follow a "subtitle/label: text" format, while others do not include subtitles in their abstracts. An example of the latter case can be found at: [https://pubmed.ncbi.nlm.nih.gov/37666905/](url) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:56:46 -07:00
Viktor Zhemchuzhnikov	fd9da60aea	Add async support to SelfQueryRetriever (#10175 ) ### Description SelfQueryRetriever is missing async support, so I am adding it. I also removed deprecated predict_and_parse method usage here, and added some tests. ### Issue N/A ### Tag maintainer Not yet ### Twitter handle N/A	2023-10-05 18:54:21 -07:00
Theron Tau	35297ca0d3	Add feature for extracting images from pdf and recognizing text from images. (#10653 ) Description It is for #10423 that it will be a useful feature if we can extract images from pdf and recognize text on them. I have implemented it with `PyPDFLoader`, `PyPDFium2Loader`, `PyPDFDirectoryLoader`, `PyMuPDFLoader`, `PDFMinerLoader`, and `PDFPlumberLoader`. [RapidOCR](https://github.com/RapidAI/RapidOCR.git) is used to recognize text on extracted images. It is time-consuming for ocr so a boolen parameter `extract_images` is set to control whether to extract and recognize. I have tested the time usage for each parser on my own laptop thinkbook 14+ with AMD R7-6800H by unit test and the result is: \| extract_images \| PyPDFParser \| PDFMinerParser \| PyMuPDFParser \| PyPDFium2Parser \| PDFPlumberParser \| \| ------------- \| ------------- \| ------------- \| ------------- \| ------------- \| ------------- \| \| False \| 0.27s \| 0.39s \| 0.06s \| 0.08s \| 1.01s \| \| True \| 17.01s \| 20.67s \| 20.32s \| 19,75s \| 20.55s \| Issue #10423 Dependencies rapidocr_onnxruntime in [RapidOCR](https://github.com/RapidAI/RapidOCR/tree/main) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:51:59 -07:00
Bagatur	8e3fbc97ca	Add vowpal_wabbit RL chain (#11462 )	2023-10-05 18:39:45 -07:00
Haris Wang	f1269830a0	Fix bug in MarkdownHeaderTextSplitter for codeblock (#10262 ) - Description: The previous version of the MarkdownHeaderTextSplitter did not take into account the possibility of '#' appearing within code blocks, which caused segmentation anomalies in these situations. This PR has fixed this issue. - Issue: - Dependencies: No - Tag maintainer: - Twitter handle: cc @baskaryan @eyurtsev @rlancemartin --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:34:42 -07:00
Eddie Cohen	656d2303f7	add in, nin for pinecone (#10303 ) Description: Adds the in and nin comparators for pinecone seen [here](https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:31:09 -07:00
Bagatur	a3a2ce623e	Revise vowpal_wabbit notebook	2023-10-05 18:18:19 -07:00
Bagatur	8fafa1af91	merge	2023-10-05 18:09:35 -07:00
olgavrou	3b07c0cf3d	RL Chain with VowpalWabbit (#10242 ) - Description: This PR adds a new chain `rl_chain.PickBest` for learned prompt variable injection, detailed description and usage can be found in the example notebook added. It essentially adds a [VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit) layer before the llm call in order to learn or personalize prompt variable selections. Most of the code is to make the API simple and provide lots of defaults and data wrangling that is needed to use Vowpal Wabbit, so that the user of the chain doesn't have to worry about it. - Dependencies: [vowpal-wabbit-next](https://pypi.org/project/vowpal-wabbit-next/), - sentence-transformers (already a dep) - numpy (already a dep) - tagging @ataymano who contributed to this chain - Tag maintainer: @baskaryan - Twitter handle: @olgavrou Added example notebook and unit tests	2023-10-05 18:07:22 -07:00
Manikanta5112	56048b909f	added ContentFormatter escape special characters for message content (#10319 ) --------- Co-authored-by: Manikanta5112 <42089393+mani5112@users.noreply.github.com>	2023-10-05 18:02:29 -07:00
Leonid Ganeline	d17416ec79	docstrings `callbacks` (#11456 ) Added missed docstrings to the `callbacks/` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-05 17:13:14 -07:00
Ofer Mendelevitch	3c7653bf0f	"source" argument in constructor of Vectara (#11454 ) Replace this entire comment with: - Description: minor update to constructor to allow for specification of "source" - Tag maintainer: @baskaryan - Twitter handle: @ofermend	2023-10-05 17:04:14 -07:00
Eugene Yurtsev	d9018ae5f1	Improve CLI ux (#11452 ) Improve UX for cli	2023-10-05 19:40:00 -04:00
Jaikanth J	9f85f7c543	fix(cache): use dumps for RedisCache (#10408 ) # Description Attempts to fix RedisCache for ChatGenerations using `loads` and `dumps` used in SQLAlchemy cache by @hwchase17 . this is better than pickle dump, because this won't execute any arbitrary code during de-serialisation. # Issues #7722 & #8666 # Dependencies None, but removes the warning introduced in #8041 by @baskaryan Handle: @jaikanthjay46	2023-10-05 16:34:07 -07:00
rodrigo-clickup	5944c1851b	Add ClickUp Toolkit (#10662 ) - Description: Adds a toolkit to interact with the [ClickUp](https://clickup.com/) [Public API](https://clickup.com/api/) - Dependencies: None - Tag maintainer: @rodrigo-georgian, @rodrigo-clickup, @aiswaryasankarwork - Twitter handle: - Aiswarya (https://twitter.com/Aiswarya_Sankar, https://www.linkedin.com/in/sankaraiswarya/) - Rodrigo (https://www.linkedin.com/in/rodrigo-ceballos-lentini/) --------- Co-authored-by: Aiswarya Sankar <aiswaryasankar@Aiswaryas-MacBook-Pro.local> Co-authored-by: aiswaryasankarwork <143119412+aiswaryasankarwork@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 16:33:05 -07:00
John Reynolds	68901e1e40	Update output_parser.py (#10430 ) - Description: Updated output parser for mrkl to remove any hallucination actions after the final answer; this was encountered when using Anthropic claude v2 for planning; reopening PR with updated unit tests - Issue: #10278 - Dependencies: N/A - Twitter handle: @johnreynolds	2023-10-05 15:47:24 -07:00
Joshua Sundance Bailey	790010703b	ArcGISLoader: Limit number of results in query (#10615 ) Description: this PR changes the `ArcGISLoader` to set `return_all_records` to `False` when `result_record_count` is provided as a keyword argument. Previously, `return_all_records` was `True` by default and this made the API ignore `result_record_count`. Issue: `ArcGISLoader` would ignore `result_record_count` unless user also passed `return_all_records=False`.	2023-10-05 15:46:02 -07:00
mrbean	9903a70379	Add youdotcom retriever (#11304 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 13:48:11 -07:00
ashish-dahal	1655ff2ded	Fix PyMuPDFLoader kwargs (#11434 ) - Description: Fix the `PyMuPDFLoader` to accept `loader_kwargs` from the document loader's `loader_kwargs` option. This provides more flexibility in formatting the output from documents. - Issue: The `loader_kwargs` is not passed into the `load` method from the document loader, which limits configuration options. - Dependencies: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 13:25:19 -07:00
Leonid Kuligin	e4a46747dc	integration test for DocAI parser (#11424 ) - Description: added an integration test - Issue: #11407 @baskaryan	2023-10-05 12:38:29 -07:00
Aashish Saini	2abbdc6ecb	Update bageldb.py (#11421 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation.	2023-10-05 12:37:56 -07:00
maks-operlejn-ds	2aae1102b0	Instance anonymization (#10501 ) ### Description Add instance anonymization - if `John Doe` will appear twice in the text, it will be treated as the same entity. The difference between `PresidioAnonymizer` and `PresidioReversibleAnonymizer` is that only the second one has a built-in memory, so it will remember anonymization mapping for multiple texts: ``` >>> anonymizer = PresidioAnonymizer() >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Brett Russell. Hi Brett Russell!' ``` ``` >>> anonymizer = PresidioReversibleAnonymizer() >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' ``` ### Twitter handle @deepsense_ai / @MaksOpp ### Tag maintainer @baskaryan @hwchase17 @hinthornw --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:23:02 -07:00
Kyle Pancamo	203258b4d6	Update pdf.py comment for PyPDFLoader (#10495 ) PyPDF does not chunk at the character level to my understanding. Description: PyPDF does not chunk at the character level, but instead breaks up content by page. Fixup comment --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:22:40 -07:00
Juan Daza	4236ae3851	Added Streaming Capability to SageMaker LLMs (#10535 ) This PR adds the ability to declare a Streaming response in the SageMaker LLM by leveraging the `invoke_endpoint_with_response_stream` capability in `boto3`. It is heavily based on the AWS Blog Post announcement linked [here](https://aws.amazon.com/blogs/machine-learning/elevating-the-generative-ai-experience-introducing-streaming-support-in-amazon-sagemaker-hosting/). It does not add any additional dependencies since it uses the existing `boto3` version. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:08:43 -07:00
Laurentiu Piciu	d9670a5945	openai_functions_multi_agent: solved the case when the "arguments" is valid JSON but it does not contain `actions` key (#10543 ) Description: There are cases when the output from the LLM comes fine (i.e. function_call["arguments"] is a valid JSON object), but it does not contain the key "actions". So I split the validation in 2 steps: loading arguments as JSON and then checking for "actions" in it. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:08:09 -07:00
Eugene Yurtsev	fcccde406d	Add SymbolicMathChain to experiment in preparation for deprecation (#11129 ) Move symbolic math chain to experimental	2023-10-05 13:54:43 -04:00
Holt Skinner	9f73fec057	fix: Update Google Cloud Enterprise Search to Vertex AI Search (#10513 ) - Description: Google Cloud Enterprise Search was renamed to Vertex AI Search - https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-search-and-conversation-is-now-generally-available - This PR updates the documentation and Retriever class to use the new terminology. - Changed retriever class from `GoogleCloudEnterpriseSearchRetriever` to `GoogleVertexAISearchRetriever` - Updated documentation to specify that `extractive_segments` requires the new [Enterprise edition](https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features#enterprise-features) to be enabled. - Fixed spelling errors in documentation. - Change parameter for Retriever from `search_engine_id` to `data_store_id` - When this retriever was originally implemented, there was no distinction between a data store and search engine, but now these have been split. - Fixed an issue blocking some users where the api_endpoint can't be set	2023-10-05 10:47:47 -07:00
Patrick Randell	1d678f805f	Additional Weaviate Filter Comparators (#10522 ) ### Description When using Weaviate Self-Retrievers, certain common filter comparators generated by user queries were unimplemented, resulting in errors. This PR implements some of them. All linting and format commands have been run and tests passed. ### Issue #10474 ### Dependencies timestamp module --------- Co-authored-by: Patrick Randell <prandell@deloitte.com.au>	2023-10-05 10:40:04 -07:00
Nuno Campos	79011f835f	Remove str() from RunnableConfigurableAlternatives (#11446 )	2023-10-05 18:40:00 +01:00
Harrison Chase	31d5bd84d7	make vectorstores optional (#11393 )	2023-10-05 10:14:05 -07:00
Eugene Yurtsev	8aa545901a	Update agent type docs (#11137 ) In code docs for agent types	2023-10-05 12:51:14 -04:00
Eugene Yurtsev	3e31d6e35f	Start deprecation of LLMBashChain (#11300 ) In preparation for migration LLMBashChain and related tools add a derprecation warning to the code.	2023-10-05 12:48:22 -04:00
Bagatur	8b6b8bf68c	bump 309 (#11443 )	2023-10-05 09:29:14 -07:00
billytrend-cohere	2ff91a46c0	Add cohere /chat integration (#11389 ) Add cohere /chat integration and an iPython notebook to demonstrate the addition. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 09:20:47 -07:00
adrienohana	ca346011b7	added interactive login for azure cognitive search vector store (#11360 ) Description: Previously if the access to Azure Cognitive Search was not done via an API key, the default credential was called which doesn't allow to use an interactive login. I simply added the option to use "INTERACTIVE" as a key name, and this will launch a login window upon initialization of the AzureSearch object.	2023-10-05 09:20:18 -07:00
Eugene Yurtsev	5a1f614175	Add docker compose to CLI (#11406 ) Add docker compose to cli	2023-10-05 15:58:56 +01:00
Predrag Gruevski	e2d6c41177	Upgrade langchain dependencies. (#11420 ) I was hoping this would pick up numpy 1.26, which is required to support the new Python 3.12 release, but it didn't. It seems that some transitive dependency requirement on numpy is preventing that, and the highest we can currently go is 1.24.x. But to find this out required a 15min `poetry lock`, so I figured we might as well upgrade the dependencies we can and hopefully make the next dependency upgrade a bit smaller.	2023-10-05 15:57:20 +01:00
Jacob Lee	71fd6428c5	Remove overridden async not implemented method on embeddings filters and add default async implementation for document compressors (#11415 ) @nfcampos @eyurtsev @baskaryan --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-10-05 15:56:03 +01:00
Nuno Campos	2f490be09b	Fix .dict() for agent/chain (#11436 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-05 15:51:21 +01:00
Nuno Campos	1e59c44d36	Nc/5oct/runnable release (#11428 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-05 14:27:50 +01:00
Bagatur	58b7a3ba16	Rm bedrock anthropic error (#11403 )	2023-10-04 23:31:51 -04:00
Predrag Gruevski	c9986bc3a9	Tweak type hints to match dependency's behavior. (#11355 ) Needs #11353 to merge first, and a new `langchain` to be published with those changes.	2023-10-04 22:36:58 -04:00
William FH	940b9ae30a	Normalize Option in Scoring Chain (#11412 )	2023-10-04 15:59:28 -07:00
Eugene Yurtsev	70be04a816	CLI: Readme update (#11404 ) Consolidating to a single README for now, will be easier to maintain we can differentiate between poetry and pip later. Does not seem critical. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-04 16:25:37 -04:00
Nuno Campos	fde19c8667	Add CLI command to create a new project (#7837 ) First version of CLI command to create a new langchain project template Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-04 15:43:41 -04:00
mhwang-stripe	9cea796671	Make langchain compatible with SQLAlchemy<1.4.0 (#11390 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> ## Description Currently SQLAlchemy >=1.4.0 is a hard requirement. We are unable to run `from langchain.vectorstores import FAISS` with SQLAlchemy <1.4.0 due to top-level imports, even if we aren't even using parts of the library that use SQLAlchemy. See Testing section for repro. Let's make it so that langchain is still compatible with SQLAlchemy <1.4.0, especially if we aren't using parts of langchain that require it. The main conflict is that SQLAlchemy removed `declarative_base` from `sqlalchemy.ext.declarative` in 1.4.0 and moved it to `sqlalchemy.orm`. We can fix this by try-catching the import. This is the same fix as applied in https://github.com/langchain-ai/langchain/pull/883. (I see that there seems to be some refactoring going on about isolating dependencies, e.g. `c87e9fb2ce`, so if this issue will be eventually fixed by isolating imports in langchain.vectorstores that also works). ## Issue I can't find a matching issue. ## Dependencies No additional dependencies ## Maintainer @hwchase17 since you reviewed https://github.com/langchain-ai/langchain/pull/883 ## Testing I didn't add a test, but I manually tested this. 1. Current failure: ``` langchain==0.0.305 sqlalchemy==1.3.24 ``` ``` python python -i >>> from langchain.vectorstores import FAISS Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/__init__.py", line 58, in <module> from langchain.vectorstores.pgembedding import PGEmbedding File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/pgembedding.py", line 10, in <module> from sqlalchemy.orm import Session, declarative_base, relationship ImportError: cannot import name 'declarative_base' from 'sqlalchemy.orm' (/pay/src/zoolander/vendor3/lib/python3.8/site-packages/sqlalchemy/orm/__init__.py) ``` 2. This fix: ``` langchain==<this PR> sqlalchemy==1.3.24 ``` ``` python python -i >>> from langchain.vectorstores import FAISS <succeeds> ```	2023-10-04 15:41:20 -04:00
Nuno Campos	4d66756d93	Improve output of Runnable.astream_log() (#11391 ) - Make logs a dictionary keyed by run name (and counter for repeats) - Ensure no output shows up in lc_serializable format - Fix up repr for RunLog and RunLogPatch <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-04 20:16:37 +01:00
Lester Solbakken	a30f98f534	Add Vespa vector store (#11329 ) Addition of Vespa vector store integration including notebook showing its use. Maintainer: @lesters Twitter handle: LesterSolbakken	2023-10-04 14:59:11 -04:00
Nuno Campos	58a88f3911	Add optional input_types to prompt template (#11385 ) - default MessagesPlaceholder one to list of messages <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-04 18:54:53 +01:00
Tomaz Bratanic	71290315cf	Add optional Cypher validation tool (#11078 ) LLMs have trouble with consistently getting the relationship direction accurately. That's why I organized a competition how to best and most simple to fix it based on the existing schema as a post-processing step. https://github.com/tomasonjo/cypher-direction-competition I am adding the winner's code in this PR: https://github.com/sakusaku-rich/cypher-direction-competition	2023-10-04 12:54:37 -04:00
Bagatur	dd514c2781	bump 308 (#11383 )	2023-10-04 12:10:09 -04:00
Leonid Kuligin	4f4e0f38fc	a better error description when GCP project is not set (#11377 ) - Description: a little bit better error description - Issue: #10879	2023-10-04 11:57:47 -04:00
Nuno Campos	0d80226c64	Add _type to json functions output parser (#11381 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-04 16:56:45 +01:00
Bagatur	106608bc89	add default async (#11141 )	2023-10-04 11:40:35 -04:00
Nuno Campos	b0893c7c6a	Use an enum for configurable_alternatives to make the generated json schema nicer (#11350 )	2023-10-04 11:32:41 -04:00
Bagatur	b499de2926	Anthropic system message fix (#11301 ) Removes human prompt prefix before system message for anthropic models Bedrock anthropic api enforces that Human and Assistant messages must be interleaved (cannot have same type twice in a row). We currently treat System Messages as human messages when converting messages -> string prompt. Our validation when using Bedrock/BedrockChat raises an error when this happens. For ChatAnthropic we don't validate this so no error is raised, but perhaps the behavior is still suboptimal	2023-10-04 11:32:24 -04:00
Massimiliano Angelino	2f83350eac	Feat bedrock cohere support (#11230 ) Description: Added support for Cohere command model via Bedrock. With this change it is now possible to use the `cohere.command-text-v14` model via Bedrock API. About Streaming: Cohere model outputs 2 additional chunks at the end of the text being generated via streaming: a chunk containing the text `<EOS_TOKEN>`, and a chunk indicating the end of the stream. In this implementation I chose to ignore both chunks. An alternative solution could be to replace `<EOS_TOKEN>` with `\n` Tests: manually tested that the new model work with both `llm.generate()` and `llm.stream()`. Tested with `temperature`, `p` and `stop` parameters. Issue: #11181 Dependencies: No new dependencies Tag maintainer: @baskaryan Twitter handle: mangelino --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-04 11:12:19 -04:00
Daniel Butler	939bceccb0	GitHubIssuesLoader Custom API URL Support (#11378 ) - Description: Adds support for custom API URL in the GitHubIssuesLoader. This allows it to be used with Github enterprise instances.	2023-10-04 10:17:46 -04:00
Bagatur	16a80779b9	bump 307 (#11380 )	2023-10-04 10:03:17 -04:00
mziru	9e3c1d4463	add HTMLHeaderTextSplitter (#11039 ) Description: Similar in concept to the `MarkdownHeaderTextSplitter`, the `HTMLHeaderTextSplitter` is a "structure-aware" chunker that splits text at the element level and adds metadata for each header "relevant" to any given chunk. It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures. It can be used with other text splitters as part of a chunking pipeline. Dependency: lxml python package Maintainer: @hwchase17 Twitter handle: @MartinZirulnik --------- Co-authored-by: PresidioVantage <github@presidiovantage.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-04 09:24:25 -04:00
Predrag Gruevski	289de601c8	Use parameterized queries to select SQL schemas. (#11356 )	2023-10-04 05:43:30 +01:00
Nuno Campos	b0097f8908	In ProgressBarCallback update the progress counter also when runs fin… (#11332 )	2023-10-04 05:04:59 +01:00
William FH	06f39be1c2	Wfh/eval max concurrency (#11368 )	2023-10-03 20:18:14 -07:00
Aashish Saini	4adb2b399d	Fixed exception type in py files (#11322 ) I've refactored the code to ensure that ImportError is consistently handled. Instead of using ValueError as before, I've now followed the standard practice of raising ImportError along with clear and informative error messages. This change enhances the code's clarity and explicitly signifies that any problems are associated with module imports.	2023-10-03 21:46:26 -04:00
니콜라스	c6d7124675	Add 'device' to GPT4All (#11216 ) Add device to GPT4All - Description: GPT4All now supports GPU. This commit adds the option to enable it. - Issue: It closes https://github.com/langchain-ai/langchain/issues/10486 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-03 17:37:30 -07:00
Harrison Chase	6e848b879a	add default for async (#11367 )	2023-10-03 17:28:14 -07:00
Fynn Flügge	0a4baca291	chore: add kotlin code splitter (#11364 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: Adds Kotlin language to `TextSplitter` --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-03 18:35:36 -04:00
Ofer Mendelevitch	b93a08079e	Updates to Vectara Implementation (#11366 ) Replace this entire comment with: - Description: updates to documentation and API headers - Tag maintainer: @baskarya - Twitter handle: @ofermend	2023-10-03 18:34:39 -04:00
Erick Friis	745e3e29da	add getattr case for llms.type_to_cls_dict (#11362 ) For external libraries that depend on `type_to_cls_dict`, adds a workaround to continue using the old format. Recommend people use `get_type_to_cls_dict()` instead and only resolve the imports when they're used.	2023-10-03 14:34:30 -07:00
Vicente Reyes	f3e13e7e5a	Use term keyword according to the official python doc glossary (#11338 ) - Description: use term keyword according to the official python doc glossary, see https://docs.python.org/3/glossary.html - Issue: not applicable - Dependencies: not applicable - Tag maintainer: @hwchase17 - Twitter handle: vreyespue	2023-10-03 12:56:08 -07:00
Predrag Gruevski	5d6b83d9cf	Make a copy of external data instead of mutating another object's attributes. (#11349 ) Fix for a bug surfaced as part of #11339. `mypy` caught this since the types didn't match up.	2023-10-03 15:27:51 -04:00
Predrag Gruevski	42d979efdd	Improve type hints and interface for SQL execution functionality. (#11353 ) The previous API of the `_execute()` function had a few rough edges that this PR addresses: - The `fetch` argument was type-hinted as being able to take any string, but any string other than `"all"` or `"one"` would `raise ValueError`. The new type hints explicitly declare that only those values are supported. - The return type was type-hinted as `Sequence` but using `fetch = "one"` would actually return a single result item. This was incorrectly suppressed using `# type: ignore`. We now always return a list. - Using `fetch = "one"` would return a single item if data was found, or an empty list if no data was found. This was confusing, and we now always return a list to simplify. - The return type was `Sequence[Any]` which was a bit difficult to use since it wasn't clear what one could do with the returned rows. I'm making the new type `Dict[str, Any]` that corresponds to the column names and their values in the query. I've updated the use of this method elsewhere in the file to match the new behavior.	2023-10-03 15:19:08 -04:00
Mohammad Mohtashim	3bddd708f7	Add memory to sql chain (#8597 ) continuation of PR #8550 @hwchase17 please see and merge. And also close the PR #8550. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-03 12:04:39 -07:00
Harrison Chase	feabf2e0d5	make llm imports optional (#11237 )	2023-10-03 09:14:15 -07:00
Harrison Chase	88bad37ec2	fix get_tool_return (#11346 )	2023-10-03 09:01:05 -07:00
Harrison Chase	bdf865d8e8	better error message on parsing errors (#11342 )	2023-10-03 09:00:17 -07:00
Eugene Yurtsev	2343302fc6	Remove langserve from langchain repo (#11288 ) LangServe has been moved to a separate repo	2023-10-03 10:48:35 -04:00
William FH	6950b44bfc	Consolidate run collector. Add link helper (#11269 ) Instead of: ``` client = Client() with collect_runs() as cb: chain.invoke() run = cb.traced_runs[0] client.get_run_url(run) ``` it's ``` with tracing_v2_enabled() as cb: chain.invoke() cb.get_run_url() ```	2023-10-03 06:20:58 -07:00
Nuno Campos	0aedbcf7b2	Pass kwargs in runnable retry (#11324 )	2023-10-03 09:55:02 +01:00

1 2 3 4 5 ...

1360 Commits