langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-06 03:20:49 +00:00

Author	SHA1	Message	Date
Matvey Arye	11505f95d3	Improve handling of empty queries for timescale vector (#12393 ) Description: Improve handling of empty queries in timescale-vector. For timescale-vector it is more efficient to get a None embedding when the embedding has no semantic meaning. It allows timescale-vector to perform more optimizations. Thus, when the query is empty, use a None embedding. Also pass down constructor arguments to the timescale vector client. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-27 13:55:16 -07:00
Erick Friis	38cee5fae0	cli updates 2 (#12447 ) - extras group - readme - another readme --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 13:37:03 -07:00
William FH	5d40e36c75	Trace if run tree set (#12444 ) This code path is hit in the following case: - Start in langchain code and manually provide a tracer - Handoff to the traceable - Hand back to langchain code. Which happens for evaluating `@traceable` functions unfortunately	2023-10-27 12:29:18 -07:00
Bagatur	c2a0a6b6df	make doc utils public (#12394 )	2023-10-27 12:08:08 -07:00
Henter	d6888a90d0	Fix the missing temperature parameter for Baichuan-AI chat_model (#12420 ) Description: the missing `temperature` parameter for Baichuan-AI chat_model Baichuan-AI api doc: https://platform.baichuan-ai.com/docs/api	2023-10-27 12:07:21 -07:00
Erick Friis	6908634428	cli updates oct27 (#12436 )	2023-10-27 12:06:46 -07:00
HwangJohn	d38c8369b3	added rrf argument in ApproxRetrievalStrategy class __init__() (#11987 ) - Description: To handle the hybrid search with RRF(Reciprocal Rank Fusion) in the Elasticsearch, rrf argument was added for adjusting 'rank_constant' and 'window_size' to combine multiple result sets with different relevance indicators into a single result set. (ref: https://www.elastic.co/kr/blog/whats-new-elastic-enterprise-search-8-9-0), - Issue: the issue # it fixes (if applicable), - Dependencies: No dependencies changed, - Tag maintainer: @baskaryan, Nice to meet you, I'm a newbie for contributions and it's my first PR. I only changed the langchain/vectorstores/elasticsearch.py file. I did make format&lint I got this message, ```shell make lint_diff ./scripts/check_pydantic.sh . ./scripts/check_imports.sh poetry run ruff . [ "langchain/vectorstores/elasticsearch.py" = "" ] \|\| poetry run black langchain/vectorstores/elasticsearch.py --check All done! ✨ 🍰 ✨ 1 file would be left unchanged. [ "langchain/vectorstores/elasticsearch.py" = "" ] \|\| poetry run mypy langchain/vectorstores/elasticsearch.py langchain/__init__.py: error: Source file found twice under different module names: "mvp.nlp.langchain.libs.langchain.langchain" and "langchain" Found 1 error in 1 file (errors prevented further checking) make: * [lint_diff] Error 2 ``` Thank you --------- Co-authored-by: 황중원 <jwhwang@amorepacific.com>	2023-10-27 11:53:19 -07:00
Roman Vasilyev	2c58dca5f0	optional reusable connection (#12051 ) My postgres out of connections after continuous PGVector usage, and the reason because it constantly creates new connections, so adding a reusable pre established connection seems like solves an issue --------- Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 11:52:42 -07:00
Ennio Pastore	48fde2004f	Update long_context_reorder.py (#12422 ) The function comment was confusing and inaccurate	2023-10-27 11:52:28 -07:00
Bagatur	a8c68d4ffa	Type LLMChain.llm as runnable (#12385 )	2023-10-27 11:52:01 -07:00
Bagatur	d12b88557a	Bagatur/bump 325 (#12440 )	2023-10-27 11:49:09 -07:00
Eugene Yurtsev	cadfce295f	Deprecate PythonRepl tools and Pandas/Xorbits/Spark DataFrame/Python/CSV agents (#12427 ) See discussion here: https://github.com/langchain-ai/langchain/discussions/11680 The code is available for usage from langchain_experimental. The reason for the deprecation is that the agents are relying on a Python REPL. The code can only be run safely with appropriate sandboxing. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-27 14:16:42 -04:00
Harrison Chase	0ca539eb85	Clean up deprecated agents and update __init__ in experimental (#12231 ) Update init paths in experimental	2023-10-27 13:52:50 -04:00
Holt Skinner	134f085824	feat: Add Google Speech to Text API Document Loader (#12298 ) - Add Document Loader for Google Speech to Text - Similar Structure to [Assembly AI Document Loader][1] [1]: https://python.langchain.com/docs/integrations/document_loaders/assemblyai	2023-10-27 09:34:26 -07:00
David Duong	52c194ec3a	Fix templates typos (#12428 )	2023-10-27 09:32:57 -07:00
Massimiliano Pronesti	c8195769f2	fix(openai-callback): completion count logic (#12383 ) The changes introduced in #12267 and #12190 broke the cost computation of the `completion` tokens for fine-tuned models because of the early return. This PR aims at fixing this. @baskaryan.	2023-10-27 09:08:54 -07:00
Stefan Langenbach	b22da81af8	Mask API key for Aleph Alpha LLM (#12377 ) - Description: Add masking of API Key for Aleph Alpha LLM when printed. - Issue: #12165 - Dependencies: None - Tag maintainer: @eyurtsev --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 11:32:43 -04:00
William FH	4254028c52	Str Evaluator Mapper (#12401 )	2023-10-26 21:38:47 -07:00
William FH	fcad1d2965	Add space (#12395 )	2023-10-26 20:32:23 -07:00
William FH	922d7910ef	Wfh/json schema evaluation (#12389 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-10-26 20:32:05 -07:00
Christian Kasim Loan	a35445c65f	johnsnowlabs embeddings support (#11271 ) - Description: Introducing the [JohnSnowLabsEmbeddings](https://www.johnsnowlabs.com/) - Dependencies: johnsnowlabs - Tag maintainer: @C-K-Loan - Twitter handle: https://twitter.com/JohnSnowLabs https://twitter.com/ChristianKasimL --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-26 20:22:50 -07:00
SteveLiao	c08b622b2d	Add HTML Title and Page Language into metadata for AsyncHtmlLoader (#11326 ) Description: Revise `libs/langchain/langchain/document_loaders/async_html.py` to store the HTML Title and Page Language in the `metadata` of `AsyncHtmlLoader`.	2023-10-26 20:22:31 -07:00
Shorthills AI	25c98dbba9	Fixed some grammatical and Exception types issues (#12015 ) Fixed some grammatical issues and Exception types. @baskaryan , @eyurtsev --------- Co-authored-by: Sanskar Tanwar <142409040+SanskarTanwarShorthillsAI@users.noreply.github.com> Co-authored-by: UpneetShorthillsAI <144228282+UpneetShorthillsAI@users.noreply.github.com> Co-authored-by: HarshGuptaShorthillsAI <144897987+HarshGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: AdityaKalraShorthillsAI <143726711+AdityaKalraShorthillsAI@users.noreply.github.com> Co-authored-by: SakshiShorthillsAI <144228183+SakshiShorthillsAI@users.noreply.github.com>	2023-10-26 21:12:38 -04:00
William FH	923696b664	Wfh/json edit dist (#12361 ) Compare predicted json to reference. First canonicalize (sort keys, rm whitespace separators), then return normalized string edit distance. Not a silver bullet but maybe an easy way to capture structure differences in a less flakey way --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-10-26 18:10:28 -07:00
Erick Friis	4db8d82c55	CLI CI 2 (#12387 ) Will run all CI because of _test change, but future PRs against CLI will only trigger the new CLI one Has a bunch of file changes related to formatting/linting. No mypy yet - coming soon	2023-10-26 17:01:31 -07:00
Tyler Hutcherson	231d553824	Update broken redis tests (#12371 ) Update broken redis tests -- tiny PR :) - Description: Fixes Redis tests on master (look like it was broken by https://github.com/langchain-ai/langchain/pull/11257) - Issue: None, - Dependencies: No - Tag maintainer: @baskaryan @Spartee - Twitter handle: N/A Co-authored-by: Sam Partee <sam.partee@redis.com>	2023-10-26 16:13:14 -07:00
Erick Friis	03e79e62c2	cli fix (#12380 )	2023-10-26 15:29:49 -07:00
Bagatur	76230d2c08	fireworks scheduled integration tests (#12373 )	2023-10-26 14:24:42 -07:00
Josh Phillips	01c5cd365b	Fix SupbaseVectoreStore write operation timeout (#12318 ) Description This small change will make chunk_size a configurable parameter for loading documents into a Supabase database. Issue https://github.com/langchain-ai/langchain/issues/11422 Dependencies No chanages Twitter @ j1philli Reminder If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Greg Richardson <greg.nmr@gmail.com>	2023-10-26 14:19:17 -07:00
Bagatur	b10cefb160	lint fix: rm init (#12374 )	2023-10-26 14:16:25 -07:00
Harrison Chase	b43996e553	Harrison/improve cli (#12368 )	2023-10-26 13:53:59 -07:00
Harrison Chase	9ce38726a2	fix some stuff (#12292 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-26 13:30:36 -07:00
Cynthia Yang	6ce276e099	Support Fireworks batching (#8 ) (#12052 ) Description * Add _generate and _agenerate to support Fireworks batching. * Add stop words test cases * Opt out retry mechanism Issue - Not applicable Dependencies - None Tag maintainer - @baskaryan	2023-10-26 16:01:08 -04:00
Tyler Hutcherson	2f0c9d8269	Fix redis vectorfield schema defaults (#12223 ) - Description: refactors the redis vector field schema to properly handle default values, includes a new unit test suite. - Issue: N/A - Dependencies: nothing new. - Tag maintainer: @baskaryan @Spartee - Twitter handle: this is a tiny fix/improvement :) This issue was causing some clients/cuatomers issues when building a vector index on Redis on smaller db instances (due to fault default values in index configuration). It would raise an error like: ```redis.exceptions.ResponseError: Vector index initial capacity 20000 exceeded server limit (852 with the given parameters)``` This PR will address this moving forward.	2023-10-26 12:17:58 -07:00
Jakub Novák	9544d64ad8	E2B tool - Improve description wuth uploaded files info (#12355 )	2023-10-26 11:44:24 -07:00
Bagatur	c6a733802b	bump 324 and 35 (#12352 )	2023-10-26 10:10:26 -07:00
Nuno Campos	683e97766d	Fix json key output parser in partial (streaming) mode (#12332 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-26 17:45:04 +01:00
Nikhil Jha	dff24285ea	Comprehend Moderation 0.2 (#11730 ) This PR replaces the previous `Intent` check with the new `Prompt Safety` check. The logic and steps to enable chain moderation via the Amazon Comprehend service, allowing you to detect and redact PII, Toxic, and Prompt Safety information in the LLM prompt or answer remains unchanged. This implementation updates the code and configuration types with respect to `Prompt Safety`. ### Usage sample ```python from langchain_experimental.comprehend_moderation import (BaseModerationConfig, ModerationPromptSafetyConfig, ModerationPiiConfig, ModerationToxicityConfig ) pii_config = ModerationPiiConfig( labels=["SSN"], redact=True, mask_character="X" ) toxicity_config = ModerationToxicityConfig( threshold=0.5 ) prompt_safety_config = ModerationPromptSafetyConfig( threshold=0.5 ) moderation_config = BaseModerationConfig( filters=[pii_config, toxicity_config, prompt_safety_config] ) comp_moderation_with_config = AmazonComprehendModerationChain( moderation_config=moderation_config, #specify the configuration client=comprehend_client, #optionally pass the Boto3 Client verbose=True ) template = """Question: {question} Answer:""" prompt = PromptTemplate(template=template, input_variables=["question"]) responses = [ "Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.", "Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here." ] llm = FakeListLLM(responses=responses) llm_chain = LLMChain(prompt=prompt, llm=llm) chain = ( prompt \| comp_moderation_with_config \| {llm_chain.input_keys[0]: lambda x: x['output'] } \| llm_chain \| { "input": lambda x: x['text'] } \| comp_moderation_with_config ) try: response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"}) except Exception as e: print(str(e)) else: print(response['output']) ``` ### Output ```python > Entering new AmazonComprehendModerationChain chain... Running AmazonComprehendModerationChain... Running pii Validation... Running toxicity Validation... Running prompt safety Validation... > Finished chain. > Entering new AmazonComprehendModerationChain chain... Running AmazonComprehendModerationChain... Running pii Validation... Running toxicity Validation... Running prompt safety Validation... > Finished chain. Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like XXXXXXXXXXXX John Doe's phone number is (999)253-9876. ``` --------- Co-authored-by: Jha <nikjha@amazon.com> Co-authored-by: Anjan Biswas <anjanavb@amazon.com> Co-authored-by: Anjan Biswas <84933469+anjanvb@users.noreply.github.com>	2023-10-26 09:42:18 -07:00
Blake (Yung Cher Ho)	b9410f2b6f	Takeoff pro support (#12070 ) Description: This PR adds support for the [Pro version of Titan Takeoff Server](https://docs.titanml.co/docs/category/pro-features). Users of the Pro version will have to import the TitanTakeoffPro model, which is different from TitanTakeoff. Issue: Also minor fixes to docs for Titan Takeoff (Community version) Dependencies: No additional dependencies Twitter handle: @becoming_blake @baskaryan @hwchase17	2023-10-26 09:39:32 -07:00
Leonid Kuligin	4e47fe1dce	fixed error message and a check for processor name (#12200 ) Replace this entire comment with: - Description: a small fix on error description / a check for processor name - Issue: the issue #11407	2023-10-26 09:38:25 -07:00
Nir Kopler	9298aff783	Finetuned openai azure models cost calculation (#12267 ) Description: Add cost calculation for fine tuned Azure with relevant unit tests. see https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo&pivots=programming-language-studio for more information. this PR is the result of this PR: https://github.com/langchain-ai/langchain/pull/12190 Twitter handle: @nirkopler	2023-10-26 09:38:10 -07:00
gnakw	20fe515f20	Fix the exception from langchain.utilities import ArceeWrapper (#12342 ) - Description: Fix the exception from langchain.utilities import ArceeWrapper	2023-10-26 09:19:43 -07:00
Qihui Xie	6720458c7d	add allowed_operators property in QdrantTranslator (#12328 ) - Description: This PR adds `allowd_operators` property to `QdrantTranslator` to fix the `TypeError: can only join an iterable` bug. This property is required in `get_query_constructor_prompt` in `query_constructor\base.py`: ``` allowed_operators=" \| ".join(allowed_operators), ``` - Issue: #12061 --------- Co-authored-by: XIE Qihui <qihui.xie@bopufund.com>	2023-10-26 09:18:29 -07:00
Bagatur	f5a57fc1ef	fix self query constructor (#12349 )	2023-10-26 09:18:15 -07:00
Vasek Mlejnsky	cdd75b687e	e2b tool - fix initialization and improve tool description (#12345 )	2023-10-26 08:47:50 -07:00
Harrison Chase	8ec7aade9f	add docs for templates (#12346 )	2023-10-26 08:28:01 -07:00
Erick Friis	ebf998acb6	Templates (#12294 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Jacob Lee <jacoblee93@gmail.com>	2023-10-25 18:47:42 -07:00
Erick Friis	43257a295c	CLI Git Improvements (#12311 ) - delete repo sources like pip - git dep fixes - error messaging	2023-10-25 18:30:02 -07:00
William FH	1d568e1add	Better wrap traceable (#12303 ) If user function is wrapped as a traceable function, this will help hand off the trace between the two. Also update handling fields to reflect optional values	2023-10-25 16:34:23 -07:00
Eugene Yurtsev	5a71b81609	Relax type annotation for custom input/output types (#12300 ) This is needed to be able to do stuff like: ```python runnable.with_types(input_type=List[str]) ```	2023-10-25 19:00:22 -04:00
William FH	988f6d9912	Rm langchain server (#12305 )	2023-10-25 15:26:46 -07:00
wemysschen	3f16acc538	Add baidu cloud vector search in vectorstore and fix some unit test in vectorstores (#11605 ) Description: Add baidu cloud vector search in vectorstore --------- Co-authored-by: root <root@icoding-cwx.bcc-szzj.baidu.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-25 13:44:19 -07:00
mrbean	b7e559c7e1	use snippet search optionally (#12236 ) Add an additional flag which allows for hitting our new endpoint.	2023-10-25 13:37:28 -07:00
felixocker	cce132d146	fix sparql queries for relations in schema description (#9136 ) - Description: Fix for the SPARQL QA chain: fixed SPARQL queries for retrieving information about relations in the graph to create a textual description of the schema for the language model. This should resolve #8907 - Issue: #8907 - Dependencies: None - Tag maintainer: @baskaryan, @hwchase17	2023-10-25 13:36:57 -07:00
Donato Azevedo	d9f1bcf366	Strips leading/trailing whitespace before parsing xml (#12297 ) Description: When llms output leading or trailing whitespace for xml (when using XMLOutputParser) the parser would raise a `ValueError: Could not parse output: ...`. However, leading or trailing whitespace are "ignorable" in the sense of XML standard. Issue: I did not find an issue related. Dependencies: None Tag maintainer: Twitter handle: donatoaz Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. Done, updated unit test and ran `make docker_test`.	2023-10-25 13:34:58 -07:00
Erick Friis	47070b8314	CLI (#12284 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-25 11:06:58 -07:00
Shwu Ku	07c2649753	response parser for ArceeRetriever (#12270 ) - Description: Response parser for arcee retriever, - Issue: follow-up pr on #11578 and [discussion](https://github.com/arcee-ai/arcee-python/issues/15#issuecomment-1759874053), - Dependencies: NA This pr implements a parser for the response from ArceeRetreiver to convert to langchain `Document`. This closes the loop of generation and retrieval for Arcee DALMs in langchain. The reference for the response parser is [api-docs:retrieve](https://api.arcee.ai/docs#/v2/retrieve_model) Attaching screenshot of working implementation: <img width="1984" alt="Screenshot 2023-10-25 at 7 42 34 PM" src="https://github.com/langchain-ai/langchain/assets/65639964/026987b9-34b2-4e4b-b87d-69fcd0c6641a"> \*api key deleted --- Successful tests, lints, etc. ```shell Re-run pytest with --snapshot-update to delete unused snapshots. ==================================================================================================================== slowest 5 durations ===================================================================================================================== 1.56s call tests/unit_tests/schema/runnable/test_runnable.py::test_retrying 0.63s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_astream 0.33s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_stream_iterator_input 0.30s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_astream_iterator_input 0.20s call tests/unit_tests/indexes/test_indexing.py::test_cleanup_with_different_batchsize ======================================================================================================= 1265 passed, 270 skipped, 32 warnings in 6.55s ======================================================================================================= [ "." = "" ] \|\| poetry run black . All done! ✨ 🍰 ✨ 1871 files left unchanged. [ "." = "" ] \|\| poetry run ruff --select I --fix . ./scripts/check_pydantic.sh . ./scripts/check_imports.sh poetry run ruff . [ "." = "" ] \|\| poetry run black . --check All done! ✨ 🍰 ✨ 1871 files would be left unchanged. [ "." = "" ] \|\| poetry run mypy . Success: no issues found in 1868 source files poetry run codespell --toml pyproject.toml poetry run codespell --toml pyproject.toml -w ``` Co-authored-by: Shubham Kushwaha <shwu@Shubhams-MacBook-Pro.local>	2023-10-25 10:55:13 -07:00
Johanna Appel	c26ec7789f	CohereEmbeddings: Add max_retries and request_timeout (#12275 ) Add max_retries and request_timeout to CohereEmbeddings, akin to how it works in OpenAIEmbeddings. Since the Cohere client already implements these parameters, we can simply pass them down. Uses parameters from these two cohere client objects: https://github.com/cohere-ai/cohere-python/blob/main/cohere/client.py https://github.com/cohere-ai/cohere-python/blob/main/cohere/client_async.py	2023-10-25 10:37:25 -07:00
Nuno Campos	7108084947	Remove CLI (#12283 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-25 10:33:52 -07:00
Nuno Campos	b5b2d07681	Pop max concurrency when recursing (#12281 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-25 18:03:58 +01:00
Bagatur	69f4e402e4	bump 323 (#12278 )	2023-10-25 09:06:12 -07:00
David Duong	c25b174db5	Add serialisation props to Fireworks and ChatFireworks (#12255 )	2023-10-25 11:41:33 +01:00
Richard Adams	fd5f549a9e	demonstrate use of RetrievalQAWithSourcesChain.from_chain (#12235 ) Description: Documents further usage of RetrievalQAWithSourcesChain in an existing test. I'd not found much documented usage of RetrievalQAWithSourcesChain and how to get the sources out. This additional code will hopefully be useful to other potential users of this retriever. Issue: No raised issue Dependencies: No new dependencies needed to run the test (it already needs `open-ai`, `faiss-cpu` and `unstructured`). Note - `make lint` showed 8 linting errors in unrelated files --------- Co-authored-by: richarda23 <richard.c.adams@infinityworks.com>	2023-10-24 21:33:34 -07:00
James Braza	53f35c5f5c	Adding `STRUCTURED_FORMAT_SIMPLE_INSTRUCTIONS` missing backticks (#12238 ) This PR fixes the fact that `STRUCTURED_FORMAT_SIMPLE_INSTRUCTIONS` was missing backticks at the end	2023-10-24 21:30:25 -07:00
William FH	276c6ba115	Check for ls project in run tree context (#12242 ) If I go traceable -> runnable when the project is manually specified, the runnable wont be logged. This makes sure the session/project is threaded through appropriately.	2023-10-24 17:18:59 -07:00
Vasek Mlejnsky	1f8094938f	Integrate E2B's data analysis/code interpreter (#12011 ) This PR adds a data [E2B's](https://e2b.dev/) analysis/code interpreter sandbox as a tool --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Jakub Novak <jakub@e2b.dev>	2023-10-24 16:04:02 -07:00
Bagatur	286a29a49e	bump 322 and 34 (#12228 )	2023-10-24 13:52:17 -07:00
Eugene Yurtsev	583dc49477	Add type to Generation and sub-classes, handle root validator (#12220 ) * Add a type literal for the generation and sub-classes for serialization purposes. * Fix the root validator of ChatGeneration to return ValueError instead of KeyError or Attribute error if intialized improperly. * This change is done for langserve to make sure that llm related callbacks can be serialized/deserialized properly.	2023-10-24 16:21:00 -04:00
Eugene Yurtsev	81052ee18e	Fix code block in runnable doc (#12221 ) Fix code block syntax in runnable doc-string	2023-10-24 16:11:58 -04:00
Mikelarg	46e28b9613	Added GigaChat chat model support (#12201 ) - Description: Added integration with [GigaChat](https://developers.sber.ru/portal/products/gigachat) language model. - Twitter handle: @dvoshansky	2023-10-24 12:53:51 -07:00
Anurag Wagh	d5c2ce7c2e	[fix] create redis vector index before adding docs, add prefix to doc… (#11257 ) Fix Description: For Redis Vector integration in add_texts method, there were two issues that lead to this bug. 1. Vector index is not being created leading to no such_index error 2. `doc:index` prefix was also missing for Redis Keys. resolves #11197 Maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-24 10:51:25 -07:00
Eugene Yurtsev	079d1f3b8e	Expose handle_event and ahandle_events as public API (#12181 ) Expose functionality to handle generic events.	2023-10-24 13:42:28 -04:00
William FH	67c4fd0ad0	Update deprecation (#12178 ) in runner_utils	2023-10-24 10:37:28 -07:00
Nir Kopler	d3744175bf	Finetuned OpenAI models cost calculation #11715 (#12190 ) Description: Add cost calculation for fine tuned models (new and legacy), this is required after OpenAI added new models for fine tuning and separated the costs of I/O for fine tuned models. Also I updated the relevant unit tests see https://platform.openai.com/docs/guides/fine-tuning for more information. issue: https://github.com/langchain-ai/langchain/issues/11715 - Issue: 11715 - Twitter handle: @nirkopler	2023-10-24 10:22:05 -07:00
Spyros	a2840a2b42	fix vertexai codey models (#12173 ) Description: This PR fixes issue #12156 by checking for Codey models appropriately before result parsing. Maintainer: @hwchase17 , @agola11	2023-10-24 10:20:05 -07:00
Hech	d76f026d72	Fix flexible dimension and doc for DingoDB (#12187 )	2023-10-24 10:16:19 -07:00
Erick Friis	95ae40ff90	Fix Anthropic Functions ainvoke (#12215 ) Removes custom `NotImplementedError` in experimental anthropic functions, allowing it to fallback on default `ainvoke` implementation.	2023-10-24 10:07:01 -07:00
Iskren Ivov Chernev	d5d7ba582a	Improvements to llm/deepinfra (#10846 ) - replace `requests` package with `langchain.requests` - add `_acall` support - add `_stream` and `_astream` - freshen up the documentation a bit - update vendor doc	2023-10-24 09:54:23 -07:00
sudranga	f09f82541b	Expose configuration options in GraphCypherQAChain (#12159 ) Allows for passing arguments into the LLM chains used by the GraphCypherQAChain. This is to address a request by a user to include memory in the Cypher creating chain. Will keep the prompt variables as-is to be backward compatible. But, would be a good idea to deprecate them and use the **kwargs variables. Added a test case. In general, I think it would be good for any chain to automatically pass in a readonlymemory(of its input) to its subchains whilist allowing for an override. But, this would be a different change.	2023-10-24 09:52:55 -07:00
Leonid Ganeline	11f13aed53	docstrings update (#12093 ) Added missed docstrings. Added missed Args:, Returns: Raises:	2023-10-24 09:34:10 -07:00
Johnny Oshika	ba20c14e28	Fix typo in stuff_prompt's system_template (#12063 ) - Description: Add missing apostrophe in `user's` in stuff_prompt's system_template. The first sentence in the system template went from: > Use the following pieces of context to answer the users question. to > Use the following pieces of context to answer the user's question. - Issue: - Dependencies: none - Tag maintainer: @baskaryan - Twitter handle: ojohnnyo	2023-10-24 09:21:28 -07:00
Holt Skinner	69d9eae5cd	feat: Add Client Info to available Google Cloud Clients (#12168 ) - This is used internally to gather aggregate usage metrics for the LangChain integrations - Note: This cannot be added to some of the Vertex AI integrations at this time because the SDK doesn't allow overriding the [`ClientInfo`](https://googleapis.dev/python/google-api-core/latest/client_info.html#module-google.api_core.client_info) - Added to: - BigQuery - Google Cloud Storage - Document AI - Vertex AI Model Garden - Document AI Warehouse - Vertex AI Search - Vertex AI Matching Engine (Cloud Storage Client) @baskaryan, @eyurtsev, @hwchase17 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-24 08:49:11 -07:00
Lukas Wolf	69f5f82804	Update extraction.py (#12207 ) Description: Pass tags as argument to create_extraction_chain Issue: create_extraction_chain does not pass tags to chain yet @baskaryan	2023-10-24 08:25:14 -07:00
Nuno Campos	34ffb94770	Remove GetLocal, PutLocal (#12133 ) Do you agree?	2023-10-24 10:16:46 +01:00
Eric Hartford	8c150ad7f6	Add COBOL parser and splitter (#11674 ) - Description: Add COBOL parser and splitter - Issue: n/a - Dependencies: n/a - Tag maintainer: @baskaryan - Twitter handle: erhartford --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-23 15:44:31 -04:00
John Mai	ebf749c40c	Baichuan & Hunyuan set default api_base (#12059 ) ### Description Baichuan & Hunyuan set default api_base env	2023-10-23 15:33:35 -04:00
Shilong Dai	99afc1b4f8	Fixed hardcoded "vector" and replaced with vector_query_field variable (#12126 ) - Description: In the max_marginal_relevance_search function of the ElasticsearchStore vector store, the name of the field corresponding to the vector embedding of the document is hard coded in the delete statement that drops the field from the document metadata. This results in an exception if the vector embedding field is customized. This PR changes the hard-coded "vector" into the vector_query_field variable. - Issue: None - Dependencies: None - Tag maintainer: @hwchase17 Co-authored-by: Shilong Dai <sdai@viperfish.net>	2023-10-23 15:08:55 -04:00
Vikram Shitole	0d44746430	10634: Added the capability to inject boto3 client in SagemakerEndpointEmbeddings (#12146 ) Description: Allow to inject boto3 client for Cross account access type of scenarios in using SagemakerEndpointEmbeddings and also updated the documentation for same in the sample notebook Issue:SagemakerEndpointEmbeddings cross account capability #10634 #10184 Dependencies: None Tag maintainer: Twitter handle:lethargicoder Co-authored-by: Vikram(VS) <vssht@amazon.com>	2023-10-23 15:08:26 -04:00
aubin_mzt	66f8cb015d	Add connection args for pgvector vector store (#11930 ) - Description: sqlalchemy create_engine() does not take into account connect_args which are mandatory for managed PGSQL instances on cloud providers (ssl_context for example). Also re-enabled create_vector_extension at post_init for using pgvector class seamlessly - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Sami Bargaoui <bargaoui.sam@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-23 14:43:44 -04:00
NuODaniel	4d6243fa87	fix: doc string of default params in chat_models, llm qianfan (#12153 ) - Description: a fix of the doc string in Qianfan - Issue: no - Dependencies: no - Tag maintainer: @baskaryan - Twitter handle: no	2023-10-23 14:03:18 -04:00
Predrag Gruevski	f82bdf4613	Update deprecated `langchain` imports with suggested new paths. (#12164 ) Let's help our users find the proper import to use instead of the deprecated top-level ones.	2023-10-23 13:52:08 -04:00
Bagatur	963ff93476	bump 321 (#12161 )	2023-10-23 12:49:38 -04:00
Nuno Campos	d0505c0d47	Update default recursion_limit, update docs (#12134 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-23 16:29:17 +01:00
William FH	4f23aa677a	Fix Pickle Error (#12141 ) If non-pickleable objects (like locks) get passed to the tracing callback, they'll fail in the deepcopy. Fallback to a shallow copy in these instances .	2023-10-23 08:22:47 -07:00
Predrag Gruevski	95a1b598fe	Update to `actions/checkout@v4`. (#11951 ) We don't use any of the new functionality at the moment. Just making sure we don't fall back on versions and fail to benefit from new patches. This is an easy upgrade and it's always harder to upgrade across multiple major versions at once.	2023-10-23 10:01:33 -04:00
William FH	7c4f340cc0	Include Parent Run ID (#12139 ) If you set local callbacks	2023-10-22 17:19:11 -07:00
omahs	f3cc9bba5b	Fix typos (#12128 ) Fix typos	2023-10-22 17:16:03 -07:00
Nuno Campos	1afdb40b48	Add optional config arg to RunnablePassthrough func arg (#12131 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 19:57:16 +01:00
Nuno Campos	325fdde8b4	Fix bug where types were lost when calling with_cconfig or bind (#12137 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 19:26:13 +01:00
Nuno Campos	02dce74b97	Fix type hint for older py versions (#12132 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 18:01:09 +01:00
Nuno Campos	d0ce374731	Allow specifying custom input/output schemas for runnables with .with_types() (#12083 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 17:26:48 +01:00
Harrison Chase	ee69116761	move csv agent to langchain experimental (#12113 )	2023-10-21 10:26:02 -07:00
Harrison Chase	03bf6ef473	add missing init files (#12114 )	2023-10-21 10:25:50 -07:00
Bagatur	ef8b180d6d	bump 320 (#12108 )	2023-10-21 11:52:52 -04:00
Rotem Weiss	78d186fb44	Add Tavily Search API as a Tool (#12103 ) Adding Tavily Search API as a tool. I will be the maintainer and assaf_elovic is the twitter handler. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-21 11:23:21 -04:00
Bagatur	85302a9ec1	Add CI check that integration tests compile (#12090 )	2023-10-21 10:52:18 -04:00
verlocks	5dbe456aae	Bug fix tongyi.py to be compatible with DashScope API (#11956 ) Current ChatTongyi is not compatible with DashScope API, which will cause error when passing api key to chat model directly. - Description: Update tongyi.py to be compatible with DashScope API. Specifically, update parameter name "dashscope_api_key" to "api_key". - Issue: None. - Dependencies: Nothing new, Tongyi would require DashScope as before.	2023-10-20 18:46:41 -04:00
Tomaz Bratanic	82f4c0589c	Add neo4j graph environment variables (#12080 )	2023-10-20 14:43:01 -07:00
Mohammad Mohtashim	d5400f6502	Google Scholar Search Tool using serpapi (#11513 ) - Description: Implementing the Google Scholar Tool as requested in PR #11505. The tool will be using the [serpapi python package](https://serpapi.com/integrations/python#search-google-scholar). The main idea of the tool will be to return the results from a Google Scholar search given a query as an input to the tool. - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17	2023-10-20 17:35:55 -04:00
Holt Skinner	f5be2d525a	fix: Add `_serving_config` property to `GoogleVertexAISearchRetriever` (#12084 ) - Fixes error: ``` ValueError: "GoogleVertexAISearchRetriever" object has no field "_serving_config" ``` Introduced in #11736 @baskaryan, @eyurtsev, @hwchase17 if you could review and merge quickly, that would be appreciated :)	2023-10-20 15:16:42 -04:00
Nuno Campos	5fee61a207	Support runnable factories in .configurable_alts() (#12065 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-20 15:22:09 +01:00
Zhitao Xu	a4c3a44712	Fix documentation typo in Clickhouse Class (#12047 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: The return info in the documentation for similarity_search_by_vector and similarity_search_with_relevance_scores is wrong	2023-10-19 17:00:22 -04:00
William FH	25418b9b4d	Always add run ID (#12046 ) in eval callback handler. Useful if you're using a custom run evaluator and don't want to thread things through.	2023-10-19 12:38:07 -07:00
Eugene Yurtsev	44d7763580	Add zapier deprecation warning (#12045 ) Add zapier deprecation	2023-10-19 15:27:56 -04:00
John Mai	4188f046ec	Add Tencent Hunyuan chat model (#12022 ) ### Description: The Tencent Hunyuan model, developed by Tencent, is a large language model by robust Chinese text generation capabilities, adeptness in logical reasoning within complex contexts, and reliable task execution proficiency.For more information, see [https://cloud.tencent.com/document/product/1729](https://cloud.tencent.com/document/product/1729)	2023-10-19 15:10:12 -04:00
Eugene Yurtsev	68599d98c2	More security notes (#12040 ) Add more security notes	2023-10-19 14:49:09 -04:00
Bagatur	0006075b08	bump 319 (#12041 )	2023-10-19 11:45:27 -07:00
John Mai	8eb40b5fe2	`baichuan_secret_key` use pydantic.types.SecretStr & Add Baichuan tests (#12031 ) ### Description - `baichuan_secret_key` use pydantic.types.SecretStr - Add Baichuan tests	2023-10-19 14:37:41 -04:00
Nuno Campos	85bac75729	nc/runnable-dynamic-schemas-from-config (#12038 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-19 19:34:35 +01:00
Nuno Campos	85eaa4ccee	Revert "nc/runnable-dynamic-schemas-from-config" (#12037 ) This reverts commit `a46eef64a7`. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-19 19:27:02 +01:00
Nuno Campos	a46eef64a7	nc/runnable-dynamic-schemas-from-config	2023-10-19 19:17:48 +01:00
Nuno Campos	d392e030be	Add default value (#12032 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-19 18:30:05 +01:00
Kenneth Choe	62efe1ffb9	support add_embeddings for elasticsearch (#11002 ) - Description: Provide a way to use different text for embedding. - For example, if you are ingesting stack-overflow Q&As for RAG, you would want to embed the questions and return the answer(s) for the hits. With this change, the consumer of langchain can implement that easily. - I noticed the similar function is added on faiss.py with #1912 which was for performance reason, but I see the same function can be used to achieve what I thought. So instead of changing Document class to have embedding_content, I mimicked the implementation of faiss.py. - The test should provide some guidance on how to use it. It would be more intuitive if I just pass texts and embedding_texts as separate arguments, but I chose to use `zip`-ed object for the consistency with faiss.py implementation. - I plan to make similar pull request for OpenSearch. - Issue: N/A - Dependencies: None other than the existing ones. Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-19 09:43:51 -07:00
Bagatur	76d3afaef0	bump 318 (#12030 )	2023-10-19 09:33:39 -07:00
Dmitry Tyumentsev	5dd2161c4b	add _acall method to YandexGPT (#12029 ) - Description: Add async support for YandexGPT LLM model Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>	2023-10-19 09:15:26 -07:00
Peter Krenesky	8425f33363	Pydantic v2 support for OpenAPI Specs (#11936 ) - Description: Adding Pydantic v2 support for OpenAPI Specs - Issue: - OpenAPI spec support was disabled because `openapi-schema-pydantic` doesn't support Pydantic v2: #9205 - Caused errors in `get_openapi_chain` - This may be the cause of #9520. - Tag maintainer: @eyurtsev - Twitter handle: kreneskyp The root cause was that `openapi-schema-pydantic` hasn't been updated in some time but [openapi-pydantic](https://github.com/mike-oakley/openapi-pydantic) forked and updated the project.	2023-10-19 11:06:11 -04:00
Joe McElroy	c9f1768cb9	Elasticsearch Query Retriever: Use match + fuzziness for LIKE (#12023 ) Updated the elasticsearch self query retriever to use the match clause for LIKE operator instead of the non-analyzed fuzzy search clause. Other small updates include: - fixing the stack inference integration test where the index's default pipeline didn't use the inference pipeline created - adding a user-agent to the old implementation to track usage - improved the documentation for ElasticsearchStore filters	2023-10-19 09:47:21 -04:00
Nuno Campos	7db6aabf65	Update chat model output type (#11833 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-19 00:55:15 -07:00
Simon Dai	ed62984cb2	update Weaviate to support multi tenancy (#11842 ) - Description: update Weaviate to support multi tenancy - Issue: 9956 - Dependencies: - Tag maintainer: hwchase17 - Twitter handle: dsx1986_	2023-10-19 00:49:30 -07:00
hiigao	f818ec49b8	Encapsulate alicloud pai-eas access method for chatmodels and llms (#11852 ) ### Description: To provide an eas llm service access methods in this pull request by impletementing `PaiEasEndpoint` and `PaiEasChatEndpoint` classes in `langchain.llms` and `langchain.chat_models` modules. Base on this pr, langchain users can build up a chain to call remote eas llm service and get the llm inference results. ### About EAS Service EAS is a Alicloud product on Alibaba Cloud Machine Learning Platform for AI which is short for AliCloud PAI. EAS provides model inference deployment services for the users. We build up a llm inference services on EAS with a general llm docker images. Therefore, end users can quickly setup their llm remote instances to load majority of the hugginface llm models, and serve as a backend for most of the llm apps. ### Dependencies This pr does't involve any new dependencies. --------- Co-authored-by: 子洪 <gaoyihong.gyh@alibaba-inc.com>	2023-10-19 00:20:18 -07:00
John Mai	a6b483dcbc	Supported RetryOutputParser & RetryWithErrorOutputParser max_retries (#11903 ) Description: Supported RetryOutputParser & RetryWithErrorOutputParser max_retries - max_retries: Maximum number of retries to parser. Issue: None Dependencies: None Tag maintainer: @baskaryan Twitter handle:	2023-10-18 23:57:16 -07:00
Hugues Chocart	008c7df80d	[LLMonitorCallbackHandler] Refactor + add llmonitor-py dependency (#11948 ) We now require uses to have the pip package `llmonitor` installed. It allows us to have cleaner code and avoid duplicates between our library and our code in Langchain.	2023-10-18 23:54:10 -07:00
Sian Cao	77fc2f7644	fix: impl missing embeddings method (#10823 ) FAISS does not implement embeddings method and use embed_query to embedding texts which is wrong for some embedding models. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-18 23:51:28 -07:00
Holt Skinner	2661dc94f3	feat: Google Vertex AI Search Retriever - Add support for Website Data Stores (#11736 ) - Only works for Data stores with Advanced Website Indexing - https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features - Minor restructuring - Follow up to #10513 - Remove outdated docs (readded in https://github.com/langchain-ai/langchain/pull/11620) - Move legacy class into new py file to clean up the directory - Shouldn't cause backwards compatibility issues as the import works the same way for users	2023-10-18 23:41:48 -07:00
Shorthills AI	4b6fdd7bf0	Update modal.py (#11588 ) feat: Raise KeyError when 'prompt' key is missing in JSON response This commit updates the error handling in the code to raise a KeyError when the 'prompt' key is not found in the JSON response. This change makes the code more explicit about the nature of the error, helping to improve clarity and debugging. @baskaryan, @eyurtsev.	2023-10-18 23:40:37 -07:00
William FH	dfb4baa3f9	Fix Fireworks Callbacks (#12003 ) I may be missing something but it seems like we inappropriately overrode the 'stream()' method, losing callbacks in the process. I don't think (?) it gave us anything in this case to customize it here? See new trace: https://smith.langchain.com/public/fbb82825-3a16-446b-8207-35622358db3b/r and confirmed it streams. Also fixes the stopwords issues from #12000	2023-10-18 23:33:09 -07:00
Wang Wei	e26559f512	Add ERNIE-Bot-4 model support for ErnieBotChat. (#11969 ) - Description: According to the document https://cloud.baidu.com/doc/WENXINWORKSHOP/s/clntwmv7t, add ERNIE-Bot-4 model support for ErnieBotChat. - Dependencies: Before using the ERNIE-Bot-4, you should have the model's access authority.	2023-10-18 14:55:29 -07:00
Eugene Yurtsev	f4bec9686d	Add more security notes (#11990 ) Add more security notes	2023-10-18 15:00:56 -04:00
Eugene Yurtsev	3d81c76160	Add security notes to agent toolkits (#11989 ) Add more security notes to agent toolkits.	2023-10-18 14:36:29 -04:00
Leonid Ganeline	b81a4c1d94	docstrings added (#11988 ) Added docstrings. Some docsctrings formatting.	2023-10-18 13:05:49 -04:00
Bagatur	35c7c1f050	bump 317 (#11986 )	2023-10-18 09:25:18 -07:00
Bagatur	122af2effe	fix chroma from_texts bug (#11984 )	2023-10-18 09:24:04 -07:00
Erick Friis	c149954cc5	Hub Runnable (#11946 ) Adds `langchain.runnables.hub.HubRunnable` for pulling configurable objects from the hub	2023-10-18 09:21:45 -07:00
Owen	9e24626e87	chore: remove duplicated export variables (#11962 ) - Description: remove duplicated `__all__` variables	2023-10-18 12:08:50 -04:00
Nuno Campos	6bd9c1d2b3	Make prompt validation opt-in (#11973 ) By default replace input_variables with the correct value <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-18 16:28:47 +01:00
Nuno Campos	9bc7e1851a	Ensure dict() does not raise not implemented error, which should instead be raised in our custom method save() (#11970 ) .dict() is a Pydantic method that cannot raise exceptions, as it is used eg. in `__eq__` <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-18 16:28:33 +01:00
Nuno Campos	653cf56e0e	Lint	2023-10-18 16:02:00 +01:00
Predrag Gruevski	debcf053eb	Fix `invalid escape sequence` warnings by using raw strings for regexes. (#11943 ) This code also generates warnings when our users' apps hit it, which is annoying and doesn't look great. Let's fix it.	2023-10-18 10:55:17 -04:00
Nuno Campos	e4ae690244	Sort order	2023-10-18 15:42:13 +01:00
Nuno Campos	b753bf3323	Make prompt validation opt-in By default replace input_variables with the correct value	2023-10-18 10:46:22 +01:00
Nuno Campos	202acce0c9	Ensure dict() does not raise not implemented error, which should instead be raised in our custom method save()	2023-10-18 09:44:41 +01:00
Predrag Gruevski	392df7b2e3	Type hints on varargs and kwargs that take anything should be `Any`. (#11950 ) Type hinting `args` as `List[Any]` means that each positional argument should be a list. Type hinting `*kwargs` as `Dict[str, Any]` means that each keyword argument should be a dict of strings. This is almost never what we actually wanted, and doesn't seem to be what we want in any of the cases I'm replacing here.	2023-10-17 21:31:44 -04:00
Eugene Yurtsev	908c7bf33e	Add documentation to tools (#11938 ) Add security notes to tools --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-10-17 21:27:59 -04:00
Eugene Yurtsev	43dc669332	Update playwright documentation (#11949 ) Add security note to playwright tool	2023-10-17 21:22:26 -04:00
Daniel Chalef	2beb767ae5	zep: Memory Retriever MMR Support & Docs Updates (#11954 ) - Update Zep Memory and Retriever docstrings - Zep Memory Retriever: Add support for native MMR - Add MMR example to existing ZepRetriever Notebook @baskaryan	2023-10-17 16:35:11 -07:00
William FH	a27fa9bf10	Use traceable context (#11896 ) Example ``` from langchain.schema.runnable import RunnableLambda from langsmith import traceable chain = RunnableLambda(lambda x: x) @traceable(run_type = "chain") def my_traceable(a): chain.invoke(a) my_traceable(5) ``` Would have a nested result. This would NOT work for interleaving chains and traceables. E.g., things like thiswould still not work well ``` from langchain.schema.runnable import RunnableLambda from langsmith import traceable @traceable() def other_traceable(a): return a def foo(x): return other_traceable(x) chain = RunnableLambda(foo) @traceable(run_type = "chain") def my_traceable(a): chain.invoke(a) my_traceable(5) ```	2023-10-17 15:10:20 -07:00
Predrag Gruevski	dcd0392423	Upgrade to newer black (23.10) and ruff (first 0.1.x!) versions. (#11944 ) Minor lint dependency version upgrade to pick up latest functionality. Ruff's new v0.1 version comes with lots of nice features, like fix-safety guarantees and a preview mode for not-yet-stable features: https://astral.sh/blog/ruff-v0.1.0	2023-10-17 17:24:51 -04:00
Trayan Azarov	1fd21ed21c	Chroma batching (#11203 ) - Description: Chroma >= 0.4.10 added support for batch sizes validation of add/upsert. This batch size is dependent on the SQLite limits of the target system and varies. In this change, for Chroma>=0.4.10 batch splitting was added as the aforementioned validation is starting to surface in the Chroma community (users using LC) - Issue: N/A - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: t_azarov	2023-10-17 13:59:42 -07:00
Guy Korland	9373b9c004	Add Graph interface (#11012 ) Replace this entire comment with: - Description: Add a Graph interface - Tag maintainer: @baskaryan @hwchase17 - Twitter handle: @g_korland	2023-10-17 13:54:05 -07:00
DanielZzz	b647505280	feat: support ChatModels Qianfan `QianfanChatEndpoint` function_call (#11107 ) - Description: * feature for `QianfanChatEndpoint` function_call ability, add integration_test for it * add `model`, `endpoint` supported in calling params * add raw response in ChatModel Message - Issue: * #10867 * #11105 * #10215 - Dependencies: no - Tag maintainer: @baskaryan - Twitter handle: no	2023-10-17 13:33:55 -07:00
M Bharat lal	67300567d3	GCSFileLoader retrieve blob custom metadata and append to document metadata (#11066 ) - Description: GCSFileLoader retrieve blob's custom metadata and append to document's metadata - Issue: #9975, - Tag maintainer: @baskaryan please review Co-authored-by: b0l00ib <bharat.lal@walmart.com>	2023-10-17 12:17:59 -07:00
billytrend-cohere	f4742dce50	Add Cohere retrieval augmented generation to retrievers (#11483 ) Add Cohere retrieval augmented generation to retrievers --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-17 11:51:04 -07:00
刘方瑞	0a24ac7388	Revised notebook and add delete to MyScale vector store (#11848 ) - Description: - Add `.delete` to myscale vector store. - Revised vector store notebooks - Tag maintainer: @baskaryan - Twitter handle: @myscaledb @mpsk_liu	2023-10-17 11:42:21 -07:00
John Mai	3fb5e4d185	Add Baichuan chat model (#11923 ) Description: A large language models developed by Baichuan Intelligent Technology，https://www.baichuan-ai.com/home Issue: None Dependencies: None Tag maintainer: Twitter handle:	2023-10-17 11:30:57 -07:00
Eugene Yurtsev	9ecb7240a4	Add security note to recursive url loader (#11934 ) Add security note to recursive loader	2023-10-17 13:41:43 -04:00
maks-operlejn-ds	42dcc502c7	Anonymizer small fixes (#11915 )	2023-10-17 10:27:29 -07:00
Eugene Yurtsev	90e9ec6962	Sitemap specify default filter url (#11925 ) Specify default filter URL in sitemap loader and add a security note --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-10-17 13:19:27 -04:00
Bagatur	ba0d729961	bump 316 (#11928 )	2023-10-17 09:47:57 -07:00
Eugene Yurtsev	12d7eaa0c2	Add security notices to toolkits (#11900 ) This adds security notices to toolkits init, and to several toolkits. We'll need to continue documenting the rest of the toolkits. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-17 11:45:09 -04:00
Eugene Yurtsev	5f4a697ce3	Add deprecation warnings (#11899 ) Add deprecation warnings Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-17 10:59:38 -04:00
Nuno Campos	8b79cf9566	Add lock for using global config enum weak map (#11920 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-17 15:50:35 +01:00
Nuno Campos	2a8ded6c8c	Export merge_configs function (#11916 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-17 15:36:11 +01:00
Nuno Campos	778e7c526e	Add comment	2023-10-17 15:29:39 +01:00
Nuno Campos	19319e1746	Allow configs with None values	2023-10-17 15:23:58 +01:00
Nuno Campos	b0d5882fe1	Export merge_configs function	2023-10-17 13:22:07 +01:00
Nuno Campos	12596b9a9b	Add validation for configurable keys passed to .with_config() - Fix some typing issues found while doing that	2023-10-17 08:50:31 +01:00
Nuno Campos	754aca794f	remove print	2023-10-17 08:46:07 +01:00
Nuno Campos	cf448a6314	Ensure that configurable fields with enums support deduplication	2023-10-17 08:25:21 +01:00
Leonid Ganeline	31f264169d	evaluation criteria (#11681 ) the updated value was: ` Criteria.MISOGYNY: "Is the submission misogynistic? If so, respond Y." ` The " If so, respond Y." should not be here. This sub-string is not presented in any other criteria and should not be presented here. I also added a synonym to "misogynistic" as it done in many other criteria.	2023-10-16 21:05:08 -07:00
Dmitry Tyumentsev	e8c1850369	Add YandexGPT LLM and Chat model (#11703 ) Description: Introducing an ability to work with the [YandexGPT](https://cloud.yandex.com/en/services/yandexgpt) language model.	2023-10-16 20:30:07 -07:00
Bagatur	c15701eebf	Revert "Add baichuan model" (#11901 ) cc @cloudscool, apologies your PR wasn't actually passing CI	2023-10-16 20:01:12 -07:00
cloudscool	c1d811c4bc	Add baichuan model	2023-10-16 19:27:35 -07:00
John Mai	0169d45ba8	Supported OutputFixingParser max_retries (#11754 ) Description: Supported OutputFixingParser max_retries - max_retries: Maximum number of retries to parser. Issue: None Dependencies: None Tag maintainer: @baskaryan Twitter handle: @JohnMai95 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-16 19:25:47 -07:00
volodymyr-memsql	ff8e6981ff	SingleStoreDBChatMessageHistory: Add singlestoredb support for ChatMessageHistory (#11705 ) Description - Added the `SingleStoreDBChatMessageHistory` class that inherits `BaseChatMessageHistory` and allows to use of a SingleStoreDB database as a storage for chat message history. - Added integration test to check that everything works (requires `singlestoredb` to be installed) - Added notebook with usage example - Removed custom retriever for SingleStoreDB vector store (as it is useless) --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2023-10-16 21:59:45 -04:00
Mohammad Mohtashim	634ccb8ccd	test_stream_log_retriever Unit Test + Tool names fix (#11808 ) ## Description \| Tool \| Original Tool Name \| \|-----------------------------\|---------------------------\| \| open-meteo-api \| Open Meteo API \| \| news-api \| News API \| \| tmdb-api \| TMDB API \| \| podcast-api \| Podcast API \| \| golden_query \| Golden Query \| \| dall-e-image-generator \| Dall-E Image Generator \| \| twilio \| Text Message \| \| searx_search_results \| Searx Search Results \| \| dataforseo \| DataForSeo Results JSON \| When using these tools through `load_tools`, I encountered the following validation error: ```console openai.error.InvalidRequestError: 'TMDB API' does not match '^[a-zA-Z0-9_-]{1,64}$' - 'functions.0.name' ``` In order to avoid this error, I replaced spaces with hyphens in the tool names: \| Tool \| Corrected Tool Name \| \|-----------------------------\|---------------------------\| \| open-meteo-api \| Open-Meteo-API \| \| news-api \| News-API \| \| tmdb-api \| TMDB-API \| \| podcast-api \| Podcast-API \| \| golden_query \| Golden-Query \| \| dall-e-image-generator \| Dall-E-Image-Generator \| \| twilio \| Text-Message \| \| searx_search_results \| Searx-Search-Results \| \| dataforseo \| DataForSeo-Results-JSON \| This correction resolved the validation error. Additionally, a unit test, `tests/unit_tests/schema/runnable/test_runnable.py::test_stream_log_retriever`, was failing at random. Upon further investigation, I confirmed that the failure was not related to the above-mentioned changes. The `stream_log` variable was generating the order of logs in two ways at random The reason for this behavior is unclear, but in the assertion, I included both possible orders to account for this variability.	2023-10-16 18:46:19 -07:00
Predrag Gruevski	7c0f1bf23f	Upgrade experimental package dependencies and use Poetry 1.6.1. (#11339 ) Part of upgrading our CI to use Poetry 1.6.1.	2023-10-16 21:13:31 -04:00
Eugene Yurtsev	c2c0814a94	Add security notice to file management tool (#11878 ) Add security notice to file management tool --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-10-16 21:12:13 -04:00
zhaoshengbo	cb7e12f6ba	Adapt to the latest version of Alibaba Cloud OpenSearch vector store API (#11849 ) Hello Folks, Alibaba Cloud OpenSearch has released a new version of the vector storage engine, which has significantly improved performance compared to the previous version. At the same time, the sdk has also undergone changes, requiring adjustments alibaba opensearch vector store code to adapt. This PR includes: Adapt to the latest version of Alibaba Cloud OpenSearch API. More comprehensive unit testing. Improve documentation. I have read your contributing guidelines. And I have passed the tests below - [x] make format - [x] make lint - [x] make coverage - [x] make test --------- Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>	2023-10-16 18:07:24 -07:00
Lee	e669f9d731	Fix: Sitemap Document Loader Tests and Documentation (#11866 ) Description: While working on the Docusaurus site loader #9138, I noticed some outdated docs and tests for the Sitemap Loader. Issue: This is tangentially related to #6691 in reference to doc links. I plan on digging in to a few of these issue when I find time next.	2023-10-16 17:42:10 -07:00
Jean-Louis Queguiner	8b697ff0ee	feat(llm): add together.xyz as an LLM provider (#11892 ) - Description: added together.xyz as an LLM provider, - Issues: fix some linting issues - twitter handle @jilijeanlouis --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-16 17:08:04 -07:00
Leonid Kuligin	d269dd2e2f	added a multiturn search based on Vertex AI Search (#11885 ) Replace this entire comment with: - Description: Added a retriever based on multi-turn Vertex AI Search - Twitter handle: lkuligin	2023-10-16 17:05:12 -07:00
Leonid Kuligin	38ed55245f	added Vertex examples as attributes (#11890 ) - Description: added examples to Vertex chat models as optional class attributes, so that a model with examples can be used inside a chain - Twitter handle: lkuligin	2023-10-16 16:55:45 -07:00
eryk-dsai	5019f59724	fix: more robust check whether the HF model is quantized (#11891 ) Removes the check of `model.is_quantized` and adds more robust way of checking for 4bit and 8bit quantization in the `huggingface_pipeline.py` script. I had to make the original change on the outdated version of `transformers`, because the models had this property before. Seems redundant now. Fixes: https://github.com/langchain-ai/langchain/issues/11809 and https://github.com/langchain-ai/langchain/issues/11759	2023-10-16 16:54:20 -07:00
Eugene Yurtsev	210a48cfb5	Add security considerations (#11869 ) Add security considerations to existing graph tools.	2023-10-16 12:23:48 -04:00
Bagatur	25b1d65305	bump 315 (#11850 )	2023-10-16 00:50:54 -07:00
Nuno Campos	4321d192ea	Use a less specific return type for \| on Runnables (#11762 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-15 21:15:06 +01:00
Harrison Chase	a506302772	bearly tool (#11812 )	2023-10-14 16:03:58 -07:00
Harrison Chase	4a2f0c51a1	use get_llm_cache and set_llm_cache (#11741 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-14 09:29:30 -07:00
Harrison Chase	f3ad22e64a	pipe default key (#11788 )	2023-10-14 08:39:23 +01:00
Eugene Yurtsev	0d37b4c27d	Add python,pandas,xorbits,spark agents to experimental (#11774 ) See for contex https://github.com/langchain-ai/langchain/discussions/11680	2023-10-13 17:36:44 -04:00
Michael Feil	233a904f2e	GradientLLM Docs update and model_id renaming. (#10963 ) Related to #10800 - Errors in the Docstring of GradientLLM / Gradient.ai LLM - Renamed the `model_id` to `model` and adapting this in all tests. Reason to so is to be in Sync with `GradientEmbeddings` and other LLM's. - inmproving tests so they check the headers in the sent request. - making the aiosession a private attribute in the docs, as in the future `pip install gradientai` will be replacing aiosession. - adding a example how to fine-tune on the Prompt Template as suggested in #10800	2023-10-13 13:57:58 -07:00
Bagatur	1559ba4bfc	fix upstash test import (#11781 )	2023-10-13 13:31:36 -07:00
Leonid Kuligin	9f0a718198	added candidate_count for Vertex models (#11729 ) - Description: added support for `candidate_count` parameter on Vertex	2023-10-13 13:31:20 -07:00
David	9d200e6cbe	Create ChatEverlyAI (#11357 ) - Description: Adds the ChatEverlyAI class with llama-2 7b on [EverlyAI Hosted Endpoints](https://everlyai.xyz/) - It inherits from ChatOpenAI and requires openai (probably unnecessary but it made for a quick and easy implementation) --------- Co-authored-by: everly-studio <127131037+everly-studio@users.noreply.github.com>	2023-10-13 12:25:11 -07:00
Hristo G	7fb25b4154	Add graceful fallback for ES vectorstore when content field is missing (#11726 ) - Description: - If the Elasticsearch field used for Langchain > Document.page_content is missing because the specific document is somehow malformed fail gracefully. - Tag maintainer: - @joemcelroy	2023-10-13 12:03:32 -07:00
Bagatur	f06fcde0d7	rm duplicate zilliz import (#11777 )	2023-10-13 12:01:22 -07:00
Bagatur	a3330c4258	bump 314 (#11773 )	2023-10-13 11:09:54 -07:00
Erick Friis	1861cc7100	General anthropic functions, steps towards experimental integration tests (#11727 ) To match change in js here https://github.com/langchain-ai/langchainjs/pull/2892 Some integration tests need a bit more work in experimental: ![Screenshot 2023-10-12 at 12 02 49 PM](https://github.com/langchain-ai/langchain/assets/9557659/262d7d22-c405-40e9-afef-669e8d585307) Pretty sure the sqldatabase ones are an actual regression or change in interface because it's returning a placeholder. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-13 09:48:24 -07:00
Nuno Campos	17c69678ab	Revert "New add Baichuan Model" (#11761 ) Reverts langchain-ai/langchain#11714 This has linting and formatting issues, plus it's added to chat models folder but doesn't subclass Chat Model base class	2023-10-13 08:23:15 -07:00
cloudscool	56653c53aa	New add Baichuan Model (#11714 ) Motivation and Context At present, the Baichuan Large Language Model is relatively popular and efficient in performance. Due to widespread market recognition, this model has been added to enhance the scalability of Langchain's ability to access the big language model, so as to facilitate application access and usage for interested users. System Info langchain： 0.0.295 python：3.8.3 IDE：vs code Description Add the following files: 1. Add baichuan_baichuaninc_endpoint.py in the libs/langchain/langchain/chat_models 2. Modify the __init__.py file,which is located in the libs/langchain/langchain/chat_models/__init__.py： a. Add "from langchain.chat_models.baichuan_baichuaninc_endpoint import BaichuanChatEndpoint" b. Add "BaichuanChatEndpoint" In the file's __ All__ method Your contribution I am willing to help implement this feature and submit a PR, but I would appreciate guidance from the maintainers or community to ensure the changes are made correctly and in line with the project's standards and practices.	2023-10-12 23:04:28 -07:00
Yang, Bo	9e1e0f54d2	Add `TrainableLLM` (#11721 ) - Description: Add `TrainableLLM` for those LLM support fine-tuning - Tag maintainer: @hwchase17 This PR add training methods to `GradientLLM` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 17:38:33 -07:00
Burak Yılmaz	63e516c2b0	Upstash redis integration (#10871 ) - Description: Introduced Upstash provider with following wrappers: UpstashRedisCache, UpstashRedisEntityStore, UpstashRedisChatMessageHistory, UpstashRedisStore - Issue: -, - Dependencies: upstash-redis python package is needed, - Tag maintainer: @baskaryan - Twitter handle: @BurakY744 --------- Co-authored-by: Burak Yılmaz <burakyilmaz@Buraks-MacBook-Pro.local> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 17:36:51 -07:00
Bagatur	a9db2b0b92	fix tongyi import (#11745 )	2023-10-12 17:24:06 -07:00
Aaron Pham	6c61315067	fix(openllm): update with newer remote client implementation (#11740 ) cc @baskaryan --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-10-12 17:01:18 -07:00
Richy Wang	11cdfe44af	Implement Alibaba Tongyi chat model apis. (#10922 ) Hi there This PR is aim to implement chat model for Alibaba Tongyi LLM model. It contains work below: 1.Implement ChatTongyi chat model in langchain.chat_models.tongyi. Note this is different with tongyi llm model to another PR https://github.com/langchain-ai/langchain/pull/10878. For detail it implements _generate() and _stream() function in ChatTongyi. 2. Add some examples in chat/tongyi.ipynb. 3. Add integration test in chat_models/test_tongyi.py Note async completion for the Text API is not yet supported. Dependencies: dashscope. It will be installed manually cause it is not need by everyone.	2023-10-12 16:59:37 -07:00
Adam Demjen	008348ce71	Add ElasticsearchChatMessageHistory (#10932 ) Description This PR adds the `ElasticsearchChatMessageHistory` implementation that stores chat message history in the configured [Elasticsearch](https://www.elastic.co/elasticsearch/) deployment. ```python from langchain.memory.chat_message_histories import ElasticsearchChatMessageHistory history = ElasticsearchChatMessageHistory( es_url="https://my-elasticsearch-deployment-url:9200", index="chat-history-index", session_id="123" ) history.add_ai_message("This is me, the AI") history.add_user_message("This is me, the human") ``` Dependencies - [elasticsearch client](https://elasticsearch-py.readthedocs.io/) required Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 16:51:38 -07:00
Jonathan Soma	48cf978391	Allow placeholders in OpenAPI endpoints #2938 (#2940 ) Use regex matches when checking endpoints instead of exact matches. `{varname}` becomes `.*` Fixes #2938 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 16:20:32 -07:00
Predrag Gruevski	9e32120cbb	Deprecate direct access to globals like `debug` and `verbose`. (#11311 ) Instead of accessing `langchain.debug`, `langchain.verbose`, or `langchain.llm_cache`, please use the new getter/setter functions in `langchain.globals`: - `langchain.globals.set_debug()` and `langchain.globals.get_debug()` - `langchain.globals.set_verbose()` and `langchain.globals.get_verbose()` - `langchain.globals.set_llm_cache()` and `langchain.globals.get_llm_cache()` Using the old globals directly will now raise a warning. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-12 15:48:04 -07:00
Richard Adams	35965df20d	Rspace doc loader (#11511 ) Description: Add a document loader for the RSpace Electronic Lab Notebook (www.researchspace.com), so that scientific documents and research notes can be easily pulled into Langchain pipelines. Issue This is an new contribution, rather than an issue fix. Dependencies: There are no new required dependencies. In order to use the loader, clients will need to install rspace_client SDK using `pip install rspace_client` --------- Co-authored-by: richarda23 <richard.c.adams@infinityworks.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 15:05:38 -07:00
Ryan Zotti	9d1867c77f	Update docs to specify Indexing-API-compatible vectorstores (#11581 ) Description: Update Indexing API docs to specify vectorstores that are compatible with the Indexing API. I add a unit test to remind developers to update the documentation whenever they add or change a vectorstore in a way that affects compatibility. For the unit test I repurposed existing code from [here](https://github.com/langchain-ai/langchain/blob/v0.0.311/libs/langchain/langchain/indexes/_api.py#L245-L257). This is my first PR to an open source project. This is a trivially simple PR whose main purpose is to make me more comfortable submitting Langchain PRs. If this PR goes through I plan to submit PRs with more substantive changes in the near future. Issue: Resolves [10482](https://github.com/langchain-ai/langchain/discussions/10482). Dependencies: No new dependencies. Twitter handle: None.	2023-10-12 15:17:44 -04:00
Richard Wang	6402c33299	Let Notion document loader support utf-8 and make it default. (#10613 ) Use utf-8 encoding by default	2023-10-12 15:13:41 -04:00
Bagatur	bd74eba152	add azure openai sched tests (#11723 )	2023-10-12 10:48:45 -07:00
Bagatur	9c0584be74	bump 313 (#11718 )	2023-10-12 09:48:54 -07:00
sudranga	361f8e1bc6	Add MMR functionality to elasticsearch retriever (#11633 ) Allows MMR functionality only for the case where we have access to the embedding function. Also allows for users to request for fields from elasticsearch store. These are added to the document metadata. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 08:42:32 -07:00
Dmitry Tyumentsev	ead9d5b55c	Add yandex stt parser (#11435 ) Description: Introducing an ability to load a transcription document of audio file using [Yandex SpeechKit](https://cloud.yandex.com/en-ru/services/speechkit) Issue: None Dependencies: yandex-speechkit Tag maintainer: @rlancemartin, @eyurtsev	2023-10-12 08:42:03 -07:00
Janos Tolgyesi	15687a28d5	Use correct tokenizer for Bedrock/Anthropic LLMs (#11561 ) Description This PR implements the usage of the correct tokenizer in Bedrock LLMs, if using anthropic models. Issue: #11560 Dependencies: optional dependency on `anthropic` python library. Twitter handle: jtolgyesi --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 08:41:52 -07:00
kYLe	467b082c34	Modify Anyscale integration to work with Anyscale Endpoint (#11569 ) Description: Modify Anyscale integration to work with [Anyscale Endpoint](https://docs.endpoints.anyscale.com/) and it supports invoke, async invoke, stream and async invoke features --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 08:41:25 -07:00
plpycoin	51193309ea	Update readthedocs.py (#11110 ) Only parse .html files .svg .png favicon.ico will crash processing phase --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-12 11:32:06 -04:00
Nuno Campos	ca9de26f2b	Add callback function to RunnablePassthrough (#11564 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-12 15:10:16 +01:00
Nuno Campos	7f4734c0dd	Add deploy command to repos generated by cli template (#11711 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-12 15:09:21 +01:00
Nuno Campos	1c0857b53e	Fix default impl of aparse_result (#11702 ) Should delegate to parse_result, not to aparse, as parse_result is a method that some output parsers override <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-12 14:13:59 +01:00
nuric	44da27c07b	Add SemaDB VST wrapper (#11484 ) - Description: Adding vectorstore wrapper for [SemaDB](https://rapidapi.com/semafind-semadb/api/semadb). - Issue: None - Dependencies: None - Twitter handle: semafind Checks performed: - [x] `make format` - [x] `make lint` - [x] `make test` - [x] `make spell_check` - [x] `make docs_build` Documentation added: - SemaDB vectorstore wrapper tutorial	2023-10-11 19:09:38 -07:00
hsuyuming	0b743f005b	Feature/enhance huggingfacepipeline to handle different return type (#11394 ) Description: Avoid huggingfacepipeline to truncate the response if user setup return_full_text as False within huggingface pipeline. Dependencies: : None Tag maintainer: Maybe @sam-h-bean ? --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 19:09:03 -07:00
Leonid Kuligin	2aba9ab47e	Retriever based on GCP DocAI Warehouse (#11400 ) - Description: implements a retriever on top of DocAI Warehouse (to interact with existing enterprise documents) https://cloud.google.com/document-ai-warehouse?hl=en - Issue: new functionality @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 19:08:53 -07:00
Erick Friis	a477ddda45	Langsmith in readme update (#11497 )	2023-10-11 18:43:52 -07:00
Leonid Kuligin	9e81ab47be	Added a better error description if processor name is wrong. (#11488 ) Replace this entire comment with: - Description: added a better error description for this error - Issue: #11407 @baskaryan	2023-10-11 18:43:40 -07:00
Robert Yi	e75766b759	fix: incorrect arguments in clickhouse docstring (#11693 ) fix docstring for clickhouse	2023-10-11 21:41:21 -04:00
Eugene Yurtsev	17b5090c18	Add `type` to Agent actions (#11682 ) Add `type` to agent actions.	2023-10-11 21:33:24 -04:00
April	c14a8df2ee	wrap confluence attachment processing with a try-except block (#11503 ) Prevents document loading from erroring out when an attachment is not found at the url. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 18:13:42 -07:00
eajechiloae	4ba2c8ba75	Fix ClearML callback (#11472 ) Handle different field names in dicts/dataframes, fixing the ClearML callback. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 17:09:02 -07:00
Lawrence Wu	93bb19f69a	Fix chains/loading.py error messages (#11688 ) - Description: make the error messages consistent in chains/loading.py - Dependencies: None	2023-10-11 17:05:42 -07:00
Harrison Chase	18ebce2032	fix tool async (#11689 )	2023-10-11 16:40:23 -07:00
sudranga	9beb03e771	11474 (#11519 ) No relevant documents may be found for a given question. In some use cases, we could directly respond with a fixed message instead of doing an LLM call with an empty context. This PR exposes this as an option: response_if_no_docs_found. --------- Co-authored-by: Sudharsan Rangarajan <sudranga@nile-global.com>	2023-10-11 16:30:15 -07:00
Joaquin Menendez	ef99b06362	feature: add metadata information into the embedding file before uplo… (#11553 ) Replace this entire comment with: - Description: In this modified version of the function, if the metadatas parameter is not None, the function includes the corresponding metadata in the JSON object for each text. This allows the metadata to be stored alongside the text's embedding in the vector store. - - Issue: #10924 - Dependencies: None - Tag maintainer: @hwchase17 @agola11 - Twitter handle: @MelliJoaco --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 16:05:13 -07:00
Marcin Wątroba	51a3a86022	#11655 Add SQLAlchemyMd5Cache implementation (#11660 ) - Description: Add SQLAlchemyMd5Cache implementation, - Issue: the issue # #11655, - Dependencies: no deps, - Tag maintainer: @markowanga --------- Co-authored-by: Marcin Wątroba <marcin.watroba@pwr.edu.pl> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 15:28:09 -07:00
Suresh Kumar Ponnusamy	70f7558db2	langchain-experimental: Add allow_list support in experimental/data_anonymizer (#11597 ) - Description: Add allow_list support in langchain experimental data-anonymizer package - Issue: no - Dependencies: no - Tag maintainer: @hwchase17 - Twitter handle:	2023-10-11 14:50:41 -07:00
wemysschen	2363c02cf3	Bos loader (#11525 ) Description: Add BaiduCloud BOS document loader. --------- Co-authored-by: chenweixu01 <chenweixu01@baidu.com> Co-authored-by: root <root@icoding-cwx.bcc-szzj.baidu.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 14:43:48 -07:00
Kwanghoon Choi	fbb82608cd	Fixed a bug in reporting Python code validation (#11522 ) - Description: fixed a bug in pal-chain when it reports Python code validation errors. When node.func does not have any ids, the original code tried to print node.func.id in raising ValueError. - Issue: n/a, - Dependencies: no dependencies, - Tag maintainer: @hazzel-cn, @eyurtsev - Twitter handle: @lazyswamp --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 14:34:28 -07:00
Harrison Chase	9f39c23a13	add input type for convo retrieval chain (#11679 )	2023-10-11 17:13:48 -04:00
zhaozhiming	d5e762d328	fix: Change the docs of JSONAgentOutputParser (#11594 ) I am merely making some minor adjustments to the function documentation. I hope to provide a small assistance to LangChain. - Description: Change the docs of JSONAgentOutputParser. It will be `JSON` better, - Issue: no, - Dependencies: no, - Tag maintainer: @hwchase17, - Twitter handle: Not worth mentioning.	2023-10-11 14:05:53 -07:00
Vinay Kakade	dd0cd98861	Add support for ChatOpenAI models in Infino callback handler (#11608 ) Description: This PR adds support for ChatOpenAI models in the Infino callback handler. In particular, this PR implements `on_chat_model_start` callback, so that ChatOpenAI models are supported. With this change, Infino callback handler can be used to track latency, errors, and prompt tokens for ChatOpenAI models too (in addition to the support for OpenAI and other non-chat models it has today). The existing example notebook is updated to show how to use this integration as well. cc/ @naman-modi @savannahar68 Issue: https://github.com/langchain-ai/langchain/issues/11607 Dependencies: None Tag maintainer: @hwchase17 Twitter handle: [@vkakade](https://twitter.com/vkakade)	2023-10-11 14:00:54 -07:00
Israel Ekpo	d0603c86b6	Add Support for Azure Cosmos DB MongoDB vCore Vector Store #11627 (#11632 ) This PR adds support for the Azure Cosmos DB MongoDB vCore Vector Store https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/ https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search Summary: - Description: added vector store integration for Azure Cosmos DB MongoDB vCore Vector Store, - Issue: the issue # it fixes #11627, - Dependencies: pymongo dependency, - Tag maintainer: @hwchase17, - Twitter handle: @izzyacademy --------- Co-authored-by: Israel Ekpo <israel.ekpo@gmail.com> Co-authored-by: Israel Ekpo <44282278+izzyacademy@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 13:56:46 -07:00
Erick Friis	28ee6a7c12	Track ChatFireworks time to first_token (#11672 )	2023-10-11 13:37:03 -07:00
Eugene Yurtsev	539941281d	Fix output types for BaseChatModel (#11670 ) * Should use non chunked messages for Invoke/Batch * After this PR, stream output type is not represented, do we want to use the union? --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-11 16:02:03 -04:00
Eugene Yurtsev	99adcdb1c9	Add dedicated `type` attribute to be used solely for serialization purposes (#11585 ) Adds standard `type` field for all messages that will be serialized/validated by pydantic. * The presence of `type` makes it easier for developers consuming schemas to write client code to serialize/deserialize. * In LangServe `type` will be used for both validation and will appear in the generated openapi specs	2023-10-11 15:06:42 -04:00
eryk-dsai	06d5971be9	Fix issue #10985 - Skip model.to(device) if it is instantiated with bitsandbytes config (#11009 ) Preventing error caused by attempting to move the model that was already loaded on the GPU using the Accelerate module to the same or another device. It is not possible to load model with Accelerate/PEFT to CPU for now Addresses: [#10985](https://github.com/langchain-ai/langchain/issues/10985)	2023-10-11 09:28:27 -07:00
Nuno Campos	64969bc8ae	Add patch_config(configurable=) arg, make with_config(configurable=) merge it with existing (#11662 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-11 14:45:31 +01:00
Harrison Chase	ce0019b646	make utils conditional (#11646 )	2023-10-11 06:11:32 +01:00
Harrison Chase	8f06085b24	make tools conditional (#11647 )	2023-10-11 06:11:05 +01:00
Bassem Yacoube	5451b724fc	Adds support for llama2 and fixes MPT-7b url (#11465 ) - Description: This is an update to OctoAI LLM provider that adds support for llama2 endpoints hosted on OctoAI and updates MPT-7b url with the current one. @baskaryan Thanks! --------- Co-authored-by: ML Wiz <bassemgeorgi@gmail.com>	2023-10-10 20:34:35 -07:00
Todd Kerpelman	0bff399af1	Make metadata from the url_selenium loader match that of the web_base loader (#11617 ) Description: I noticed the metadata returned by the url_selenium loader was missing several values included by the web_base loader. (The former returned `{source: ...}`, the latter returned `{source: ..., title: ..., description: ..., language: ...}`.) This change fixes it so both loaders return all 4 key value pairs. Files have been properly formatted and all tests are passing. Note, however, that I am not much of a python expert, so that whole "Adding the imports inside the code so that tests pass" thing seems weird to me. Please LMK if I did anything wrong.	2023-10-10 20:32:45 -07:00
Tarun Thotakura	c9d4d53545	Fixed the assignment of custom_llm_provider argument (#11628 ) - Description: Assigning the custom_llm_provider to the default params function so that it will be passed to the litellm - Issue: Even though the custom_llm_provider argument is being defined it's not being assigned anywhere in the code and hence its not being passed to litellm, therefore any litellm call which uses the custom_llm_provider as required parameter is being failed. This parameter is mainly used by litellm when we are doing inference via Custom API server. https://docs.litellm.ai/docs/providers/custom_openai_proxy - Dependencies: No dependencies are required @krrishdholakia , @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-10 20:29:24 -07:00
Leonid Ganeline	db67ccb0bb	docstrings cleanup (#11640 ) Added missed docstrings. Some reformatting.	2023-10-10 19:56:47 -07:00
Yang, Bo	3a82bd7bdb	Use raise from statement so that users can find detailed error message (#11461 ) - Description: Use `raise from` statement so that users can find detailed error message - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17	2023-10-10 17:25:23 -07:00
Nuno Campos	9a0ed75a95	Add configurable fields with options (#11601 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-10 22:17:22 +01:00
Bagatur	7232e082de	bump 312 (#11621 )	2023-10-10 12:34:49 -07:00
Eugene Yurtsev	58220cda72	Remove LLM Bash and related bash utilities (#11619 ) Deprecate LLMBash and related bash utilities	2023-10-10 14:54:09 -04:00
Shubham Kushwaha	49de862076	Arcee.ai LLM & Retriever integration (#11579 ) - Description: This PR introduces a new LLM and Retriever API to https://arcee.ai for the python client - Issue: implements the integrations as requested in #11578 , - Dependencies: no dependencies are required, - Tag maintainer: @hwchase17 - Twitter handle: shwooobham ✅ `make format`, `make lint` and `make test` runs locally. ```shell =========== 1245 passed, 277 skipped, 20 warnings in 16.26s =========== ./scripts/check_pydantic.sh . ./scripts/check_imports.sh poetry run ruff . [ "." = "" ] \|\| poetry run black . --check All done! ✨ 🍰 ✨ 1818 files would be left unchanged. [ "." = "" ] \|\| poetry run mypy . Success: no issues found in 1815 source files [ "." = "" ] \|\| poetry run black . All done! ✨ 🍰 ✨ 1818 files left unchanged. [ "." = "" ] \|\| poetry run ruff --select I --fix . poetry run codespell --toml pyproject.toml poetry run codespell --toml pyproject.toml -w ``` Contributions 1. Arcee (langchain/llms), ArceeRetriever (langchain/retrievers), ArceeWrapper (langchain/utilities) 2. docs for Arcee (llms/arcee.py) and ArceeRetriever(retrievers/arcee.py) 3. cc: @jacobsolawetz @ben-epstein --------- Co-authored-by: Shubham <shubham@sORo.local>	2023-10-10 10:20:45 -07:00
Eugene Yurtsev	b56ca0c2a4	Deprecate LLMSymbolicMath from langchain core (#11615 ) Deprecate LLMSymbolicMath from langchain core package.	2023-10-10 12:33:51 -04:00
Eugene Yurtsev	c9bce5bbfb	Add version to langchain_experimental (#11613 ) Add version to langchain experimental	2023-10-10 11:17:41 -04:00
Predrag Gruevski	22abeb9f6c	Disable loading jinja2 `PromptTemplate` from file. (#10252 ) jinja2 templates are not sandboxed and are at risk for arbitrary code execution. To mitigate this risk: - We no longer support loading jinja2-formatted prompt template files. - `PromptTemplate` with jinja2 may still be constructed manually, but the class carries a security warning reminding the user to not pass untrusted input into it. Resolves #4394.	2023-10-10 11:15:42 -04:00
Nuno Campos	c7c03d4709	Fix mutation bugs in callback manager configure (#11603 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-10 14:50:18 +01:00
cccs-eric	e2a9072b80	Fix CohereRerank configuration (#11583 ) Description: CohereRerank is missing `cohere_api_key` as a field and since extras are forbidden, it is not possible to pass-in the key. The only way is to use an env variable named `COHERE_API_KEY`. For example, if trying to create a compressor like this: ```python cohere_api_key = "......Cohere api key......" compressor = CohereRerank(cohere_api_key=cohere_api_key) ``` you will get the following error: ``` File "/langchain/.venv/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__ raise validation_error pydantic.v1.error_wrappers.ValidationError: 1 validation error for CohereRerank cohere_api_key extra fields not permitted (type=value_error.extra) ```	2023-10-09 23:26:34 -07:00
Anar	55fef4b64b	implemented add files method in LLMRails (#11518 ) This PR provides add files method with LLMRails. Implemented here are: docs/extras/integrations/vectorstores/llm-rails.ipynb --------- Co-authored-by: Anar Aliyev <aaliyev@mgmt.cloudnet.services>	2023-10-09 16:29:43 -07:00
Stephen Hankinson	316dddc7cd	fix wording of query_sql_database_tool_description (#11530 ) - Description: Fixes minor typo for the query_sql_database_tool_description in the db toolkit - Issue: N/A - Dependencies: N/A - Tag maintainer: @nfcampos - Twitter handle: N/A	2023-10-09 15:32:45 -07:00
Ash Vardanian	1acfe86353	Accelerating Math Utils with SimSIMD (#11566 ) LangChain relies on NumPy to compute cosine distances, which becomes a bottleneck with the growing dimensionality and number of embeddings. To avoid this bottleneck, in our libraries at [Unum](https://github.com/unum-cloud), we have created a specialized package - [SimSIMD](https://github.com/ashvardanian/simsimd), that knows how to use newer hardware capabilities. Compared to SciPy and NumPy, it reaches 3x-200x performance for various data types. Since publication, several LangChain users have asked me if I can integrate it into LangChain to accelerate their workflows, so here I am 🤗 ## Benchmarking To conduct benchmarks locally, run this in your Jupyter: ```py import numpy as np import scipy as sp import simsimd as simd import timeit as tt def cosine_similarity_np(X: np.ndarray, Y: np.ndarray) -> np.ndarray: X_norm = np.linalg.norm(X, axis=1) Y_norm = np.linalg.norm(Y, axis=1) with np.errstate(divide="ignore", invalid="ignore"): similarity = np.dot(X, Y.T) / np.outer(X_norm, Y_norm) similarity[np.isnan(similarity) \| np.isinf(similarity)] = 0.0 return similarity def cosine_similarity_sp(X: np.ndarray, Y: np.ndarray) -> np.ndarray: return 1 - sp.spatial.distance.cdist(X, Y, metric='cosine') def cosine_similarity_simd(X: np.ndarray, Y: np.ndarray) -> np.ndarray: return 1 - simd.cdist(X, Y, metric='cosine') X = np.random.randn(1, 1536).astype(np.float32) Y = np.random.randn(1, 1536).astype(np.float32) repeat = 1000 print("NumPy: {:,.0f} ops/s, SciPy: {:,.0f} ops/s, SimSIMD: {:,.0f} ops/s".format( repeat / tt.timeit(lambda: cosine_similarity_np(X, Y), number=repeat), repeat / tt.timeit(lambda: cosine_similarity_sp(X, Y), number=repeat), repeat / tt.timeit(lambda: cosine_similarity_simd(X, Y), number=repeat), )) ``` ## Results I ran this on an M2 Pro Macbook for various data types and different number of rows in `X` and reformatted the results as a table for readability: \| Data Type \| NumPy \| SciPy \| SimSIMD \| \| :--- \| ---: \| ---: \| ---: \| \| `f32, 1` \| 59,114 ops/s \| 80,330 ops/s \| 475,351 ops/s \| \| `f16, 1` \| 32,880 ops/s \| 82,420 ops/s \| 650,177 ops/s \| \| `i8, 1` \| 47,916 ops/s \| 115,084 ops/s \| 866,958 ops/s \| \| `f32, 10` \| 40,135 ops/s \| 24,305 ops/s \| 185,373 ops/s \| \| `f16, 10` \| 7,041 ops/s \| 17,596 ops/s \| 192,058 ops/s \| \| `f16, 10` \| 21,989 ops/s \| 25,064 ops/s \| 619,131 ops/s \| \| `f32, 100` \| 3,536 ops/s \| 3,094 ops/s \| 24,206 ops/s \| \| `f16, 100` \| 900 ops/s \| 2,014 ops/s \| 23,364 ops/s \| \| `i8, 100` \| 5,510 ops/s \| 3,214 ops/s \| 143,922 ops/s \| It's important to note that SimSIMD will underperform if both matrices are huge. That, however, seems to be an uncommon usage pattern for LangChain users. You can find a much more detailed performance report for different hardware models here: - [Apple M2 Pro](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-1-performance-on-apple-m2-pro). - [4th Gen Intel Xeon Platinum](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-2-performance-on-4th-gen-intel-xeon-platinum-8480). - [AWS Graviton 3](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-3-performance-on-aws-graviton-3). ## Additional Notes 1. Previous version used `X = np.array(X)`, to repackage lists of lists. It's an anti-pattern, as it will use double-precision floating-point numbers, which are slow on both CPUs and GPUs. I have replaced it with `X = np.array(X, dtype=np.float32)`, but a more selective approach should be discussed. 2. In numerical computations, it's recommended to explicitly define tolerance levels, which were previously avoided in `np.allclose(expected, actual)` calls. For now, I've set absolute tolerance to distance computation errors as 0.01: `np.allclose(expected, actual, atol=1e-2)`. --- - Dependencies: adds `simsimd` dependency - Tag maintainer: @hwchase17 - Twitter handle: @ashvardanian --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-09 14:56:55 -07:00
benchello	5de64e6d60	Add option to specify metadata columns in CSV loader (#11576 ) #### Description This PR adds the option to specify additional metadata columns in the CSVLoader beyond just `Source`. The current CSV loader includes all columns in `page_content` and if we want to have columns specified for `page_content` and `metadata` we have to do something like the below.: ``` csv = pd.read_csv( "path_to_csv" ).to_dict("records") documents = [ Document( page_content=doc["content"], metadata={ "last_modified_by": doc["last_modified_by"], "point_of_contact": doc["point_of_contact"], } ) for doc in csv ] ``` #### Usage Example Usage: ``` csv_test = CSVLoader( file_path="path_to_csv", metadata_columns=["last_modified_by", "point_of_contact"] ) ``` Example CSV: ``` content, last_modified_by, point_of_contact "hello world", "Person A", "Person B" ``` Example Result: ``` Document { page_content: "hello world" metadata: { row: '0', source: 'path_to_csv', last_modified_by: 'Person A', point_of_contact: 'Person B', } ``` --------- Co-authored-by: Ben Chello <bchello@dropbox.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-09 14:56:45 -07:00
Stephen Hankinson	447a523662	fix comments in output format (#11536 ) - Description: Fixes the comments in the ConvoOutputParser. Because the \\\\ is escaping a single \\, they render something like: `"action_input": string \ The input to the action` in the prompt. Changing this to \\\\\\\\ lets it escape two slashes so that it renders a proper comment: `"action_input": string \\ The input to the action` - Issue: N/A - Dependencies: - Tag maintainer: @hwchase17 - Twitter handle:	2023-10-09 14:55:44 -07:00
Michael Landis	8e45f720a8	feat: add momento vector index as a vector store provider (#11567 ) Description: - Added Momento Vector Index (MVI) as a vector store provider. This includes an implementation with docstrings, integration tests, a notebook, and documentation on the docs pages. - Updated the Momento dependency in pyproject.toml and the lock file to enable access to MVI. - Refactored the Momento cache and chat history session store to prefer using "MOMENTO_API_KEY" over "MOMENTO_AUTH_TOKEN" for consistency with MVI. This change is backwards compatible with the previous "auth_token" variable usage. Updated the code and tests accordingly. Dependencies: - Updated Momento dependency in pyproject.toml. Testing: - Run the integration tests with a Momento API key. Get one at the [Momento Console](https://console.gomomento.com) for free. MVI is available in AWS us-west-2 with a superuser key. - `MOMENTO_API_KEY=<your key> poetry run pytest tests/integration_tests/vectorstores/test_momento_vector_index.py` Tag maintainer: @eyurtsev Twitter handle: Please mention @momentohq for this addition to langchain. With the integration of Momento Vector Index, Momento caching, and session store, Momento provides serverless support for the core langchain data needs. Also mention @mlonml for the integration.	2023-10-09 14:02:59 -07:00
Eugene Yurtsev	ca2eed36b7	LangChain cli fix a few bugs (#11573 ) Code was assuming that `git` and `poetry` exist. In addition, it was not ignoring pycache files that get generated during run time	2023-10-09 13:30:16 -07:00
Hugues Chocart	258ae1ba5f	[LLMonitor Callback Handler]: Add error handling (#11563 ) Wraps every callback handler method in error handlers to avoid breaking users' programs when an error occurs inside the handler. Thanks @valdo99 for the suggestion 🙂	2023-10-09 13:26:35 -07:00
Eugene Yurtsev	2aabfafe1e	Module documentation for langchain runnables (#11550 ) Add in code documentation for langchain runnables module.	2023-10-09 16:02:29 -04:00
Eugene Yurtsev	d8fa94e6fa	RunnablePassthrough: In code documentation (#11552 ) Add in code documentation for a runnable passthrough	2023-10-09 16:02:16 -04:00
Eugene Yurtsev	b42f218cfc	RunnableLambda: Add in code docs (#11521 ) Add in code docs for Runnable Lambda	2023-10-09 14:37:46 -04:00
maks-operlejn-ds	f64522fbaf	Reset deanonymizer mapping (#11559 ) @hwchase17 @baskaryan	2023-10-09 11:11:05 -07:00
maks-operlejn-ds	b14b65d62a	Support all presidio entities (#11558 ) https://microsoft.github.io/presidio/supported_entities/ @baskaryan @hwchase17	2023-10-09 11:10:46 -07:00
maks-operlejn-ds	4d62def9ff	Better deanonymizer matching strategy (#11557 ) @baskaryan, @hwchase17	2023-10-09 11:10:29 -07:00
Ash Vardanian	a992b9670d	Fix: Missing DuckDuckGo package version (#11535 ) [The `duckduckgo-search` v3.9.2 was removed from PyPi](https://pypi.org/project/duckduckgo-search/#history). That breaks the build. - Description: refreshes the Poetry dependency to v3.9.3 - Tag maintainer: @baskaryan - Twitter handle: @ashvardanian	2023-10-09 10:55:46 -07:00
Bagatur	8932ed3f07	bump 311 (#11555 )	2023-10-09 08:17:07 -07:00
Bagatur	e7a0def1bc	QoL improvements to query constructor (#11504 ) updating query constructor and self query retriever to - make it easier to pass in examples - validate attributes used in query - remove invalid parts of query - make it easier to get + edit prompt - make query constructor a runnable - make self query retriever use as runnable	2023-10-09 08:10:52 -07:00
Taikono-Himazin	eec53fa294	Added autodetect_encoding option to csvLoader (#11327 )	2023-10-09 08:06:43 -07:00
Holt Skinner	09c66fe04f	feat: Update Google Document AI Parser (#11413 ) - Description: Code Refactoring, Documentation Improvements for Google Document AI PDF Parser - Adds Online (synchronous) processing option. - Adds default field mask to limit payload size. - Skips Human review by default. - Issue: Fixes #10589 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-09 08:04:25 -07:00
Nuno Campos	628cc4cce8	Rename RunnableMap to RunnableParallel (#11487 ) - keep alias for RunnableMap - update docs to use RunnableParallel and RunnablePassthrough.assign <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-09 11:22:03 +01:00
Eugene Yurtsev	6a10e8ef31	Add documentation to Runnable (#11516 )	2023-10-08 08:09:04 +01:00
William FH	eb572f41a6	Add LangSmith Run Chat Loader (#11458 )	2023-10-06 17:02:18 -07:00
David Duong	484947c492	Fetch up-to-date attributes for env-pulled kwargs during serialisation of OpenAI classes (#11499 )	2023-10-06 22:43:29 +01:00
Bagatur	5470e730d2	raise openapi import error (#11495 )	2023-10-06 12:57:24 -07:00
Erick Friis	29f5f70415	Rename some last hwchase17/langchain links (#11494 )	2023-10-06 12:34:30 -07:00
Fabrice Pont	872836c541	feat: add markdown list parser (#11411 ) Description: add `MarkdownListOutputParser` as a new `ListOutputParser` Issue: #11410	2023-10-06 12:25:45 -07:00
Erick Friis	8f50b616c5	Remove optional from vectara source (#11493 ) fyi @ofermend --------- Co-authored-by: Ofer Mendelevitch <ofer@vectara.com> Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>	2023-10-06 12:12:44 -07:00
Bagatur	53887242a1	bump 310 (#11486 )	2023-10-06 09:49:10 -07:00
Jesús Vélez Santiago	a1c7532298	Add async sql record manager and async indexing API (#10726 ) - Description: Add support for a SQLRecordManager in async environments. It includes the creation of `RecorManagerAsync` abstract class. - Issue: None - Dependencies: Optional `aiosqlite`. - Tag maintainer: @nfcampos - Twitter handle: @jvelezmagic --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-06 09:38:44 -04:00
Qihui Xie	57ade13b2b	fix llm_inputs duplication problem in intermediate_steps in SQLDatabaseChain (#10279 ) Use `.copy()` to fix the bug that the first `llm_inputs` element is overwritten by the second `llm_inputs` element in `intermediate_steps`. *Problem description:* In [line 127]( `c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L127C17-L127C17)`), the `llm_inputs` of the sql generation step is appended as the first element of `intermediate_steps`: ``` intermediate_steps.append(llm_inputs) # input: sql generation ``` However, `llm_inputs` is a mutable dict, it is updated in [line 179](https://github.com/langchain-ai/langchain/blob/master/libs/experimental/langchain_experimental/sql/base.py#L179) for the final answer step: ``` llm_inputs["input"] = input_text ``` Then, the updated `llm_inputs` is appended as another element of `intermediate_steps` in [line 180](`c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L180)`): ``` intermediate_steps.append(llm_inputs) # input: final answer ``` As a result, the final `intermediate_steps` returned in [line 189](`c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L189C43-L189C43)`) actually contains two same `llm_inputs` elements, i.e., the `llm_inputs` for the sql generation step overwritten by the one for final answer step by mistake. Users are not able to get the actual `llm_inputs` for the sql generation step from `intermediate_steps` Simply calling `.copy()` when appending `llm_inputs` to `intermediate_steps` can solve this problem.	2023-10-05 21:32:08 -07:00
Florian	d78f418c0d	Extract abstracts from Pubmed articles, even if they have no extra label (#10245 ) ### Description This pull request involves modifications to the extraction method for abstracts/summaries within the PubMed utility. A condition has been added to verify the presence of unlabeled abstracts. Now an abstract will be extracted even if it does not have a subtitle. In addition, the extraction of the abstract was extended to books. ### Issue The PubMed utility occasionally returns an empty result when extracting abstracts from articles, despite the presence of an abstract for the paper on PubMed. This issue arises due to the varying structure of articles; some articles follow a "subtitle/label: text" format, while others do not include subtitles in their abstracts. An example of the latter case can be found at: [https://pubmed.ncbi.nlm.nih.gov/37666905/](url) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:56:46 -07:00
Viktor Zhemchuzhnikov	fd9da60aea	Add async support to SelfQueryRetriever (#10175 ) ### Description SelfQueryRetriever is missing async support, so I am adding it. I also removed deprecated predict_and_parse method usage here, and added some tests. ### Issue N/A ### Tag maintainer Not yet ### Twitter handle N/A	2023-10-05 18:54:21 -07:00
Theron Tau	35297ca0d3	Add feature for extracting images from pdf and recognizing text from images. (#10653 ) Description It is for #10423 that it will be a useful feature if we can extract images from pdf and recognize text on them. I have implemented it with `PyPDFLoader`, `PyPDFium2Loader`, `PyPDFDirectoryLoader`, `PyMuPDFLoader`, `PDFMinerLoader`, and `PDFPlumberLoader`. [RapidOCR](https://github.com/RapidAI/RapidOCR.git) is used to recognize text on extracted images. It is time-consuming for ocr so a boolen parameter `extract_images` is set to control whether to extract and recognize. I have tested the time usage for each parser on my own laptop thinkbook 14+ with AMD R7-6800H by unit test and the result is: \| extract_images \| PyPDFParser \| PDFMinerParser \| PyMuPDFParser \| PyPDFium2Parser \| PDFPlumberParser \| \| ------------- \| ------------- \| ------------- \| ------------- \| ------------- \| ------------- \| \| False \| 0.27s \| 0.39s \| 0.06s \| 0.08s \| 1.01s \| \| True \| 17.01s \| 20.67s \| 20.32s \| 19,75s \| 20.55s \| Issue #10423 Dependencies rapidocr_onnxruntime in [RapidOCR](https://github.com/RapidAI/RapidOCR/tree/main) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:51:59 -07:00
Bagatur	8e3fbc97ca	Add vowpal_wabbit RL chain (#11462 )	2023-10-05 18:39:45 -07:00
Haris Wang	f1269830a0	Fix bug in MarkdownHeaderTextSplitter for codeblock (#10262 ) - Description: The previous version of the MarkdownHeaderTextSplitter did not take into account the possibility of '#' appearing within code blocks, which caused segmentation anomalies in these situations. This PR has fixed this issue. - Issue: - Dependencies: No - Tag maintainer: - Twitter handle: cc @baskaryan @eyurtsev @rlancemartin --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:34:42 -07:00
Eddie Cohen	656d2303f7	add in, nin for pinecone (#10303 ) Description: Adds the in and nin comparators for pinecone seen [here](https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:31:09 -07:00
Bagatur	a3a2ce623e	Revise vowpal_wabbit notebook	2023-10-05 18:18:19 -07:00
Bagatur	8fafa1af91	merge	2023-10-05 18:09:35 -07:00
olgavrou	3b07c0cf3d	RL Chain with VowpalWabbit (#10242 ) - Description: This PR adds a new chain `rl_chain.PickBest` for learned prompt variable injection, detailed description and usage can be found in the example notebook added. It essentially adds a [VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit) layer before the llm call in order to learn or personalize prompt variable selections. Most of the code is to make the API simple and provide lots of defaults and data wrangling that is needed to use Vowpal Wabbit, so that the user of the chain doesn't have to worry about it. - Dependencies: [vowpal-wabbit-next](https://pypi.org/project/vowpal-wabbit-next/), - sentence-transformers (already a dep) - numpy (already a dep) - tagging @ataymano who contributed to this chain - Tag maintainer: @baskaryan - Twitter handle: @olgavrou Added example notebook and unit tests	2023-10-05 18:07:22 -07:00
Manikanta5112	56048b909f	added ContentFormatter escape special characters for message content (#10319 ) --------- Co-authored-by: Manikanta5112 <42089393+mani5112@users.noreply.github.com>	2023-10-05 18:02:29 -07:00
Leonid Ganeline	d17416ec79	docstrings `callbacks` (#11456 ) Added missed docstrings to the `callbacks/` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-05 17:13:14 -07:00
Ofer Mendelevitch	3c7653bf0f	"source" argument in constructor of Vectara (#11454 ) Replace this entire comment with: - Description: minor update to constructor to allow for specification of "source" - Tag maintainer: @baskaryan - Twitter handle: @ofermend	2023-10-05 17:04:14 -07:00
Eugene Yurtsev	d9018ae5f1	Improve CLI ux (#11452 ) Improve UX for cli	2023-10-05 19:40:00 -04:00
Jaikanth J	9f85f7c543	fix(cache): use dumps for RedisCache (#10408 ) # Description Attempts to fix RedisCache for ChatGenerations using `loads` and `dumps` used in SQLAlchemy cache by @hwchase17 . this is better than pickle dump, because this won't execute any arbitrary code during de-serialisation. # Issues #7722 & #8666 # Dependencies None, but removes the warning introduced in #8041 by @baskaryan Handle: @jaikanthjay46	2023-10-05 16:34:07 -07:00
rodrigo-clickup	5944c1851b	Add ClickUp Toolkit (#10662 ) - Description: Adds a toolkit to interact with the [ClickUp](https://clickup.com/) [Public API](https://clickup.com/api/) - Dependencies: None - Tag maintainer: @rodrigo-georgian, @rodrigo-clickup, @aiswaryasankarwork - Twitter handle: - Aiswarya (https://twitter.com/Aiswarya_Sankar, https://www.linkedin.com/in/sankaraiswarya/) - Rodrigo (https://www.linkedin.com/in/rodrigo-ceballos-lentini/) --------- Co-authored-by: Aiswarya Sankar <aiswaryasankar@Aiswaryas-MacBook-Pro.local> Co-authored-by: aiswaryasankarwork <143119412+aiswaryasankarwork@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 16:33:05 -07:00
John Reynolds	68901e1e40	Update output_parser.py (#10430 ) - Description: Updated output parser for mrkl to remove any hallucination actions after the final answer; this was encountered when using Anthropic claude v2 for planning; reopening PR with updated unit tests - Issue: #10278 - Dependencies: N/A - Twitter handle: @johnreynolds	2023-10-05 15:47:24 -07:00
Joshua Sundance Bailey	790010703b	ArcGISLoader: Limit number of results in query (#10615 ) Description: this PR changes the `ArcGISLoader` to set `return_all_records` to `False` when `result_record_count` is provided as a keyword argument. Previously, `return_all_records` was `True` by default and this made the API ignore `result_record_count`. Issue: `ArcGISLoader` would ignore `result_record_count` unless user also passed `return_all_records=False`.	2023-10-05 15:46:02 -07:00
mrbean	9903a70379	Add youdotcom retriever (#11304 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 13:48:11 -07:00
ashish-dahal	1655ff2ded	Fix PyMuPDFLoader kwargs (#11434 ) - Description: Fix the `PyMuPDFLoader` to accept `loader_kwargs` from the document loader's `loader_kwargs` option. This provides more flexibility in formatting the output from documents. - Issue: The `loader_kwargs` is not passed into the `load` method from the document loader, which limits configuration options. - Dependencies: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 13:25:19 -07:00
Leonid Kuligin	e4a46747dc	integration test for DocAI parser (#11424 ) - Description: added an integration test - Issue: #11407 @baskaryan	2023-10-05 12:38:29 -07:00
Aashish Saini	2abbdc6ecb	Update bageldb.py (#11421 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation.	2023-10-05 12:37:56 -07:00
maks-operlejn-ds	2aae1102b0	Instance anonymization (#10501 ) ### Description Add instance anonymization - if `John Doe` will appear twice in the text, it will be treated as the same entity. The difference between `PresidioAnonymizer` and `PresidioReversibleAnonymizer` is that only the second one has a built-in memory, so it will remember anonymization mapping for multiple texts: ``` >>> anonymizer = PresidioAnonymizer() >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Brett Russell. Hi Brett Russell!' ``` ``` >>> anonymizer = PresidioReversibleAnonymizer() >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' ``` ### Twitter handle @deepsense_ai / @MaksOpp ### Tag maintainer @baskaryan @hwchase17 @hinthornw --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:23:02 -07:00
Kyle Pancamo	203258b4d6	Update pdf.py comment for PyPDFLoader (#10495 ) PyPDF does not chunk at the character level to my understanding. Description: PyPDF does not chunk at the character level, but instead breaks up content by page. Fixup comment --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:22:40 -07:00
Juan Daza	4236ae3851	Added Streaming Capability to SageMaker LLMs (#10535 ) This PR adds the ability to declare a Streaming response in the SageMaker LLM by leveraging the `invoke_endpoint_with_response_stream` capability in `boto3`. It is heavily based on the AWS Blog Post announcement linked [here](https://aws.amazon.com/blogs/machine-learning/elevating-the-generative-ai-experience-introducing-streaming-support-in-amazon-sagemaker-hosting/). It does not add any additional dependencies since it uses the existing `boto3` version. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:08:43 -07:00
Laurentiu Piciu	d9670a5945	openai_functions_multi_agent: solved the case when the "arguments" is valid JSON but it does not contain `actions` key (#10543 ) Description: There are cases when the output from the LLM comes fine (i.e. function_call["arguments"] is a valid JSON object), but it does not contain the key "actions". So I split the validation in 2 steps: loading arguments as JSON and then checking for "actions" in it. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:08:09 -07:00
Eugene Yurtsev	fcccde406d	Add SymbolicMathChain to experiment in preparation for deprecation (#11129 ) Move symbolic math chain to experimental	2023-10-05 13:54:43 -04:00
Holt Skinner	9f73fec057	fix: Update Google Cloud Enterprise Search to Vertex AI Search (#10513 ) - Description: Google Cloud Enterprise Search was renamed to Vertex AI Search - https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-search-and-conversation-is-now-generally-available - This PR updates the documentation and Retriever class to use the new terminology. - Changed retriever class from `GoogleCloudEnterpriseSearchRetriever` to `GoogleVertexAISearchRetriever` - Updated documentation to specify that `extractive_segments` requires the new [Enterprise edition](https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features#enterprise-features) to be enabled. - Fixed spelling errors in documentation. - Change parameter for Retriever from `search_engine_id` to `data_store_id` - When this retriever was originally implemented, there was no distinction between a data store and search engine, but now these have been split. - Fixed an issue blocking some users where the api_endpoint can't be set	2023-10-05 10:47:47 -07:00
Patrick Randell	1d678f805f	Additional Weaviate Filter Comparators (#10522 ) ### Description When using Weaviate Self-Retrievers, certain common filter comparators generated by user queries were unimplemented, resulting in errors. This PR implements some of them. All linting and format commands have been run and tests passed. ### Issue #10474 ### Dependencies timestamp module --------- Co-authored-by: Patrick Randell <prandell@deloitte.com.au>	2023-10-05 10:40:04 -07:00
Nuno Campos	79011f835f	Remove str() from RunnableConfigurableAlternatives (#11446 )	2023-10-05 18:40:00 +01:00
Harrison Chase	31d5bd84d7	make vectorstores optional (#11393 )	2023-10-05 10:14:05 -07:00
Eugene Yurtsev	8aa545901a	Update agent type docs (#11137 ) In code docs for agent types	2023-10-05 12:51:14 -04:00
Eugene Yurtsev	3e31d6e35f	Start deprecation of LLMBashChain (#11300 ) In preparation for migration LLMBashChain and related tools add a derprecation warning to the code.	2023-10-05 12:48:22 -04:00
Bagatur	8b6b8bf68c	bump 309 (#11443 )	2023-10-05 09:29:14 -07:00
billytrend-cohere	2ff91a46c0	Add cohere /chat integration (#11389 ) Add cohere /chat integration and an iPython notebook to demonstrate the addition. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 09:20:47 -07:00
adrienohana	ca346011b7	added interactive login for azure cognitive search vector store (#11360 ) Description: Previously if the access to Azure Cognitive Search was not done via an API key, the default credential was called which doesn't allow to use an interactive login. I simply added the option to use "INTERACTIVE" as a key name, and this will launch a login window upon initialization of the AzureSearch object.	2023-10-05 09:20:18 -07:00
Eugene Yurtsev	5a1f614175	Add docker compose to CLI (#11406 ) Add docker compose to cli	2023-10-05 15:58:56 +01:00
Predrag Gruevski	e2d6c41177	Upgrade langchain dependencies. (#11420 ) I was hoping this would pick up numpy 1.26, which is required to support the new Python 3.12 release, but it didn't. It seems that some transitive dependency requirement on numpy is preventing that, and the highest we can currently go is 1.24.x. But to find this out required a 15min `poetry lock`, so I figured we might as well upgrade the dependencies we can and hopefully make the next dependency upgrade a bit smaller.	2023-10-05 15:57:20 +01:00
Jacob Lee	71fd6428c5	Remove overridden async not implemented method on embeddings filters and add default async implementation for document compressors (#11415 ) @nfcampos @eyurtsev @baskaryan --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-10-05 15:56:03 +01:00
Nuno Campos	2f490be09b	Fix .dict() for agent/chain (#11436 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-05 15:51:21 +01:00
Nuno Campos	1e59c44d36	Nc/5oct/runnable release (#11428 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-05 14:27:50 +01:00
Bagatur	58b7a3ba16	Rm bedrock anthropic error (#11403 )	2023-10-04 23:31:51 -04:00
Predrag Gruevski	c9986bc3a9	Tweak type hints to match dependency's behavior. (#11355 ) Needs #11353 to merge first, and a new `langchain` to be published with those changes.	2023-10-04 22:36:58 -04:00
William FH	940b9ae30a	Normalize Option in Scoring Chain (#11412 )	2023-10-04 15:59:28 -07:00
Eugene Yurtsev	70be04a816	CLI: Readme update (#11404 ) Consolidating to a single README for now, will be easier to maintain we can differentiate between poetry and pip later. Does not seem critical. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-04 16:25:37 -04:00
Nuno Campos	fde19c8667	Add CLI command to create a new project (#7837 ) First version of CLI command to create a new langchain project template Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-04 15:43:41 -04:00
mhwang-stripe	9cea796671	Make langchain compatible with SQLAlchemy<1.4.0 (#11390 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> ## Description Currently SQLAlchemy >=1.4.0 is a hard requirement. We are unable to run `from langchain.vectorstores import FAISS` with SQLAlchemy <1.4.0 due to top-level imports, even if we aren't even using parts of the library that use SQLAlchemy. See Testing section for repro. Let's make it so that langchain is still compatible with SQLAlchemy <1.4.0, especially if we aren't using parts of langchain that require it. The main conflict is that SQLAlchemy removed `declarative_base` from `sqlalchemy.ext.declarative` in 1.4.0 and moved it to `sqlalchemy.orm`. We can fix this by try-catching the import. This is the same fix as applied in https://github.com/langchain-ai/langchain/pull/883. (I see that there seems to be some refactoring going on about isolating dependencies, e.g. `c87e9fb2ce`, so if this issue will be eventually fixed by isolating imports in langchain.vectorstores that also works). ## Issue I can't find a matching issue. ## Dependencies No additional dependencies ## Maintainer @hwchase17 since you reviewed https://github.com/langchain-ai/langchain/pull/883 ## Testing I didn't add a test, but I manually tested this. 1. Current failure: ``` langchain==0.0.305 sqlalchemy==1.3.24 ``` ``` python python -i >>> from langchain.vectorstores import FAISS Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/__init__.py", line 58, in <module> from langchain.vectorstores.pgembedding import PGEmbedding File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/pgembedding.py", line 10, in <module> from sqlalchemy.orm import Session, declarative_base, relationship ImportError: cannot import name 'declarative_base' from 'sqlalchemy.orm' (/pay/src/zoolander/vendor3/lib/python3.8/site-packages/sqlalchemy/orm/__init__.py) ``` 2. This fix: ``` langchain==<this PR> sqlalchemy==1.3.24 ``` ``` python python -i >>> from langchain.vectorstores import FAISS <succeeds> ```	2023-10-04 15:41:20 -04:00
Nuno Campos	4d66756d93	Improve output of Runnable.astream_log() (#11391 ) - Make logs a dictionary keyed by run name (and counter for repeats) - Ensure no output shows up in lc_serializable format - Fix up repr for RunLog and RunLogPatch <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-04 20:16:37 +01:00

... 5 6 7 8 9 ...

1892 Commits