langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-18 09:25:54 +00:00

Author	SHA1	Message	Date
Hech	8fe6bcc662	Fix return metadata when searching for DingoDB (#12937 )	2023-11-06 07:35:36 -08:00
Jakub Novák	ada3d2cbd1	Add possibility to pass on_artifacts for a specific conversation (#12687 ) Possibility to pass on_artifacts to a conversation. It can be then achieved by adding this way: ```python result = agent.run( input=message.text, metadata={ "on_artifact": CALLBACK_FUNCTION }, ) ```	2023-11-06 07:29:47 -08:00
Bagatur	53f453f01a	bump 331 (#12932 )	2023-11-06 05:58:12 -08:00
Erick Friis	5000c7308e	cli template gitignores (#12914 ) - ap gitignore - package	2023-11-05 22:34:45 -08:00
Harrison Chase	aba407f774	use keys not items (#12918 )	2023-11-05 22:08:29 -08:00
wemysschen	e14aa37d59	fix bes vector store search (#12828 ) Issue: fix search body in baidu cloud vectorsearch --------- Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>	2023-11-03 15:39:19 -07:00
Lance Martin	ea1ab391d4	Open Clip multimodal embeddings (#12754 )	2023-11-03 13:33:36 -07:00
Bagatur	ebee616822	bump 330 (#12853 )	2023-11-03 13:26:41 -07:00
Erick Friis	6c237716c4	Update readmes with new cli install (#12847 ) Old command still works. Just simplifying. Merge after releasing CLI 0.0.15	2023-11-03 12:10:32 -07:00
Erick Friis	7db49d3842	Confirm sys.path includes current dir for app serve (#12851 ) - Make sure sys.path is set properly for langchain app serve - bump	2023-11-03 11:37:20 -07:00
Erick Friis	1bc35f61cb	CLI 0.0.14, Uvicorn update and no more [serve] (#12845 ) Calls uvicorn directly from cli: Reload works if you define app by import string instead of object. (was doing subprocess in order to get reloading) Version bump to 0.0.14 Remove the need for [serve] for simplicity. Readmes are updated in #12847 to avoid cluttering this PR	2023-11-03 11:05:52 -07:00
William FH	18005c6384	Disable trace_on_chain_group auto-tracing (#12807 ) Previously we treated trace_on_chain_group as a command to always start tracing. This is unintuitive (makes the function do 2 things), and makes it harder to toggle tracing	2023-11-03 10:05:09 -07:00
Erick Friis	0da75b9ebd	Autopopulate module name in cli init (#12814 )	2023-11-02 23:45:38 -07:00
William FH	98aff29fbd	Add Dataset Page to printout (#12816 )	2023-11-02 20:36:56 -07:00
Manuel Rech	2e2b9c76d9	Keep also original query - multi_query.py (#12696 ) When you use a MultiQuery it might be useful to use the original query as well as the newly generated ones to maximise the changes to retriever the correct document. I haven't created an issue, it seems a very small and easy thing. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 18:15:02 -07:00
Bagatur	658a3a8607	FEAT: Merge TileDB vecstore (#12811 )	2023-11-02 17:40:32 -07:00
Akio Nishimura	c04647bb4e	Correct number of elements in config list in `batch()` and `abatch()` of `BaseLLM` (#12713 ) - Description: Correct number of elements in config list in `batch()` and `abatch()` of `BaseLLM` in case `max_concurrency` is not None. - Issue: #12643 - Twitter handle: @akionux --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 17:28:48 -07:00
James Braza	88b506b321	Adds missing `urllib.parse` for IDE warning of `PubMedAPIWrapper` (#12808 ) Resolves an IDE (PyCharm 2023.2.3 PE) warning around `urllib.parse.quote`, also enabling CTRL-click	2023-11-02 17:27:25 -07:00
Bagatur	a2bb0dd445	TileDB update import unit tests	2023-11-02 17:24:22 -07:00
Nikos Papailiou	2fdaa1e5fd	Add TileDB vectorstore implementation (#12624 ) - Description: Add [TileDB](https://tiledb.com) vectorstore implementation. TileDB offers ANN search capabilities using the [TileDB-Vector-Search](https://github.com/TileDB-Inc/TileDB-Vector-Search) module. It provides serverless execution of ANN queries and storage of vector indexes both on local disk and cloud object stores (i.e. AWS S3). More details in: - [Why TileDB as a Vector Database](https://tiledb.com/blog/why-tiledb-as-a-vector-database) - [TileDB 101: Vector Search](https://tiledb.com/blog/tiledb-101-vector-search) - Twitter handle: @tiledb	2023-11-02 17:21:03 -07:00
盐粒 Yanli	1b233798a0	feat: Supprt pgvecto.rs as a VectorStore (#12718 ) Supprt [pgvecto.rs](https://github.com/tensorchord/pgvecto.rs) as a new VectorStore type. This introduces a new dependency [pgvecto_rs](https://pypi.org/project/pgvecto_rs/) and upgrade SQLAlchemy to ^2. Relate to https://github.com/tensorchord/pgvecto.rs/issues/11 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 17:16:04 -07:00
Daniel Chalef	0cbdba6a9b	zep: VectorStore: Use Native MMR (#12690 ) - refactor to use Zep's native MMR; update example - @baskaryan @eyurtsev	2023-11-02 16:45:42 -07:00
Daniel Chalef	cc3d3920e3	Zep: Summary Search and Example (#12686 ) Zep now has the ability to search over chat history summaries. This PR adds support for doing so. More here: https://blog.getzep.com/zep-v0-17/ @baskaryan @eyurtsev	2023-11-02 16:31:11 -07:00
Bagatur	526313002c	add import tests to all modules (#12806 )	2023-11-02 15:32:55 -07:00
Harrison Chase	6609a6033f	fix vectorstore imports (#12804 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 15:32:31 -07:00
Nuno Campos	f66a9d2adf	Automatically add configurable key to config_schema if config_specs i… (#12798 ) …s present <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 21:46:15 +00:00
Praveen Venkateswaran	21eeba075c	enable the device_map parameter in huggingface pipeline (#12731 ) ### Enabling `device_map` in HuggingFacePipeline For multi-gpu settings with large models, the [accelerate](https://huggingface.co/docs/accelerate/usage_guides/big_modeling#using--accelerate) library provides the `device_map` parameter to automatically distribute the model across GPUs / disk. The [Transformers pipeline](`3520e37e86/src/transformers/pipelines/__init__.py (L543)`) enables users to specify `device` (or) `device_map`, and handles cases (with warnings) when both are specified. However, Langchain's HuggingFacePipeline only supports specifying `device` when calling transformers which limits large models and multi-gpu use-cases. Additionally, the [default value](`8bd3ce59cd/libs/langchain/langchain/llms/huggingface_pipeline.py (L72)`) of `device` is initialized to `-1` , which is incompatible with the transformers pipeline when `device_map` is specified. This PR addresses the addition of `device_map` as a parameter , and solves the incompatibility of `device = -1` when `device_map` is also specified. An additional test has been added for this feature. Additionally, some existing tests no longer work since 1. `max_new_tokens` has to be specified under `pipeline_kwargs` and not `model_kwargs` 2. The GPT2 tokenizer raises a `ValueError: Pipeline with tokenizer without pad_token cannot do batching`, since the `tokenizer.pad_token` is `None` ([related issue](https://github.com/huggingface/transformers/issues/19853) on the transformers repo). This PR handles fixing these tests as well. Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>	2023-11-02 14:29:06 -07:00
Mark Bell	3276aa3e17	__getattr__ should rase AttributeError not ImportError on missing attributes (#12801 ) [The python spec](https://docs.python.org/3/reference/datamodel.html#object.__getattr__) requires that `__getattr__` throw `AttributeError` for missing attributes but there are several places throwing `ImportError` in the current code base. This causes a specific problem with `hasattr` since it calls `__getattr__` then looks only for `AttributeError` exceptions. At present, calling `hasattr` on any of these modules will raise an unexpected exception that most code will not handle as `hasattr` throwing exceptions is not expected. In our case this is triggered by an exception tracker (Airbrake) that attempts to collect the version of all installed modules with code that looks like: `if hasattr(mod, "__version__"):`. With `HEAD` this is causing our exception tracker to fail on all exceptions. I only changed instances of unknown attributes raising `ImportError` and left instances of known attributes raising `ImportError`. It feels a little weird but doesn't seem to break anything.	2023-11-02 17:08:54 -04:00
Illia	71d1a48b66	Use data from all Google search results in SerpApi.com wrapper (#12770 ) - Description: Use all Google search results data in SerpApi.com wrapper instead of the first one only - Tag maintainer: @hwchase17 _P.S. `libs/langchain/tests/integration_tests/utilities/test_serpapi.py` are not executed during the `make test`._	2023-11-02 13:31:27 -07:00
Nuno Campos	c4fdf78d03	Fix AddableDict raising exception when used with non-addable values (#12785 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 18:56:29 +00:00
Erick Friis	49e283a0cd	CLI 0.0.13, Configurable Template Demo (#12796 )	2023-11-02 11:42:57 -07:00
Nuno Campos	d1c6ad7769	Fix on_llm_new_token(chunk=) for some chat models (#12784 ) It was passing in message instead of generation <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 16:33:44 +00:00
Erick Friis	070823f294	CLI 0.0.12 (#12787 )	2023-11-02 08:29:27 -07:00
Bagatur	979501c0ca	bump 329 (#12778 )	2023-11-02 06:02:43 -07:00
Erick Friis	da821320d3	Fixes 'Nonetype' not iterable for ObsidianLoader (#12751 ) Implements #12726 from @Di3mex	2023-11-01 16:07:09 -07:00
Eugene Yurtsev	b1caae62fd	APIChain add restrictions to domains (CVE-2023-32786) (#12747 ) * Restrict the chain to specific domains by default * This is a breaking change, but it will fail loudly upon object instantiation -- so there should be no silent errors for users * Resolves CVE-2023-32786	2023-11-01 18:50:34 -04:00
Erick Friis	4421ba46d7	Demo Server, Fix Timescale (#12746 ) - improve demo server - missing deps	2023-11-01 15:29:34 -07:00
Eugene Yurtsev	0e1aedb9f4	Use jinja2 sandboxing by default (#12733 ) * This is an opt-in feature, so users should be aware of risks if using jinja2. * Regardless we'll add sandboxing by default to jinja2 templates -- this sandboxing is a best effort basis. * Best strategy is still to make sure that jinja2 templates are only loaded from trusted sources.	2023-11-01 14:54:01 -07:00
Erick Friis	14340ee7cd	use http.client instead of urllib3 (#12660 ) dep problems with requests cloudflare debugging not worth it with urllib	2023-11-01 11:15:05 -07:00
Bagatur	eee5181b7a	bump 328, exp 37 (#12722 )	2023-11-01 10:27:39 -07:00
Erick Friis	3405dbbc64	dash not underscore (#12716 ) template names are auto-populating with the wrong convention (with underscores)	2023-11-01 09:48:37 -07:00
123-fake-st	8bd3ce59cd	PyPDFLoader use url in metadata source if file is a web path (#12092 ) Description: Update `langchain.document_loaders.pdf.PyPDFLoader` to store url in metadata (instead of a temporary file path) if user provides a web path to a pdf - Issue: Related to #7034; the reporter on that issue submitted a PR updating `PyMuPDFParser` for this behavior, but it has unresolved merge issues as of 20 Oct 2023 #7077 - In addition to `PyPDFLoader` and `PyMuPDFParser`, these other classes in `langchain.document_loaders.pdf` exhibit similar behavior and could benefit from an update: `PyPDFium2Loader`, `PDFMinerLoader`, `PDFMinerPDFasHTMLLoader`, `PDFPlumberLoader` (I'm happy to contribute to some/all of that, including assisting with `PyMuPDFParser`, if my work is agreeable) - The root cause is that the underlying pdf parser classes, e.g. `langchain.document_loaders.parsers.pdf.PyPDFParser`, never receive information about the url; the parsers receive a `langchain.document_loaders.blob_loaders.blob`, which contains the pdf contents and local file path, but not the url - This update passes the web path directly to the parser since it's minimally invasive and doesn't require further changes to maintain existing behavior for local files... bigger picture, I'd consider extending `blob` so that extra information like this can be communicated, but that has much bigger implications on the codebase which I think warrants maintainer input - Dependencies: None ```python # old behavior >>> from langchain.document_loaders import PyPDFLoader >>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': '/var/folders/w2/zx77z1cs01s1thx5dhshkd58h3jtrv/T/tmpfgrorsi5/tmp.pdf', 'page': 0} # new behavior >>> from langchain.document_loaders import PyPDFLoader >>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0} ```	2023-11-01 11:27:00 -04:00
Dave Kwon	b1954aab13	feat: Add page metadata on PDFMinerLoader (#12277 ) - Description: #12273 's suggestion PR Like other PDFLoader, loading pdf per each page and giving page metadata. - Issue: #12273 - Twitter handle: @blue0_0hope --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-01 11:25:37 -04:00
Duda Nogueira	7148f3e1fe	Weaviate - Fix schema existence check (#12711 ) This will allow you create the schema beforehand. The check was failing and preventing importing into existing classes. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-01 08:22:15 -07:00
Aidos Kanapyanov	ae63c186af	Mask API key for Anyscale LLM (#12406 ) Description: Add masking of API Key for Anyscale LLM when printed. Issue: #12165 Dependencies: None Tag maintainer: @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-01 10:22:26 -04:00
Predrag Gruevski	5ae51a8a85	Fix typo highlighted by `ruff` autoformatter. (#12691 ) H/t @MichaReiser for spotting it: https://github.com/langchain-ai/langchain/pull/12585/files#r1378253045	2023-10-31 22:16:06 -04:00
Erick Friis	44c8b159b9	properly increment version in cli (#12685 ) Went from 0.0.9 -> 0.0.11 without releasing. Back to 10, then release.	2023-10-31 17:27:43 -07:00
Leonid Ganeline	ddcec005bc	fix for `YahooFinanceNewsTool` (#12665 ) Added YahooFinanceNewsTool to the __init__.py It was missed here.	2023-10-31 14:58:09 -07:00
Predrag Gruevski	01a3c9b94e	Use an in-project virtualenv in the CLI package. (#12678 ) Keeping it in sync with how our other packages are configured.	2023-10-31 14:51:24 -07:00
Jacob Lee	bd668fcea1	Adds version CLI command (#12619 ) Will be automatically bumped with `poetry version patch`. @efriis @hwchase17 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-31 14:50:04 -07:00
Frank	bf5805bb32	Add quip loader (#12259 ) - Description: implement [quip](https://quip.com) loader - Issue: https://github.com/langchain-ai/langchain/issues/10352 - Dependencies: No - pass make format, make lint, make test --------- Co-authored-by: Hao Fan <h_fan@apple.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-31 14:11:24 -07:00
Roman Vasilyev	c9a6940d58	PGVector fix (#12592 ) latest release broken, this fixes it --------- Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-31 17:01:15 -04:00
Predrag Gruevski	e8b99364b3	Use `ruff` for both linting and formatting in `langchain-cli`. (#12672 ) Prior to this PR, `ruff` was used only for linting and not for formatting, despite the names of the commands. This PR makes it be used for both linting code and autoformatting it.	2023-10-31 13:52:25 -07:00
Margaret Qian	acfc485808	Update MosaicML Embedding Input Key (#12657 ) This input key was missed in the last update PR: https://github.com/langchain-ai/langchain/pull/7391 The input/output formats are intended to be like this: ``` {"inputs": [<prompt>]} {"outputs": [<output_text>]} ```	2023-10-31 14:43:30 -04:00
Predrag Gruevski	c871cc5055	Remove `print()` statements which seemed leftover from debugging. (#12648 ) Added in #12159 presumably during debugging. Right now they cause a bit of visual noise.	2023-10-31 13:45:48 -04:00
Noam Gat	14e8c74736	LM Format Enforcer Integration + Sample Notebook (#12625 ) ## Description This PR adds support for [lm-format-enforcer](https://github.com/noamgat/lm-format-enforcer) to LangChain. ![image](https://raw.githubusercontent.com/noamgat/lm-format-enforcer/main/docs/Intro.webp) The library is similar to jsonformer / RELLM which are supported in Langchain, but has several advantages such as - Batching and Beam search support - More complete JSON Schema support - LLM has control over whitespace, improving quality - Better runtime performance due to only calling the LLM's generate() function once per generate() call. The integration is loosely based on the jsonformer integration in terms of project structure. ## Dependencies No compile-time dependency was added, but if `lm-format-enforcer` is not installed, a runtime error will occur if it is trying to be used. ## Tests Due to the integration modifying the internal parameters of the underlying huggingface transformer LLM, it is not possible to test without building a real LM, which requires internet access. So, similar to the jsonformer and RELLM integrations, the testing is via the notebook. ## Twitter Handle [@noamgat](https://twitter.com/noamgat) Looking forward to hearing feedback! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-31 09:49:01 -07:00
Erick Friis	7f6e751a3d	template updates (#12646 )	2023-10-31 09:13:58 -07:00
Predrag Gruevski	f94e24dfd7	Install and use `ruff format` instead of black for code formatting. (#12585 ) Best to review one commit at a time, since two of the commits are 100% autogenerated changes from running `ruff format`: - Install and use `ruff format` instead of black for code formatting. - Output of `ruff format .` in the `langchain` package. - Use `ruff format` in experimental package. - Format changes in experimental package by `ruff format`. - Manual formatting fixes to make `ruff .` pass.	2023-10-31 10:53:12 -04:00
William FH	bfd719f9d8	bind_functions convenience method (#12518 ) I always take 20-30 seconds to re-discover where the `convert_to_openai_function` wrapper lives in our codebase. Chat langchain [has no clue](https://smith.langchain.com/public/3989d687-18c7-4108-958e-96e88803da86/r) what to do either. There's the older `create_openai_fn_chain` , but we haven't been recommending it in LCEL. The example we show in the [cookbook](https://python.langchain.com/docs/expression_language/how_to/binding#attaching-openai-functions) is really verbose. General function calling should be as simple as possible to do, so this seems a bit more ergonomic to me (feel free to disagree). Another option would be to directly coerce directly in the class's init (or when calling invoke), if provided. I'm not 100% set against that. That approach may be too easy but not simple. This PR feels like a decent compromise between simple and easy. ``` from enum import Enum from typing import Optional from pydantic import BaseModel, Field class Category(str, Enum): """The category of the issue.""" bug = "bug" nit = "nit" improvement = "improvement" other = "other" class IssueClassification(BaseModel): """Classify an issue.""" category: Category other_description: Optional[str] = Field( description="If classified as 'other', the suggested other category" ) from langchain.chat_models import ChatOpenAI llm = ChatOpenAI().bind_functions([IssueClassification]) llm.invoke("This PR adds a convenience wrapper to the bind argument") # AIMessage(content='', additional_kwargs={'function_call': {'name': 'IssueClassification', 'arguments': '{\n "category": "improvement"\n}'}}) ```	2023-10-31 07:15:37 -07:00
Nuno Campos	3143324984	Improve Runnable type inference for input_schemas (#12630 ) - Prefer lambda type annotations over inferred dict schema - For sequences that start with RunnableAssign infer seq input type as "input type of 2nd item in sequence - output type of runnable assign"	2023-10-31 13:22:54 +00:00
Nuno Campos	2f563cee20	Add Runnable.with_listeners() (#12549 ) - This binds start/end/error listeners to a runnable, which will be called with the Run object	2023-10-31 11:04:51 +00:00
Bagatur	bcc62d63be	bump 327 (#12623 )	2023-10-31 02:18:08 -07:00
Erick Friis	a1fae1fddd	Readme rewrite (#12615 ) Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-31 00:06:02 -07:00
Yujie Qian	1dbb77d7db	VoyageEmbeddings (#12608 ) - Description: Integrate VoyageEmbeddings into LangChain, with tests and docs - Issue: N/A - Dependencies: N/A - Tag maintainer: N/A - Twitter handle: @Voyage_AI_ --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:37:43 -07:00
chocolate4	92bf40a921	Add a new vector store hippo for langchain #11763 (#12412 ) #11763 --------- Co-authored-by: TranswarpHippo <hippo.0.assistant@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:35:23 -07:00
Karthik Raja A	342d6c7ab6	Multi on client toolkit (#12392 ) Replace this entire comment with: -Add MultiOn close function and update key value and add async functionality - solved the key value TabId not found.. (updated to use latest key value) @hwchase17	2023-10-30 18:34:56 -07:00
Prabin Nepal	b109cb031b	SecretStr for fireworks api (#12475 ) - Description: This pull request removes secrets present in raw format, - Issue: Fireworks api key was exposed when printing out the langchain object [#12165](https://github.com/langchain-ai/langchain/issues/12165) - Maintainer: @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:17:53 -07:00
Harrison Chase	a32c236c64	bump cli to 009 (#12611 )	2023-10-30 18:12:08 -07:00
Martin Schade	0c7f1d8b21	Textract linearizer (#12446 ) Description: Textract PDF Loader generating linearized output, meaning it will replicate the structure of the source document as close as possible based on the features passed into the call (e. g. LAYOUT, FORMS, TABLES). With LAYOUT reading order for multi-column documents or identification of lists and figures is supported and with TABLES it will generate the table structure as well. FORMS will indicate "key: value" with columms. - Issue: the issue fixes #12068 - Dependencies: amazon-textract-textractor is added, which provides the linearization - Tag maintainer: @3coins --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:02:10 -07:00
Erick Friis	f39246bd7e	cli should pull instead of delete+clone (#12607 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-30 16:44:09 -07:00
Harrison Chase	8b5e879171	add a template for the package readme (#12499 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-30 16:39:39 -07:00
Bagatur	9bedda50f2	Bagatur/lakefs loader2 (#12524 ) Co-authored-by: Jonathan Rosenberg <96974219+Jonathan-Rosenberg@users.noreply.github.com>	2023-10-30 16:30:27 -07:00
Ackermann Yuriy	99b69fe607	Fixed missing optional tags. Added default key value for Ollama (#12599 ) Added missing Optional typings. Added default values for Ollama optional keys. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 16:30:10 -07:00
Bagatur	016813d189	factor out to_secret (#12593 )	2023-10-30 15:10:25 -07:00
hsuyuming	630ae24b28	implement get_num_tokens to use google's count_tokens function (#10565 ) can get the correct token count instead of using gpt-2 model Description: Implement get_num_tokens within VertexLLM to use google's count_tokens function. (https://cloud.google.com/vertex-ai/docs/generative-ai/get-token-count). So we don't need to download gpt-2 model from huggingface, also when we do the mapreduce chain we can get correct token count. Tag maintainer: @lkuligin Twitter handle: My twitter: @abehsu1992626 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 15:10:05 -07:00
Pham Vu Thai Minh	33e77a1007	Async support for FAISS (#11333 ) Following this tutoral about using OpenAI Embeddings with FAISS https://python.langchain.com/docs/integrations/vectorstores/faiss ```python from langchain.embeddings.openai import OpenAIEmbeddings from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import FAISS from langchain.document_loaders import TextLoader from langchain.document_loaders import TextLoader loader = TextLoader("../../../extras/modules/state_of_the_union.txt") documents = loader.load() text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(documents) embeddings = OpenAIEmbeddings() ``` This works fine ```python db = FAISS.from_documents(docs, embeddings) query = "What did the president say about Ketanji Brown Jackson" docs = db.similarity_search(query) ``` But the async version is not ```python db = await FAISS.afrom_documents(docs, embeddings) # NotImplementedError query = "What did the president say about Ketanji Brown Jackson" docs = await db.asimilarity_search(query) # this will use await asyncio.get_event_loop().run_in_executor under the hood and will not call OpenAIEmbeddings.aembed_query but call OpenAIEmbeddings.embed_query ``` So this PR add async/await supports for FAISS --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-30 15:08:53 -07:00
Jeff Zhuo	13b89815a3	Issue: fix the issue #11648 init minimax llm (#12554 ) e https://github.com/langchain-ai/langchain/issues/11648 Minimax llm failed to initialize The idea of this fix is https://github.com/langchain-ai/langchain/issues/10917#issuecomment-1765606725 do not use underscore in python model class --------- Co-authored-by: zhuojianming@cmcm.com <zhuojianming@cmcm.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 14:30:17 -07:00
Florian Valeye	bfb27324cb	[Matching Engine] Update the Matching Engine to include the distance and filters (#12555 ) Hello 👋, This Pull Request adds more capability to the [MatchingEngine](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.matching_engine.MatchingEngine.html) vectorstore of GCP. It includes the `similarity_search_by_vector_with_relevance_scores` function and also [filters](https://cloud.google.com/vertex-ai/docs/vector-search/filtering) to `filter` the namespaces when retrieving the results. - Description: Add [filter](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndexEndpoint#google_cloud_aiplatform_MatchingEngineIndexEndpoint_find_neighbors) in `similarity_search` and add `similarity_search_by_vector_with_relevance_scores` method - Dependencies: None - Tag maintainer: Unknown Thank you! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 14:12:59 -07:00
Harrison Chase	1d51363e49	change project template (#12493 )	2023-10-30 14:06:30 -07:00
Holt Skinner	e53b9ccd70	feat: Add Google Cloud Text-to-Speech Tool (#12572 ) - Add Tool for [Google Cloud Text-to-Speech](https://cloud.google.com/text-to-speech) - Follows similar structure to [Eleven Labs Text2Speech](https://python.langchain.com/docs/integrations/tools/eleven_labs_tts) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 14:05:39 -07:00
Adilkhan Sarsen	6e702b9c36	Deep memory support in LangChain (#12268 ) - Description: adding support to Activeloop's DeepMemory feature that boosts recall up to 25%. Added Jupyter notebook showcasing the feature and also made index params explicit. - Twitter handle: will really appreciate if we could announce this on twitter. --------- Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>	2023-10-30 12:16:14 -07:00
billytrend-cohere	b1e3843931	Add client_name="langchain" to Cohere usage (#11328 ) Hey, we're looking to invest more in adding cohere integrations to langchain so would love to get more of an idea for how it's used. Hopefully this pr is acceptable. This week I'm also going to be looking into adding our new [retrieval augmented generation product](https://txt.cohere.com/chat-with-rag/) to langchain. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 11:20:55 -07:00
Bagatur	37aec1e050	bump 326 (#12569 )	2023-10-30 10:11:17 -07:00
Eugene Yurtsev	1b1a2d5740	Image Caption accepts bytes for images (#12561 ) Accept bytes for images in image caption --------- Co-authored-by: webcoderz <19884161+webcoderz@users.noreply.github.com>	2023-10-30 12:29:54 -04:00
Nuno Campos	7897483819	Allow astream_log to be used inside atrace_as_chain_group (#12558 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-30 15:55:16 +00:00
Holt Skinner	e05bb938de	Merge pull request #12433 * feat: Add Google Cloud Translation document transformer * Merge branch 'langchain-ai:master' into google-translate * Add documentation for Google Translate Document Transformer * Fix line length error * Merge branch 'master' into google-translate * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Addressed code review comments * Merge branch 'master' into google-translate * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Removed extra variable * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Merge branch 'master' into google-translate * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Removed extra import	2023-10-29 21:22:36 -04:00
Samad Koita	d1fdcd4fcb	Masking of API Key for GooseAI LLM (#12496 ) Description: Add masking of API Key for GooseAI LLM when printed. Issue: https://github.com/langchain-ai/langchain/issues/12165 Dependencies: None Tag maintainer: @eyurtsev --------- Co-authored-by: Samad Koita <>	2023-10-29 21:21:33 -04:00
Andrew Zhou	64c4a698a8	More comprehensive readthedocs document loader (#12382 ) ## Description: When building our own readthedocs.io scraper, we noticed a couple interesting things: 1. Text lines with a lot of nested <span> tags would give unclean text with a bunch of newlines. For example, for [Langchain's documentation](https://api.python.langchain.com/en/latest/document_loaders/langchain.document_loaders.readthedocs.ReadTheDocsLoader.html#langchain.document_loaders.readthedocs.ReadTheDocsLoader), a single line is represented in a complicated nested HTML structure, and the naive `soup.get_text()` call currently being made will create a newline for each nested HTML element. Therefore, the document loader would give a messy, newline-separated blob of text. This would be true in a lot of cases. <img width="945" alt="Screenshot 2023-10-26 at 6 15 39 PM" src="https://github.com/langchain-ai/langchain/assets/44193474/eca85d1f-d2bf-4487-a18a-e1e732fadf19"> <img width="1031" alt="Screenshot 2023-10-26 at 6 16 00 PM" src="https://github.com/langchain-ai/langchain/assets/44193474/035938a0-9892-4f6a-83cd-0d7b409b00a3"> Additionally, content from iframes, code from scripts, css from styles, etc. will be gotten if it's a subclass of the selector (which happens more often than you'd think). For example, [this page](https://pydeck.gl/gallery/contour_layer.html#) will scrape 1.5 million characters of content that looks like this: <img width="1372" alt="Screenshot 2023-10-26 at 6 32 55 PM" src="https://github.com/langchain-ai/langchain/assets/44193474/dbd89e39-9478-4a18-9e84-f0eb91954eac"> Therefore, I wrote a recursive _get_clean_text(soup) class function that 1. skips all irrelevant elements, and 2. only adds newlines when necessary. 2. Index pages (like [this one](https://api.python.langchain.com/en/latest/api_reference.html)) would be loaded, chunked, and eventually embedded. This is really bad not just because the user will be embedding irrelevant information - but because index pages are very likely to show up in retrieved content, making retrieval less effective (in our tests). Therefore, I added a bool parameter `exclude_index_pages` defaulted to False (which is the current behavior — although I'd petition to default this to True) that will skip all pages where links take up 50%+ of the page. Through manual testing, this seems to be the best threshold. ## Other Information: - Issue: n/a - Dependencies: n/a - Tag maintainer: n/a - Twitter handle: @andrewthezhou --------- Co-authored-by: Andrew Zhou <andrew@heykona.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-29 16:26:53 -07:00
Peter Vandenabeele	3468c038ba	Add unit tests for document_transformers/beautiful_soup_transformer.py (#12520 ) - Description: * Add unit tests for document_transformers/beautiful_soup_transformer.py * Basic functionality is tested (extract tags, remove tags, drop lines) * add a FIXME comment about the order of tags that is not preserved (and a passing test, but with the expected tags now out-of-order) - Issue: None - Dependencies: None - Tag maintainer: @rlancemartin - Twitter handle: `peter_v` Please make sure your PR is passing linting and testing before submitting. => OK: I ran `make format`, `make test` (passing after install of beautifulsoup4) and `make lint`.	2023-10-29 16:24:47 -07:00
Anirudh Gautam	b257e6a4e8	Mask API key for AI21 LLM (#12418 ) - Description: Added masking of the API Key for AI21 LLM when printed and improved the docstring for AI21 LLM. - Updated the AI21 LLM to utilize SecretStr from pydantic to securely manage API key. - Made improvements in the docstring of AI21 LLM. It now mentions that the API key can also be passed as a named parameter to the constructor. - Added unit tests. - Issue: #12165 - Tag maintainer: @eyurtsev --------- Co-authored-by: Anirudh Gautam <anirudh@Anirudhs-Mac-mini.local>	2023-10-29 14:53:41 -07:00
silvhua	9dead1034c	`_dalle_image_url` returns list of urls if n>1 (#11800 ) - Description: Updated the `_dalle_image_url` method to return a list of URLs if self.n>1, - Issue: #10691, - Dependencies: unsure, - Tag maintainer: @eyurtsev, - Twitter handle: @silvhua --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-29 14:23:23 -07:00
Bagatur	1815ea2fdb	OpenAI runnable constructor (#12455 )	2023-10-29 13:40:30 -07:00
William FH	a830b809f3	Patch forward ref bug (#12508 ) Currently this gives a bug: ``` from langchain.schema.runnable import RunnableLambda bound = RunnableLambda(lambda x: x).with_config({"callbacks": []}) # ConfigError: field "callbacks" not yet prepared so type is still a ForwardRef, you might need to call RunnableConfig.update_forward_refs(). ``` Rather than deal with cyclic imports and extra load time, etc., I think it makes sense to just have a separate Callbacks definition here that is a relaxed typehint.	2023-10-29 00:53:01 -07:00
William FH	36204c2baf	Evaluation Callback Multi Response (#12505 ) 1. Allow run evaluators to return {"results": [list of evaluation results]} in the evaluator callback. 2. Allows run evaluators to pick the target run ID to provide feedback to (1) means you could do something like a function call that populates a full rubric in one go (not sure how reliable that is in general though) rather than splitting off into separate LLM calls - cheaper and less code to write (2) means you can provide feedback to runs on subsequent calls. Immediate use case is if you wanted to add an evaluator to a chat bot and assign to assign to previous conversation turns have a corresponding one in the SDK	2023-10-28 23:18:29 -07:00
Harrison Chase	9e0ae56287	various templates improvements (#12500 )	2023-10-28 22:13:22 -07:00
0xC9	79cf01366e	Update tool.py (#12472 ) In the GoogleSerperResults class, the name field is defined as 'google_serrper_results_json'. This looks like a typo, and perhaps should be 'google_serper_results_json'. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-28 21:49:01 -07:00
Harrison Chase	eb903e211c	bump to 36 (#12487 )	2023-10-28 08:51:23 -07:00
Tyler Hutcherson	4209457bdc	Redis langserve template (#12443 ) Add Redis langserve template! Eventually will add semantic caching to this too. But I was struggling to get that to work for some reason with the LCEL implementation here. - Description: Introduces the Redis LangServe template. A simple RAG based app built on top of Redis that allows you to chat with company's public financial data (Edgar 10k filings) - Issue: None - Dependencies: The template contains the poetry project requirements to run this template - Tag maintainer: @baskaryan @Spartee - Twitter handle: @tchutch94 Note: this requires the commit here that deletes the `_aget_relevant_documents()` method from the Redis retriever class that wasn't implemented. That was breaking the langserve app. --------- Co-authored-by: Sam Partee <sam.partee@redis.com>	2023-10-28 08:31:12 -07:00
Erick Friis	9adaa78c65	cli improvements (#12465 ) Features - add multiple repos by their branch/repo - generate `pip install` commands and `add_route()` code ![Screenshot 2023-10-27 at 4 49 52 PM](https://github.com/langchain-ai/langchain/assets/9557659/3aec4cbb-3f67-4f04-8370-5b54ea983b2a) Optimizations: - group installs by repo/branch to avoid duplicate cloning	2023-10-28 08:25:31 -07:00
Adam Law	df4960a6d8	add reranking to azuresearch (#12454 ) -Description Adds returning the reranking score when using semantic search -*Issue: #12317 --------- Co-authored-by: Adam Law <adamlaw@microsoft.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-27 14:14:09 -07:00
Eugene Yurtsev	60d009f75a	Add security note to API chain (#12452 ) Add security note	2023-10-27 17:09:42 -04:00
Matvey Arye	11505f95d3	Improve handling of empty queries for timescale vector (#12393 ) Description: Improve handling of empty queries in timescale-vector. For timescale-vector it is more efficient to get a None embedding when the embedding has no semantic meaning. It allows timescale-vector to perform more optimizations. Thus, when the query is empty, use a None embedding. Also pass down constructor arguments to the timescale vector client. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-27 13:55:16 -07:00
Erick Friis	38cee5fae0	cli updates 2 (#12447 ) - extras group - readme - another readme --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 13:37:03 -07:00
William FH	5d40e36c75	Trace if run tree set (#12444 ) This code path is hit in the following case: - Start in langchain code and manually provide a tracer - Handoff to the traceable - Hand back to langchain code. Which happens for evaluating `@traceable` functions unfortunately	2023-10-27 12:29:18 -07:00
Bagatur	c2a0a6b6df	make doc utils public (#12394 )	2023-10-27 12:08:08 -07:00
Henter	d6888a90d0	Fix the missing temperature parameter for Baichuan-AI chat_model (#12420 ) Description: the missing `temperature` parameter for Baichuan-AI chat_model Baichuan-AI api doc: https://platform.baichuan-ai.com/docs/api	2023-10-27 12:07:21 -07:00
Erick Friis	6908634428	cli updates oct27 (#12436 )	2023-10-27 12:06:46 -07:00
HwangJohn	d38c8369b3	added rrf argument in ApproxRetrievalStrategy class __init__() (#11987 ) - Description: To handle the hybrid search with RRF(Reciprocal Rank Fusion) in the Elasticsearch, rrf argument was added for adjusting 'rank_constant' and 'window_size' to combine multiple result sets with different relevance indicators into a single result set. (ref: https://www.elastic.co/kr/blog/whats-new-elastic-enterprise-search-8-9-0), - Issue: the issue # it fixes (if applicable), - Dependencies: No dependencies changed, - Tag maintainer: @baskaryan, Nice to meet you, I'm a newbie for contributions and it's my first PR. I only changed the langchain/vectorstores/elasticsearch.py file. I did make format&lint I got this message, ```shell make lint_diff ./scripts/check_pydantic.sh . ./scripts/check_imports.sh poetry run ruff . [ "langchain/vectorstores/elasticsearch.py" = "" ] \|\| poetry run black langchain/vectorstores/elasticsearch.py --check All done! ✨ 🍰 ✨ 1 file would be left unchanged. [ "langchain/vectorstores/elasticsearch.py" = "" ] \|\| poetry run mypy langchain/vectorstores/elasticsearch.py langchain/__init__.py: error: Source file found twice under different module names: "mvp.nlp.langchain.libs.langchain.langchain" and "langchain" Found 1 error in 1 file (errors prevented further checking) make: * [lint_diff] Error 2 ``` Thank you --------- Co-authored-by: 황중원 <jwhwang@amorepacific.com>	2023-10-27 11:53:19 -07:00
Roman Vasilyev	2c58dca5f0	optional reusable connection (#12051 ) My postgres out of connections after continuous PGVector usage, and the reason because it constantly creates new connections, so adding a reusable pre established connection seems like solves an issue --------- Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 11:52:42 -07:00
Ennio Pastore	48fde2004f	Update long_context_reorder.py (#12422 ) The function comment was confusing and inaccurate	2023-10-27 11:52:28 -07:00
Bagatur	a8c68d4ffa	Type LLMChain.llm as runnable (#12385 )	2023-10-27 11:52:01 -07:00
Bagatur	d12b88557a	Bagatur/bump 325 (#12440 )	2023-10-27 11:49:09 -07:00
Eugene Yurtsev	cadfce295f	Deprecate PythonRepl tools and Pandas/Xorbits/Spark DataFrame/Python/CSV agents (#12427 ) See discussion here: https://github.com/langchain-ai/langchain/discussions/11680 The code is available for usage from langchain_experimental. The reason for the deprecation is that the agents are relying on a Python REPL. The code can only be run safely with appropriate sandboxing. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-27 14:16:42 -04:00
Harrison Chase	0ca539eb85	Clean up deprecated agents and update __init__ in experimental (#12231 ) Update init paths in experimental	2023-10-27 13:52:50 -04:00
Holt Skinner	134f085824	feat: Add Google Speech to Text API Document Loader (#12298 ) - Add Document Loader for Google Speech to Text - Similar Structure to [Assembly AI Document Loader][1] [1]: https://python.langchain.com/docs/integrations/document_loaders/assemblyai	2023-10-27 09:34:26 -07:00
David Duong	52c194ec3a	Fix templates typos (#12428 )	2023-10-27 09:32:57 -07:00
Massimiliano Pronesti	c8195769f2	fix(openai-callback): completion count logic (#12383 ) The changes introduced in #12267 and #12190 broke the cost computation of the `completion` tokens for fine-tuned models because of the early return. This PR aims at fixing this. @baskaryan.	2023-10-27 09:08:54 -07:00
Stefan Langenbach	b22da81af8	Mask API key for Aleph Alpha LLM (#12377 ) - Description: Add masking of API Key for Aleph Alpha LLM when printed. - Issue: #12165 - Dependencies: None - Tag maintainer: @eyurtsev --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 11:32:43 -04:00
William FH	4254028c52	Str Evaluator Mapper (#12401 )	2023-10-26 21:38:47 -07:00
William FH	fcad1d2965	Add space (#12395 )	2023-10-26 20:32:23 -07:00
William FH	922d7910ef	Wfh/json schema evaluation (#12389 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-10-26 20:32:05 -07:00
Christian Kasim Loan	a35445c65f	johnsnowlabs embeddings support (#11271 ) - Description: Introducing the [JohnSnowLabsEmbeddings](https://www.johnsnowlabs.com/) - Dependencies: johnsnowlabs - Tag maintainer: @C-K-Loan - Twitter handle: https://twitter.com/JohnSnowLabs https://twitter.com/ChristianKasimL --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-26 20:22:50 -07:00
SteveLiao	c08b622b2d	Add HTML Title and Page Language into metadata for AsyncHtmlLoader (#11326 ) Description: Revise `libs/langchain/langchain/document_loaders/async_html.py` to store the HTML Title and Page Language in the `metadata` of `AsyncHtmlLoader`.	2023-10-26 20:22:31 -07:00
Shorthills AI	25c98dbba9	Fixed some grammatical and Exception types issues (#12015 ) Fixed some grammatical issues and Exception types. @baskaryan , @eyurtsev --------- Co-authored-by: Sanskar Tanwar <142409040+SanskarTanwarShorthillsAI@users.noreply.github.com> Co-authored-by: UpneetShorthillsAI <144228282+UpneetShorthillsAI@users.noreply.github.com> Co-authored-by: HarshGuptaShorthillsAI <144897987+HarshGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: AdityaKalraShorthillsAI <143726711+AdityaKalraShorthillsAI@users.noreply.github.com> Co-authored-by: SakshiShorthillsAI <144228183+SakshiShorthillsAI@users.noreply.github.com>	2023-10-26 21:12:38 -04:00
William FH	923696b664	Wfh/json edit dist (#12361 ) Compare predicted json to reference. First canonicalize (sort keys, rm whitespace separators), then return normalized string edit distance. Not a silver bullet but maybe an easy way to capture structure differences in a less flakey way --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-10-26 18:10:28 -07:00
Erick Friis	4db8d82c55	CLI CI 2 (#12387 ) Will run all CI because of _test change, but future PRs against CLI will only trigger the new CLI one Has a bunch of file changes related to formatting/linting. No mypy yet - coming soon	2023-10-26 17:01:31 -07:00
Tyler Hutcherson	231d553824	Update broken redis tests (#12371 ) Update broken redis tests -- tiny PR :) - Description: Fixes Redis tests on master (look like it was broken by https://github.com/langchain-ai/langchain/pull/11257) - Issue: None, - Dependencies: No - Tag maintainer: @baskaryan @Spartee - Twitter handle: N/A Co-authored-by: Sam Partee <sam.partee@redis.com>	2023-10-26 16:13:14 -07:00
Erick Friis	03e79e62c2	cli fix (#12380 )	2023-10-26 15:29:49 -07:00
Bagatur	76230d2c08	fireworks scheduled integration tests (#12373 )	2023-10-26 14:24:42 -07:00
Josh Phillips	01c5cd365b	Fix SupbaseVectoreStore write operation timeout (#12318 ) Description This small change will make chunk_size a configurable parameter for loading documents into a Supabase database. Issue https://github.com/langchain-ai/langchain/issues/11422 Dependencies No chanages Twitter @ j1philli Reminder If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Greg Richardson <greg.nmr@gmail.com>	2023-10-26 14:19:17 -07:00
Bagatur	b10cefb160	lint fix: rm init (#12374 )	2023-10-26 14:16:25 -07:00
Harrison Chase	b43996e553	Harrison/improve cli (#12368 )	2023-10-26 13:53:59 -07:00
Harrison Chase	9ce38726a2	fix some stuff (#12292 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-26 13:30:36 -07:00
Cynthia Yang	6ce276e099	Support Fireworks batching (#8 ) (#12052 ) Description * Add _generate and _agenerate to support Fireworks batching. * Add stop words test cases * Opt out retry mechanism Issue - Not applicable Dependencies - None Tag maintainer - @baskaryan	2023-10-26 16:01:08 -04:00
Tyler Hutcherson	2f0c9d8269	Fix redis vectorfield schema defaults (#12223 ) - Description: refactors the redis vector field schema to properly handle default values, includes a new unit test suite. - Issue: N/A - Dependencies: nothing new. - Tag maintainer: @baskaryan @Spartee - Twitter handle: this is a tiny fix/improvement :) This issue was causing some clients/cuatomers issues when building a vector index on Redis on smaller db instances (due to fault default values in index configuration). It would raise an error like: ```redis.exceptions.ResponseError: Vector index initial capacity 20000 exceeded server limit (852 with the given parameters)``` This PR will address this moving forward.	2023-10-26 12:17:58 -07:00
Jakub Novák	9544d64ad8	E2B tool - Improve description wuth uploaded files info (#12355 )	2023-10-26 11:44:24 -07:00
Bagatur	c6a733802b	bump 324 and 35 (#12352 )	2023-10-26 10:10:26 -07:00
Nuno Campos	683e97766d	Fix json key output parser in partial (streaming) mode (#12332 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-26 17:45:04 +01:00
Nikhil Jha	dff24285ea	Comprehend Moderation 0.2 (#11730 ) This PR replaces the previous `Intent` check with the new `Prompt Safety` check. The logic and steps to enable chain moderation via the Amazon Comprehend service, allowing you to detect and redact PII, Toxic, and Prompt Safety information in the LLM prompt or answer remains unchanged. This implementation updates the code and configuration types with respect to `Prompt Safety`. ### Usage sample ```python from langchain_experimental.comprehend_moderation import (BaseModerationConfig, ModerationPromptSafetyConfig, ModerationPiiConfig, ModerationToxicityConfig ) pii_config = ModerationPiiConfig( labels=["SSN"], redact=True, mask_character="X" ) toxicity_config = ModerationToxicityConfig( threshold=0.5 ) prompt_safety_config = ModerationPromptSafetyConfig( threshold=0.5 ) moderation_config = BaseModerationConfig( filters=[pii_config, toxicity_config, prompt_safety_config] ) comp_moderation_with_config = AmazonComprehendModerationChain( moderation_config=moderation_config, #specify the configuration client=comprehend_client, #optionally pass the Boto3 Client verbose=True ) template = """Question: {question} Answer:""" prompt = PromptTemplate(template=template, input_variables=["question"]) responses = [ "Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.", "Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here." ] llm = FakeListLLM(responses=responses) llm_chain = LLMChain(prompt=prompt, llm=llm) chain = ( prompt \| comp_moderation_with_config \| {llm_chain.input_keys[0]: lambda x: x['output'] } \| llm_chain \| { "input": lambda x: x['text'] } \| comp_moderation_with_config ) try: response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"}) except Exception as e: print(str(e)) else: print(response['output']) ``` ### Output ```python > Entering new AmazonComprehendModerationChain chain... Running AmazonComprehendModerationChain... Running pii Validation... Running toxicity Validation... Running prompt safety Validation... > Finished chain. > Entering new AmazonComprehendModerationChain chain... Running AmazonComprehendModerationChain... Running pii Validation... Running toxicity Validation... Running prompt safety Validation... > Finished chain. Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like XXXXXXXXXXXX John Doe's phone number is (999)253-9876. ``` --------- Co-authored-by: Jha <nikjha@amazon.com> Co-authored-by: Anjan Biswas <anjanavb@amazon.com> Co-authored-by: Anjan Biswas <84933469+anjanvb@users.noreply.github.com>	2023-10-26 09:42:18 -07:00
Blake (Yung Cher Ho)	b9410f2b6f	Takeoff pro support (#12070 ) Description: This PR adds support for the [Pro version of Titan Takeoff Server](https://docs.titanml.co/docs/category/pro-features). Users of the Pro version will have to import the TitanTakeoffPro model, which is different from TitanTakeoff. Issue: Also minor fixes to docs for Titan Takeoff (Community version) Dependencies: No additional dependencies Twitter handle: @becoming_blake @baskaryan @hwchase17	2023-10-26 09:39:32 -07:00
Leonid Kuligin	4e47fe1dce	fixed error message and a check for processor name (#12200 ) Replace this entire comment with: - Description: a small fix on error description / a check for processor name - Issue: the issue #11407	2023-10-26 09:38:25 -07:00
Nir Kopler	9298aff783	Finetuned openai azure models cost calculation (#12267 ) Description: Add cost calculation for fine tuned Azure with relevant unit tests. see https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo&pivots=programming-language-studio for more information. this PR is the result of this PR: https://github.com/langchain-ai/langchain/pull/12190 Twitter handle: @nirkopler	2023-10-26 09:38:10 -07:00
gnakw	20fe515f20	Fix the exception from langchain.utilities import ArceeWrapper (#12342 ) - Description: Fix the exception from langchain.utilities import ArceeWrapper	2023-10-26 09:19:43 -07:00
Qihui Xie	6720458c7d	add allowed_operators property in QdrantTranslator (#12328 ) - Description: This PR adds `allowd_operators` property to `QdrantTranslator` to fix the `TypeError: can only join an iterable` bug. This property is required in `get_query_constructor_prompt` in `query_constructor\base.py`: ``` allowed_operators=" \| ".join(allowed_operators), ``` - Issue: #12061 --------- Co-authored-by: XIE Qihui <qihui.xie@bopufund.com>	2023-10-26 09:18:29 -07:00
Bagatur	f5a57fc1ef	fix self query constructor (#12349 )	2023-10-26 09:18:15 -07:00
Vasek Mlejnsky	cdd75b687e	e2b tool - fix initialization and improve tool description (#12345 )	2023-10-26 08:47:50 -07:00
Harrison Chase	8ec7aade9f	add docs for templates (#12346 )	2023-10-26 08:28:01 -07:00
Erick Friis	ebf998acb6	Templates (#12294 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Jacob Lee <jacoblee93@gmail.com>	2023-10-25 18:47:42 -07:00
Erick Friis	43257a295c	CLI Git Improvements (#12311 ) - delete repo sources like pip - git dep fixes - error messaging	2023-10-25 18:30:02 -07:00
William FH	1d568e1add	Better wrap traceable (#12303 ) If user function is wrapped as a traceable function, this will help hand off the trace between the two. Also update handling fields to reflect optional values	2023-10-25 16:34:23 -07:00
Eugene Yurtsev	5a71b81609	Relax type annotation for custom input/output types (#12300 ) This is needed to be able to do stuff like: ```python runnable.with_types(input_type=List[str]) ```	2023-10-25 19:00:22 -04:00
William FH	988f6d9912	Rm langchain server (#12305 )	2023-10-25 15:26:46 -07:00
wemysschen	3f16acc538	Add baidu cloud vector search in vectorstore and fix some unit test in vectorstores (#11605 ) Description: Add baidu cloud vector search in vectorstore --------- Co-authored-by: root <root@icoding-cwx.bcc-szzj.baidu.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-25 13:44:19 -07:00
mrbean	b7e559c7e1	use snippet search optionally (#12236 ) Add an additional flag which allows for hitting our new endpoint.	2023-10-25 13:37:28 -07:00
felixocker	cce132d146	fix sparql queries for relations in schema description (#9136 ) - Description: Fix for the SPARQL QA chain: fixed SPARQL queries for retrieving information about relations in the graph to create a textual description of the schema for the language model. This should resolve #8907 - Issue: #8907 - Dependencies: None - Tag maintainer: @baskaryan, @hwchase17	2023-10-25 13:36:57 -07:00
Donato Azevedo	d9f1bcf366	Strips leading/trailing whitespace before parsing xml (#12297 ) Description: When llms output leading or trailing whitespace for xml (when using XMLOutputParser) the parser would raise a `ValueError: Could not parse output: ...`. However, leading or trailing whitespace are "ignorable" in the sense of XML standard. Issue: I did not find an issue related. Dependencies: None Tag maintainer: Twitter handle: donatoaz Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. Done, updated unit test and ran `make docker_test`.	2023-10-25 13:34:58 -07:00
Erick Friis	47070b8314	CLI (#12284 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-25 11:06:58 -07:00
Shwu Ku	07c2649753	response parser for ArceeRetriever (#12270 ) - Description: Response parser for arcee retriever, - Issue: follow-up pr on #11578 and [discussion](https://github.com/arcee-ai/arcee-python/issues/15#issuecomment-1759874053), - Dependencies: NA This pr implements a parser for the response from ArceeRetreiver to convert to langchain `Document`. This closes the loop of generation and retrieval for Arcee DALMs in langchain. The reference for the response parser is [api-docs:retrieve](https://api.arcee.ai/docs#/v2/retrieve_model) Attaching screenshot of working implementation: <img width="1984" alt="Screenshot 2023-10-25 at 7 42 34 PM" src="https://github.com/langchain-ai/langchain/assets/65639964/026987b9-34b2-4e4b-b87d-69fcd0c6641a"> \*api key deleted --- Successful tests, lints, etc. ```shell Re-run pytest with --snapshot-update to delete unused snapshots. ==================================================================================================================== slowest 5 durations ===================================================================================================================== 1.56s call tests/unit_tests/schema/runnable/test_runnable.py::test_retrying 0.63s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_astream 0.33s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_stream_iterator_input 0.30s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_astream_iterator_input 0.20s call tests/unit_tests/indexes/test_indexing.py::test_cleanup_with_different_batchsize ======================================================================================================= 1265 passed, 270 skipped, 32 warnings in 6.55s ======================================================================================================= [ "." = "" ] \|\| poetry run black . All done! ✨ 🍰 ✨ 1871 files left unchanged. [ "." = "" ] \|\| poetry run ruff --select I --fix . ./scripts/check_pydantic.sh . ./scripts/check_imports.sh poetry run ruff . [ "." = "" ] \|\| poetry run black . --check All done! ✨ 🍰 ✨ 1871 files would be left unchanged. [ "." = "" ] \|\| poetry run mypy . Success: no issues found in 1868 source files poetry run codespell --toml pyproject.toml poetry run codespell --toml pyproject.toml -w ``` Co-authored-by: Shubham Kushwaha <shwu@Shubhams-MacBook-Pro.local>	2023-10-25 10:55:13 -07:00
Johanna Appel	c26ec7789f	CohereEmbeddings: Add max_retries and request_timeout (#12275 ) Add max_retries and request_timeout to CohereEmbeddings, akin to how it works in OpenAIEmbeddings. Since the Cohere client already implements these parameters, we can simply pass them down. Uses parameters from these two cohere client objects: https://github.com/cohere-ai/cohere-python/blob/main/cohere/client.py https://github.com/cohere-ai/cohere-python/blob/main/cohere/client_async.py	2023-10-25 10:37:25 -07:00
Nuno Campos	7108084947	Remove CLI (#12283 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-25 10:33:52 -07:00
Nuno Campos	b5b2d07681	Pop max concurrency when recursing (#12281 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-25 18:03:58 +01:00
Bagatur	69f4e402e4	bump 323 (#12278 )	2023-10-25 09:06:12 -07:00
David Duong	c25b174db5	Add serialisation props to Fireworks and ChatFireworks (#12255 )	2023-10-25 11:41:33 +01:00
Richard Adams	fd5f549a9e	demonstrate use of RetrievalQAWithSourcesChain.from_chain (#12235 ) Description: Documents further usage of RetrievalQAWithSourcesChain in an existing test. I'd not found much documented usage of RetrievalQAWithSourcesChain and how to get the sources out. This additional code will hopefully be useful to other potential users of this retriever. Issue: No raised issue Dependencies: No new dependencies needed to run the test (it already needs `open-ai`, `faiss-cpu` and `unstructured`). Note - `make lint` showed 8 linting errors in unrelated files --------- Co-authored-by: richarda23 <richard.c.adams@infinityworks.com>	2023-10-24 21:33:34 -07:00
James Braza	53f35c5f5c	Adding `STRUCTURED_FORMAT_SIMPLE_INSTRUCTIONS` missing backticks (#12238 ) This PR fixes the fact that `STRUCTURED_FORMAT_SIMPLE_INSTRUCTIONS` was missing backticks at the end	2023-10-24 21:30:25 -07:00
William FH	276c6ba115	Check for ls project in run tree context (#12242 ) If I go traceable -> runnable when the project is manually specified, the runnable wont be logged. This makes sure the session/project is threaded through appropriately.	2023-10-24 17:18:59 -07:00
Vasek Mlejnsky	1f8094938f	Integrate E2B's data analysis/code interpreter (#12011 ) This PR adds a data [E2B's](https://e2b.dev/) analysis/code interpreter sandbox as a tool --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Jakub Novak <jakub@e2b.dev>	2023-10-24 16:04:02 -07:00
Bagatur	286a29a49e	bump 322 and 34 (#12228 )	2023-10-24 13:52:17 -07:00
Eugene Yurtsev	583dc49477	Add type to Generation and sub-classes, handle root validator (#12220 ) * Add a type literal for the generation and sub-classes for serialization purposes. * Fix the root validator of ChatGeneration to return ValueError instead of KeyError or Attribute error if intialized improperly. * This change is done for langserve to make sure that llm related callbacks can be serialized/deserialized properly.	2023-10-24 16:21:00 -04:00
Eugene Yurtsev	81052ee18e	Fix code block in runnable doc (#12221 ) Fix code block syntax in runnable doc-string	2023-10-24 16:11:58 -04:00
Mikelarg	46e28b9613	Added GigaChat chat model support (#12201 ) - Description: Added integration with [GigaChat](https://developers.sber.ru/portal/products/gigachat) language model. - Twitter handle: @dvoshansky	2023-10-24 12:53:51 -07:00
Anurag Wagh	d5c2ce7c2e	[fix] create redis vector index before adding docs, add prefix to doc… (#11257 ) Fix Description: For Redis Vector integration in add_texts method, there were two issues that lead to this bug. 1. Vector index is not being created leading to no such_index error 2. `doc:index` prefix was also missing for Redis Keys. resolves #11197 Maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-24 10:51:25 -07:00
Eugene Yurtsev	079d1f3b8e	Expose handle_event and ahandle_events as public API (#12181 ) Expose functionality to handle generic events.	2023-10-24 13:42:28 -04:00
William FH	67c4fd0ad0	Update deprecation (#12178 ) in runner_utils	2023-10-24 10:37:28 -07:00
Nir Kopler	d3744175bf	Finetuned OpenAI models cost calculation #11715 (#12190 ) Description: Add cost calculation for fine tuned models (new and legacy), this is required after OpenAI added new models for fine tuning and separated the costs of I/O for fine tuned models. Also I updated the relevant unit tests see https://platform.openai.com/docs/guides/fine-tuning for more information. issue: https://github.com/langchain-ai/langchain/issues/11715 - Issue: 11715 - Twitter handle: @nirkopler	2023-10-24 10:22:05 -07:00
Spyros	a2840a2b42	fix vertexai codey models (#12173 ) Description: This PR fixes issue #12156 by checking for Codey models appropriately before result parsing. Maintainer: @hwchase17 , @agola11	2023-10-24 10:20:05 -07:00
Hech	d76f026d72	Fix flexible dimension and doc for DingoDB (#12187 )	2023-10-24 10:16:19 -07:00
Erick Friis	95ae40ff90	Fix Anthropic Functions ainvoke (#12215 ) Removes custom `NotImplementedError` in experimental anthropic functions, allowing it to fallback on default `ainvoke` implementation.	2023-10-24 10:07:01 -07:00
Iskren Ivov Chernev	d5d7ba582a	Improvements to llm/deepinfra (#10846 ) - replace `requests` package with `langchain.requests` - add `_acall` support - add `_stream` and `_astream` - freshen up the documentation a bit - update vendor doc	2023-10-24 09:54:23 -07:00
sudranga	f09f82541b	Expose configuration options in GraphCypherQAChain (#12159 ) Allows for passing arguments into the LLM chains used by the GraphCypherQAChain. This is to address a request by a user to include memory in the Cypher creating chain. Will keep the prompt variables as-is to be backward compatible. But, would be a good idea to deprecate them and use the **kwargs variables. Added a test case. In general, I think it would be good for any chain to automatically pass in a readonlymemory(of its input) to its subchains whilist allowing for an override. But, this would be a different change.	2023-10-24 09:52:55 -07:00
Leonid Ganeline	11f13aed53	docstrings update (#12093 ) Added missed docstrings. Added missed Args:, Returns: Raises:	2023-10-24 09:34:10 -07:00
Johnny Oshika	ba20c14e28	Fix typo in stuff_prompt's system_template (#12063 ) - Description: Add missing apostrophe in `user's` in stuff_prompt's system_template. The first sentence in the system template went from: > Use the following pieces of context to answer the users question. to > Use the following pieces of context to answer the user's question. - Issue: - Dependencies: none - Tag maintainer: @baskaryan - Twitter handle: ojohnnyo	2023-10-24 09:21:28 -07:00
Holt Skinner	69d9eae5cd	feat: Add Client Info to available Google Cloud Clients (#12168 ) - This is used internally to gather aggregate usage metrics for the LangChain integrations - Note: This cannot be added to some of the Vertex AI integrations at this time because the SDK doesn't allow overriding the [`ClientInfo`](https://googleapis.dev/python/google-api-core/latest/client_info.html#module-google.api_core.client_info) - Added to: - BigQuery - Google Cloud Storage - Document AI - Vertex AI Model Garden - Document AI Warehouse - Vertex AI Search - Vertex AI Matching Engine (Cloud Storage Client) @baskaryan, @eyurtsev, @hwchase17 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-24 08:49:11 -07:00
Lukas Wolf	69f5f82804	Update extraction.py (#12207 ) Description: Pass tags as argument to create_extraction_chain Issue: create_extraction_chain does not pass tags to chain yet @baskaryan	2023-10-24 08:25:14 -07:00
Nuno Campos	34ffb94770	Remove GetLocal, PutLocal (#12133 ) Do you agree?	2023-10-24 10:16:46 +01:00
Eric Hartford	8c150ad7f6	Add COBOL parser and splitter (#11674 ) - Description: Add COBOL parser and splitter - Issue: n/a - Dependencies: n/a - Tag maintainer: @baskaryan - Twitter handle: erhartford --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-23 15:44:31 -04:00
John Mai	ebf749c40c	Baichuan & Hunyuan set default api_base (#12059 ) ### Description Baichuan & Hunyuan set default api_base env	2023-10-23 15:33:35 -04:00
Shilong Dai	99afc1b4f8	Fixed hardcoded "vector" and replaced with vector_query_field variable (#12126 ) - Description: In the max_marginal_relevance_search function of the ElasticsearchStore vector store, the name of the field corresponding to the vector embedding of the document is hard coded in the delete statement that drops the field from the document metadata. This results in an exception if the vector embedding field is customized. This PR changes the hard-coded "vector" into the vector_query_field variable. - Issue: None - Dependencies: None - Tag maintainer: @hwchase17 Co-authored-by: Shilong Dai <sdai@viperfish.net>	2023-10-23 15:08:55 -04:00
Vikram Shitole	0d44746430	10634: Added the capability to inject boto3 client in SagemakerEndpointEmbeddings (#12146 ) Description: Allow to inject boto3 client for Cross account access type of scenarios in using SagemakerEndpointEmbeddings and also updated the documentation for same in the sample notebook Issue:SagemakerEndpointEmbeddings cross account capability #10634 #10184 Dependencies: None Tag maintainer: Twitter handle:lethargicoder Co-authored-by: Vikram(VS) <vssht@amazon.com>	2023-10-23 15:08:26 -04:00
aubin_mzt	66f8cb015d	Add connection args for pgvector vector store (#11930 ) - Description: sqlalchemy create_engine() does not take into account connect_args which are mandatory for managed PGSQL instances on cloud providers (ssl_context for example). Also re-enabled create_vector_extension at post_init for using pgvector class seamlessly - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Sami Bargaoui <bargaoui.sam@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-23 14:43:44 -04:00
NuODaniel	4d6243fa87	fix: doc string of default params in chat_models, llm qianfan (#12153 ) - Description: a fix of the doc string in Qianfan - Issue: no - Dependencies: no - Tag maintainer: @baskaryan - Twitter handle: no	2023-10-23 14:03:18 -04:00
Predrag Gruevski	f82bdf4613	Update deprecated `langchain` imports with suggested new paths. (#12164 ) Let's help our users find the proper import to use instead of the deprecated top-level ones.	2023-10-23 13:52:08 -04:00
Bagatur	963ff93476	bump 321 (#12161 )	2023-10-23 12:49:38 -04:00
Nuno Campos	d0505c0d47	Update default recursion_limit, update docs (#12134 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-23 16:29:17 +01:00
William FH	4f23aa677a	Fix Pickle Error (#12141 ) If non-pickleable objects (like locks) get passed to the tracing callback, they'll fail in the deepcopy. Fallback to a shallow copy in these instances .	2023-10-23 08:22:47 -07:00
Predrag Gruevski	95a1b598fe	Update to `actions/checkout@v4`. (#11951 ) We don't use any of the new functionality at the moment. Just making sure we don't fall back on versions and fail to benefit from new patches. This is an easy upgrade and it's always harder to upgrade across multiple major versions at once.	2023-10-23 10:01:33 -04:00
William FH	7c4f340cc0	Include Parent Run ID (#12139 ) If you set local callbacks	2023-10-22 17:19:11 -07:00
omahs	f3cc9bba5b	Fix typos (#12128 ) Fix typos	2023-10-22 17:16:03 -07:00
Nuno Campos	1afdb40b48	Add optional config arg to RunnablePassthrough func arg (#12131 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 19:57:16 +01:00
Nuno Campos	325fdde8b4	Fix bug where types were lost when calling with_cconfig or bind (#12137 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 19:26:13 +01:00
Nuno Campos	02dce74b97	Fix type hint for older py versions (#12132 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 18:01:09 +01:00
Nuno Campos	d0ce374731	Allow specifying custom input/output schemas for runnables with .with_types() (#12083 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 17:26:48 +01:00
Harrison Chase	ee69116761	move csv agent to langchain experimental (#12113 )	2023-10-21 10:26:02 -07:00
Harrison Chase	03bf6ef473	add missing init files (#12114 )	2023-10-21 10:25:50 -07:00
Bagatur	ef8b180d6d	bump 320 (#12108 )	2023-10-21 11:52:52 -04:00
Rotem Weiss	78d186fb44	Add Tavily Search API as a Tool (#12103 ) Adding Tavily Search API as a tool. I will be the maintainer and assaf_elovic is the twitter handler. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-21 11:23:21 -04:00
Bagatur	85302a9ec1	Add CI check that integration tests compile (#12090 )	2023-10-21 10:52:18 -04:00
verlocks	5dbe456aae	Bug fix tongyi.py to be compatible with DashScope API (#11956 ) Current ChatTongyi is not compatible with DashScope API, which will cause error when passing api key to chat model directly. - Description: Update tongyi.py to be compatible with DashScope API. Specifically, update parameter name "dashscope_api_key" to "api_key". - Issue: None. - Dependencies: Nothing new, Tongyi would require DashScope as before.	2023-10-20 18:46:41 -04:00
Tomaz Bratanic	82f4c0589c	Add neo4j graph environment variables (#12080 )	2023-10-20 14:43:01 -07:00
Mohammad Mohtashim	d5400f6502	Google Scholar Search Tool using serpapi (#11513 ) - Description: Implementing the Google Scholar Tool as requested in PR #11505. The tool will be using the [serpapi python package](https://serpapi.com/integrations/python#search-google-scholar). The main idea of the tool will be to return the results from a Google Scholar search given a query as an input to the tool. - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17	2023-10-20 17:35:55 -04:00
Holt Skinner	f5be2d525a	fix: Add `_serving_config` property to `GoogleVertexAISearchRetriever` (#12084 ) - Fixes error: ``` ValueError: "GoogleVertexAISearchRetriever" object has no field "_serving_config" ``` Introduced in #11736 @baskaryan, @eyurtsev, @hwchase17 if you could review and merge quickly, that would be appreciated :)	2023-10-20 15:16:42 -04:00
Nuno Campos	5fee61a207	Support runnable factories in .configurable_alts() (#12065 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-20 15:22:09 +01:00
Zhitao Xu	a4c3a44712	Fix documentation typo in Clickhouse Class (#12047 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: The return info in the documentation for similarity_search_by_vector and similarity_search_with_relevance_scores is wrong	2023-10-19 17:00:22 -04:00
William FH	25418b9b4d	Always add run ID (#12046 ) in eval callback handler. Useful if you're using a custom run evaluator and don't want to thread things through.	2023-10-19 12:38:07 -07:00
Eugene Yurtsev	44d7763580	Add zapier deprecation warning (#12045 ) Add zapier deprecation	2023-10-19 15:27:56 -04:00
John Mai	4188f046ec	Add Tencent Hunyuan chat model (#12022 ) ### Description: The Tencent Hunyuan model, developed by Tencent, is a large language model by robust Chinese text generation capabilities, adeptness in logical reasoning within complex contexts, and reliable task execution proficiency.For more information, see [https://cloud.tencent.com/document/product/1729](https://cloud.tencent.com/document/product/1729)	2023-10-19 15:10:12 -04:00
Eugene Yurtsev	68599d98c2	More security notes (#12040 ) Add more security notes	2023-10-19 14:49:09 -04:00
Bagatur	0006075b08	bump 319 (#12041 )	2023-10-19 11:45:27 -07:00
John Mai	8eb40b5fe2	`baichuan_secret_key` use pydantic.types.SecretStr & Add Baichuan tests (#12031 ) ### Description - `baichuan_secret_key` use pydantic.types.SecretStr - Add Baichuan tests	2023-10-19 14:37:41 -04:00
Nuno Campos	85bac75729	nc/runnable-dynamic-schemas-from-config (#12038 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-19 19:34:35 +01:00
Nuno Campos	85eaa4ccee	Revert "nc/runnable-dynamic-schemas-from-config" (#12037 ) This reverts commit `a46eef64a7`. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-19 19:27:02 +01:00
Nuno Campos	a46eef64a7	nc/runnable-dynamic-schemas-from-config	2023-10-19 19:17:48 +01:00
Nuno Campos	d392e030be	Add default value (#12032 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-19 18:30:05 +01:00
Kenneth Choe	62efe1ffb9	support add_embeddings for elasticsearch (#11002 ) - Description: Provide a way to use different text for embedding. - For example, if you are ingesting stack-overflow Q&As for RAG, you would want to embed the questions and return the answer(s) for the hits. With this change, the consumer of langchain can implement that easily. - I noticed the similar function is added on faiss.py with #1912 which was for performance reason, but I see the same function can be used to achieve what I thought. So instead of changing Document class to have embedding_content, I mimicked the implementation of faiss.py. - The test should provide some guidance on how to use it. It would be more intuitive if I just pass texts and embedding_texts as separate arguments, but I chose to use `zip`-ed object for the consistency with faiss.py implementation. - I plan to make similar pull request for OpenSearch. - Issue: N/A - Dependencies: None other than the existing ones. Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-19 09:43:51 -07:00
Bagatur	76d3afaef0	bump 318 (#12030 )	2023-10-19 09:33:39 -07:00
Dmitry Tyumentsev	5dd2161c4b	add _acall method to YandexGPT (#12029 ) - Description: Add async support for YandexGPT LLM model Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>	2023-10-19 09:15:26 -07:00
Peter Krenesky	8425f33363	Pydantic v2 support for OpenAPI Specs (#11936 ) - Description: Adding Pydantic v2 support for OpenAPI Specs - Issue: - OpenAPI spec support was disabled because `openapi-schema-pydantic` doesn't support Pydantic v2: #9205 - Caused errors in `get_openapi_chain` - This may be the cause of #9520. - Tag maintainer: @eyurtsev - Twitter handle: kreneskyp The root cause was that `openapi-schema-pydantic` hasn't been updated in some time but [openapi-pydantic](https://github.com/mike-oakley/openapi-pydantic) forked and updated the project.	2023-10-19 11:06:11 -04:00
Joe McElroy	c9f1768cb9	Elasticsearch Query Retriever: Use match + fuzziness for LIKE (#12023 ) Updated the elasticsearch self query retriever to use the match clause for LIKE operator instead of the non-analyzed fuzzy search clause. Other small updates include: - fixing the stack inference integration test where the index's default pipeline didn't use the inference pipeline created - adding a user-agent to the old implementation to track usage - improved the documentation for ElasticsearchStore filters	2023-10-19 09:47:21 -04:00
Nuno Campos	7db6aabf65	Update chat model output type (#11833 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-19 00:55:15 -07:00
Simon Dai	ed62984cb2	update Weaviate to support multi tenancy (#11842 ) - Description: update Weaviate to support multi tenancy - Issue: 9956 - Dependencies: - Tag maintainer: hwchase17 - Twitter handle: dsx1986_	2023-10-19 00:49:30 -07:00
hiigao	f818ec49b8	Encapsulate alicloud pai-eas access method for chatmodels and llms (#11852 ) ### Description: To provide an eas llm service access methods in this pull request by impletementing `PaiEasEndpoint` and `PaiEasChatEndpoint` classes in `langchain.llms` and `langchain.chat_models` modules. Base on this pr, langchain users can build up a chain to call remote eas llm service and get the llm inference results. ### About EAS Service EAS is a Alicloud product on Alibaba Cloud Machine Learning Platform for AI which is short for AliCloud PAI. EAS provides model inference deployment services for the users. We build up a llm inference services on EAS with a general llm docker images. Therefore, end users can quickly setup their llm remote instances to load majority of the hugginface llm models, and serve as a backend for most of the llm apps. ### Dependencies This pr does't involve any new dependencies. --------- Co-authored-by: 子洪 <gaoyihong.gyh@alibaba-inc.com>	2023-10-19 00:20:18 -07:00
John Mai	a6b483dcbc	Supported RetryOutputParser & RetryWithErrorOutputParser max_retries (#11903 ) Description: Supported RetryOutputParser & RetryWithErrorOutputParser max_retries - max_retries: Maximum number of retries to parser. Issue: None Dependencies: None Tag maintainer: @baskaryan Twitter handle:	2023-10-18 23:57:16 -07:00
Hugues Chocart	008c7df80d	[LLMonitorCallbackHandler] Refactor + add llmonitor-py dependency (#11948 ) We now require uses to have the pip package `llmonitor` installed. It allows us to have cleaner code and avoid duplicates between our library and our code in Langchain.	2023-10-18 23:54:10 -07:00
Sian Cao	77fc2f7644	fix: impl missing embeddings method (#10823 ) FAISS does not implement embeddings method and use embed_query to embedding texts which is wrong for some embedding models. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-18 23:51:28 -07:00
Holt Skinner	2661dc94f3	feat: Google Vertex AI Search Retriever - Add support for Website Data Stores (#11736 ) - Only works for Data stores with Advanced Website Indexing - https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features - Minor restructuring - Follow up to #10513 - Remove outdated docs (readded in https://github.com/langchain-ai/langchain/pull/11620) - Move legacy class into new py file to clean up the directory - Shouldn't cause backwards compatibility issues as the import works the same way for users	2023-10-18 23:41:48 -07:00
Shorthills AI	4b6fdd7bf0	Update modal.py (#11588 ) feat: Raise KeyError when 'prompt' key is missing in JSON response This commit updates the error handling in the code to raise a KeyError when the 'prompt' key is not found in the JSON response. This change makes the code more explicit about the nature of the error, helping to improve clarity and debugging. @baskaryan, @eyurtsev.	2023-10-18 23:40:37 -07:00
William FH	dfb4baa3f9	Fix Fireworks Callbacks (#12003 ) I may be missing something but it seems like we inappropriately overrode the 'stream()' method, losing callbacks in the process. I don't think (?) it gave us anything in this case to customize it here? See new trace: https://smith.langchain.com/public/fbb82825-3a16-446b-8207-35622358db3b/r and confirmed it streams. Also fixes the stopwords issues from #12000	2023-10-18 23:33:09 -07:00
Wang Wei	e26559f512	Add ERNIE-Bot-4 model support for ErnieBotChat. (#11969 ) - Description: According to the document https://cloud.baidu.com/doc/WENXINWORKSHOP/s/clntwmv7t, add ERNIE-Bot-4 model support for ErnieBotChat. - Dependencies: Before using the ERNIE-Bot-4, you should have the model's access authority.	2023-10-18 14:55:29 -07:00
Eugene Yurtsev	f4bec9686d	Add more security notes (#11990 ) Add more security notes	2023-10-18 15:00:56 -04:00
Eugene Yurtsev	3d81c76160	Add security notes to agent toolkits (#11989 ) Add more security notes to agent toolkits.	2023-10-18 14:36:29 -04:00
Leonid Ganeline	b81a4c1d94	docstrings added (#11988 ) Added docstrings. Some docsctrings formatting.	2023-10-18 13:05:49 -04:00
Bagatur	35c7c1f050	bump 317 (#11986 )	2023-10-18 09:25:18 -07:00
Bagatur	122af2effe	fix chroma from_texts bug (#11984 )	2023-10-18 09:24:04 -07:00
Erick Friis	c149954cc5	Hub Runnable (#11946 ) Adds `langchain.runnables.hub.HubRunnable` for pulling configurable objects from the hub	2023-10-18 09:21:45 -07:00
Owen	9e24626e87	chore: remove duplicated export variables (#11962 ) - Description: remove duplicated `__all__` variables	2023-10-18 12:08:50 -04:00
Nuno Campos	6bd9c1d2b3	Make prompt validation opt-in (#11973 ) By default replace input_variables with the correct value <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-18 16:28:47 +01:00
Nuno Campos	9bc7e1851a	Ensure dict() does not raise not implemented error, which should instead be raised in our custom method save() (#11970 ) .dict() is a Pydantic method that cannot raise exceptions, as it is used eg. in `__eq__` <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-18 16:28:33 +01:00
Nuno Campos	653cf56e0e	Lint	2023-10-18 16:02:00 +01:00
Predrag Gruevski	debcf053eb	Fix `invalid escape sequence` warnings by using raw strings for regexes. (#11943 ) This code also generates warnings when our users' apps hit it, which is annoying and doesn't look great. Let's fix it.	2023-10-18 10:55:17 -04:00
Nuno Campos	e4ae690244	Sort order	2023-10-18 15:42:13 +01:00
Nuno Campos	b753bf3323	Make prompt validation opt-in By default replace input_variables with the correct value	2023-10-18 10:46:22 +01:00
Nuno Campos	202acce0c9	Ensure dict() does not raise not implemented error, which should instead be raised in our custom method save()	2023-10-18 09:44:41 +01:00
Predrag Gruevski	392df7b2e3	Type hints on varargs and kwargs that take anything should be `Any`. (#11950 ) Type hinting `args` as `List[Any]` means that each positional argument should be a list. Type hinting `*kwargs` as `Dict[str, Any]` means that each keyword argument should be a dict of strings. This is almost never what we actually wanted, and doesn't seem to be what we want in any of the cases I'm replacing here.	2023-10-17 21:31:44 -04:00
Eugene Yurtsev	908c7bf33e	Add documentation to tools (#11938 ) Add security notes to tools --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-10-17 21:27:59 -04:00
Eugene Yurtsev	43dc669332	Update playwright documentation (#11949 ) Add security note to playwright tool	2023-10-17 21:22:26 -04:00
Daniel Chalef	2beb767ae5	zep: Memory Retriever MMR Support & Docs Updates (#11954 ) - Update Zep Memory and Retriever docstrings - Zep Memory Retriever: Add support for native MMR - Add MMR example to existing ZepRetriever Notebook @baskaryan	2023-10-17 16:35:11 -07:00
William FH	a27fa9bf10	Use traceable context (#11896 ) Example ``` from langchain.schema.runnable import RunnableLambda from langsmith import traceable chain = RunnableLambda(lambda x: x) @traceable(run_type = "chain") def my_traceable(a): chain.invoke(a) my_traceable(5) ``` Would have a nested result. This would NOT work for interleaving chains and traceables. E.g., things like thiswould still not work well ``` from langchain.schema.runnable import RunnableLambda from langsmith import traceable @traceable() def other_traceable(a): return a def foo(x): return other_traceable(x) chain = RunnableLambda(foo) @traceable(run_type = "chain") def my_traceable(a): chain.invoke(a) my_traceable(5) ```	2023-10-17 15:10:20 -07:00
Predrag Gruevski	dcd0392423	Upgrade to newer black (23.10) and ruff (first 0.1.x!) versions. (#11944 ) Minor lint dependency version upgrade to pick up latest functionality. Ruff's new v0.1 version comes with lots of nice features, like fix-safety guarantees and a preview mode for not-yet-stable features: https://astral.sh/blog/ruff-v0.1.0	2023-10-17 17:24:51 -04:00
Trayan Azarov	1fd21ed21c	Chroma batching (#11203 ) - Description: Chroma >= 0.4.10 added support for batch sizes validation of add/upsert. This batch size is dependent on the SQLite limits of the target system and varies. In this change, for Chroma>=0.4.10 batch splitting was added as the aforementioned validation is starting to surface in the Chroma community (users using LC) - Issue: N/A - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: t_azarov	2023-10-17 13:59:42 -07:00
Guy Korland	9373b9c004	Add Graph interface (#11012 ) Replace this entire comment with: - Description: Add a Graph interface - Tag maintainer: @baskaryan @hwchase17 - Twitter handle: @g_korland	2023-10-17 13:54:05 -07:00
DanielZzz	b647505280	feat: support ChatModels Qianfan `QianfanChatEndpoint` function_call (#11107 ) - Description: * feature for `QianfanChatEndpoint` function_call ability, add integration_test for it * add `model`, `endpoint` supported in calling params * add raw response in ChatModel Message - Issue: * #10867 * #11105 * #10215 - Dependencies: no - Tag maintainer: @baskaryan - Twitter handle: no	2023-10-17 13:33:55 -07:00
M Bharat lal	67300567d3	GCSFileLoader retrieve blob custom metadata and append to document metadata (#11066 ) - Description: GCSFileLoader retrieve blob's custom metadata and append to document's metadata - Issue: #9975, - Tag maintainer: @baskaryan please review Co-authored-by: b0l00ib <bharat.lal@walmart.com>	2023-10-17 12:17:59 -07:00
billytrend-cohere	f4742dce50	Add Cohere retrieval augmented generation to retrievers (#11483 ) Add Cohere retrieval augmented generation to retrievers --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-17 11:51:04 -07:00
刘方瑞	0a24ac7388	Revised notebook and add delete to MyScale vector store (#11848 ) - Description: - Add `.delete` to myscale vector store. - Revised vector store notebooks - Tag maintainer: @baskaryan - Twitter handle: @myscaledb @mpsk_liu	2023-10-17 11:42:21 -07:00
John Mai	3fb5e4d185	Add Baichuan chat model (#11923 ) Description: A large language models developed by Baichuan Intelligent Technology，https://www.baichuan-ai.com/home Issue: None Dependencies: None Tag maintainer: Twitter handle:	2023-10-17 11:30:57 -07:00
Eugene Yurtsev	9ecb7240a4	Add security note to recursive url loader (#11934 ) Add security note to recursive loader	2023-10-17 13:41:43 -04:00
maks-operlejn-ds	42dcc502c7	Anonymizer small fixes (#11915 )	2023-10-17 10:27:29 -07:00
Eugene Yurtsev	90e9ec6962	Sitemap specify default filter url (#11925 ) Specify default filter URL in sitemap loader and add a security note --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-10-17 13:19:27 -04:00
Bagatur	ba0d729961	bump 316 (#11928 )	2023-10-17 09:47:57 -07:00
Eugene Yurtsev	12d7eaa0c2	Add security notices to toolkits (#11900 ) This adds security notices to toolkits init, and to several toolkits. We'll need to continue documenting the rest of the toolkits. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-17 11:45:09 -04:00
Eugene Yurtsev	5f4a697ce3	Add deprecation warnings (#11899 ) Add deprecation warnings Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-17 10:59:38 -04:00
Nuno Campos	8b79cf9566	Add lock for using global config enum weak map (#11920 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-17 15:50:35 +01:00
Nuno Campos	2a8ded6c8c	Export merge_configs function (#11916 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-17 15:36:11 +01:00
Nuno Campos	778e7c526e	Add comment	2023-10-17 15:29:39 +01:00
Nuno Campos	19319e1746	Allow configs with None values	2023-10-17 15:23:58 +01:00
Nuno Campos	b0d5882fe1	Export merge_configs function	2023-10-17 13:22:07 +01:00
Nuno Campos	12596b9a9b	Add validation for configurable keys passed to .with_config() - Fix some typing issues found while doing that	2023-10-17 08:50:31 +01:00
Nuno Campos	754aca794f	remove print	2023-10-17 08:46:07 +01:00
Nuno Campos	cf448a6314	Ensure that configurable fields with enums support deduplication	2023-10-17 08:25:21 +01:00
Leonid Ganeline	31f264169d	evaluation criteria (#11681 ) the updated value was: ` Criteria.MISOGYNY: "Is the submission misogynistic? If so, respond Y." ` The " If so, respond Y." should not be here. This sub-string is not presented in any other criteria and should not be presented here. I also added a synonym to "misogynistic" as it done in many other criteria.	2023-10-16 21:05:08 -07:00
Dmitry Tyumentsev	e8c1850369	Add YandexGPT LLM and Chat model (#11703 ) Description: Introducing an ability to work with the [YandexGPT](https://cloud.yandex.com/en/services/yandexgpt) language model.	2023-10-16 20:30:07 -07:00
Bagatur	c15701eebf	Revert "Add baichuan model" (#11901 ) cc @cloudscool, apologies your PR wasn't actually passing CI	2023-10-16 20:01:12 -07:00
cloudscool	c1d811c4bc	Add baichuan model	2023-10-16 19:27:35 -07:00
John Mai	0169d45ba8	Supported OutputFixingParser max_retries (#11754 ) Description: Supported OutputFixingParser max_retries - max_retries: Maximum number of retries to parser. Issue: None Dependencies: None Tag maintainer: @baskaryan Twitter handle: @JohnMai95 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-16 19:25:47 -07:00
volodymyr-memsql	ff8e6981ff	SingleStoreDBChatMessageHistory: Add singlestoredb support for ChatMessageHistory (#11705 ) Description - Added the `SingleStoreDBChatMessageHistory` class that inherits `BaseChatMessageHistory` and allows to use of a SingleStoreDB database as a storage for chat message history. - Added integration test to check that everything works (requires `singlestoredb` to be installed) - Added notebook with usage example - Removed custom retriever for SingleStoreDB vector store (as it is useless) --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2023-10-16 21:59:45 -04:00
Mohammad Mohtashim	634ccb8ccd	test_stream_log_retriever Unit Test + Tool names fix (#11808 ) ## Description \| Tool \| Original Tool Name \| \|-----------------------------\|---------------------------\| \| open-meteo-api \| Open Meteo API \| \| news-api \| News API \| \| tmdb-api \| TMDB API \| \| podcast-api \| Podcast API \| \| golden_query \| Golden Query \| \| dall-e-image-generator \| Dall-E Image Generator \| \| twilio \| Text Message \| \| searx_search_results \| Searx Search Results \| \| dataforseo \| DataForSeo Results JSON \| When using these tools through `load_tools`, I encountered the following validation error: ```console openai.error.InvalidRequestError: 'TMDB API' does not match '^[a-zA-Z0-9_-]{1,64}$' - 'functions.0.name' ``` In order to avoid this error, I replaced spaces with hyphens in the tool names: \| Tool \| Corrected Tool Name \| \|-----------------------------\|---------------------------\| \| open-meteo-api \| Open-Meteo-API \| \| news-api \| News-API \| \| tmdb-api \| TMDB-API \| \| podcast-api \| Podcast-API \| \| golden_query \| Golden-Query \| \| dall-e-image-generator \| Dall-E-Image-Generator \| \| twilio \| Text-Message \| \| searx_search_results \| Searx-Search-Results \| \| dataforseo \| DataForSeo-Results-JSON \| This correction resolved the validation error. Additionally, a unit test, `tests/unit_tests/schema/runnable/test_runnable.py::test_stream_log_retriever`, was failing at random. Upon further investigation, I confirmed that the failure was not related to the above-mentioned changes. The `stream_log` variable was generating the order of logs in two ways at random The reason for this behavior is unclear, but in the assertion, I included both possible orders to account for this variability.	2023-10-16 18:46:19 -07:00
Predrag Gruevski	7c0f1bf23f	Upgrade experimental package dependencies and use Poetry 1.6.1. (#11339 ) Part of upgrading our CI to use Poetry 1.6.1.	2023-10-16 21:13:31 -04:00
Eugene Yurtsev	c2c0814a94	Add security notice to file management tool (#11878 ) Add security notice to file management tool --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-10-16 21:12:13 -04:00
zhaoshengbo	cb7e12f6ba	Adapt to the latest version of Alibaba Cloud OpenSearch vector store API (#11849 ) Hello Folks, Alibaba Cloud OpenSearch has released a new version of the vector storage engine, which has significantly improved performance compared to the previous version. At the same time, the sdk has also undergone changes, requiring adjustments alibaba opensearch vector store code to adapt. This PR includes: Adapt to the latest version of Alibaba Cloud OpenSearch API. More comprehensive unit testing. Improve documentation. I have read your contributing guidelines. And I have passed the tests below - [x] make format - [x] make lint - [x] make coverage - [x] make test --------- Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>	2023-10-16 18:07:24 -07:00
Lee	e669f9d731	Fix: Sitemap Document Loader Tests and Documentation (#11866 ) Description: While working on the Docusaurus site loader #9138, I noticed some outdated docs and tests for the Sitemap Loader. Issue: This is tangentially related to #6691 in reference to doc links. I plan on digging in to a few of these issue when I find time next.	2023-10-16 17:42:10 -07:00
Jean-Louis Queguiner	8b697ff0ee	feat(llm): add together.xyz as an LLM provider (#11892 ) - Description: added together.xyz as an LLM provider, - Issues: fix some linting issues - twitter handle @jilijeanlouis --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-16 17:08:04 -07:00
Leonid Kuligin	d269dd2e2f	added a multiturn search based on Vertex AI Search (#11885 ) Replace this entire comment with: - Description: Added a retriever based on multi-turn Vertex AI Search - Twitter handle: lkuligin	2023-10-16 17:05:12 -07:00
Leonid Kuligin	38ed55245f	added Vertex examples as attributes (#11890 ) - Description: added examples to Vertex chat models as optional class attributes, so that a model with examples can be used inside a chain - Twitter handle: lkuligin	2023-10-16 16:55:45 -07:00
eryk-dsai	5019f59724	fix: more robust check whether the HF model is quantized (#11891 ) Removes the check of `model.is_quantized` and adds more robust way of checking for 4bit and 8bit quantization in the `huggingface_pipeline.py` script. I had to make the original change on the outdated version of `transformers`, because the models had this property before. Seems redundant now. Fixes: https://github.com/langchain-ai/langchain/issues/11809 and https://github.com/langchain-ai/langchain/issues/11759	2023-10-16 16:54:20 -07:00
Eugene Yurtsev	210a48cfb5	Add security considerations (#11869 ) Add security considerations to existing graph tools.	2023-10-16 12:23:48 -04:00
Bagatur	25b1d65305	bump 315 (#11850 )	2023-10-16 00:50:54 -07:00
Nuno Campos	4321d192ea	Use a less specific return type for \| on Runnables (#11762 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-15 21:15:06 +01:00
Harrison Chase	a506302772	bearly tool (#11812 )	2023-10-14 16:03:58 -07:00
Harrison Chase	4a2f0c51a1	use get_llm_cache and set_llm_cache (#11741 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-14 09:29:30 -07:00
Harrison Chase	f3ad22e64a	pipe default key (#11788 )	2023-10-14 08:39:23 +01:00

... 4 5 6 7 8 ...

1943 Commits