langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-08 07:10:35 +00:00

Author	SHA1	Message	Date
Erick Friis	0c81cd923e	oai v1 embeddings (#12969 ) Initial PR to get OpenAIEmbeddings working with the new sdk fyi @rlancemartin Fixes #12943 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-06 18:52:33 -08:00
Bagatur	fdbb45d79e	bump 331rc1 (#12965 )	2023-11-06 15:36:43 -08:00
Bagatur	3bb8030a6e	fix max_tokens (#12964 )	2023-11-06 15:36:05 -08:00
Bagatur	a9002a82b8	bump 331rc0 (#12963 )	2023-11-06 15:19:33 -08:00
Harrison Chase	c27400efeb	Support multimodal messages (#11320 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-06 15:14:18 -08:00
Bagatur	4f7dff9d66	Record system fingerprint chat openai (#12960 )	2023-11-06 14:25:53 -08:00
Bagatur	8e0cb2eb84	ChatOpenAI and AzureChatOpenAI openai>=1 compatible (#12948 )	2023-11-06 13:24:18 -08:00
Kacper Łukawski	52d0055a91	Add support of Cohere Embed v3 (#12940 ) Cohere released the new embedding API (Embed v3: https://txt.cohere.com/introducing-embed-v3/) that treats document and query embeddings differently. This PR updated the `CohereEmbeddings` to use them appropriately. It also works with the old models.	2023-11-06 15:06:58 -05:00
Praveen Venkateswaran	8e0dcb37d2	Add SecretStr for Symbl.ai Nebula API (#12896 ) Description: This PR masks API key secrets for the Nebula model from Symbl.ai Issue: #12165 Maintainer: @eyurtsev --------- Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>	2023-11-06 14:13:59 -05:00
Vinzenz Klass	59d0bd2150	feat: acquire advisory lock before creating extension in pgvector (#12935 ) - Description: Acquire advisory lock before attempting to create extension on postgres server, preventing errors in concurrent executions. - Issue: #12933 - Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-06 14:00:39 -05:00
Eugene Yurtsev	b376854b26	Fix for anyscale chat model api key (#12938 ) * ChatAnyscale was missing coercion to SecretStr for anyscale api key * The model inherits from ChatOpenAI so it should not force the openai api key to be secret str until openai model has the same changes https://github.com/langchain-ai/langchain/issues/12841	2023-11-06 13:28:02 -05:00
hmasdev	622bf12c2e	fix regex pattern of structured output parser (#12929 ) - Description: fix the regex pattern of [StructuredChatOutputParser](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/agents/structured_chat/output_parser.py#L18) and add unit tests for the code change. - Issue: #12158 #12922 - Dependencies: None - Tag maintainer: - Twitter handle: @hmdev3 - NOTE: This PR conflicts #7495 . After #7495 is merged, I am going to update PR.	2023-11-06 07:53:14 -08:00
wemysschen	8d7144e6a6	fix baiducloud directory loader import file loader (#12924 ) Issue: fix baiducloud BOS directory loader imports its file loader --------- Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>	2023-11-06 07:52:31 -08:00
Kacper Łukawski	621419f71e	Fix normalizing the cosine distance in Qdrant (#12934 ) Qdrant was incorrectly calculating the cosine similarity and returning `0.0` for the best match, instead of `1.0`. Internally Qdrant returns a cosine score from `-1.0` (worst match) to `1.0` (best match), and the current formula reflects it.	2023-11-06 07:36:59 -08:00
Hech	8fe6bcc662	Fix return metadata when searching for DingoDB (#12937 )	2023-11-06 07:35:36 -08:00
Jakub Novák	ada3d2cbd1	Add possibility to pass on_artifacts for a specific conversation (#12687 ) Possibility to pass on_artifacts to a conversation. It can be then achieved by adding this way: ```python result = agent.run( input=message.text, metadata={ "on_artifact": CALLBACK_FUNCTION }, ) ```	2023-11-06 07:29:47 -08:00
Bagatur	53f453f01a	bump 331 (#12932 )	2023-11-06 05:58:12 -08:00
Erick Friis	5000c7308e	cli template gitignores (#12914 ) - ap gitignore - package	2023-11-05 22:34:45 -08:00
Harrison Chase	aba407f774	use keys not items (#12918 )	2023-11-05 22:08:29 -08:00
wemysschen	e14aa37d59	fix bes vector store search (#12828 ) Issue: fix search body in baidu cloud vectorsearch --------- Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>	2023-11-03 15:39:19 -07:00
Lance Martin	ea1ab391d4	Open Clip multimodal embeddings (#12754 )	2023-11-03 13:33:36 -07:00
Bagatur	ebee616822	bump 330 (#12853 )	2023-11-03 13:26:41 -07:00
Erick Friis	6c237716c4	Update readmes with new cli install (#12847 ) Old command still works. Just simplifying. Merge after releasing CLI 0.0.15	2023-11-03 12:10:32 -07:00
Erick Friis	7db49d3842	Confirm sys.path includes current dir for app serve (#12851 ) - Make sure sys.path is set properly for langchain app serve - bump	2023-11-03 11:37:20 -07:00
Erick Friis	1bc35f61cb	CLI 0.0.14, Uvicorn update and no more [serve] (#12845 ) Calls uvicorn directly from cli: Reload works if you define app by import string instead of object. (was doing subprocess in order to get reloading) Version bump to 0.0.14 Remove the need for [serve] for simplicity. Readmes are updated in #12847 to avoid cluttering this PR	2023-11-03 11:05:52 -07:00
William FH	18005c6384	Disable trace_on_chain_group auto-tracing (#12807 ) Previously we treated trace_on_chain_group as a command to always start tracing. This is unintuitive (makes the function do 2 things), and makes it harder to toggle tracing	2023-11-03 10:05:09 -07:00
Erick Friis	0da75b9ebd	Autopopulate module name in cli init (#12814 )	2023-11-02 23:45:38 -07:00
William FH	98aff29fbd	Add Dataset Page to printout (#12816 )	2023-11-02 20:36:56 -07:00
Manuel Rech	2e2b9c76d9	Keep also original query - multi_query.py (#12696 ) When you use a MultiQuery it might be useful to use the original query as well as the newly generated ones to maximise the changes to retriever the correct document. I haven't created an issue, it seems a very small and easy thing. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 18:15:02 -07:00
Bagatur	658a3a8607	FEAT: Merge TileDB vecstore (#12811 )	2023-11-02 17:40:32 -07:00
Akio Nishimura	c04647bb4e	Correct number of elements in config list in `batch()` and `abatch()` of `BaseLLM` (#12713 ) - Description: Correct number of elements in config list in `batch()` and `abatch()` of `BaseLLM` in case `max_concurrency` is not None. - Issue: #12643 - Twitter handle: @akionux --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 17:28:48 -07:00
James Braza	88b506b321	Adds missing `urllib.parse` for IDE warning of `PubMedAPIWrapper` (#12808 ) Resolves an IDE (PyCharm 2023.2.3 PE) warning around `urllib.parse.quote`, also enabling CTRL-click	2023-11-02 17:27:25 -07:00
Bagatur	a2bb0dd445	TileDB update import unit tests	2023-11-02 17:24:22 -07:00
Nikos Papailiou	2fdaa1e5fd	Add TileDB vectorstore implementation (#12624 ) - Description: Add [TileDB](https://tiledb.com) vectorstore implementation. TileDB offers ANN search capabilities using the [TileDB-Vector-Search](https://github.com/TileDB-Inc/TileDB-Vector-Search) module. It provides serverless execution of ANN queries and storage of vector indexes both on local disk and cloud object stores (i.e. AWS S3). More details in: - [Why TileDB as a Vector Database](https://tiledb.com/blog/why-tiledb-as-a-vector-database) - [TileDB 101: Vector Search](https://tiledb.com/blog/tiledb-101-vector-search) - Twitter handle: @tiledb	2023-11-02 17:21:03 -07:00
盐粒 Yanli	1b233798a0	feat: Supprt pgvecto.rs as a VectorStore (#12718 ) Supprt [pgvecto.rs](https://github.com/tensorchord/pgvecto.rs) as a new VectorStore type. This introduces a new dependency [pgvecto_rs](https://pypi.org/project/pgvecto_rs/) and upgrade SQLAlchemy to ^2. Relate to https://github.com/tensorchord/pgvecto.rs/issues/11 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 17:16:04 -07:00
Daniel Chalef	0cbdba6a9b	zep: VectorStore: Use Native MMR (#12690 ) - refactor to use Zep's native MMR; update example - @baskaryan @eyurtsev	2023-11-02 16:45:42 -07:00
Daniel Chalef	cc3d3920e3	Zep: Summary Search and Example (#12686 ) Zep now has the ability to search over chat history summaries. This PR adds support for doing so. More here: https://blog.getzep.com/zep-v0-17/ @baskaryan @eyurtsev	2023-11-02 16:31:11 -07:00
Bagatur	526313002c	add import tests to all modules (#12806 )	2023-11-02 15:32:55 -07:00
Harrison Chase	6609a6033f	fix vectorstore imports (#12804 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 15:32:31 -07:00
Nuno Campos	f66a9d2adf	Automatically add configurable key to config_schema if config_specs i… (#12798 ) …s present <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 21:46:15 +00:00
Praveen Venkateswaran	21eeba075c	enable the device_map parameter in huggingface pipeline (#12731 ) ### Enabling `device_map` in HuggingFacePipeline For multi-gpu settings with large models, the [accelerate](https://huggingface.co/docs/accelerate/usage_guides/big_modeling#using--accelerate) library provides the `device_map` parameter to automatically distribute the model across GPUs / disk. The [Transformers pipeline](`3520e37e86/src/transformers/pipelines/__init__.py (L543)`) enables users to specify `device` (or) `device_map`, and handles cases (with warnings) when both are specified. However, Langchain's HuggingFacePipeline only supports specifying `device` when calling transformers which limits large models and multi-gpu use-cases. Additionally, the [default value](`8bd3ce59cd/libs/langchain/langchain/llms/huggingface_pipeline.py (L72)`) of `device` is initialized to `-1` , which is incompatible with the transformers pipeline when `device_map` is specified. This PR addresses the addition of `device_map` as a parameter , and solves the incompatibility of `device = -1` when `device_map` is also specified. An additional test has been added for this feature. Additionally, some existing tests no longer work since 1. `max_new_tokens` has to be specified under `pipeline_kwargs` and not `model_kwargs` 2. The GPT2 tokenizer raises a `ValueError: Pipeline with tokenizer without pad_token cannot do batching`, since the `tokenizer.pad_token` is `None` ([related issue](https://github.com/huggingface/transformers/issues/19853) on the transformers repo). This PR handles fixing these tests as well. Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>	2023-11-02 14:29:06 -07:00
Mark Bell	3276aa3e17	__getattr__ should rase AttributeError not ImportError on missing attributes (#12801 ) [The python spec](https://docs.python.org/3/reference/datamodel.html#object.__getattr__) requires that `__getattr__` throw `AttributeError` for missing attributes but there are several places throwing `ImportError` in the current code base. This causes a specific problem with `hasattr` since it calls `__getattr__` then looks only for `AttributeError` exceptions. At present, calling `hasattr` on any of these modules will raise an unexpected exception that most code will not handle as `hasattr` throwing exceptions is not expected. In our case this is triggered by an exception tracker (Airbrake) that attempts to collect the version of all installed modules with code that looks like: `if hasattr(mod, "__version__"):`. With `HEAD` this is causing our exception tracker to fail on all exceptions. I only changed instances of unknown attributes raising `ImportError` and left instances of known attributes raising `ImportError`. It feels a little weird but doesn't seem to break anything.	2023-11-02 17:08:54 -04:00
Illia	71d1a48b66	Use data from all Google search results in SerpApi.com wrapper (#12770 ) - Description: Use all Google search results data in SerpApi.com wrapper instead of the first one only - Tag maintainer: @hwchase17 _P.S. `libs/langchain/tests/integration_tests/utilities/test_serpapi.py` are not executed during the `make test`._	2023-11-02 13:31:27 -07:00
Nuno Campos	c4fdf78d03	Fix AddableDict raising exception when used with non-addable values (#12785 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 18:56:29 +00:00
Erick Friis	49e283a0cd	CLI 0.0.13, Configurable Template Demo (#12796 )	2023-11-02 11:42:57 -07:00
Nuno Campos	d1c6ad7769	Fix on_llm_new_token(chunk=) for some chat models (#12784 ) It was passing in message instead of generation <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 16:33:44 +00:00
Erick Friis	070823f294	CLI 0.0.12 (#12787 )	2023-11-02 08:29:27 -07:00
Bagatur	979501c0ca	bump 329 (#12778 )	2023-11-02 06:02:43 -07:00
Erick Friis	da821320d3	Fixes 'Nonetype' not iterable for ObsidianLoader (#12751 ) Implements #12726 from @Di3mex	2023-11-01 16:07:09 -07:00
Eugene Yurtsev	b1caae62fd	APIChain add restrictions to domains (CVE-2023-32786) (#12747 ) * Restrict the chain to specific domains by default * This is a breaking change, but it will fail loudly upon object instantiation -- so there should be no silent errors for users * Resolves CVE-2023-32786	2023-11-01 18:50:34 -04:00

1 2 3 4 5 ...

1707 Commits