langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-10 01:10:59 +00:00

Author	SHA1	Message	Date
Dmitrii Rashchenko	aaabc1574f	Support of custom hugging face inference endpoints url (#14125 ) - Description: to support not only publicly available Hugging Face endpoints, but also protected ones (created with "Inference Endpoints" Hugging Face feature), I have added ability to specify custom api_url. But if not specified, default behaviour won't change - Issue: #9181, - Dependencies: no extra dependencies	2023-12-04 12:08:51 -08:00
Harrison Chase	e32185193e	Harrison/embass (#14242 ) Co-authored-by: Julius Lipp <lipp.julius@gmail.com>	2023-12-04 11:58:52 -08:00
umair mehmood	8504ec56e4	fixed: ModuleNotFoundError: No module named 'clarifai.auth' (#14215 ) Updated the clarifai imports fixed: #14175 @efriis @baskaryan	2023-12-04 11:53:34 -08:00
Hieu Lam	ca8a022cd9	Fixed OpenAIFunctionsAgent not returning when receiving AgentFinish (#14236 ) Description: The way the condition is checked in the `return_stopped_response` function of `OpenAIAgent` may not be correct, when the value returned is `AgentFinish` from the tools it does not work properly. Thanks for review, @baskaryan, @eyurtsev, @hwchase17.	2023-12-04 11:43:04 -08:00
Unai Garay Maestre	6826feea14	Adds `llm_chain_kwargs` to `BaseRetrievalQA.from_llm` (#14224 ) - Description: Adds `llm_chain_kwargs` to `BaseRetrievalQA.from_llm` so these can be passed to the LLM at runtime, - Issue: https://github.com/langchain-ai/langchain/issues/14216, --------- Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>	2023-12-04 11:34:01 -08:00
James Braza	6ce5dab38c	Clarifying descriptions in `GuardrailsOutputParser` (#14228 ) Upstreaming knowledge from https://github.com/guardrails-ai/guardrails/discussions/473 to LangChain	2023-12-04 11:33:22 -08:00
geret1	50aee687c6	langchain[patch]: Cerebrium model_api_request deprecation (#12704 ) - Description: As part of my conversation with Cerebrium team, `model_api_request` will be no longer available in cerebrium lib so it needs to be replaced. - Issue: #12705 12705, - Dependencies: Cerebrium team (agreed) - Tag maintainer: @eyurtsev - Twitter handle: No official Twitter account sorry :D --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-04 09:26:32 -08:00
William FH	246dc4f9cc	langchain[patch]: Pass kwargs to chat fireworks (#14183 ) Otherwise `.bind()` isn't really any good	2023-12-03 15:12:02 -08:00
Kaiboon Ee	e961c57fd2	langchain[patch]: Mask API key for Arcee LLM (#14193 ) - Description: Mask API key for Arcee LLM and its associated unit tests - Issue: https://github.com/langchain-ai/langchain/issues/12165 - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: `eekaiboon` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-03 15:11:43 -08:00
Daniyar Supiyev	092f302c0f	langchain[patch]: Asynchronous human-in-the-loop callback (#14195 ) Description: Adding a possibility to use asynchronous callback handler in human-in-the-loop validation tool. Very useful, for example, if you want to implement a validation over Telegram bot. Issue: - Dependencies: - --------- Co-authored-by: Daniyar_Supiyev <daniyar_supiyev@epam.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-03 14:57:07 -08:00
Mark Cusack	16c83f786c	Adds the Yellowbrick Data Warehouse as a supported vector store (#13820 ) - Description An integration to allow the Yellowbrick Data Warehouse to function as a vector store --------- Co-authored-by: markcusack <markcusack@markcusacksmac.lan> Co-authored-by: markcusack <markcusack@Mark-Cusack-sMac.local>	2023-12-03 13:35:53 -08:00
Hendrik Hogertz	e6862e6e7d	Fix Azure Openai function calling in streaming mode (#13768 ) - Description: This PR addresses an issue with the OpenAI API streaming response, where initially the key (arguments) is provided but the value is None. Subsequently, it updates with {"arguments": "{\n"}, leading to a type inconsistency that causes an exception. The specific error encountered is ValueError: additional_kwargs["arguments"] already exists in this message, but with a different type. This change aims to resolve this inconsistency and ensure smooth API interactions. - Issue: None. - Dependencies: None. - Tag maintainer: @eyurtsev This is an updated version of #13229 based on the refactored code. Credit goes to @superken01. Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-03 12:07:15 -08:00
Nicolò Boschi	e204657b3c	AstraDB VectorStore: implement pre_delete_collection (#13780 ) - Description: some vector stores have a flag for try deleting the collection before creating it (such as ´vectorpg´). This is a useful flag when prototyping indexing pipelines and also for integration tests. Added the bool flag `pre_delete_collection ` to the constructor (default False) - Tag maintainer: @hemidactylus - Twitter handle: nicoloboschi --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-03 12:06:20 -08:00
Chelsea E. Manning	2780d2d4dd	Extend OpenAIEmbeddings class to support non-`tiktoken` based embeddings (#13884 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: This extends `OpenAIEmbeddings` to add support for non-`tiktoken` based embeddings, specifically for use with the new `text-generation-webui` API (`--extensions openai`) which does not support `tiktoken` encodings, but rather strings - Issue: Not found, - Dependencies: HuggingFace `transformers.AutoTokenizer` is new dependency for running the model without `tiktoken` - Tag maintainer: @baskaryan based on last commit for `langchain-core` refactor - Twitter handle: @xychelsea Modified the tokenization process to be model-agnostic, allowing for both OpenAI and non-OpenAI model tokenizations, by setting the new default `bool` flag `tiktoken_enabled` to `False`. This requeires HuggingFace’s AutoTokenizer and handling tokenization for models requiring different preprocessing steps to generate a chunked string request rather than a list of integers. Updated the embeddings generation process to accommodate non-OpenAI models. This includes converting tokenized text into embeddings using OpenAI’s and Hugging Face’s model architectures. -->	2023-12-03 12:04:17 -08:00
Changgeng Zhao	9b59bde93d	Update Hologres vector store: use hologres-vector (#13767 ) Hi, I made some code changes on the Hologres vector store to improve the data insertion performance. Also, this version of the code uses `hologres-vector` library. This library is more convenient for us to update, and more efficient in performance. The code has passed the format/lint/spell check. I have run the unit test for Hologres connecting to my own database. Please check this PR again and tell me if anything needs to change. Best, Changgeng, Developer @ Alibaba Cloud Co-authored-by: Changgeng Zhao <zhaochanggeng.zcg@alibaba-inc.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-03 11:50:45 -08:00
Nicolò Boschi	0de7cf898d	Ensure AstraDB integration tests clean up the environment (#13774 ) - Description: currently astra_db integration tests might leave orphan collections - Tag maintainer: @hemidactylus - Twitter handle: nicoloboschi	2023-12-03 11:14:42 -08:00
Chad Norvell	8a0951d934	Fix Mathpix PDF loader integration (#13949 ) - Description: Fixes the Mathpix PDF loader API integration. Specifically, ensures that Mathpix auth headers are provided for every request, and ensures that we recognize all errors that can occur during a request. Also, the option to provide API keys as kwargs never actually worked before, but now that's fixed too. - Issue: #11249 - Dependencies: None	2023-12-03 10:36:49 -08:00
gzyJoy	32d4bb4590	Added Slacktoolkit (#14012 ) - Description: This PR introduces the Slack toolkit to LangChain, which allows users to read and write to Slack using the Slack API. Specifically, we've added the following tools. 1. get_channel: Provides a summary of all the channels in a workspace. 2. get_message: Gets the message history of a channel. 3. send_message: Sends a message to a channel. 4. schedule_message: Sends a message to a channel at a specific time and date. - Issue: This pull request addresses [Add Slack Toolkit #11747](https://github.com/langchain-ai/langchain/issues/11747) - Dependencies: package`slack_sdk` Note: For this toolkit to function you will need to add a Slack app to your workspace. Additional info can be found [here](https://slack.com/help/articles/202035138-Add-apps-to-your-Slack-workspace). --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: ArianneLavada <ariannelavada@gmail.com> Co-authored-by: ArianneLavada <84357335+ArianneLavada@users.noreply.github.com> Co-authored-by: ariannelavada@gmail.com <you@example.com>	2023-12-03 10:25:38 -08:00
Richie	99e5ee6a84	fix(vectorstores): incorrect import for mongodb atlas DriverInfo (#14060 ) - Description: fix `import` issue for `mongodb atlas` vectore store integration - Issue: none - Dependencies: none while trying to follow official `langchain`'s [mongodb integration guide](https://python.langchain.com/docs/integrations/vectorstores/mongodb_atlas), an import error will happen. It's caused by incorrect import location: - `from pymongo import DriverInfo` should be `from pymongo.driver_info import DriverInfo` - reference: [pymongo's DriverInfo class](https://pymongo.readthedocs.io/en/stable/api/pymongo/driver_info.html#pymongo.driver_info.DriverInfo) Thanks!	2023-12-03 10:22:13 -08:00
James Braza	3833882ab7	Removing extra `StdOutCallbackHandler` overridden methods (#14136 ) Unnecessarily overridden methods: - Give the idea the subclass is doing something special (when it isn't) - Block CTRL-click to the actual method This PR removes some unnecessarily overridden methods in `StdOutCallbackHandler` Supercedes https://github.com/langchain-ai/langchain/pull/12858	2023-12-03 09:38:49 -08:00
James Braza	052e23be3e	Added Python `logging` tracer (#14190 ) This PR creates a logging handler and adds a simple unit test of it Supercedes https://github.com/langchain-ai/langchain/pull/12862 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-12-03 09:36:30 -08:00
Bob Lin	62505043be	Closed #14069 (#14166 ) ### Description Fix #14069 ### Twitter handle [lin_bob57617](https://twitter.com/lin_bob57617)	2023-12-03 08:55:25 -08:00
Yong woo Song	9938086df0	Fix Html2TextTransformer for shallow copy (#14197 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> Hi, There is some unintended behavior in Html2TextTransformer. The current code is directly modifying the original documents that are passed as arguments to the function. Therefore, not only the return of the function but also the input variables are being modified simultaneously. To resolve this, I added unit test code as well. reference link: [Shallow vs Deep Copying of Python Objects](https://realpython.com/copying-python-objects/) Thanks! ☺️	2023-12-03 08:45:35 -08:00
h3l	818252b1f8	Fix: (issue #14127 ) Volc Engine MaaS import error (#14194 ) - Description: fix Volc Engine MaaS import error - Issue: [the issue # it fixes (if applicable),](https://github.com/langchain-ai/langchain/issues/14127) - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: Co-authored-by: lvzhong <lvzhong@bytedance.com>	2023-12-03 08:43:23 -08:00
Bagatur	0bdb434383	langchain[patch]: Release langchain 0.0.345 (#14184 )	2023-12-02 15:53:49 -08:00
Bagatur	15c04a5670	core[patch]: Release 0.0.9 (#14182 )	2023-12-02 14:40:56 -08:00
James Braza	bdb6ae2ed3	core[patch]: `BaseTracer` helper method for `Run` lookup (#14139 ) I observed the same run ID extraction logic is repeated many times in `BaseTracer`. This PR creates a helper method for DRY code.	2023-12-02 14:05:50 -08:00
Harutaka Kawamura	41ee3be95f	langchain[patch]: Support passing parameters to `llms.Databricks` and `llms.Mlflow` (#14100 ) Before, we need to use `params` to pass extra parameters: ```python from langchain.llms import Databricks Databricks(..., params={"temperature": 0.0}) ``` Now, we can directly specify extra params: ```python from langchain.llms import Databricks Databricks(..., temperature=0.0) ```	2023-12-01 19:27:18 -08:00
Abdul	82102c99b3	langchain[patch]: Running SQLDatabaseChain adds prefix "SQLQuery:\n" (#14058 ) - Issue: https://github.com/langchain-ai/langchain/issues/12077 --------- Co-authored-by: Abdul Kader Maliyakkal <maliyakk@amazon.com>	2023-12-01 19:26:16 -08:00
Samuel Kemp	fd781c89cc	langchain[minor]: add azure ai data document loader (#13404 ) This PR adds an "Azure AI data" document loader, which allows Azure AI users to load their registered data assets as a document object in langchain. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-01 19:25:55 -08:00
James Braza	24385a00de	core[minor], langchain[patch], experimental[patch]: Added missing `py.typed` to `langchain_core` (#14143 ) See PR title. From what I can see, `poetry` will auto-include this. Please let me know if I am missing something here. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-01 19:15:23 -08:00
quantum00549	f7c257553d	langchain[patch]: fixed a bug that was causing the streaming transfer to not work… (#10827 ) … properly Fixed a bug that was causing the streaming transfer to not work properly. - Description: 1、The on_llm_new_token method in the streaming callback can now be called properly in streaming transfer mode. 2、In streaming transfer mode, LLM can now correctly output the complete response instead of just the first token. - Tag maintainer: @wangxuqi - **Twitter handle: @kGX7XJjuYxzX9Km --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-01 18:57:50 -08:00
Eugene Yurtsev	6d0209e0aa	Improve file system blob loader and generic loader (#14004 ) * Add support for passing a specific file to the file system blob loader * Allow specifying a class parameter for the parser for the generic loader ```python class AudioLoader(GenericLoader): @staticmethod def get_parser(kwargs): return MyAudioParser(kwargs): ``` The intent of the GenericLoader is to provide on-ramps from different sources (e.g., web, s3, file system). An alternative is to use pipelining syntax or creating a Pipeline ``` FileSystemBlobLoader(...) \| MyAudioParser ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-01 21:23:40 -05:00
Lance Martin	cbe4753e1a	Update Open CLIP embd (#14155 ) Prior default model required a large amt of RAM and often crashed Jupyter ntbk kernel.	2023-12-01 15:13:20 -08:00
Amyh102	b6d26d3f9f	infra[patch]: Add unit tests for Huggingface dataset loader (#14053 ) - Description: Add unit tests for huggingface dataset loader and sample huggingface dataset for future tests. Updates dependencies for `datasets` module. - Adds coverage for [previous pull request](https://github.com/langchain-ai/langchain/pull/13864) - Tag maintainer: @hwchase17 --------- Co-authored-by: Amy Han <amyhan@Amys-Air.lan> Co-authored-by: Amy Han <amyhan@Amys-MacBook-Air.local> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-01 12:42:31 -08:00
Govinda Totla	62a3473ac0	docs[patch]: add text_splitter.py test (#14025 ) Description: Add HTMLHeaderTextSplitter unit test Dependencies: none	2023-12-01 11:57:50 -08:00
axiangcoding	1b36ddf16c	docs[patch]: add deprecated note for ErnieChatBot (#14061 ) - Description: just a little change of ErnieChatBot class description, sugguesting user to use more suitable class - Issue: none, - Dependencies: none, - Tag maintainer: @baskaryan , - Twitter handle: none	2023-12-01 11:16:31 -08:00
Devin Dahoon Kim	32da0a4d71	langchain[patch]: use async_embed_with_retry in _aget_len_safe_embeddings (#14110 ) Description `embed_with_retry` is for sync operations and not for async operations. Use `async_embed_with_retry` for appropriate async operations. I'm using `OpenAIEmbedding(http_client=httpx.AsyncClient())` with only async operations. However, I got an error when I use `embedding.aembed_documents` because `embed_with_retry` uses sync OpenAI client with async http client.	2023-12-01 10:47:07 -08:00
lijie	371bcb7580	langchain[patch]: set maxsplit when parse python function docstring (#14121 ) Description when the desc of arg in python docstring contains ":", the `_parse_python_function_docstring` will raise ValueError: too many values to unpack (expected 2). A sample desc would be: """ Args: error_arg: this is an arg with an additional ":" symbol """ So, set `maxsplit` parameter to fix it.	2023-12-01 10:46:53 -08:00
Harrison Chase	ae646701c4	Harrison/ibm (#14133 ) Co-authored-by: Mateusz Szewczyk <139469471+MateuszOssGit@users.noreply.github.com>	2023-12-01 12:44:11 -05:00
Eugene Yurtsev	943aa01c14	Improve indexing performance for Postgres (remote database) for refresh for async API (#14132 ) This PR speeds up the indexing api on the async path by batching the uid updates in the sql record manager (which may be remote).	2023-12-01 12:10:07 -05:00
William FH	528fc76d6a	Update Prompt Format Error (#14044 ) The number of times I try to format a string (especially in lcel) is embarrassingly high. Think this may be more actionable than the default error message. Now I get nice helpful errors ``` KeyError: "Input to ChatPromptTemplate is missing variable 'input'. Expected: ['input'] Received: ['dialogue']" ```	2023-12-01 09:06:35 -08:00
William FH	71c2e184b4	[Nits] Evaluation - Some Rendering Improvements (#14097 ) - Improve rendering of aggregate results at the end - flatten reference if present	2023-12-01 09:06:07 -08:00
Mark Scannell	9b0e46dcf0	Improve indexing performance for Postgres (remote database) for refresh (#14126 ) Description: By combining the document timestamp refresh within a single call to update(), this enables batching of multiple documents in a single SQL statement. This is important for non-local databases where tens of milliseconds has a huge impact on performance when doing document-by-document SQL statements. Issue: #11935 Dependencies: None Tag maintainer: @eyurtsev	2023-12-01 11:36:02 -05:00
Jacob Lee	3328507f11	langchain[patch], experimental[minor]: Adds OllamaFunctions wrapper (#13330 ) CC @baskaryan @hwchase17 @jmorganca Having a bit of trouble importing `langchain_experimental` from a notebook, will figure it out tomorrow ~Ah and also is blocked by #13226~ --------- Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-30 16:13:57 -08:00
Bagatur	4063bf144a	langchain[patch]: release 0.0.344 (#14095 )	2023-11-30 15:57:11 -08:00
Bagatur	efce352d6b	core[patch]: release 0.0.8 (#14086 )	2023-11-30 15:12:06 -08:00
Harutaka Kawamura	0d08a692a3	langchain[minor]: Migrate mlflow and databricks classes to deployments APIs. (#13699 ) ## Description Related to https://github.com/mlflow/mlflow/pull/10420. MLflow AI gateway will be deprecated and replaced by the `mlflow.deployments` module. Happy to split this PR if it's too large. ``` pip install git+https://github.com/langchain-ai/langchain.git@refs/pull/13699/merge#subdirectory=libs/langchain ``` ## Dependencies Install mlflow from https://github.com/mlflow/mlflow/pull/10420: ``` pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10420/merge ``` ## Testing plan The following code works fine on local and databricks: <details><summary>Click</summary> <p> ```python """ Setup ----- mlflow deployments start-server --config-path examples/gateway/openai/config.yaml databricks secrets create-scope <scope> databricks secrets put-secret <scope> openai-api-key --string-value $OPENAI_API_KEY Run --- python /path/to/this/file.py secrets/<scope>/openai-api-key """ from langchain.chat_models import ChatMlflow, ChatDatabricks from langchain.embeddings import MlflowEmbeddings, DatabricksEmbeddings from langchain.llms import Databricks, Mlflow from langchain.schema.messages import HumanMessage from langchain.chains.loading import load_chain from mlflow.deployments import get_deploy_client import uuid import sys import tempfile from langchain.chains import LLMChain from langchain.prompts import PromptTemplate ############################### # MLflow ############################### chat = ChatMlflow( target_uri="http://127.0.0.1:5000", endpoint="chat", params={"temperature": 0.1} ) print(chat([HumanMessage(content="hello")])) embeddings = MlflowEmbeddings(target_uri="http://127.0.0.1:5000", endpoint="embeddings") print(embeddings.embed_query("hello")[:3]) print(embeddings.embed_documents(["hello", "world"])[0][:3]) llm = Mlflow( target_uri="http://127.0.0.1:5000", endpoint="completions", params={"temperature": 0.1}, ) print(llm("I am")) llm_chain = LLMChain( llm=llm, prompt=PromptTemplate( input_variables=["adjective"], template="Tell me a {adjective} joke", ), ) print(llm_chain.run(adjective="funny")) # serialization/deserialization with tempfile.TemporaryDirectory() as tmpdir: print(tmpdir) path = f"{tmpdir}/llm.yaml" llm_chain.save(path) loaded_chain = load_chain(path) print(loaded_chain("funny")) ############################### # Databricks ############################### secret = sys.argv[1] client = get_deploy_client("databricks") # External - chat name = f"chat-{uuid.uuid4()}" client.create_endpoint( name=name, config={ "served_entities": [ { "name": "test", "external_model": { "name": "gpt-4", "provider": "openai", "task": "llm/v1/chat", "openai_config": { "openai_api_key": "{{" + secret + "}}", }, }, } ], }, ) try: chat = ChatDatabricks( target_uri="databricks", endpoint=name, params={"temperature": 0.1} ) print(chat([HumanMessage(content="hello")])) finally: client.delete_endpoint(endpoint=name) # External - embeddings name = f"embeddings-{uuid.uuid4()}" client.create_endpoint( name=name, config={ "served_entities": [ { "name": "test", "external_model": { "name": "text-embedding-ada-002", "provider": "openai", "task": "llm/v1/embeddings", "openai_config": { "openai_api_key": "{{" + secret + "}}", }, }, } ], }, ) try: embeddings = DatabricksEmbeddings(target_uri="databricks", endpoint=name) print(embeddings.embed_query("hello")[:3]) print(embeddings.embed_documents(["hello", "world"])[0][:3]) finally: client.delete_endpoint(endpoint=name) # External - completions name = f"completions-{uuid.uuid4()}" client.create_endpoint( name=name, config={ "served_entities": [ { "name": "test", "external_model": { "name": "gpt-3.5-turbo-instruct", "provider": "openai", "task": "llm/v1/completions", "openai_config": { "openai_api_key": "{{" + secret + "}}", }, }, } ], }, ) try: llm = Databricks( endpoint_name=name, model_kwargs={"temperature": 0.1}, ) print(llm("I am")) finally: client.delete_endpoint(endpoint=name) # Foundation model - chat chat = ChatDatabricks( endpoint="databricks-llama-2-70b-chat", params={"temperature": 0.1} ) print(chat([HumanMessage(content="hello")])) # Foundation model - embeddings embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en") print(embeddings.embed_query("hello")[:3]) # Foundation model - completions llm = Databricks( endpoint_name="databricks-mpt-7b-instruct", model_kwargs={"temperature": 0.1} ) print(llm("hello")) llm_chain = LLMChain( llm=llm, prompt=PromptTemplate( input_variables=["adjective"], template="Tell me a {adjective} joke", ), ) print(llm_chain.run(adjective="funny")) # serialization/deserialization with tempfile.TemporaryDirectory() as tmpdir: print(tmpdir) path = f"{tmpdir}/llm.yaml" llm_chain.save(path) loaded_chain = load_chain(path) print(loaded_chain("funny")) ``` Output: ``` content='Hello! How can I assist you today?' [-0.025058426, -0.01938856, -0.027781019] [-0.025058426, -0.01938856, -0.027781019] sorry, but I cannot continue the sentence as it is incomplete. Can you please provide more information or context? Sure, here's a classic one for you: Why don't scientists trust atoms? Because they make up everything! /var/folders/dz/cd_nvlf14g9g__n3ph0d_0pm0000gp/T/tmpx_4no6ad {'adjective': 'funny', 'text': "Sure, here's a classic one for you:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!"} content='Hello! How can I assist you today?' [-0.025058426, -0.01938856, -0.027781019] [-0.025058426, -0.01938856, -0.027781019] a 23 year old female and I am currently studying for my master's degree content="\nHello! It's nice to meet you. Is there something I can help you with or would you like to chat for a bit?" [0.051055908203125, 0.007221221923828125, 0.003879547119140625] [0.051055908203125, 0.007221221923828125, 0.003879547119140625] hello back Well, I don't really know many jokes, but I do know this funny story... /var/folders/dz/cd_nvlf14g9g__n3ph0d_0pm0000gp/T/tmp7_ds72ex {'adjective': 'funny', 'text': " Well, I don't really know many jokes, but I do know this funny story..."} ``` </p> </details> The existing workflow doesn't break: <details><summary>click</summary> <p> ```python import uuid import mlflow from mlflow.models import ModelSignature from mlflow.types.schema import ColSpec, Schema class MyModel(mlflow.pyfunc.PythonModel): def predict(self, context, model_input): return str(uuid.uuid4()) with mlflow.start_run(): mlflow.pyfunc.log_model( "model", python_model=MyModel(), pip_requirements=["mlflow==2.8.1", "cloudpickle<3"], signature=ModelSignature( inputs=Schema( [ ColSpec("string", "prompt"), ColSpec("string", "stop"), ] ), outputs=Schema( [ ColSpec(name=None, type="string"), ] ), ), registered_model_name=f"lang-{uuid.uuid4()}", ) # Manually create a serving endpoint with the registered model and run from langchain.llms import Databricks llm = Databricks(endpoint_name="<name>") llm("hello") # 9d0b2491-3d13-487c-bc02-1287f06ecae7 ``` </p> </details> ## Follow-up tasks (This PR is too large. I'll file a separate one for follow-up tasks.) - Update `docs/docs/integrations/providers/mlflow_ai_gateway.mdx` and `docs/docs/integrations/providers/databricks.md`. --------- Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-30 15:06:58 -08:00
Jeremy Naccache	a14cf87576	core[patch]: Add kwargs to Langchain's dumps() to allow passing of json.dumps() … (#10628 ) …parameters. In Langchain's `dumps()` function, I've added a `kwargs` parameter. This allows users to pass additional parameters to the underlying `json.dumps()` function, providing greater flexibility and control over JSON serialization. Many parameters available in `json.dumps()` can be useful or even necessary in specific situations. For example, when using an Agent with return_intermediate_steps set to true, the output is a list of AgentAction objects. These objects can't be serialized without using Langchain's `dumps()` function. The issue arises when using the Agent with a language other than English, which may contain non-ASCII characters like 'é'. The default behavior of `json.dumps()` sets ensure_ascii to true, converting `{"name": "José"}` into `{"name": "Jos\u00e9"}`. This can make the output hard to read, especially in the case of intermediate steps in agent logs. By allowing users to pass additional parameters to `json.dumps()` via Langchain's dumps(), we can solve this problem. For instance, users can set `ensure_ascii=False` to maintain the original characters. This update also enables users to pass other useful `json.dumps()` parameters like `sort_keys`, providing even more flexibility. The implementation takes into account edge cases where a user might pass a "default" parameter, which is already defined by `dumps()`, or an "indent" parameter, which is also predefined if `pretty=True` is set. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-30 08:52:24 -08:00
Yong woo Song	f4d520ccb5	Fix .env file path in integration_test README.md (#14028 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> ### Description Hello, The [integration_test README](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests) was indicating incorrect paths for the `.env.example` and `.env` files. `tests/.env.example` ->`tests/integration_tests/.env.example` While it’s a minor error, it could potentially lead to confusion for the document’s readers, so I’ve made the necessary corrections. Thank you! ☺️ ### Related Issue - https://github.com/langchain-ai/langchain/pull/2806	2023-11-29 22:14:28 -05:00
Rohan Dey	41a4c06a94	Added support for a Pandas DataFrame OutputParser (#13257 ) Description: Added support for a Pandas DataFrame OutputParser with format instructions, along with unit tests and a demo notebook. Namely, we've added the ability to request data from a DataFrame, have the LLM parse the request, and then use that request to retrieve a well-formatted response. Within LangChain, it seamlessly integrates with language models like OpenAI's `text-davinci-003`, facilitating streamlined interaction using the format instructions (just like the other output parsers). This parser structures its requests as `<operation/column/row>[<optional_array_params>]`. The instructions detail permissible operations, valid columns, and array formats, ensuring clarity and adherence to the required format. For example: - When the LLM receives the input: "Retrieve the mean of `num_legs` from rows 1 to 3." - The provided format instructions guide the LLM to structure the request as: "mean:num_legs[1..3]". The parser processes this formatted request, leveraging the LLM's understanding to extract the mean of `num_legs` from rows 1 to 3 within the Pandas DataFrame. This integration allows users to communicate requests naturally, with the LLM transforming these instructions into structured commands understood by the `PandasDataFrameOutputParser`. The format instructions act as a bridge between natural language queries and precise DataFrame operations, optimizing communication and data retrieval. Issue: - https://github.com/langchain-ai/langchain/issues/11532 Dependencies: No additional dependencies :) Tag maintainer: @baskaryan Twitter handle: No need. :) --------- Co-authored-by: Wasee Alam <waseealam@protonmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 22:08:50 -05:00
Masanori Taniguchi	235bdb9fa7	Support Vald secure connection (#13269 ) Description: When using Vald, only insecure grpc connection was supported, so secure connection is now supported. In addition, grpc metadata can be added to Vald requests to enable authentication with a token. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-29 22:07:29 -05:00
sudranga	d1d693b2a7	Fix issue where response_if_no_docs_found is not implemented on async… (#13297 ) Response_if_no_docs_found is not implemented in ConversationalRetrievalChain for async code paths. Implemented it and added test cases Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 22:06:13 -05:00
AthulVincent	67c55cb5b0	Implemented MongoDB Atlas Self-Query Retriever (#13321 ) # Description This PR implements Self-Query Retriever for MongoDB Atlas vector store. I've implemented the comparators and operators that are supported by MongoDB Atlas vector store according to the section titled "Atlas Vector Search Pre-Filter" from https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-stage/. Namely: ``` allowed_comparators = [ Comparator.EQ, Comparator.NE, Comparator.GT, Comparator.GTE, Comparator.LT, Comparator.LTE, Comparator.IN, Comparator.NIN, ] """Subset of allowed logical operators.""" allowed_operators = [ Operator.AND, Operator.OR ] ``` Translations from comparators/operators to MongoDB Atlas filter operators(you can find the syntax in the "Atlas Vector Search Pre-Filter" section from the previous link) are done using the following dictionary: ``` map_dict = { Operator.AND: "$and", Operator.OR: "$or", Comparator.EQ: "$eq", Comparator.NE: "$ne", Comparator.GTE: "$gte", Comparator.LTE: "$lte", Comparator.LT: "$lt", Comparator.GT: "$gt", Comparator.IN: "$in", Comparator.NIN: "$nin", } ``` In visit_structured_query() the filters are passed as "pre_filter" and not "filter" as in the MongoDB link above since langchain's implementation of MongoDB atlas vector store(libs\langchain\langchain\vectorstores\mongodb_atlas.py) in _similarity_search_with_score() sets the "filter" key to have the value of the "pre_filter" argument. ``` params["filter"] = pre_filter ``` Test cases and documentation have also been added. # Issue #11616 # Dependencies No new dependencies have been added. # Documentation I have created the notebook mongodb_atlas_self_query.ipynb outlining the steps to get the self-query mechanism working. I worked closely with [@Farhan-Faisal](https://github.com/Farhan-Faisal) on this PR. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 22:05:06 -05:00
Josef Zoller	c2e3963da4	Merriam-Webster Dictionary Tool (#12044 ) # Description We implemented a simple tool for accessing the Merriam-Webster Collegiate Dictionary API (https://dictionaryapi.com/products/api-collegiate-dictionary). Here's a simple usage example: ```py from langchain.llms import OpenAI from langchain.agents import load_tools, initialize_agent, AgentType llm = OpenAI() tools = load_tools(["serpapi", "merriam-webster"], llm=llm) # Serp API gives our agent access to Google agent = initialize_agent( tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True ) agent.run("What is the english word for the german word Himbeere? Define that word.") ``` Sample output: ``` > Entering new AgentExecutor chain... I need to find the english word for Himbeere and then get the definition of that word. Action: Search Action Input: "English word for Himbeere" Observation: {'type': 'translation_result'} Thought: Now I have the english word, I can look up the definition. Action: MerriamWebster Action Input: raspberry Observation: Definitions of 'raspberry': 1. rasp-ber-ry, noun: any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries 2. rasp-ber-ry, noun: a perennial plant (genus Rubus) of the rose family that bears raspberries 3. rasp-ber-ry, noun: a sound of contempt made by protruding the tongue between the lips and expelling air forcibly to produce a vibration; broadly : an expression of disapproval or contempt 4. black raspberry, noun: a raspberry (Rubus occidentalis) of eastern North America that has a purplish-black fruit and is the source of several cultivated varieties —called also blackcap Thought: I now know the final answer. Final Answer: Raspberry is an english word for Himbeere and it is defined as any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries. > Finished chain. ``` # Issue This closes #12039. # Dependencies We added no extra dependencies. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Lara <63805048+larkgz@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 20:28:29 -05:00
Mohammad Mohtashim	f3dd4a10cf	DROP BOX Loader Documentation Update (#14047 ) - Description: Update the document for drop box loader + made the messages more verbose when loading pdf file since people were getting confused - Issue: #13952 - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17, --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-29 17:25:35 -08:00
Cheng (William) Huang	a00db4b28f	Add multi-input Reddit search tool (#13893 ) - Description: Added a tool called RedditSearchRun and an accompanying API wrapper, which searches Reddit for posts with support for time filtering, post sorting, query string and subreddit filtering. - Issue: #13891 - Dependencies: `praw` module is used to search Reddit - Tag maintainer: @baskaryan , and any of the other maintainers if needed - Twitter handle: None. Hello, This is our first PR and we hope that our changes will be helpful to the community. We have run `make format`, `make lint` and `make test` locally before submitting the PR. To our knowledge, our changes do not introduce any new errors. Our PR integrates the `praw` package which is already used by RedditPostsLoader in LangChain. Nonetheless, we have added integration tests and edited unit tests to test our changes. An example notebook is also provided. These changes were put together by me, @Anika2000, @CharlesXu123, and @Jeremy-Cheng-stack Thank you in advance to the maintainers for their time. --------- Co-authored-by: What-Is-A-Username <49571870+What-Is-A-Username@users.noreply.github.com> Co-authored-by: Anika2000 <anika.sultana@mail.utoronto.ca> Co-authored-by: Jeremy Cheng <81793294+Jeremy-Cheng-stack@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 20:16:40 -05:00
Jawad Arshad	00a6e8962c	langchain[minor]: Add serpapi tools (#13934 ) - Description: Added some of the more endpoints supported by serpapi that are not suported on langchain at the moment, like google trends, google finance, google jobs, and google lens - Issue: [Add support for many of the querying endpoints with serpapi #11811](https://github.com/langchain-ai/langchain/issues/11811) --------- Co-authored-by: zushenglu <58179949+zushenglu@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Ian Xu <ian.xu@mail.utoronto.ca> Co-authored-by: zushenglu <zushenglu1809@gmail.com> Co-authored-by: KevinT928 <96837880+KevinT928@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 14:02:57 -08:00
h3l	dbaeb163aa	langchain[minor]: add volcengine endpoint as LLM (#13942 ) - Description: Volc Engine MaaS serves as an enterprise-grade, large-model service platform designed for developers. You can visit its homepage at https://www.volcengine.com/docs/82379/1099455 for details. This change will facilitate developers to integrate quickly with the platform. - Issue: None - Dependencies: volcengine - Tag maintainer: @baskaryan - Twitter handle: @he1v3tica --------- Co-authored-by: lvzhong <lvzhong@bytedance.com>	2023-11-29 13:16:42 -08:00
Mohammad Ahmad	1600ebe6c7	langchain[patch]: Mask API key for ForeFrontAI LLM (#14013 ) - Description: Mask API key for ForeFrontAI LLM and associated unit tests - Issue: https://github.com/langchain-ai/langchain/issues/12165 - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: `__mmahmad__` I made the API key non-optional since linting required adding validation for None, but the key is required per documentation: https://python.langchain.com/docs/integrations/llms/forefrontai	2023-11-29 13:12:19 -08:00
yoch	a0e859df51	langchain[patch]: fix cohere reranker init #12899 (#14029 ) - Description: use post field validation for `CohereRerank` - Issue: #12899 and #13058 - Dependencies: - Tag maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 12:57:06 -08:00
123-fake-st	9bd6e9df36	update pdf document loaders' metadata source to url for online pdf (#13274 ) - Description: Update 5 pdf document loaders in `langchain.document_loaders.pdf`, to store a url in the metadata (instead of a temporary, local file path) if the user provides a web path to a pdf: `PyPDFium2Loader`, `PDFMinerLoader`, `PDFMinerPDFasHTMLLoader`, `PyMuPDFLoader`, and `PDFPlumberLoader` were updated. - The updates follow the approach used to update `PyPDFLoader` for the same behavior in #12092 - The `PyMuPDFLoader` changes required additional work in updating `langchain.document_loaders.parsers.pdf.PyMuPDFParser` to be able to process either an `io.BufferedReader` (from local pdf) or `io.BytesIO` (from online pdf) - The `PDFMinerPDFasHTMLLoader` change used a simpler approach since the metadata is assigned by the loader and not the parser - Issue: Fixes #7034 - Dependencies: None ```python # PyPDFium2Loader example: # old behavior >>> from langchain.document_loaders import PyPDFium2Loader >>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': '/var/folders/7z/d5dt407n673drh1f5cm8spj40000gn/T/tmpm5oqa92f/tmp.pdf', 'page': 0} # new behavior >>> from langchain.document_loaders import PyPDFium2Loader >>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0} ```	2023-11-29 15:07:46 -05:00
Toshish Jawale	6f64cb5078	Remove deprecated param and flexibility for prompt (#13310 ) - Description: Updated to remove deprecated parameter penalty_alpha, and use string variation of prompt rather than json object for better flexibility. - Issue: the issue # it fixes (if applicable), - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: @symbldotai --------- Co-authored-by: toshishjawale <toshish@symbl.ai> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 14:48:25 -05:00
Tomaz Bratanic	3eb391561b	langchain[minor]: Reduce the number of tokens required to describe a Cypher/Neo4j schema (#13851 ) Instead of using JSON-like syntax to describe node and relationship properties we changed to a shorter and more concise schema description Old: ``` Node properties are the following: [{'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Movie'}, {'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Actor'}] Relationship properties are the following: [] The relationships are the following: ['(:Actor)-[:ACTED_IN]->(:Movie)'] ``` New: ``` Node properties are the following: Movie {name: STRING},Actor {name: STRING} Relationship properties are the following: The relationships are the following: (:Actor)-[:ACTED_IN]->(:Movie) ```	2023-11-29 11:13:12 -08:00
Sauhaard	7ec4dbeb80	langchain[minor]: Add StackExchange API integration (#14002 ) Implements [#12115](https://github.com/langchain-ai/langchain/issues/12115) Who can review? @baskaryan , @eyurtsev , @hwchase17 Integrated Stack Exchange API into Langchain, enabling access to diverse communities within the platform. This addition enhances Langchain's capabilities by allowing users to query Stack Exchange for specialized information and engage in discussions. The integration provides seamless interaction with Stack Exchange content, offering content from varied knowledge repositories. A notebook example and test cases were included to demonstrate the functionality and reliability of this integration. - Add StackExchange as a tool. - Add unit test for the StackExchange wrapper and tool. - Add documentation for the StackExchange wrapper and tool. If you have time, could you please review the code and provide any feedback as necessary! My team is welcome to any suggestions. --------- Co-authored-by: Yuval Kamani <yuvalkamani@gmail.com> Co-authored-by: Aryan Thakur <aryanthakur@Aryans-MacBook-Pro.local> Co-authored-by: Manas1818 <79381912+manas1818@users.noreply.github.com> Co-authored-by: aryan-thakur <61063777+aryan-thakur@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 10:32:07 -08:00
Bagatur	d4405bc94e	langchain[patch]: Release 0.0.343 (#14037 )	2023-11-29 10:31:03 -08:00
Yves Zumbühl	9c0ad0cebb	langchain[patch]: Improve HyDe with custom prompts and ability to supply the run_manager (#14016 ) - Description: The class allows to only select between a few predefined prompts from the paper. That is not ideal, since other use cases might need a custom prompt. The changes made allow for this. To be able to monitor those, I also added functionality to supply a custom run_manager. - Issue: no issue, but a new feature, - Dependencies: none, - Tag maintainer: @hwchase17, - Twitter handle: @yvesloy --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 09:40:53 -08:00
Chad Norvell	1c4bfb8c5f	langchain[patch]: Mathpix PDF loader supports arbitrary extra params (#13950 ) - Description: Support providing whatever extra parameters you want to the Mathpix PDF loader API request. - Issue: #12773 - Dependencies: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 02:12:32 -08:00
Unai Garay Maestre	9e2ae866c4	langchain[patch]: Adds progress bar to GooglePalmEmbeddings (#13812 ) - Description: Adds a tqdm progress bar to GooglePalmEmbeddings when embedding a list. - Issue: #13637 - Dependencies: TQDM as a main dependency (instead of extra) Signed-off-by: ugm2 <unaigaraymaestre@gmail.com> --------- Signed-off-by: ugm2 <unaigaraymaestre@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 01:58:53 -08:00
David Norman	a578076aea	Mask api key for Together LLM (#13981 ) - Description: Add unit tests and mask api key for Together LLM - Issue: the issue https://github.com/langchain-ai/langchain/issues/12165 , - Dependencies: N/A - Tag maintainer: ?, - Twitter handle: N/A --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-28 22:57:40 -05:00
Johnny	6463d2d0bd	small fix matching engine AttributeError - object has no attribute (#13763 ) This PR is fixing an attributeError: object endpoint has no attribute "_public_match_client" when using gcp matching engine with private VPC network. @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-28 22:42:29 -05:00
Amyh102	750485eaa8	Add object parsing functionality (#13864 ) * Description: Parses huggingface dataset Sequence objects into strings for Document loading. * Issue: Fixes #10674 * Tag maintainter: @baskaryan @eyurtsev --------- Co-authored-by: Amy Han <amyhan@Amys-Air.lan> Co-authored-by: Amy Han <amyhan@Amys-MacBook-Air.local>	2023-11-28 22:33:16 -05:00
ggeutzzang	981f78f920	Fix: (issue #13825 ) Getting an error with DallEAPIWrapper (#13874 ) - Description: As of OpenAI's Python package 1.0, the existing DallEAPIWrapper does not work correctly, so the example in the LangChain Documentation link below does not work either. https://python.langchain.com/docs/integrations/tools/dalle_image_generator Also, since OpenAI only supports DALL-E version 2 or version 3, I modified the DallEAPIWrapper to support it. - Issue: #13825 - Twitter handle: ggeutzzang	2023-11-28 22:31:25 -05:00
Kunal	74045bf5c0	max length attribute for spacy splitter for large docs (#13875 ) For large size documents spacy splitter doesn't work it throws an error as shown in below screenshot. Reason its default max_length is 1000000 and there is no option to increase it. So i added it in this PR. ![image](https://github.com/langchain-ai/langchain/assets/73680423/613625c3-0e21-4834-9aad-2a73cf56eecc) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-28 22:30:26 -05:00
Wang Wei	fe9341a29c	feat: Add ERNIE-Bot-8K model support for ErnieBotChat. (#13716 ) - Description: According to the document https://cloud.baidu.com/doc/WENXINWORKSHOP/s/6lp69is2a, add ERNIE-Bot-8K model support for ErnieBotChat. - Dependencies: Before using the ERNIE-Bot-8K, you should have the model's access authority.	2023-11-28 22:22:23 -05:00
Burak Ömür	0e462b72ef	Update openai/create_llm_result function to consider kwargs (#13815 ) Replace this entire comment with: - Description: updates `create_llm_result` function within `openai.py` to consider latest `params`, - Issue: #8928 - Dependencies: -, - Tag maintainer: - - Twitter handle: [burkomr](https://twitter.com/burkomr) <!-- If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Burak Ömür <burakomur@retorio.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-28 22:02:38 -05:00
chyroc	f97ab84c6b	Merge pull request #13907 * feat: mask api_key for jina	2023-11-28 21:24:50 -05:00
nhywieza	9b86fb3fcb	secretStr for baichuan chat model api key (#13946 ) Merge pull request #13946 * secretStr for baichuan chat model api key	2023-11-28 21:20:23 -05:00
卢靖轩	aff1dba252	Merge pull request #13945 * feat: mask api key for nlpcloud	2023-11-28 21:16:36 -05:00
Leonid Kuligin	85bb3a418c	Switched VertexAI models from preview (#13657 ) Replace this entire comment with: - Description: VertexAI models are now GA, moved away from using preview ones from the SDK - Issue: #13606 --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-11-28 20:38:04 -05:00
Erick Friis	5eca1bd93f	Library Licenses (#13300 ) Same change as #8403 but in other libs also updates (c) LangChain Inc. instead of @hwchase17	2023-11-28 17:34:27 -08:00
Bagatur	14799b139a	infra[patch]: add base deps and fix docs lint (#13998 )	2023-11-28 17:27:37 -08:00
Théo LEBRUN	926d4cfda7	Set default region from boto3 session for Bedrock (#13694 ) - Description: Set default region from boto3 session for Bedrock - Issue: #13683	2023-11-28 20:26:54 -05:00
Snow	1a33e5b500	Repair Wikipedia document loader `load_max_docs` and improve test coverage. (#13769 ) Description: Repair Wikipedia document loader `load_max_docs` and improve test coverage. Issue: The Wikipedia document loader was not respecting the `load_max_docs` paramater (not reported) and would always return a maximum of 10 documents. This is because the API wrapper (in `utilities/wikipedia.py`) wasn't passing `top_k_results` to the underlying [Wikipedia library](https://wikipedia.readthedocs.io/en/latest/code.html#module-wikipedia). By default this library returns 10 results. The default number of results for the document loader has been reduced from 100 to 25. This is because loading 100 results takes a very long time and is an inconvenient default. It should possibly be 10. In addition, the documentation for the loader reported that there was a hard limit (300) on the number of documents returned. In actuality 300 is the maximum Wikipedia query character length set by the API wrapper. Tests have been added for the document loader (previously missing) and to test the correct numbers of documents are being returned by each class, both by default, and when overridden. Also repaired is the `assert_docs` test which has been updated to correctly test for the default metadata (which includes `source` in recent releases). Dependencies: nil Tag maintainer: @leo-gan Twitter handle: @queenvictoria	2023-11-28 20:26:40 -05:00
Bob Lin	04c4878306	Remove `python_repl` from _BASE_TOOLS (#13962 ) ### Description: Previously `python_repl` was a built-in tool, but now it has been moved to `langchain_experimental`. When I use `load_tools` I get an error: ```python In [1]: from langchain.agents import load_tools In [2]: load_tools(["python_repl"]) --------------------------------------------------------------------------- ImportError Traceback (most recent call last) Cell In[2], line 1 ----> 1 load_tools(["python_repl"]) File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:530, in load_tools(tool_names, llm, callbacks, kwargs) 528 tool_names.extend(requests_method_tools) 529 elif name in _BASE_TOOLS: --> 530 tools.append(_BASE_TOOLS[name]()) 531 elif name in _LLM_TOOLS: 532 if llm is None: File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:84, in _get_python_repl() 83 def _get_python_repl() -> BaseTool: ---> 84 raise ImportError( 85 "This tool has been moved to langchain experiment. " 86 "This tool has access to a python REPL. " 87 "For best practices make sure to sandbox this tool. " 88 "Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md " 89 "To keep using this code as is, install langchain experimental and " 90 "update relevant imports replacing 'langchain' with 'langchain_experimental'" 91 ) ImportError: This tool has been moved to langchain experiment. This tool has access to a python REPL. For best practices make sure to sandbox this tool. Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md To keep using this code as is, install langchain experimental and update relevant imports replacing 'langchain' with 'langchain_experimental' ``` In this case, it will be very confusing. I think it is no longer a built-in tool now, so it can be removed from `_BASE_TOOLS` ### Issue: https://github.com/langchain-ai/langchain/issues/13858, https://github.com/langchain-ai/langchain/issues/13859, https://github.com/langchain-ai/langchain/issues/13856 ### Twitter handle:** [lin_bob57617](https://twitter.com/lin_bob57617)	2023-11-28 20:13:54 -05:00
Leonid Ganeline	52eee458bb	renamed `google_vertex_ai_vector_search` notebook (#13484 ) The `integrations/vectorstores/matchingengine.ipynb` example has the "Google Vertex AI Vector Search" title. This place this Title in the wrong order in the ToC (it is sorted by the file name). - Renamed `integrations/vectorstores/matchingengine.ipynb` into `integrations/vectorstores/google_vertex_ai_vector_search.ipynb`. - Updated a correspondent comment in docstring - Rerouted old URL to a new URL --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-28 16:58:29 -08:00
Leonid Ganeline	bf5787f58b	experimental[patch]: fixed namespace bug (#13585 ) It was : `from langchain.schema.prompts import BasePromptTemplate` but because of the breaking change in the ns, it is now `from langchain.schema.prompt_template import BasePromptTemplate` This bug prevents building the API Reference for the langchain_experimental	2023-11-28 16:40:27 -08:00
Taqi Jaffri	144710ad9a	langchain[minor]: Updated DocugamiLoader, includes breaking changes (#13265 ) There are the following main changes in this PR: 1. Rewrite of the DocugamiLoader to not do any XML parsing of the DGML format internally, and instead use the `dgml-utils` library we are separately working on. This is a very lightweight dependency. 2. Added MMR search type as an option to multi-vector retriever, similar to other retrievers. MMR is especially useful when using Docugami for RAG since we deal with large sets of documents within which a few might be duplicates and straight similarity based search doesn't give great results in many cases. We are @docugami on twitter, and I am @tjaffri --------- Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>	2023-11-28 15:56:22 -08:00
Bagatur	a20e8f8bb0	experimental[patch]: release 0.0.43 (#13570 )	2023-11-28 15:38:09 -08:00
Bagatur	d8fe987ef5	langchain[patch]: release 0.0.342 (#13992 )	2023-11-28 14:34:57 -08:00
david qiu	9fb6805be4	langchain[minor]: Add retriever for Knowledge Bases for Amazon Bedrock (#13980 ) - Description: Adds a retriever implementation for [Knowledge Bases for Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/), a new service announced at AWS re:Invent, shortly before this PR was opened. This depends on the `bedrock-agent-runtime` service, which will be included in a future version of `boto3` and of `botocore`. We will open a follow-up PR documenting the minimum required versions of `boto3` and `botocore` after that information is available. - Issue: N/A - Dependencies: `boto3>=1.33.2, botocore>=1.33.2` - Tag maintainer: @baskaryan - Twitter handles: `@pjain7` `@dead_letter_q` This PR includes a documentation notebook under `docs/docs/integrations/retrievers`, which I (@dlqqq) have verified independently. EDIT: `bedrock-agent-runtime` service is now included in `boto3>=1.33.2`: `5cf793f493` --------- Co-authored-by: Piyush Jain <piyushjain@duck.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-28 14:10:23 -08:00
Bagatur	1aed2d1f08	core[patch]: release 0.0.7 (#13989 )	2023-11-28 14:05:01 -08:00
David Duong	eb67f07e32	Track RunnableAssign as a separate run trace (#13972 ) Addressing incorrect order being sent to callbacks / tracers, due to the nature of threading --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-11-28 22:02:31 +00:00
Nuno Campos	0f255bb6c4	In Runnable.stream_log build up final_output from adding output chunks (#12781 ) Add arg to omit streamed_output list, in cases where final_output is enough this saves bandwidth <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-28 21:50:41 +00:00
Nuno Campos	970fe23feb	Fixes for opengpts release (#13960 )	2023-11-28 21:49:43 +00:00
David Duong	947daaf833	Exclude Bedrock client and credentials_profile_name fields from serialisation (#13603 )	2023-11-28 16:34:46 -05:00
Bagatur	48fbc5513d	infra[patch], langchain[patch]: fix test deps and upper bound langchain dep on core(#13984 )	2023-11-28 13:26:15 -08:00
Stefano Lottini	1fd724293b	Astra DB vector store, move constructor docstring to class docstring (#13784 ) This PR rearranges the docstring for the `AstraDB` vector store class so as to have all useful information in the _class_ docstring for ease of reading. (incidentally, due to an oversight, the docstring that was in the constructor ended up buried below some lines of code, thereby disappearing altogether from accessibility. Apologies.)	2023-11-28 16:25:44 -05:00
Johannes Foulds	fc40bd4cdb	AnthropicFunctions function_call compatibility (#13901 ) - Description: Updates to `AnthropicFunctions` to be compatible with the OpenAI `function_call` functionality. - Issue: The functionality to indicate `auto`, `none` and a forced function_call was not completely implemented in the existing code. - Dependencies: None - Tag maintainer: @baskaryan , and any of the other maintainers if needed. - Twitter handle: None I have specifically tested this functionality via AWS Bedrock with the Claude-2 and Claude-Instant models.	2023-11-28 16:22:55 -05:00
mengjincn	05ea4fd37d	fix merge None value and non None value error (#13703 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-28 15:49:56 -05:00
Ali Orozgani	32d794f5a3	iMessage loader: implement message content extraction from attributed… (#13634 ) - Description: We are adding functionality to extract message content from the `attributedBody` field of the database, in case the content is not in the `text` field. - Issue: Closes #13326 and #10680 - Dependencies: None. - Tag maintainer: @eyurtsev, @hwchase17 --------- Co-authored-by: onotate <johnp.pham@mail.utoronto.ca>	2023-11-28 15:45:43 -05:00
William FH	e5256bcb69	[Evals] Add Project Tags (#13982 ) Add them to project extra	2023-11-28 11:38:59 -08:00
Nuno Campos	e0bcc98436	infra[patch]: Use langchain core in-tree as a dev dependency (#13957 ) Using the published version means master is broken for contributors whenever we make changes in one lib that depend on the other.	2023-11-28 09:23:43 -08:00
unifyh	2703a1b061	Fix `MarkdownHeaderTextSplitter` not recognizing tilde-fenced code blocks (#13511 ) - Description: Previously `MarkdownHeaderTextSplitter` did not consider tilde-fenced code blocks (https://spec.commonmark.org/0.30/#fenced-code-blocks). This PR fixes that. ````md # Bug caused by previous implementation: ~~~py foo() # This is a comment that would be considered header bar() ~~~ ```` - Tag maintainer: @baskaryan	2023-11-28 11:52:38 -05:00
Leonid Ganeline	7929b26017	office365 toolkit bug fixes (#13618 ) Several bug fixes: - emails: instead of `bcc` the `cc` is used. - errors in the truncation descriptions - no truncation of the `message_search` Several updates: - generalized UTC format - truncation limit can be changed now in _call()	2023-11-28 11:49:24 -05:00
William FH	60309341bd	Eval Error Key (#13974 )	2023-11-28 08:38:30 -08:00
Erick Friis	f9bef600f1	RELEASE: core 0.0.7 (#13973 )	2023-11-28 10:28:28 -05:00
Nicolas Bondoux	e17edc4d0b	RunnableLambda: create afunc instance from func when not provided (#13408 ) Fixes #13407. This workaround consists in letting the RunnableLambda create its self.afunc from its self.func when self.afunc is not provided; the change has no dependency. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Nuno Campos <nuno@langchain.dev>	2023-11-28 11:18:26 +00:00
Nuno Campos	391f200eaa	Implement stream() and astream() for agents (#12783 ) ``` ---- chunk 1 {'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})])], 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]} ---- chunk 2 {'messages': [FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search')], 'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”")]} ---- chunk 3 {'actions': [AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})])], 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]} ---- chunk 4 {'messages': [FunctionMessage(content='25 years', name='Search')], 'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years')]} ---- chunk 5 {'actions': [AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})])], 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]} ---- chunk 6 {'messages': [FunctionMessage(content='Answer: 3.991298452658078', name='Calculator')], 'steps': [AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]} ---- chunk 7 {'messages': [AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")], 'output': "Leonardo DiCaprio's current girlfriend is the Italian model " 'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 ' 'power is approximately 3.99.'} ---- final {'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})])], 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}}), FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search'), AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}}), FunctionMessage(content='25 years', name='Search'), AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}}), FunctionMessage(content='Answer: 3.991298452658078', name='Calculator'), AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")], 'output': "Leonardo DiCaprio's current girlfriend is the Italian model " 'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 ' 'power is approximately 3.99.', 'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”"), AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years'), AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]} ```	2023-11-28 08:11:37 +00:00
Michael Feil	686162670e	langchain[minor]: Adding `infinity` embedding integration. (#13928 ) This adds integation to https://github.com/michaelfeil/infinity. Users requested it in https://github.com/michaelfeil/infinity/issues/36 @saatvikshah Follows my implementation of gradient.ai. Feedback 1: Well done - I love your CI / repo / poetry setup - I adapted a lot in https://github.com/michaelfeil/infinity. Feedback 2: Not so good: The openai integration contains to much reverse engineering - in general projects such as michaelfeil/infinity and huggingface/text-embeddings-inference are compatible to the `pip install openai` package. Reverse engineering like this one is really hindering the use for me: `8e88ba16a8/libs/langchain/langchain/embeddings/openai.py (L347)` `8e88ba16a8/libs/langchain/langchain/embeddings/openai.py (L351)` - it is about preventing 3rd party providers to use the same url + uses interfaces of openai, that are not publically documented.	2023-11-27 16:43:47 -08:00
Bagatur	10a6e7cbb6	langchain[patch], core[patch]: Make common utils public (#13932 ) - rename `langchain_core.chat_models.base._generate_from_stream` -> `generate_from_stream` - rename `langchain_core.chat_models.base._agenerate_from_stream` -> `agenerate_from_stream` - export `langchain_core.utils.utils.build_extra_kwargs` from `langchain_core.utils`	2023-11-27 15:34:46 -08:00
Oleksandr Yaremchuk	c0277d06e8	experimental[patch] Update prompt injection model (#13930 ) - Description: Existing model used for Prompt Injection is quite outdated but we fine-tuned and open-source a new model based on the same model deberta-v3-base from Microsoft - [laiyer/deberta-v3-base-prompt-injection](https://huggingface.co/laiyer/deberta-v3-base-prompt-injection). It supports more up-to-date injections and less prone to false-positives. - Dependencies: No - Tag maintainer: - - Twitter handle: @alex_yaremchuk --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-27 17:56:53 -05:00
Bob Lin	e6ebde9688	experimental[patch]: Add experimental.agent imports (#13839 ) - Description: The experimental package needs to be compatible with the usage of importing agents For example, if i use `from langchain.agents import create_pandas_dataframe_agent`, running the program will prompt the following information: ``` Traceback (most recent call last): File "/Users/dongwm/test/main.py", line 1, in <module> from langchain.agents import create_pandas_dataframe_agent File "/Users/dongwm/test/venv/lib/python3.11/site-packages/langchain/agents/__init__.py", line 87, in __getattr__ raise ImportError( ImportError: create_pandas_dataframe_agent has been moved to langchain experimental. See https://github.com/langchain-ai/langchain/discussions/11680 for more information. Please update your import statement from: `langchain.agents.create_pandas_dataframe_agent` to `langchain_experimental.agents.create_pandas_dataframe_agent`. ``` But when I changed to `from langchain_experimental.agents import create_pandas_dataframe_agent`, it was actually wrong: ```python Traceback (most recent call last): File "/Users/dongwm/test/main.py", line 2, in <module> from langchain_experimental.agents import create_pandas_dataframe_agent ImportError: cannot import name 'create_pandas_dataframe_agent' from 'langchain_experimental.agents' (/Users/dongwm/test/venv/lib/python3.11/site-packages/langchain_experimental/agents/__init__.py) ``` I should use `from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent`. In order to solve the problem and make it compatible, I added additional import code to the langchain_experimental package. Now it can be like this Used `from langchain_experimental.agents import create_pandas_dataframe_agent` - Twitter handle: [lin_bob57617](https://twitter.com/lin_bob57617)	2023-11-27 14:03:47 -08:00
Tyler Titsworth	afcfa2a5e7	langchain[patch]: Add progress bar option to OllamaEmbeddings (#13882 ) - Description: Adds a tqdm progress bar to OllamaEmbeddings when embedding a list. - Issue: Related to #13637, but extended to Ollama. - Dependencies: `tqdm` made a necessary dependency. Thanks to @ugm2 for helping identify a common problem. Embeddings take a very long time to finish on local machines, and require a progress bar to help identify if one should even attempt the workload. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-27 13:56:13 -08:00
jeremyb-data	cd77fba562	Improvement: Weaviate multitenant adddocs (#13827 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: Added a line to pass the tenant parameter to add_data_object - Issue: An extra line added from the fix for #9956 - Dependencies: n/a - Tag maintainer: @baskaryan Tested locally, works as expected with the line change. --------- Co-authored-by: Simon Dai <simon6752@gmail.com>	2023-11-27 12:59:57 -08:00
jiangying	3e30cd8261	NIT: comment typo (#13817 )	2023-11-27 12:59:12 -08:00
Assaf Toledo	ba62ff89cc	BUGFIX: Support for elastic indices that don't return 'metadata' in '_source' (#13903 ) Description: Some Elastic indexes do not return a 'metadata' field in '_source'. However, prior to this PR, the code assumed there always is a 'metadata' field. This PR adds support for cases where the field is missing by adding it manually. Issue: #13869	2023-11-27 12:52:57 -08:00
Enric Soler Rastrollo	c156d0281a	BUGFIX: Use embedding key in azure_cosmos_db index creation (#13919 ) Description: Implement embedding key parametrisation Issue: https://github.com/langchain-ai/langchain/issues/13918 Dependencies: None Tag maintainer: @hwchase17 @izzymsft Twitter handle:@MaddogoS	2023-11-27 12:51:08 -08:00
Bagatur	ac67422a3d	IMPROVEMENT: import Document from core (#13905 )	2023-11-27 12:48:43 -08:00
chyroc	886bc2d50a	IMPROVEMENT: fix qianfan validate_environment typo (#13908 )	2023-11-27 11:17:27 -08:00
Chengzu Ou	4b8e053fe8	FEATURE: Add Databricks Vector Search as a new vector store (#13621 ) Description: This PR adds Databricks Vector Search as a new vector store in LangChain. - [x] Add `DatabricksVectorSearch` in `langchain/vectorstores/` - [x] Unit tests - [x] Add [`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/) as a new optional dependency We ran the following checks: - `make format` passed ✅ - `make lint` failed but the failures were caused by other files + Files touched by this PR passed the linter ✅ - `make test` passed ✅ - `make coverage` failed but the failures were caused by other files. Tests added by or related to this PR all passed + langchain/vectorstores/databricks_vector_search.py test coverage 94% ✅ - `make spell_check` passed ✅ The example notebook and updates to the [provider's documentation page](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/providers/databricks.md) will be added later in a separate PR. Dependencies: Optional dependency: [`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-27 11:07:26 -08:00
Leonid Kuligin	25387db432	BUFIX: add support for various OSS images from Vertex Model Garden (#13917 ) - Description: add support for various OSS images from Model Garden - Issue: #13370	2023-11-27 10:31:53 -08:00
Eugene Yurtsev	e186637921	Document Runnable Binding (#13927 ) Document runnable binding	2023-11-27 13:21:27 -05:00
Bagatur	46b3311190	RELEASE: 0.0.341 (#13926 )	2023-11-27 09:51:12 -08:00
umair mehmood	b3e08f9239	improvement: fix chat prompt loading from config (#13818 ) Add loader for loading chat prompt from config file. fixed: #13667 @efriis @baskaryan	2023-11-27 11:39:50 -05:00
Nuno Campos	8a3e0c9afa	Add option to prefix config keys in configurable_alts (#13714 )	2023-11-27 15:25:17 +00:00
ggeutzzang	3749af79ae	DOCS: fixed error in the docstring of RunnablePassthrough class (#13843 ) This pull request addresses an issue found in the example code within the docstring of `libs/core/langchain_core/runnables/passthrough.py` The original code snippet caused a `NameError` due to the missing import of `RunnableLambda`. The error was as follows: ``` 12 return "completion" 13 ---> 14 chain = RunnableLambda(fake_llm) \| { 15 'original': RunnablePassthrough(), # Original LLM output 16 'parsed': lambda text: text[::-1] # Parsing logic NameError: name 'RunnableLambda' is not defined ``` To resolve this, I have modified the example code to include the necessary import statement for `RunnableLambda`. Additionally, I have adjusted the indentation in the code snippet to ensure consistency and readability. The modified code now successfully defines and utilizes `RunnableLambda`, ensuring that users referencing the docstring will have a functional and clear example to follow. There are no related GitHub issues for this particular change. Modified Code: ```python from langchain_core.runnables import RunnablePassthrough, RunnableParallel from langchain_core.runnables import RunnableLambda runnable = RunnableParallel( origin=RunnablePassthrough(), modified=lambda x: x+1 ) runnable.invoke(1) # {'origin': 1, 'modified': 2} def fake_llm(prompt: str) -> str: # Fake LLM for the example return "completion" chain = RunnableLambda(fake_llm) \| { 'original': RunnablePassthrough(), # Original LLM output 'parsed': lambda text: text[::-1] # Parsing logic } chain.invoke('hello') # {'original': 'completion', 'parsed': 'noitelpmoc'} ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-27 00:06:55 -08:00
Dylan Williams	1983a39894	FEATURE: Add OneNote document loader (#13841 ) - Description: Added OneNote document loader - Issue: #12125 - Dependencies: msal Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-26 23:59:52 -08:00
Tomaz Bratanic	1ad65f7a98	BUGFIX: Fix bugs with Cypher validation (#13849 ) Fixes https://github.com/langchain-ai/langchain/issues/13803. Thanks to @sakusaku-rich	2023-11-26 19:30:11 -08:00
Harrison Chase	6a35831128	BUGFIX: export more types (#13886 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-26 19:15:34 -08:00
Yusuf Khan	935f78c944	FEATURE: Add retriever for Outline (#13889 ) - Description: Added a retriever for the Outline API to ask questions on knowledge base - Issue: resolves #11814 - Dependencies: None - Tag maintainer: @baskaryan	2023-11-26 18:56:12 -08:00
Bagatur	0efa59cbb8	RELEASE: 0.0.339rc3 (#13852 )	2023-11-25 10:37:30 -08:00
Bagatur	7222c42077	RELEASE: core 0.0.6 (#13853 )	2023-11-25 10:21:14 -08:00
raelix	c172605ea6	IMPROVEMENT: Added title metadata to GoogleDriveLoader for optional File Loaders (#13832 ) - Description: Simple change, I just added title metadata to GoogleDriveLoader for optional File Loaders - Dependencies: no dependencies - Tag maintainer: @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-24 18:53:55 -08:00
Stefano Lottini	19c68c7652	FEATURE: Astra DB, LLM cache classes (exact-match and semantic cache) (#13834 ) This PR provides idiomatic implementations for the exact-match and the semantic LLM caches using Astra DB as backend through the database's HTTP JSON API. These caches require the `astrapy` library as dependency. Comes with integration tests and example usage in the `llm_cache.ipynb` in the docs. @baskaryan this is the Astra DB counterpart for the Cassandra classes you merged some time ago, tagging you for your familiarity with the topic. Thank you! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-24 18:53:37 -08:00
Stefano Lottini	272df9dcae	Astra DB, chat message history (#13836 ) This PR adds a chat message history component that uses Astra DB for persistence through the JSON API. The `astrapy` package is required for this class to work. I have added tests and a small notebook, and updated the relevant references in the other docs pages. (@rlancemartin this is the counterpart of the Cassandra equivalent class you so helpfully reviewed back at the end of June) Thank you!	2023-11-24 18:12:29 -08:00
Bagatur	58f7e109ac	BUGFIX: Add import types and typevars from core (#13829 )	2023-11-24 17:04:10 -08:00
Bagatur	751226e067	bump 0.0.339rc2 (#13787 )	2023-11-23 12:50:09 -08:00
Bagatur	300ff01824	RELEASE: core 0.0.5 (#13786 )	2023-11-23 12:23:50 -08:00
Bagatur	72c108b003	IMPROVEMENT: filter global warnings properly (#13754 )	2023-11-22 16:26:37 -08:00
William FH	163bf165ed	Add Batch Size kwarg to the llm start callback (#13483 ) So you can more easily use the token counts directly from the API endpoint for batch size of 1	2023-11-22 14:47:57 -08:00
Bagatur	0be515f720	RELEASE: 0.0.339rc1 (#13746 )	2023-11-22 14:29:49 -08:00
Bagatur	2bc5bd67f7	RELEASE: core 0.0.4 (#13745 )	2023-11-22 13:57:28 -08:00
Bagatur	32d087fcb8	REFACTOR: combine core documents files (#13733 )	2023-11-22 10:10:26 -08:00
William FH	5b90fe5b1c	Fix locking (#13725 )	2023-11-22 07:37:25 -08:00
Bagatur	16af282429	BUGFIX: add prompt imports for backwards compat (#13702 )	2023-11-21 23:04:20 -08:00
Bagatur	e327bb4ba4	IMPROVEMENT: Conditionally import core type hints (#13700 )	2023-11-21 21:38:49 -08:00
dandanwei	d47ee1ae79	BUGFIX: redis vector store overwrites falsey metadata (#13652 ) - Description: This commit fixed the problem that Redis vector store will change the value of a metadata from 0 to empty when saving the document, which should be an un-intended behavior. - Issue: N/A - Dependencies: N/A	2023-11-21 20:16:23 -08:00
Bagatur	a21e84faf7	BUGFIX: llm backwards compat imports (#13698 )	2023-11-21 20:12:35 -08:00
Yujie Qian	ace9e64d62	IMPROVEMENT: VoyageEmbeddings embed_general_texts (#13620 ) - Description: add method embed_general_texts in VoyageEmebddings to support input_type - Issue: - Dependencies: - Tag maintainer: - Twitter handle: @Voyage_AI_	2023-11-21 18:33:07 -08:00

1 2 3 4 5 ...

2113 Commits