diff --git a/MIGRATE.md b/MIGRATE.md index 6500865076..710f2d27b2 100644 --- a/MIGRATE.md +++ b/MIGRATE.md @@ -1,11 +1,11 @@ # Migrating -Please see the following guides for migratin LangChain code: +Please see the following guides for migrating LangChain code: * Migrate to [LangChain v0.3](https://python.langchain.com/docs/versions/v0_3/) * Migrate to [LangChain v0.2](https://python.langchain.com/docs/versions/v0_2/) * Migrating from [LangChain 0.0.x Chains](https://python.langchain.com/docs/versions/migrating_chains/) -* Upgrate to [LangGraph Memory](https://python.langchain.com/docs/versions/migrating_memory/) +* Upgrade to [LangGraph Memory](https://python.langchain.com/docs/versions/migrating_memory/) -The [LangChain CLI](https://python.langchain.com/docs/versions/v0_3/#migrate-using-langchain-cli) can help automatically upgrade your code to use non deprecated imports. +The [LangChain CLI](https://python.langchain.com/docs/versions/v0_3/#migrate-using-langchain-cli) can help you automatically upgrade your code to use non-deprecated imports. This will be especially helpful if you're still on either version 0.0.x or 0.1.x of LangChain. diff --git a/docs/docs/concepts/chat_history.mdx b/docs/docs/concepts/chat_history.mdx index 5c7c112460..57d22c2735 100644 --- a/docs/docs/concepts/chat_history.mdx +++ b/docs/docs/concepts/chat_history.mdx @@ -17,7 +17,7 @@ Most conversations start with a **system message** that sets the context for the The **assistant** may respond directly to the user or if configured with tools request that a [tool](/docs/concepts/tool_calling) be invoked to perform a specific task. -So a full conversation often involves a combination of two patterns of alternating messages: +A full conversation often involves a combination of two patterns of alternating messages: 1. The **user** and the **assistant** representing a back-and-forth conversation. 2. The **assistant** and **tool messages** representing an ["agentic" workflow](/docs/concepts/agents) where the assistant is invoking tools to perform specific tasks. diff --git a/docs/docs/concepts/chat_models.mdx b/docs/docs/concepts/chat_models.mdx index b42022161f..03133a253e 100644 --- a/docs/docs/concepts/chat_models.mdx +++ b/docs/docs/concepts/chat_models.mdx @@ -2,7 +2,7 @@ ## Overview -Large Language Models (LLMs) are advanced machine learning models that excel in a wide range of language-related tasks such as text generation, translation, summarization, question answering, and more, without needing task-specific tuning for every scenario. +Large Language Models (LLMs) are advanced machine learning models that excel in a wide range of language-related tasks such as text generation, translation, summarization, question answering, and more, without needing task-specific fine tuning for every scenario. Modern LLMs are typically accessed through a chat model interface that takes a list of [messages](/docs/concepts/messages) as input and returns a [message](/docs/concepts/messages) as output. @@ -85,7 +85,7 @@ Many chat models have standardized parameters that can be used to configure the | Parameter | Description | |----------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `model` | The name or identifier of the specific AI model you want to use (e.g., `"gpt-3.5-turbo"` or `"gpt-4"`). | -| `temperature` | Controls the randomness of the model's output. A higher value (e.g., 1.0) makes responses more creative, while a lower value (e.g., 0.1) makes them more deterministic and focused. | +| `temperature` | Controls the randomness of the model's output. A higher value (e.g., 1.0) makes responses more creative, while a lower value (e.g., 0.0) makes them more deterministic and focused. | | `timeout` | The maximum time (in seconds) to wait for a response from the model before canceling the request. Ensures the request doesn’t hang indefinitely. | | `max_tokens` | Limits the total number of tokens (words and punctuation) in the response. This controls how long the output can be. | | `stop` | Specifies stop sequences that indicate when the model should stop generating tokens. For example, you might use specific strings to signal the end of a response. | @@ -97,9 +97,9 @@ Many chat models have standardized parameters that can be used to configure the Some important things to note: - Standard parameters only apply to model providers that expose parameters with the intended functionality. For example, some providers do not expose a configuration for maximum output tokens, so max_tokens can't be supported on these. -- Standard params are currently only enforced on integrations that have their own integration packages (e.g. `langchain-openai`, `langchain-anthropic`, etc.), they're not enforced on models in ``langchain-community``. +- Standard parameters are currently only enforced on integrations that have their own integration packages (e.g. `langchain-openai`, `langchain-anthropic`, etc.), they're not enforced on models in `langchain-community`. -ChatModels also accept other parameters that are specific to that integration. To find all the parameters supported by a ChatModel head to the [API reference](https://python.langchain.com/api_reference/) for that model. +Chat models also accept other parameters that are specific to that integration. To find all the parameters supported by a Chat model head to the their respective [API reference](https://python.langchain.com/api_reference/) for that model. ## Tool calling @@ -150,7 +150,7 @@ An alternative approach is to use semantic caching, where you cache responses ba A semantic cache introduces a dependency on another model on the critical path of your application (e.g., the semantic cache may rely on an [embedding model](/docs/concepts/embedding_models) to convert text to a vector representation), and it's not guaranteed to capture the meaning of the input accurately. -However, there might be situations where caching chat model responses is beneficial. For example, if you have a chat model that is used to answer frequently asked questions, caching responses can help reduce the load on the model provider and improve response times. +However, there might be situations where caching chat model responses is beneficial. For example, if you have a chat model that is used to answer frequently asked questions, caching responses can help reduce the load on the model provider, costs, and improve response times. Please see the [how to cache chat model responses](/docs/how_to/chat_model_caching/) guide for more details. diff --git a/docs/docs/concepts/document_loaders.mdx b/docs/docs/concepts/document_loaders.mdx index d9b1f13bab..c38e81610e 100644 --- a/docs/docs/concepts/document_loaders.mdx +++ b/docs/docs/concepts/document_loaders.mdx @@ -29,7 +29,7 @@ loader = CSVLoader( data = loader.load() ``` -or if working with large datasets, you can use the `.lazy_load` method: +When working with large datasets, you can use the `.lazy_load` method: ```python for document in loader.lazy_load(): diff --git a/docs/docs/concepts/lcel.mdx b/docs/docs/concepts/lcel.mdx index 020bc6f8aa..da45da268b 100644 --- a/docs/docs/concepts/lcel.mdx +++ b/docs/docs/concepts/lcel.mdx @@ -6,7 +6,7 @@ The **L**ang**C**hain **E**xpression **L**anguage (LCEL) takes a [declarative](https://en.wikipedia.org/wiki/Declarative_programming) approach to building new [Runnables](/docs/concepts/runnables) from existing Runnables. -This means that you describe what you want to happen, rather than how you want it to happen, allowing LangChain to optimize the run-time execution of the chains. +This means that you describe what *should* happen, rather than *how* it should happen, allowing LangChain to optimize the run-time execution of the chains. We often refer to a `Runnable` created using LCEL as a "chain". It's important to remember that a "chain" is `Runnable` and it implements the full [Runnable Interface](/docs/concepts/runnables). @@ -20,8 +20,8 @@ We often refer to a `Runnable` created using LCEL as a "chain". It's important t LangChain optimizes the run-time execution of chains built with LCEL in a number of ways: -- **Optimize parallel execution**: Run Runnables in parallel using [RunnableParallel](#runnableparallel) or run multiple inputs through a given chain in parallel using the [Runnable Batch API](/docs/concepts/runnables/#optimized-parallel-execution-batch). Parallel execution can significantly reduce the latency as processing can be done in parallel instead of sequentially. -- **Guarantee Async support**: Any chain built with LCEL can be run asynchronously using the [Runnable Async API](/docs/concepts/runnables/#asynchronous-support). This can be useful when running chains in a server environment where you want to handle large number of requests concurrently. +- **Optimized parallel execution**: Run Runnables in parallel using [RunnableParallel](#runnableparallel) or run multiple inputs through a given chain in parallel using the [Runnable Batch API](/docs/concepts/runnables/#optimized-parallel-execution-batch). Parallel execution can significantly reduce the latency as processing can be done in parallel instead of sequentially. +- **Guaranteed Async support**: Any chain built with LCEL can be run asynchronously using the [Runnable Async API](/docs/concepts/runnables/#asynchronous-support). This can be useful when running chains in a server environment where you want to handle large number of requests concurrently. - **Simplify streaming**: LCEL chains can be streamed, allowing for incremental output as the chain is executed. LangChain can optimize the streaming of the output to minimize the time-to-first-token(time elapsed until the first chunk of output from a [chat model](/docs/concepts/chat_models) or [llm](/docs/concepts/text_llms) comes out). Other benefits include: @@ -38,7 +38,7 @@ LCEL is an [orchestration solution](https://en.wikipedia.org/wiki/Orchestration_ While we have seen users run chains with hundreds of steps in production, we generally recommend using LCEL for simpler orchestration tasks. When the application requires complex state management, branching, cycles or multiple agents, we recommend that users take advantage of [LangGraph](/docs/concepts/architecture#langgraph). -In LangGraph, users define graphs that specify the flow of the application. This allows users to keep using LCEL within individual nodes when LCEL is needed, while making it easy to define complex orchestration logic that is more readable and maintainable. +In LangGraph, users define graphs that specify the application's flow. This allows users to keep using LCEL within individual nodes when LCEL is needed, while making it easy to define complex orchestration logic that is more readable and maintainable. Here are some guidelines: diff --git a/docs/docs/concepts/messages.mdx b/docs/docs/concepts/messages.mdx index d1b307b98c..c8765ab3d3 100644 --- a/docs/docs/concepts/messages.mdx +++ b/docs/docs/concepts/messages.mdx @@ -8,7 +8,7 @@ Messages are the unit of communication in [chat models](/docs/concepts/chat_models). They are used to represent the input and output of a chat model, as well as any additional context or metadata that may be associated with a conversation. -Each message has a **role** (e.g., "user", "assistant"), **content** (e.g., text, multimodal data), and additional metadata that can vary depending on the chat model provider. +Each message has a **role** (e.g., "user", "assistant") and **content** (e.g., text, multimodal data) with additional metadata that varies depending on the chat model provider. LangChain provides a unified message format that can be used across chat models, allowing users to work with different chat models without worrying about the specific details of the message format used by each model provider. @@ -39,6 +39,7 @@ The content of a message text or a list of dictionaries representing [multimodal Currently, most chat models support text as the primary content type, with some models also supporting multimodal data. However, support for multimodal data is still limited across most chat model providers. For more information see: +* [SystemMessage](#systemmessage) -- for content which should be passed to direct the conversation * [HumanMessage](#humanmessage) -- for content in the input from the user. * [AIMessage](#aimessage) -- for content in the response from the model. * [Multimodality](/docs/concepts/multimodality) -- for more information on multimodal content. diff --git a/docs/docs/concepts/retrieval.mdx b/docs/docs/concepts/retrieval.mdx index a69fb8d4f9..0ded476b80 100644 --- a/docs/docs/concepts/retrieval.mdx +++ b/docs/docs/concepts/retrieval.mdx @@ -27,7 +27,7 @@ These systems accommodate various data formats: - Unstructured text (e.g., documents) is often stored in vector stores or lexical search indexes. - Structured data is typically housed in relational or graph databases with defined schemas. -Despite this diversity in data formats, modern AI applications increasingly aim to make all types of data accessible through natural language interfaces. +Despite the growing diversity in data formats, modern AI applications increasingly aim to make all types of data accessible through natural language interfaces. Models play a crucial role in this process by translating natural language queries into formats compatible with the underlying search index or database. This translation enables more intuitive and flexible interactions with complex data structures. @@ -41,7 +41,7 @@ This translation enables more intuitive and flexible interactions with complex d ## Query analysis -While users typically prefer to interact with retrieval systems using natural language, retrieval systems can specific query syntax or benefit from particular keywords. +While users typically prefer to interact with retrieval systems using natural language, these systems may require specific query syntax or benefit from certain keywords. Query analysis serves as a bridge between raw user input and optimized search queries. Some common applications of query analysis include: 1. **Query Re-writing**: Queries can be re-written or expanded to improve semantic or lexical searches. diff --git a/docs/docs/concepts/runnables.mdx b/docs/docs/concepts/runnables.mdx index 961942c67d..e37022aa52 100644 --- a/docs/docs/concepts/runnables.mdx +++ b/docs/docs/concepts/runnables.mdx @@ -1,6 +1,6 @@ # Runnable interface -The Runnable interface is foundational for working with LangChain components, and it's implemented across many of them, such as [language models](/docs/concepts/chat_models), [output parsers](/docs/concepts/output_parsers), [retrievers](/docs/concepts/retrievers), [compiled LangGraph graphs]( +The Runnable interface is the foundation for working with LangChain components, and it's implemented across many of them, such as [language models](/docs/concepts/chat_models), [output parsers](/docs/concepts/output_parsers), [retrievers](/docs/concepts/retrievers), [compiled LangGraph graphs]( https://langchain-ai.github.io/langgraph/concepts/low_level/#compiling-your-graph) and more. This guide covers the main concepts and methods of the Runnable interface, which allows developers to interact with various LangChain components in a consistent and predictable manner. @@ -42,7 +42,7 @@ Some Runnables may provide their own implementations of `batch` and `batch_as_co rely on a `batch` API provided by a model provider). :::note -The async versions of `abatch` and `abatch_as_completed` these rely on asyncio's [gather](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather) and [as_completed](https://docs.python.org/3/library/asyncio-task.html#asyncio.as_completed) functions to run the `ainvoke` method in parallel. +The async versions of `abatch` and `abatch_as_completed` relies on asyncio's [gather](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather) and [as_completed](https://docs.python.org/3/library/asyncio-task.html#asyncio.as_completed) functions to run the `ainvoke` method in parallel. ::: :::tip @@ -58,7 +58,7 @@ Runnables expose an asynchronous API, allowing them to be called using the `awai Please refer to the [Async Programming with LangChain](/docs/concepts/async) guide for more details. -## Streaming apis +## Streaming APIs Streaming is critical in making applications based on LLMs feel responsive to end-users. @@ -101,7 +101,7 @@ This is an advanced feature that is unnecessary for most users. You should proba skip this section unless you have a specific need to inspect the schema of a Runnable. ::: -In some advanced uses, you may want to programmatically **inspect** the Runnable and determine what input and output types the Runnable expects and produces. +In more advanced use cases, you may want to programmatically **inspect** the Runnable and determine what input and output types the Runnable expects and produces. The Runnable interface provides methods to get the [JSON Schema](https://json-schema.org/) of the input and output types of a Runnable, as well as [Pydantic schemas](https://docs.pydantic.dev/latest/) for the input and output types. diff --git a/docs/docs/concepts/structured_outputs.mdx b/docs/docs/concepts/structured_outputs.mdx index a334ecc127..dad1c1a49c 100644 --- a/docs/docs/concepts/structured_outputs.mdx +++ b/docs/docs/concepts/structured_outputs.mdx @@ -119,11 +119,11 @@ json_object = json.loads(ai_msg.content) There are a few challenges when producing structured output with the above methods: -(1) If using tool calling, tool call arguments needs to be parsed from a dictionary back to the original schema. +(1) When tool calling is used, tool call arguments needs to be parsed from a dictionary back to the original schema. (2) In addition, the model needs to be instructed to *always* use the tool when we want to enforce structured output, which is a provider specific setting. -(3) If using JSON mode, the output needs to be parsed into a JSON object. +(3) When JSON mode is used, the output needs to be parsed into a JSON object. With these challenges in mind, LangChain provides a helper function (`with_structured_output()`) to streamline the process. diff --git a/docs/docs/concepts/tools.mdx b/docs/docs/concepts/tools.mdx index 13bf00d43f..c459a5973b 100644 --- a/docs/docs/concepts/tools.mdx +++ b/docs/docs/concepts/tools.mdx @@ -6,7 +6,7 @@ ## Overview -The **tool** abstraction in LangChain associates a python **function** with a **schema** that defines the function's **name**, **description** and **input**. +The **tool** abstraction in LangChain associates a Python **function** with a **schema** that defines the function's **name**, **description** and **expected arguments**. **Tools** can be passed to [chat models](/docs/concepts/chat_models) that support [tool calling](/docs/concepts/tool_calling) allowing the model to request the execution of a specific function with specific inputs. @@ -14,7 +14,7 @@ The **tool** abstraction in LangChain associates a python **function** with a ** - Tools are a way to encapsulate a function and its schema in a way that can be passed to a chat model. - Create tools using the [@tool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html) decorator, which simplifies the process of tool creation, supporting the following: - - Automatically infer the tool's **name**, **description** and **inputs**, while also supporting customization. + - Automatically infer the tool's **name**, **description** and **expected arguments**, while also supporting customization. - Defining tools that return **artifacts** (e.g. images, dataframes, etc.) - Hiding input arguments from the schema (and hence from the model) using **injected tool arguments**. diff --git a/docs/docs/concepts/why_langchain.mdx b/docs/docs/concepts/why_langchain.mdx index c6b1d41da3..584a080c95 100644 --- a/docs/docs/concepts/why_langchain.mdx +++ b/docs/docs/concepts/why_langchain.mdx @@ -1,9 +1,9 @@ -# Why langchain? +# Why LangChain? -The goal of `langchain` the Python package and LangChain the company is to make it as easy possible for developers to build applications that reason. +The goal of `langchain` the Python package and LangChain the company is to make it as easy as possible for developers to build applications that reason. While LangChain originally started as a single open source package, it has evolved into a company and a whole ecosystem. This page will talk about the LangChain ecosystem as a whole. -Most of the components within in the LangChain ecosystem can be used by themselves - so if you feel particularly drawn to certain components but not others, that is totally fine! Pick and choose whichever components you like best. +Most of the components within the LangChain ecosystem can be used by themselves - so if you feel particularly drawn to certain components but not others, that is totally fine! Pick and choose whichever components you like best for your own use case! ## Features @@ -17,8 +17,8 @@ LangChain exposes a standard interface for key components, making it easy to swi [Orchestration](https://en.wikipedia.org/wiki/Orchestration_(computing)) is crucial for building such applications. 3. **Observability and evaluation:** As applications become more complex, it becomes increasingly difficult to understand what is happening within them. -Furthermore, the pace of development can become rate-limited by the [paradox of choice](https://en.wikipedia.org/wiki/Paradox_of_choice): -for example, developers often wonder how to engineer their prompt or which LLM best balances accuracy, latency, and cost. +Furthermore, the pace of development can become rate-limited by the [paradox of choice](https://en.wikipedia.org/wiki/Paradox_of_choice). +For example, developers often wonder how to engineer their prompt or which LLM best balances accuracy, latency, and cost. [Observability](https://en.wikipedia.org/wiki/Observability) and evaluations can help developers monitor their applications and rapidly answer these types of questions with confidence. @@ -72,11 +72,11 @@ There are several common characteristics of LLM applications that this orchestra * **[Persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/):** The application needs to maintain [short-term and / or long-term memory](https://langchain-ai.github.io/langgraph/concepts/memory/). * **[Human-in-the-loop](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/):** The application needs human interaction, e.g., pausing, reviewing, editing, approving certain steps. -The recommended way to do orchestration for these complex applications is [LangGraph](https://langchain-ai.github.io/langgraph/concepts/high_level/). +The recommended way to orchestrate components for complex applications is [LangGraph](https://langchain-ai.github.io/langgraph/concepts/high_level/). LangGraph is a library that gives developers a high degree of control by expressing the flow of the application as a set of nodes and edges. LangGraph comes with built-in support for [persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/), [human-in-the-loop](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/), [memory](https://langchain-ai.github.io/langgraph/concepts/memory/), and other features. -It's particularly well suited for building [agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/) or [multi-agent](https://langchain-ai.github.io/langgraph/concepts/multi_agent/) applications. -Importantly, individual LangChain components can be used within LangGraph nodes, but you can also use LangGraph **without** using LangChain components. +It's particularly well suited for building [agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/) or [multi-agent](https://langchain-ai.github.io/langgraph/concepts/multi_agent/) applications. +Importantly, individual LangChain components can be used as LangGraph nodes, but you can also use LangGraph **without** using LangChain components. :::info[Further reading] diff --git a/docs/docs/how_to/chatbots_memory.ipynb b/docs/docs/how_to/chatbots_memory.ipynb index aa6e7002ca..011609a4a5 100644 --- a/docs/docs/how_to/chatbots_memory.ipynb +++ b/docs/docs/how_to/chatbots_memory.ipynb @@ -15,7 +15,7 @@ "source": [ "# How to add memory to chatbots\n", "\n", - "A key feature of chatbots is their ability to use content of previous conversation turns as context. This state management can take several forms, including:\n", + "A key feature of chatbots is their ability to use the content of previous conversational turns as context. This state management can take several forms, including:\n", "\n", "- Simply stuffing previous messages into a chat model prompt.\n", "- The above, but trimming old messages to reduce the amount of distracting information the model has to deal with.\n", @@ -185,7 +185,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - " We'll pass the latest input to the conversation here and let the LangGraph keep track of the conversation history using the checkpointer:" + " We'll pass the latest input to the conversation here and let LangGraph keep track of the conversation history using the checkpointer:" ] }, { diff --git a/docs/docs/how_to/custom_chat_model.ipynb b/docs/docs/how_to/custom_chat_model.ipynb index 708a0942c9..4fc502ca17 100644 --- a/docs/docs/how_to/custom_chat_model.ipynb +++ b/docs/docs/how_to/custom_chat_model.ipynb @@ -503,7 +503,7 @@ "\n", "Documentation:\n", "\n", - "* The model contains doc-strings for all initialization arguments, as these will be surfaced in the [APIReference](https://python.langchain.com/api_reference/langchain/index.html).\n", + "* The model contains doc-strings for all initialization arguments, as these will be surfaced in the [API Reference](https://python.langchain.com/api_reference/langchain/index.html).\n", "* The class doc-string for the model contains a link to the model API if the model is powered by a service.\n", "\n", "Tests:\n", diff --git a/docs/docs/how_to/output_parser_custom.ipynb b/docs/docs/how_to/output_parser_custom.ipynb index 26180b3fb6..a8cca984b6 100644 --- a/docs/docs/how_to/output_parser_custom.ipynb +++ b/docs/docs/how_to/output_parser_custom.ipynb @@ -238,7 +238,7 @@ "id": "3a96a846-1296-4d92-8e76-e29e583dee22", "metadata": {}, "source": [ - "Here's a simple parser that can parse a **string** representation of a booealn (e.g., `YES` or `NO`) and convert it into the corresponding `boolean` type." + "Here's a simple parser that can parse a **string** representation of a boolean (e.g., `YES` or `NO`) and convert it into the corresponding `boolean` type." ] }, { diff --git a/docs/docs/integrations/llms/aleph_alpha.ipynb b/docs/docs/integrations/llms/aleph_alpha.ipynb index 70fc18af07..1c7f264dbb 100644 --- a/docs/docs/integrations/llms/aleph_alpha.ipynb +++ b/docs/docs/integrations/llms/aleph_alpha.ipynb @@ -7,7 +7,7 @@ "source": [ "# Aleph Alpha\n", "\n", - "[The Luminous series](https://docs.aleph-alpha.com/docs/introduction/luminous/) is a family of large language models.\n", + "[The Luminous series](https://docs.aleph-alpha.com/docs/category/luminous/) is a family of large language models.\n", "\n", "This example goes over how to use LangChain to interact with Aleph Alpha models" ] diff --git a/docs/docs/integrations/llms/cloudflare_workersai.ipynb b/docs/docs/integrations/llms/cloudflare_workersai.ipynb index 5c6652eb33..0232683535 100644 --- a/docs/docs/integrations/llms/cloudflare_workersai.ipynb +++ b/docs/docs/integrations/llms/cloudflare_workersai.ipynb @@ -7,7 +7,7 @@ "source": [ "# Cloudflare Workers AI\n", "\n", - "[Cloudflare AI documentation](https://developers.cloudflare.com/workers-ai/models/text-generation/) listed all generative text models available.\n", + "[Cloudflare AI documentation](https://developers.cloudflare.com/workers-ai/models/) listed all generative text models available.\n", "\n", "Both Cloudflare account ID and API token are required. Find how to obtain them from [this document](https://developers.cloudflare.com/workers-ai/get-started/rest-api/)." ] diff --git a/docs/docs/integrations/llms/forefrontai.ipynb b/docs/docs/integrations/llms/forefrontai.ipynb index 34dec0be5e..c06988e6e3 100644 --- a/docs/docs/integrations/llms/forefrontai.ipynb +++ b/docs/docs/integrations/llms/forefrontai.ipynb @@ -7,7 +7,7 @@ "# ForefrontAI\n", "\n", "\n", - "The `Forefront` platform gives you the ability to fine-tune and use [open-source large language models](https://docs.forefront.ai/forefront/master/models).\n", + "The `Forefront` platform gives you the ability to fine-tune and use [open-source large language models](https://docs.forefront.ai/get-started/models).\n", "\n", "This notebook goes over how to use Langchain with [ForefrontAI](https://www.forefront.ai/).\n" ] diff --git a/docs/docs/integrations/vectorstores/kinetica.ipynb b/docs/docs/integrations/vectorstores/kinetica.ipynb index 1d5344cf42..292098ce9a 100644 --- a/docs/docs/integrations/vectorstores/kinetica.ipynb +++ b/docs/docs/integrations/vectorstores/kinetica.ipynb @@ -33,35 +33,13 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.2.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.0\u001b[0m\n", - "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", - "Note: you may need to restart the kernel to use updated packages.\n", - "Requirement already satisfied: gpudb==7.2.0.0b in /home/anindyam/kinetica/kinetica-github/langchain/libs/langchain/.venv/lib/python3.8/site-packages (7.2.0.0b0)\n", - "Requirement already satisfied: future in /home/anindyam/kinetica/kinetica-github/langchain/libs/langchain/.venv/lib/python3.8/site-packages (from gpudb==7.2.0.0b) (0.18.3)\n", - "Requirement already satisfied: pyzmq in /home/anindyam/kinetica/kinetica-github/langchain/libs/langchain/.venv/lib/python3.8/site-packages (from gpudb==7.2.0.0b) (25.1.2)\n", - "\n", - "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.2.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.0\u001b[0m\n", - "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", - "Note: you may need to restart the kernel to use updated packages.\n", - "\n", - "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.2.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.0\u001b[0m\n", - "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", - "Note: you may need to restart the kernel to use updated packages.\n" - ] - } - ], + "outputs": [], "source": [ "# Pip install necessary package\n", "%pip install --upgrade --quiet langchain-openai langchain-community\n", - "%pip install gpudb==7.2.0.9\n", + "%pip install gpudb>=7.2.2.0 \n", "%pip install --upgrade --quiet tiktoken" ] }, @@ -74,7 +52,7 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 2, "metadata": {}, "outputs": [], "source": [ @@ -87,7 +65,7 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 3, "metadata": {}, "outputs": [ { @@ -96,7 +74,7 @@ "False" ] }, - "execution_count": 25, + "execution_count": 3, "metadata": {}, "output_type": "execute_result" } @@ -110,38 +88,30 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from langchain_community.document_loaders import TextLoader\n", "from langchain_community.vectorstores import (\n", - " DistanceStrategy,\n", " Kinetica,\n", " KineticaSettings,\n", ")\n", - "from langchain_core.documents import Document\n", - "from langchain_openai import OpenAIEmbeddings\n", - "from langchain_text_splitters import CharacterTextSplitter" + "from langchain_openai import OpenAIEmbeddings" ] }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ - "loader = TextLoader(\"../../how_to/state_of_the_union.txt\")\n", - "documents = loader.load()\n", - "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n", - "docs = text_splitter.split_documents(documents)\n", - "\n", - "embeddings = OpenAIEmbeddings()" + "embeddings = OpenAIEmbeddings(model=\"text-embedding-3-large\")" ] }, { "cell_type": "code", - "execution_count": 28, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -157,6 +127,81 @@ " return KineticaSettings(host=HOST, username=USERNAME, password=PASSWORD)" ] }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "from uuid import uuid4\n", + "\n", + "from langchain_core.documents import Document\n", + "\n", + "document_1 = Document(\n", + " page_content=\"I had chocalate chip pancakes and scrambled eggs for breakfast this morning.\",\n", + " metadata={\"source\": \"tweet\"},\n", + ")\n", + "\n", + "document_2 = Document(\n", + " page_content=\"The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees.\",\n", + " metadata={\"source\": \"news\"},\n", + ")\n", + "\n", + "document_3 = Document(\n", + " page_content=\"Building an exciting new project with LangChain - come check it out!\",\n", + " metadata={\"source\": \"tweet\"},\n", + ")\n", + "\n", + "document_4 = Document(\n", + " page_content=\"Robbers broke into the city bank and stole $1 million in cash.\",\n", + " metadata={\"source\": \"news\"},\n", + ")\n", + "\n", + "document_5 = Document(\n", + " page_content=\"Wow! That was an amazing movie. I can't wait to see it again.\",\n", + " metadata={\"source\": \"tweet\"},\n", + ")\n", + "\n", + "document_6 = Document(\n", + " page_content=\"Is the new iPhone worth the price? Read this review to find out.\",\n", + " metadata={\"source\": \"website\"},\n", + ")\n", + "\n", + "document_7 = Document(\n", + " page_content=\"The top 10 soccer players in the world right now.\",\n", + " metadata={\"source\": \"website\"},\n", + ")\n", + "\n", + "document_8 = Document(\n", + " page_content=\"LangGraph is the best framework for building stateful, agentic applications!\",\n", + " metadata={\"source\": \"tweet\"},\n", + ")\n", + "\n", + "document_9 = Document(\n", + " page_content=\"The stock market is down 500 points today due to fears of a recession.\",\n", + " metadata={\"source\": \"news\"},\n", + ")\n", + "\n", + "document_10 = Document(\n", + " page_content=\"I have a bad feeling I am going to get deleted :(\",\n", + " metadata={\"source\": \"tweet\"},\n", + ")\n", + "\n", + "documents = [\n", + " document_1,\n", + " document_2,\n", + " document_3,\n", + " document_4,\n", + " document_5,\n", + " document_6,\n", + " document_7,\n", + " document_8,\n", + " document_9,\n", + " document_10,\n", + "]\n", + "uuids = [str(uuid4()) for _ in range(len(documents))]" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -166,207 +211,92 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 8, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "['05e5a484-0273-49d1-90eb-1276baca31de',\n", + " 'd98b808f-dc0b-4328-bdbf-88f6b2ab6040',\n", + " 'ba0968d4-e344-4285-ae0f-f5199b56f9d6',\n", + " 'a25393b8-6539-45b5-993e-ea16d01941ec',\n", + " '804a37e3-1278-4b60-8b02-36b159ee8c1a',\n", + " '9688b594-3dc6-41d2-a937-babf8ff24c2f',\n", + " '40f7b8fe-67c7-489a-a5a5-7d3965e33bba',\n", + " 'b4fc1376-c113-41e9-8f16-f9320517bedd',\n", + " '4d94d089-fdde-442b-84ab-36d9fe0670c8',\n", + " '66fdb79d-49ce-4b06-901a-fda6271baf2a']" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "# The Kinetica Module will try to create a table with the name of the collection.\n", "# So, make sure that the collection name is unique and the user has the permission to create a table.\n", "\n", - "COLLECTION_NAME = \"state_of_the_union_test\"\n", + "COLLECTION_NAME = \"langchain_example\"\n", "connection = create_config()\n", "\n", - "db = Kinetica.from_documents(\n", - " embedding=embeddings,\n", - " documents=docs,\n", + "db = Kinetica(\n", + " connection,\n", + " embeddings,\n", " collection_name=COLLECTION_NAME,\n", - " config=connection,\n", - ")" + ")\n", + "\n", + "db.add_documents(documents=documents, ids=uuids)" ] }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 9, "metadata": {}, "outputs": [], "source": [ - "query = \"What did the president say about Ketanji Brown Jackson\"\n", - "docs_with_score = db.similarity_search_with_score(query)" + "# query = \"What did the president say about Ketanji Brown Jackson\"\n", + "# docs_with_score = db.similarity_search_with_score(query)" ] }, { "cell_type": "code", - "execution_count": 31, + "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "--------------------------------------------------------------------------------\n", - "Score: 0.6077010035514832\n", - "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n", "\n", - "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n", + "Similarity Search\n", + "* Building an exciting new project with LangChain - come check it out! [{'source': 'tweet'}]\n", + "* LangGraph is the best framework for building stateful, agentic applications! [{'source': 'tweet'}]\n", "\n", - "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n", - "\n", - "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n", - "--------------------------------------------------------------------------------\n", - "--------------------------------------------------------------------------------\n", - "Score: 0.6077010035514832\n", - "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n", - "\n", - "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n", - "\n", - "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n", - "\n", - "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n", - "--------------------------------------------------------------------------------\n", - "--------------------------------------------------------------------------------\n", - "Score: 0.6596046090126038\n", - "A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n", - "\n", - "And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n", - "\n", - "We can do both. At our border, we’ve installed new technology like cutting-edge scanners to better detect drug smuggling. \n", - "\n", - "We’ve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n", - "\n", - "We’re putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n", - "\n", - "We’re securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.\n", - "--------------------------------------------------------------------------------\n", - "--------------------------------------------------------------------------------\n", - "Score: 0.6597143411636353\n", - "A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n", - "\n", - "And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n", - "\n", - "We can do both. At our border, we’ve installed new technology like cutting-edge scanners to better detect drug smuggling. \n", - "\n", - "We’ve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \n", - "\n", - "We’re putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n", - "\n", - "We’re securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.\n", - "--------------------------------------------------------------------------------\n" + "Similarity search with score\n", + "* [SIM=0.945397] The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees. [{'source': 'news'}]\n" ] } ], "source": [ - "for doc, score in docs_with_score:\n", - " print(\"-\" * 80)\n", - " print(\"Score: \", score)\n", - " print(doc.page_content)\n", - " print(\"-\" * 80)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Maximal Marginal Relevance Search (MMR)\n", - "Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents." - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "metadata": {}, - "outputs": [], - "source": [ - "docs_with_score = db.max_marginal_relevance_search_with_score(query)" - ] - }, - { - "cell_type": "code", - "execution_count": 33, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "--------------------------------------------------------------------------------\n", - "Score: 0.6077010035514832\n", - "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n", - "\n", - "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n", - "\n", - "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n", - "\n", - "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n", - "--------------------------------------------------------------------------------\n", - "--------------------------------------------------------------------------------\n", - "Score: 0.6852865219116211\n", - "It is going to transform America and put us on a path to win the economic competition of the 21st Century that we face with the rest of the world—particularly with China. \n", - "\n", - "As I’ve told Xi Jinping, it is never a good bet to bet against the American people. \n", - "\n", - "We’ll create good jobs for millions of Americans, modernizing roads, airports, ports, and waterways all across America. \n", - "\n", - "And we’ll do it all to withstand the devastating effects of the climate crisis and promote environmental justice. \n", - "\n", - "We’ll build a national network of 500,000 electric vehicle charging stations, begin to replace poisonous lead pipes—so every child—and every American—has clean water to drink at home and at school, provide affordable high-speed internet for every American—urban, suburban, rural, and tribal communities. \n", - "\n", - "4,000 projects have already been announced. \n", - "\n", - "And tonight, I’m announcing that this year we will start fixing over 65,000 miles of highway and 1,500 bridges in disrepair.\n", - "--------------------------------------------------------------------------------\n", - "--------------------------------------------------------------------------------\n", - "Score: 0.6866700053215027\n", - "We can’t change how divided we’ve been. But we can change how we move forward—on COVID-19 and other issues we must face together. \n", - "\n", - "I recently visited the New York City Police Department days after the funerals of Officer Wilbert Mora and his partner, Officer Jason Rivera. \n", - "\n", - "They were responding to a 9-1-1 call when a man shot and killed them with a stolen gun. \n", - "\n", - "Officer Mora was 27 years old. \n", - "\n", - "Officer Rivera was 22. \n", - "\n", - "Both Dominican Americans who’d grown up on the same streets they later chose to patrol as police officers. \n", - "\n", - "I spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves. \n", - "\n", - "I’ve worked on these issues a long time. \n", - "\n", - "I know what works: Investing in crime prevention and community police officers who’ll walk the beat, who’ll know the neighborhood, and who can restore trust and safety.\n", - "--------------------------------------------------------------------------------\n", - "--------------------------------------------------------------------------------\n", - "Score: 0.6936529278755188\n", - "But cancer from prolonged exposure to burn pits ravaged Heath’s lungs and body. \n", - "\n", - "Danielle says Heath was a fighter to the very end. \n", - "\n", - "He didn’t know how to stop fighting, and neither did she. \n", - "\n", - "Through her pain she found purpose to demand we do better. \n", - "\n", - "Tonight, Danielle—we are. \n", - "\n", - "The VA is pioneering new ways of linking toxic exposures to diseases, already helping more veterans get benefits. \n", - "\n", - "And tonight, I’m announcing we’re expanding eligibility to veterans suffering from nine respiratory cancers. \n", - "\n", - "I’m also calling on Congress: pass a law to make sure veterans devastated by toxic exposures in Iraq and Afghanistan finally get the benefits and comprehensive health care they deserve. \n", - "\n", - "And fourth, let’s end cancer as we know it. \n", - "\n", - "This is personal to me and Jill, to Kamala, and to so many of you. \n", - "\n", - "Cancer is the #2 cause of death in America–second only to heart disease.\n", - "--------------------------------------------------------------------------------\n" - ] - } - ], - "source": [ - "for doc, score in docs_with_score:\n", - " print(\"-\" * 80)\n", - " print(\"Score: \", score)\n", - " print(doc.page_content)\n", - " print(\"-\" * 80)" + "print()\n", + "print(\"Similarity Search\")\n", + "results = db.similarity_search(\n", + " \"LangChain provides abstractions to make working with LLMs easy\",\n", + " k=2,\n", + " filter={\"source\": \"tweet\"},\n", + ")\n", + "for res in results:\n", + " print(f\"* {res.page_content} [{res.metadata}]\")\n", + "\n", + "print()\n", + "print(\"Similarity search with score\")\n", + "results = db.similarity_search_with_score(\n", + " \"Will it be hot tomorrow?\", k=1, filter={\"source\": \"news\"}\n", + ")\n", + "for res, score in results:\n", + " print(f\"* [SIM={score:3f}] {res.page_content} [{res.metadata}]\")" ] }, { @@ -381,7 +311,7 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 11, "metadata": {}, "outputs": [], "source": [ @@ -402,16 +332,16 @@ }, { "cell_type": "code", - "execution_count": 35, + "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "['b94dc67c-ce7e-11ee-b8cb-b940b0e45762']" + "['68c4c679-c4d9-4f2d-bf01-f6c4f2181503']" ] }, - "execution_count": 35, + "execution_count": 12, "metadata": {}, "output_type": "execute_result" } @@ -422,7 +352,7 @@ }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 13, "metadata": {}, "outputs": [], "source": [ @@ -431,16 +361,16 @@ }, { "cell_type": "code", - "execution_count": 37, + "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "(Document(page_content='foo'), 0.0)" + "(Document(metadata={}, page_content='foo'), 0.0015394920483231544)" ] }, - "execution_count": 37, + "execution_count": 14, "metadata": {}, "output_type": "execute_result" } @@ -451,17 +381,17 @@ }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "(Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, we’ve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWe’ve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWe’re putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWe’re securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': '../../how_to/state_of_the_union.txt'}),\n", - " 0.6946534514427185)" + "(Document(metadata={'source': 'tweet'}, page_content='Building an exciting new project with LangChain - come check it out!'),\n", + " 1.2609431743621826)" ] }, - "execution_count": 38, + "execution_count": 15, "metadata": {}, "output_type": "execute_result" } @@ -481,12 +411,12 @@ }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "db = Kinetica.from_documents(\n", - " documents=docs,\n", + " documents=documents,\n", " embedding=embeddings,\n", " collection_name=COLLECTION_NAME,\n", " config=connection,\n", @@ -496,7 +426,7 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 17, "metadata": {}, "outputs": [], "source": [ @@ -505,17 +435,17 @@ }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "(Document(page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\n\\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \\n\\nWe can do both. At our border, we’ve installed new technology like cutting-edge scanners to better detect drug smuggling. \\n\\nWe’ve set up joint patrols with Mexico and Guatemala to catch more human traffickers. \\n\\nWe’re putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \\n\\nWe’re securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.', metadata={'source': '../../how_to/state_of_the_union.txt'}),\n", - " 0.6946534514427185)" + "(Document(metadata={'source': 'tweet'}, page_content='Building an exciting new project with LangChain - come check it out!'),\n", + " 1.260920763015747)" ] }, - "execution_count": 41, + "execution_count": 18, "metadata": {}, "output_type": "execute_result" } @@ -533,7 +463,7 @@ }, { "cell_type": "code", - "execution_count": 42, + "execution_count": 19, "metadata": {}, "outputs": [], "source": [ @@ -542,14 +472,14 @@ }, { "cell_type": "code", - "execution_count": 43, + "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "tags=['Kinetica', 'OpenAIEmbeddings'] vectorstore=\n" + "tags=['Kinetica', 'OpenAIEmbeddings'] vectorstore= search_kwargs={}\n" ] } ], @@ -574,7 +504,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.10" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/docs/troubleshooting/errors/INVALID_TOOL_RESULTS.ipynb b/docs/docs/troubleshooting/errors/INVALID_TOOL_RESULTS.ipynb index 4a1b9b3bdf..f3e62badc4 100644 --- a/docs/docs/troubleshooting/errors/INVALID_TOOL_RESULTS.ipynb +++ b/docs/docs/troubleshooting/errors/INVALID_TOOL_RESULTS.ipynb @@ -6,7 +6,7 @@ "source": [ "# INVALID_TOOL_RESULTS\n", "\n", - "You are passing too many, too few, or mismatched [`ToolMessages`](https://api.js.langchain.com/classes/_langchain_core.messages_tool.ToolMessage.html) to a model.\n", + "You are passing too many, too few, or mismatched [`ToolMessages`](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolMessage.html#toolmessage) to a model.\n", "\n", "When [using a model to call tools](/docs/concepts/tool_calling), the [`AIMessage`](https://api.js.langchain.com/classes/_langchain_core.messages.AIMessage.html)\n", "the model responds with will contain a `tool_calls` array. To continue the flow, the next messages you pass back to the model must\n", diff --git a/libs/community/langchain_community/callbacks/tracers/wandb.py b/libs/community/langchain_community/callbacks/tracers/wandb.py index 76dbb3d202..fcc2312f47 100644 --- a/libs/community/langchain_community/callbacks/tracers/wandb.py +++ b/libs/community/langchain_community/callbacks/tracers/wandb.py @@ -16,6 +16,7 @@ from typing import ( Union, ) +from langchain_core._api import warn_deprecated from langchain_core.output_parsers.pydantic import PydanticBaseModel from langchain_core.tracers.base import BaseTracer from langchain_core.tracers.schemas import Run @@ -325,6 +326,22 @@ class WandbTracer(BaseTracer): self._run_args = run_args self._ensure_run(should_print_url=(wandb.run is None)) self._io_serializer = io_serializer + warn_deprecated( + "0.3.8", + pending=False, + message=( + "Please use the `WeaveTracer` from the `weave` package instead of this." + "The `WeaveTracer` is a more flexible and powerful tool for logging " + "and tracing your LangChain callables." + "Find more information at https://weave-docs.wandb.ai/guides/integrations/langchain" + ), + alternative=( + "Please instantiate the WeaveTracer from " + "`weave.integrations.langchain import WeaveTracer` ." + "For autologging simply use `weave.init()` and log all traces " + "from your LangChain callables." + ), + ) def finish(self) -> None: """Waits for all asynchronous processes to finish and data to upload. diff --git a/libs/community/langchain_community/vectorstores/kinetica.py b/libs/community/langchain_community/vectorstores/kinetica.py index b9f987219b..8c4a4f9672 100644 --- a/libs/community/langchain_community/vectorstores/kinetica.py +++ b/libs/community/langchain_community/vectorstores/kinetica.py @@ -93,7 +93,7 @@ class Kinetica(VectorStore): To use, you should have the ``gpudb`` python package installed. Args: - kinetica_settings: Kinetica connection settings class. + config: Kinetica connection settings class. embedding_function: Any embedding function implementing `langchain.embeddings.base.Embeddings` interface. collection_name: The name of the collection to use. (default: langchain) @@ -170,7 +170,7 @@ class Kinetica(VectorStore): except ImportError: raise ImportError( "Could not import Kinetica python API. " - "Please install it with `pip install gpudb==7.2.0.9`." + "Please install it with `pip install gpudb>=7.2.2.0`." ) self.dimensions = dimensions @@ -199,7 +199,7 @@ class Kinetica(VectorStore): except ImportError: raise ImportError( "Could not import Kinetica python API. " - "Please install it with `pip install gpudb==7.2.0.9`." + "Please install it with `pip install gpudb>=7.2.2.0`." ) options = GPUdb.Options() @@ -290,7 +290,7 @@ class Kinetica(VectorStore): except ImportError: raise ImportError( "Could not import Kinetica python API. " - "Please install it with `pip install gpudb==7.2.0.9`." + "Please install it with `pip install gpudb>=7.2.2.0`." ) return GPUdbTable( _type=self.table_schema, @@ -428,7 +428,7 @@ class Kinetica(VectorStore): k: int = 4, filter: Optional[dict] = None, ) -> List[Tuple[Document, float]]: - from gpudb import GPUdbException + # from gpudb import GPUdbException resp: Dict = self.__query_collection(embedding, k, filter) if resp and resp["status_info"]["status"] == "OK" and "records" in resp: @@ -436,9 +436,10 @@ class Kinetica(VectorStore): results = list(zip(*list(records.values()))) return self._results_to_docs_and_scores(results) - else: - self.logger.error(resp["status_info"]["message"]) - raise GPUdbException(resp["status_info"]["message"]) + + self.logger.error(resp["status_info"]["message"]) + # raise GPUdbException(resp["status_info"]["message"]) + return [] def similarity_search_by_vector( self, @@ -464,16 +465,20 @@ class Kinetica(VectorStore): def _results_to_docs_and_scores(self, results: Any) -> List[Tuple[Document, float]]: """Return docs and scores from results.""" - docs = [ - ( - Document( - page_content=result[0], - metadata=json.loads(result[1]), - ), - result[2] if self.embedding_function is not None else None, - ) - for result in results - ] + docs = ( + [ + ( + Document( + page_content=result[0], + metadata=json.loads(result[1]), + ), + result[2] if self.embedding_function is not None else None, + ) + for result in results + ] + if len(results) > 0 + else [] + ) return docs def _select_relevance_score_fn(self) -> Callable[[float], float]: diff --git a/libs/core/langchain_core/callbacks/file.py b/libs/core/langchain_core/callbacks/file.py index 7ea1ff76f8..961b0c9bc2 100644 --- a/libs/core/langchain_core/callbacks/file.py +++ b/libs/core/langchain_core/callbacks/file.py @@ -13,7 +13,8 @@ class FileCallbackHandler(BaseCallbackHandler): """Callback Handler that writes to a file. Parameters: - file: The file to write to. + filename: The file to write to. + mode: The mode to open the file in. Defaults to "a". color: The color to use for the text. """ diff --git a/libs/core/langchain_core/callbacks/manager.py b/libs/core/langchain_core/callbacks/manager.py index 9a25734f93..821939d1a6 100644 --- a/libs/core/langchain_core/callbacks/manager.py +++ b/libs/core/langchain_core/callbacks/manager.py @@ -1298,7 +1298,7 @@ class CallbackManager(BaseCallbackManager): run_id: Optional[UUID] = None, **kwargs: Any, ) -> list[CallbackManagerForLLMRun]: - """Run when LLM starts running. + """Run when chat model starts running. Args: serialized (Dict[str, Any]): The serialized LLM. diff --git a/libs/core/langchain_core/indexing/__init__.py b/libs/core/langchain_core/indexing/__init__.py index 786914c00e..472f41e11a 100644 --- a/libs/core/langchain_core/indexing/__init__.py +++ b/libs/core/langchain_core/indexing/__init__.py @@ -7,6 +7,7 @@ if it's unchanged. from langchain_core.indexing.api import IndexingResult, aindex, index from langchain_core.indexing.base import ( + DeleteResponse, DocumentIndex, InMemoryRecordManager, RecordManager, @@ -15,6 +16,7 @@ from langchain_core.indexing.base import ( __all__ = [ "aindex", + "DeleteResponse", "DocumentIndex", "index", "IndexingResult", diff --git a/libs/core/poetry.lock b/libs/core/poetry.lock index e3e50e7bdd..158095ea4e 100644 --- a/libs/core/poetry.lock +++ b/libs/core/poetry.lock @@ -1,4 +1,4 @@ -# This file is automatically @generated by Poetry 1.8.2 and should not be changed by hand. +# This file is automatically @generated by Poetry 1.8.3 and should not be changed by hand. [[package]] name = "annotated-types" @@ -808,22 +808,22 @@ arrow = ">=0.15.0" [[package]] name = "jedi" -version = "0.19.1" +version = "0.19.2" description = "An autocompletion tool for Python that can be used for text editors." optional = false python-versions = ">=3.6" files = [ - {file = "jedi-0.19.1-py2.py3-none-any.whl", hash = "sha256:e983c654fe5c02867aef4cdfce5a2fbb4a50adc0af145f70504238f18ef5e7e0"}, - {file = "jedi-0.19.1.tar.gz", hash = "sha256:cf0496f3651bc65d7174ac1b7d043eff454892c708a87d1b683e57b569927ffd"}, + {file = "jedi-0.19.2-py2.py3-none-any.whl", hash = "sha256:a8ef22bde8490f57fe5c7681a3c83cb58874daf72b4784de3cce5b6ef6edb5b9"}, + {file = "jedi-0.19.2.tar.gz", hash = "sha256:4770dc3de41bde3966b02eb84fbcf557fb33cce26ad23da12c742fb50ecb11f0"}, ] [package.dependencies] -parso = ">=0.8.3,<0.9.0" +parso = ">=0.8.4,<0.9.0" [package.extras] docs = ["Jinja2 (==2.11.3)", "MarkupSafe (==1.1.1)", "Pygments (==2.8.1)", "alabaster (==0.7.12)", "babel (==2.9.1)", "chardet (==4.0.0)", "commonmark (==0.8.1)", "docutils (==0.17.1)", "future (==0.18.2)", "idna (==2.10)", "imagesize (==1.2.0)", "mock (==1.0.1)", "packaging (==20.9)", "pyparsing (==2.4.7)", "pytz (==2021.1)", "readthedocs-sphinx-ext (==2.1.4)", "recommonmark (==0.5.0)", "requests (==2.25.1)", "six (==1.15.0)", "snowballstemmer (==2.1.0)", "sphinx (==1.8.5)", "sphinx-rtd-theme (==0.4.3)", "sphinxcontrib-serializinghtml (==1.1.4)", "sphinxcontrib-websupport (==1.2.4)", "urllib3 (==1.26.4)"] qa = ["flake8 (==5.0.4)", "mypy (==0.971)", "types-setuptools (==67.2.0.1)"] -testing = ["Django", "attrs", "colorama", "docopt", "pytest (<7.0.0)"] +testing = ["Django", "attrs", "colorama", "docopt", "pytest (<9.0.0)"] [[package]] name = "jinja2" @@ -844,15 +844,18 @@ i18n = ["Babel (>=2.7)"] [[package]] name = "json5" -version = "0.9.25" +version = "0.9.28" description = "A Python implementation of the JSON5 data format." optional = false -python-versions = ">=3.8" +python-versions = ">=3.8.0" files = [ - {file = "json5-0.9.25-py3-none-any.whl", hash = "sha256:34ed7d834b1341a86987ed52f3f76cd8ee184394906b6e22a1e0deb9ab294e8f"}, - {file = "json5-0.9.25.tar.gz", hash = "sha256:548e41b9be043f9426776f05df8635a00fe06104ea51ed24b67f908856e151ae"}, + {file = "json5-0.9.28-py3-none-any.whl", hash = "sha256:29c56f1accdd8bc2e037321237662034a7e07921e2b7223281a5ce2c46f0c4df"}, + {file = "json5-0.9.28.tar.gz", hash = "sha256:1f82f36e615bc5b42f1bbd49dbc94b12563c56408c6ffa06414ea310890e9a6e"}, ] +[package.extras] +dev = ["build (==1.2.2.post1)", "coverage (==7.5.3)", "mypy (==1.13.0)", "pip (==24.3.1)", "pylint (==3.2.3)", "ruff (==0.7.3)", "twine (==5.1.1)", "uv (==0.5.1)"] + [[package]] name = "jsonpatch" version = "1.33" @@ -1222,13 +1225,13 @@ url = "../text-splitters" [[package]] name = "langsmith" -version = "0.1.141" +version = "0.1.142" description = "Client library to connect to the LangSmith LLM Tracing and Evaluation Platform." optional = false python-versions = "<4.0,>=3.8.1" files = [ - {file = "langsmith-0.1.141-py3-none-any.whl", hash = "sha256:e133c6bed9c6d274f19735696a169dea890d35d444ae61c25fb47082aa2d0f16"}, - {file = "langsmith-0.1.141.tar.gz", hash = "sha256:f3f17d2abc7b8a3857ad8d492b535970109b5ea40e712df91a782522ed581516"}, + {file = "langsmith-0.1.142-py3-none-any.whl", hash = "sha256:f639ca23c9a0bb77af5fb881679b2f66ff1f21f19d0bebf4e51375e7585a8b38"}, + {file = "langsmith-0.1.142.tar.gz", hash = "sha256:f8a84d100f3052233ff0a1d66ae14c5dfc20b7e41a1601de011384f16ee6cb82"}, ] [package.dependencies] @@ -2751,13 +2754,13 @@ test = ["pytest", "ruff"] [[package]] name = "tomli" -version = "2.0.2" +version = "2.1.0" description = "A lil' TOML parser" optional = false python-versions = ">=3.8" files = [ - {file = "tomli-2.0.2-py3-none-any.whl", hash = "sha256:2ebe24485c53d303f690b0ec092806a085f07af5a5aa1464f3931eec36caaa38"}, - {file = "tomli-2.0.2.tar.gz", hash = "sha256:d46d457a85337051c36524bc5349dd91b1877838e2979ac5ced3e710ed8a60ed"}, + {file = "tomli-2.1.0-py3-none-any.whl", hash = "sha256:a5c57c3d1c56f5ccdf89f6523458f60ef716e210fc47c4cfb188c5ba473e0391"}, + {file = "tomli-2.1.0.tar.gz", hash = "sha256:3f646cae2aec94e17d04973e4249548320197cfabdf130015d023de4b74d8ab8"}, ] [[package]] @@ -2953,19 +2956,15 @@ files = [ [[package]] name = "webcolors" -version = "24.8.0" +version = "24.11.1" description = "A library for working with the color formats defined by HTML and CSS." optional = false -python-versions = ">=3.8" +python-versions = ">=3.9" files = [ - {file = "webcolors-24.8.0-py3-none-any.whl", hash = "sha256:fc4c3b59358ada164552084a8ebee637c221e4059267d0f8325b3b560f6c7f0a"}, - {file = "webcolors-24.8.0.tar.gz", hash = "sha256:08b07af286a01bcd30d583a7acadf629583d1f79bfef27dd2c2c5c263817277d"}, + {file = "webcolors-24.11.1-py3-none-any.whl", hash = "sha256:515291393b4cdf0eb19c155749a096f779f7d909f7cceea072791cb9095b92e9"}, + {file = "webcolors-24.11.1.tar.gz", hash = "sha256:ecb3d768f32202af770477b8b65f318fa4f566c22948673a977b00d589dd80f6"}, ] -[package.extras] -docs = ["furo", "sphinx", "sphinx-copybutton", "sphinx-inline-tabs", "sphinx-notfound-page", "sphinxext-opengraph"] -tests = ["coverage[toml]"] - [[package]] name = "webencodings" version = "0.5.1" @@ -3006,13 +3005,13 @@ files = [ [[package]] name = "zipp" -version = "3.20.2" +version = "3.21.0" description = "Backport of pathlib-compatible object wrapper for zip files" optional = false -python-versions = ">=3.8" +python-versions = ">=3.9" files = [ - {file = "zipp-3.20.2-py3-none-any.whl", hash = "sha256:a817ac80d6cf4b23bf7f2828b7cabf326f15a001bea8b1f9b49631780ba28350"}, - {file = "zipp-3.20.2.tar.gz", hash = "sha256:bc9eb26f4506fda01b81bcde0ca78103b6e62f991b381fec825435c836edbc29"}, + {file = "zipp-3.21.0-py3-none-any.whl", hash = "sha256:ac1bbe05fd2991f160ebce24ffbac5f6d11d83dc90891255885223d42b3cd931"}, + {file = "zipp-3.21.0.tar.gz", hash = "sha256:2c9958f6430a2040341a52eb608ed6dd93ef4392e02ffe219417c1b28b5dd1f4"}, ] [package.extras] diff --git a/libs/core/pyproject.toml b/libs/core/pyproject.toml index 4c885666ba..274401db50 100644 --- a/libs/core/pyproject.toml +++ b/libs/core/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api" [tool.poetry] name = "langchain-core" -version = "0.3.17" +version = "0.3.18" description = "Building applications with LLMs through composability" authors = [] license = "MIT" @@ -84,17 +84,20 @@ classmethod-decorators = [ "classmethod", "langchain_core.utils.pydantic.pre_ini [tool.poetry.group.lint.dependencies] ruff = "^0.5" + [tool.poetry.group.typing.dependencies] mypy = ">=1.10,<1.11" types-pyyaml = "^6.0.12.2" types-requests = "^2.28.11.5" types-jinja2 = "^2.11.9" + [tool.poetry.group.dev.dependencies] jupyter = "^1.0.0" setuptools = "^67.6.1" grandalf = "^0.8" + [tool.poetry.group.test.dependencies] pytest = "^8" freezegun = "^1.2.2" @@ -113,12 +116,15 @@ python = "<3.12" version = ">=1.26.0,<3" python = ">=3.12" + [tool.poetry.group.test_integration.dependencies] + [tool.poetry.group.typing.dependencies.langchain-text-splitters] path = "../text-splitters" develop = true + [tool.poetry.group.test.dependencies.langchain-standard-tests] path = "../standard-tests" develop = true diff --git a/libs/core/tests/unit_tests/indexing/test_public_api.py b/libs/core/tests/unit_tests/indexing/test_public_api.py index fce3d4f4f9..24ac092eb2 100644 --- a/libs/core/tests/unit_tests/indexing/test_public_api.py +++ b/libs/core/tests/unit_tests/indexing/test_public_api.py @@ -6,6 +6,7 @@ def test_all() -> None: assert __all__ == sorted(__all__, key=str.lower) assert set(__all__) == { "aindex", + "DeleteResponse", "DocumentIndex", "index", "IndexingResult", diff --git a/libs/partners/xai/poetry.lock b/libs/partners/xai/poetry.lock index c7e083d67c..95ba6848ae 100644 --- a/libs/partners/xai/poetry.lock +++ b/libs/partners/xai/poetry.lock @@ -720,7 +720,7 @@ files = [ [[package]] name = "langchain-core" -version = "0.3.16" +version = "0.3.17" description = "Building applications with LLMs through composability" optional = false python-versions = ">=3.9,<4.0" @@ -745,7 +745,7 @@ url = "../../core" [[package]] name = "langchain-openai" -version = "0.2.7" +version = "0.2.8" description = "An integration package connecting OpenAI and LangChain" optional = false python-versions = ">=3.9,<4.0" @@ -753,7 +753,7 @@ files = [] develop = true [package.dependencies] -langchain-core = "^0.3.16" +langchain-core = "^0.3.17" openai = "^1.54.0" tiktoken = ">=0.7,<1" @@ -2071,4 +2071,4 @@ propcache = ">=0.2.0" [metadata] lock-version = "2.0" python-versions = ">=3.9,<4.0" -content-hash = "352c1db7e0ce9fd87b2acf92db83e72dec6e82b2e83cb1678f47b175d982eacf" +content-hash = "954aeccc9bb5a2c79b1fd5affaab2303d588dcda6447db5e866430de7f759823" diff --git a/libs/partners/xai/pyproject.toml b/libs/partners/xai/pyproject.toml index a8f026d08e..819223c97d 100644 --- a/libs/partners/xai/pyproject.toml +++ b/libs/partners/xai/pyproject.toml @@ -20,10 +20,10 @@ disallow_untyped_defs = "True" [tool.poetry.dependencies] python = ">=3.9,<4.0" -langchain-openai = "^0.2" -langchain-core = "^0.3" -requests = "^2" -aiohttp = "^3.9.1" +langchain-openai = ">=0.2,<0.3" +langchain-core = ">=0.3,<0.4" +requests = ">=2,<3" +aiohttp = ">=3.9.1,<4" [tool.ruff.lint] select = ["E", "F", "I", "D"]