DOCS: Concept Section Improvements & Updates (#27733)

Edited mainly the `Concepts` section in the LangChain documentation.

Overview:
* Updated some explanations to make the point more clear / Add missing
words for some documentations.
* Rephrased some sentences to make it shorter and more concise.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
This commit is contained in:
Zapiron 2024-11-14 00:01:27 +08:00 committed by GitHub
parent 02de346f6d
commit da7c79b794
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
10 changed files with 31 additions and 30 deletions

View File

@ -17,7 +17,7 @@ Most conversations start with a **system message** that sets the context for the
The **assistant** may respond directly to the user or if configured with tools request that a [tool](/docs/concepts/tool_calling) be invoked to perform a specific task.
So a full conversation often involves a combination of two patterns of alternating messages:
A full conversation often involves a combination of two patterns of alternating messages:
1. The **user** and the **assistant** representing a back-and-forth conversation.
2. The **assistant** and **tool messages** representing an ["agentic" workflow](/docs/concepts/agents) where the assistant is invoking tools to perform specific tasks.

View File

@ -2,7 +2,7 @@
## Overview
Large Language Models (LLMs) are advanced machine learning models that excel in a wide range of language-related tasks such as text generation, translation, summarization, question answering, and more, without needing task-specific tuning for every scenario.
Large Language Models (LLMs) are advanced machine learning models that excel in a wide range of language-related tasks such as text generation, translation, summarization, question answering, and more, without needing task-specific fine tuning for every scenario.
Modern LLMs are typically accessed through a chat model interface that takes a list of [messages](/docs/concepts/messages) as input and returns a [message](/docs/concepts/messages) as output.
@ -85,7 +85,7 @@ Many chat models have standardized parameters that can be used to configure the
| Parameter | Description |
|----------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `model` | The name or identifier of the specific AI model you want to use (e.g., `"gpt-3.5-turbo"` or `"gpt-4"`). |
| `temperature` | Controls the randomness of the model's output. A higher value (e.g., 1.0) makes responses more creative, while a lower value (e.g., 0.1) makes them more deterministic and focused. |
| `temperature` | Controls the randomness of the model's output. A higher value (e.g., 1.0) makes responses more creative, while a lower value (e.g., 0.0) makes them more deterministic and focused. |
| `timeout` | The maximum time (in seconds) to wait for a response from the model before canceling the request. Ensures the request doesnt hang indefinitely. |
| `max_tokens` | Limits the total number of tokens (words and punctuation) in the response. This controls how long the output can be. |
| `stop` | Specifies stop sequences that indicate when the model should stop generating tokens. For example, you might use specific strings to signal the end of a response. |
@ -97,9 +97,9 @@ Many chat models have standardized parameters that can be used to configure the
Some important things to note:
- Standard parameters only apply to model providers that expose parameters with the intended functionality. For example, some providers do not expose a configuration for maximum output tokens, so max_tokens can't be supported on these.
- Standard params are currently only enforced on integrations that have their own integration packages (e.g. `langchain-openai`, `langchain-anthropic`, etc.), they're not enforced on models in ``langchain-community``.
- Standard parameters are currently only enforced on integrations that have their own integration packages (e.g. `langchain-openai`, `langchain-anthropic`, etc.), they're not enforced on models in `langchain-community`.
ChatModels also accept other parameters that are specific to that integration. To find all the parameters supported by a ChatModel head to the [API reference](https://python.langchain.com/api_reference/) for that model.
Chat models also accept other parameters that are specific to that integration. To find all the parameters supported by a Chat model head to the their respective [API reference](https://python.langchain.com/api_reference/) for that model.
## Tool calling
@ -150,7 +150,7 @@ An alternative approach is to use semantic caching, where you cache responses ba
A semantic cache introduces a dependency on another model on the critical path of your application (e.g., the semantic cache may rely on an [embedding model](/docs/concepts/embedding_models) to convert text to a vector representation), and it's not guaranteed to capture the meaning of the input accurately.
However, there might be situations where caching chat model responses is beneficial. For example, if you have a chat model that is used to answer frequently asked questions, caching responses can help reduce the load on the model provider and improve response times.
However, there might be situations where caching chat model responses is beneficial. For example, if you have a chat model that is used to answer frequently asked questions, caching responses can help reduce the load on the model provider, costs, and improve response times.
Please see the [how to cache chat model responses](/docs/how_to/chat_model_caching/) guide for more details.

View File

@ -29,7 +29,7 @@ loader = CSVLoader(
data = loader.load()
```
or if working with large datasets, you can use the `.lazy_load` method:
When working with large datasets, you can use the `.lazy_load` method:
```python
for document in loader.lazy_load():

View File

@ -6,7 +6,7 @@
The **L**ang**C**hain **E**xpression **L**anguage (LCEL) takes a [declarative](https://en.wikipedia.org/wiki/Declarative_programming) approach to building new [Runnables](/docs/concepts/runnables) from existing Runnables.
This means that you describe what you want to happen, rather than how you want it to happen, allowing LangChain to optimize the run-time execution of the chains.
This means that you describe what *should* happen, rather than *how* it should happen, allowing LangChain to optimize the run-time execution of the chains.
We often refer to a `Runnable` created using LCEL as a "chain". It's important to remember that a "chain" is `Runnable` and it implements the full [Runnable Interface](/docs/concepts/runnables).
@ -20,8 +20,8 @@ We often refer to a `Runnable` created using LCEL as a "chain". It's important t
LangChain optimizes the run-time execution of chains built with LCEL in a number of ways:
- **Optimize parallel execution**: Run Runnables in parallel using [RunnableParallel](#runnableparallel) or run multiple inputs through a given chain in parallel using the [Runnable Batch API](/docs/concepts/runnables/#optimized-parallel-execution-batch). Parallel execution can significantly reduce the latency as processing can be done in parallel instead of sequentially.
- **Guarantee Async support**: Any chain built with LCEL can be run asynchronously using the [Runnable Async API](/docs/concepts/runnables/#asynchronous-support). This can be useful when running chains in a server environment where you want to handle large number of requests concurrently.
- **Optimized parallel execution**: Run Runnables in parallel using [RunnableParallel](#runnableparallel) or run multiple inputs through a given chain in parallel using the [Runnable Batch API](/docs/concepts/runnables/#optimized-parallel-execution-batch). Parallel execution can significantly reduce the latency as processing can be done in parallel instead of sequentially.
- **Guaranteed Async support**: Any chain built with LCEL can be run asynchronously using the [Runnable Async API](/docs/concepts/runnables/#asynchronous-support). This can be useful when running chains in a server environment where you want to handle large number of requests concurrently.
- **Simplify streaming**: LCEL chains can be streamed, allowing for incremental output as the chain is executed. LangChain can optimize the streaming of the output to minimize the time-to-first-token(time elapsed until the first chunk of output from a [chat model](/docs/concepts/chat_models) or [llm](/docs/concepts/text_llms) comes out).
Other benefits include:
@ -38,7 +38,7 @@ LCEL is an [orchestration solution](https://en.wikipedia.org/wiki/Orchestration_
While we have seen users run chains with hundreds of steps in production, we generally recommend using LCEL for simpler orchestration tasks. When the application requires complex state management, branching, cycles or multiple agents, we recommend that users take advantage of [LangGraph](/docs/concepts/architecture#langgraph).
In LangGraph, users define graphs that specify the flow of the application. This allows users to keep using LCEL within individual nodes when LCEL is needed, while making it easy to define complex orchestration logic that is more readable and maintainable.
In LangGraph, users define graphs that specify the application's flow. This allows users to keep using LCEL within individual nodes when LCEL is needed, while making it easy to define complex orchestration logic that is more readable and maintainable.
Here are some guidelines:

View File

@ -8,7 +8,7 @@
Messages are the unit of communication in [chat models](/docs/concepts/chat_models). They are used to represent the input and output of a chat model, as well as any additional context or metadata that may be associated with a conversation.
Each message has a **role** (e.g., "user", "assistant"), **content** (e.g., text, multimodal data), and additional metadata that can vary depending on the chat model provider.
Each message has a **role** (e.g., "user", "assistant") and **content** (e.g., text, multimodal data) with additional metadata that varies depending on the chat model provider.
LangChain provides a unified message format that can be used across chat models, allowing users to work with different chat models without worrying about the specific details of the message format used by each model provider.
@ -39,6 +39,7 @@ The content of a message text or a list of dictionaries representing [multimodal
Currently, most chat models support text as the primary content type, with some models also supporting multimodal data. However, support for multimodal data is still limited across most chat model providers.
For more information see:
* [SystemMessage](#systemmessage) -- for content which should be passed to direct the conversation
* [HumanMessage](#humanmessage) -- for content in the input from the user.
* [AIMessage](#aimessage) -- for content in the response from the model.
* [Multimodality](/docs/concepts/multimodality) -- for more information on multimodal content.

View File

@ -27,7 +27,7 @@ These systems accommodate various data formats:
- Unstructured text (e.g., documents) is often stored in vector stores or lexical search indexes.
- Structured data is typically housed in relational or graph databases with defined schemas.
Despite this diversity in data formats, modern AI applications increasingly aim to make all types of data accessible through natural language interfaces.
Despite the growing diversity in data formats, modern AI applications increasingly aim to make all types of data accessible through natural language interfaces.
Models play a crucial role in this process by translating natural language queries into formats compatible with the underlying search index or database.
This translation enables more intuitive and flexible interactions with complex data structures.
@ -41,7 +41,7 @@ This translation enables more intuitive and flexible interactions with complex d
## Query analysis
While users typically prefer to interact with retrieval systems using natural language, retrieval systems can specific query syntax or benefit from particular keywords.
While users typically prefer to interact with retrieval systems using natural language, these systems may require specific query syntax or benefit from certain keywords.
Query analysis serves as a bridge between raw user input and optimized search queries. Some common applications of query analysis include:
1. **Query Re-writing**: Queries can be re-written or expanded to improve semantic or lexical searches.

View File

@ -1,6 +1,6 @@
# Runnable interface
The Runnable interface is foundational for working with LangChain components, and it's implemented across many of them, such as [language models](/docs/concepts/chat_models), [output parsers](/docs/concepts/output_parsers), [retrievers](/docs/concepts/retrievers), [compiled LangGraph graphs](
The Runnable interface is the foundation for working with LangChain components, and it's implemented across many of them, such as [language models](/docs/concepts/chat_models), [output parsers](/docs/concepts/output_parsers), [retrievers](/docs/concepts/retrievers), [compiled LangGraph graphs](
https://langchain-ai.github.io/langgraph/concepts/low_level/#compiling-your-graph) and more.
This guide covers the main concepts and methods of the Runnable interface, which allows developers to interact with various LangChain components in a consistent and predictable manner.
@ -42,7 +42,7 @@ Some Runnables may provide their own implementations of `batch` and `batch_as_co
rely on a `batch` API provided by a model provider).
:::note
The async versions of `abatch` and `abatch_as_completed` these rely on asyncio's [gather](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather) and [as_completed](https://docs.python.org/3/library/asyncio-task.html#asyncio.as_completed) functions to run the `ainvoke` method in parallel.
The async versions of `abatch` and `abatch_as_completed` relies on asyncio's [gather](https://docs.python.org/3/library/asyncio-task.html#asyncio.gather) and [as_completed](https://docs.python.org/3/library/asyncio-task.html#asyncio.as_completed) functions to run the `ainvoke` method in parallel.
:::
:::tip
@ -58,7 +58,7 @@ Runnables expose an asynchronous API, allowing them to be called using the `awai
Please refer to the [Async Programming with LangChain](/docs/concepts/async) guide for more details.
## Streaming apis
## Streaming APIs
<span data-heading-keywords="streaming-api"></span>
Streaming is critical in making applications based on LLMs feel responsive to end-users.
@ -101,7 +101,7 @@ This is an advanced feature that is unnecessary for most users. You should proba
skip this section unless you have a specific need to inspect the schema of a Runnable.
:::
In some advanced uses, you may want to programmatically **inspect** the Runnable and determine what input and output types the Runnable expects and produces.
In more advanced use cases, you may want to programmatically **inspect** the Runnable and determine what input and output types the Runnable expects and produces.
The Runnable interface provides methods to get the [JSON Schema](https://json-schema.org/) of the input and output types of a Runnable, as well as [Pydantic schemas](https://docs.pydantic.dev/latest/) for the input and output types.

View File

@ -119,11 +119,11 @@ json_object = json.loads(ai_msg.content)
There are a few challenges when producing structured output with the above methods:
(1) If using tool calling, tool call arguments needs to be parsed from a dictionary back to the original schema.
(1) When tool calling is used, tool call arguments needs to be parsed from a dictionary back to the original schema.
(2) In addition, the model needs to be instructed to *always* use the tool when we want to enforce structured output, which is a provider specific setting.
(3) If using JSON mode, the output needs to be parsed into a JSON object.
(3) When JSON mode is used, the output needs to be parsed into a JSON object.
With these challenges in mind, LangChain provides a helper function (`with_structured_output()`) to streamline the process.

View File

@ -6,7 +6,7 @@
## Overview
The **tool** abstraction in LangChain associates a python **function** with a **schema** that defines the function's **name**, **description** and **input**.
The **tool** abstraction in LangChain associates a Python **function** with a **schema** that defines the function's **name**, **description** and **expected arguments**.
**Tools** can be passed to [chat models](/docs/concepts/chat_models) that support [tool calling](/docs/concepts/tool_calling) allowing the model to request the execution of a specific function with specific inputs.
@ -14,7 +14,7 @@ The **tool** abstraction in LangChain associates a python **function** with a **
- Tools are a way to encapsulate a function and its schema in a way that can be passed to a chat model.
- Create tools using the [@tool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html) decorator, which simplifies the process of tool creation, supporting the following:
- Automatically infer the tool's **name**, **description** and **inputs**, while also supporting customization.
- Automatically infer the tool's **name**, **description** and **expected arguments**, while also supporting customization.
- Defining tools that return **artifacts** (e.g. images, dataframes, etc.)
- Hiding input arguments from the schema (and hence from the model) using **injected tool arguments**.

View File

@ -1,9 +1,9 @@
# Why langchain?
# Why LangChain?
The goal of `langchain` the Python package and LangChain the company is to make it as easy possible for developers to build applications that reason.
The goal of `langchain` the Python package and LangChain the company is to make it as easy as possible for developers to build applications that reason.
While LangChain originally started as a single open source package, it has evolved into a company and a whole ecosystem.
This page will talk about the LangChain ecosystem as a whole.
Most of the components within in the LangChain ecosystem can be used by themselves - so if you feel particularly drawn to certain components but not others, that is totally fine! Pick and choose whichever components you like best.
Most of the components within the LangChain ecosystem can be used by themselves - so if you feel particularly drawn to certain components but not others, that is totally fine! Pick and choose whichever components you like best for your own use case!
## Features
@ -17,8 +17,8 @@ LangChain exposes a standard interface for key components, making it easy to swi
[Orchestration](https://en.wikipedia.org/wiki/Orchestration_(computing)) is crucial for building such applications.
3. **Observability and evaluation:** As applications become more complex, it becomes increasingly difficult to understand what is happening within them.
Furthermore, the pace of development can become rate-limited by the [paradox of choice](https://en.wikipedia.org/wiki/Paradox_of_choice):
for example, developers often wonder how to engineer their prompt or which LLM best balances accuracy, latency, and cost.
Furthermore, the pace of development can become rate-limited by the [paradox of choice](https://en.wikipedia.org/wiki/Paradox_of_choice).
For example, developers often wonder how to engineer their prompt or which LLM best balances accuracy, latency, and cost.
[Observability](https://en.wikipedia.org/wiki/Observability) and evaluations can help developers monitor their applications and rapidly answer these types of questions with confidence.
@ -72,11 +72,11 @@ There are several common characteristics of LLM applications that this orchestra
* **[Persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/):** The application needs to maintain [short-term and / or long-term memory](https://langchain-ai.github.io/langgraph/concepts/memory/).
* **[Human-in-the-loop](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/):** The application needs human interaction, e.g., pausing, reviewing, editing, approving certain steps.
The recommended way to do orchestration for these complex applications is [LangGraph](https://langchain-ai.github.io/langgraph/concepts/high_level/).
The recommended way to orchestrate components for complex applications is [LangGraph](https://langchain-ai.github.io/langgraph/concepts/high_level/).
LangGraph is a library that gives developers a high degree of control by expressing the flow of the application as a set of nodes and edges.
LangGraph comes with built-in support for [persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/), [human-in-the-loop](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/), [memory](https://langchain-ai.github.io/langgraph/concepts/memory/), and other features.
It's particularly well suited for building [agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/) or [multi-agent](https://langchain-ai.github.io/langgraph/concepts/multi_agent/) applications.
Importantly, individual LangChain components can be used within LangGraph nodes, but you can also use LangGraph **without** using LangChain components.
It's particularly well suited for building [agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/) or [multi-agent](https://langchain-ai.github.io/langgraph/concepts/multi_agent/) applications.
Importantly, individual LangChain components can be used as LangGraph nodes, but you can also use LangGraph **without** using LangChain components.
:::info[Further reading]