From b54727fbad8e27ff93498b218ad7949e8bac05ae Mon Sep 17 00:00:00 2001 From: Nuno Campos Date: Thu, 12 Oct 2023 17:52:20 +0100 Subject: [PATCH] Nc/why lcel (#11717) --- docs/docs/expression_language/index.mdx | 3 +++ docs/docs/expression_language/why.mdx | 11 +++++++++++ 2 files changed, 14 insertions(+) create mode 100644 docs/docs/expression_language/why.mdx diff --git a/docs/docs/expression_language/index.mdx b/docs/docs/expression_language/index.mdx index 94b77aff5c..e4be132ca0 100644 --- a/docs/docs/expression_language/index.mdx +++ b/docs/docs/expression_language/index.mdx @@ -31,3 +31,6 @@ How to use core features of LCEL #### [Cookbook](/docs/expression_language/cookbook) Examples of common LCEL usage patterns + +#### [Why use LCEL](/docs/expression_language/why) +A deeper dive into the benefits of LCEL diff --git a/docs/docs/expression_language/why.mdx b/docs/docs/expression_language/why.mdx new file mode 100644 index 0000000000..48ada98a76 --- /dev/null +++ b/docs/docs/expression_language/why.mdx @@ -0,0 +1,11 @@ +# Why use LCEL? + +The LangChain Expression Language was designed from day 1 to **support putting prototypes in production, with no code changes**, from the simplest “prompt + LLM” chain to the most complex chains (we’ve seen folks successfully running in production LCEL chains with 100s of steps). To highlight a few of the reasons you might want to use LCEL: + +- first-class support for streaming: when you build your chains with LCEL you get the best possible time-to-first-token (time elapsed until the first chunk of output comes out). For some chains this means eg. we stream tokens straight from an LLM to a streaming output parser, and you get back parsed, incremental chunks of output at the same rate as the LLM provider outputs the raw tokens. We’re constantly improving streaming support, recently we added a [streaming JSON parser](https://twitter.com/LangChainAI/status/1709690468030914584), and more is in the works. +- first-class async support: any chain built with LCEL can be called both with the synchronous API (eg. in your Jupyter notebook while prototyping) as well as with the asynchronous API (eg. in a [LangServe](https://github.com/langchain-ai/langserve) server). This enables using the same code for prototypes and in production, with great performance, and the ability to handle many concurrent requests in the same server. +- optimised parallel execution: whenever your LCEL chains have steps that can be executed in parallel (eg if you fetch documents from multiple retrievers) we automatically do it, both in the sync and the async interfaces, for the smallest possible latency. +- support for retries and fallbacks: more recently we’ve added support for configuring retries and fallbacks for any part of your LCEL chain. This is a great way to make your chains more reliable at scale. We’re currently working on adding streaming support for retries/fallbacks, so you can get the added reliability without any latency cost. +- accessing intermediate results: for more complex chains it’s often very useful to access the results of intermediate steps even before the final output is produced. This can be used let end-users know something is happening, or even just to debug your chain. We’ve added support for [streaming intermediate results](https://x.com/LangChainAI/status/1711806009097044193?s=20), and it’s available on every LangServe server. +- [input and output schemas](https://x.com/LangChainAI/status/1711805322195861934?s=20): this week we launched input and output schemas for LCEL, giving every LCEL chain Pydantic and JSONSchema schemas inferred from the structure of your chain. This can be used for validation of inputs and outputs, and is an integral part of LangServe. +- tracing with LangSmith: all chains built with LCEL have first-class tracing support, which can be used to debug your chains, or to understand what’s happening in production. To enable this all you have to do is add your [LangSmith](https://www.langchain.com/langsmith) API key as an environment variable.