mirror of
https://github.com/hwchase17/langchain
synced 2024-11-10 01:10:59 +00:00
docs: update summarization guides (#25408)
This commit is contained in:
parent
32f5147523
commit
09c0823c3a
@ -315,6 +315,15 @@ For a high-level tutorial, check out [this guide](/docs/tutorials/graph/).
|
||||
- [How to: improve results with prompting](/docs/how_to/graph_prompting)
|
||||
- [How to: construct knowledge graphs](/docs/how_to/graph_constructing)
|
||||
|
||||
### Summarization
|
||||
|
||||
LLMs can summarize and otherwise distill desired information from text, including
|
||||
large volumes of text. For a high-level tutorial, check out [this guide](/docs/tutorials/summarization).
|
||||
|
||||
- [How to: summarize text in a single LLM call](/docs/how_to/summarize_stuff)
|
||||
- [How to: summarize text through parallelization](/docs/how_to/summarize_map_reduce)
|
||||
- [How to: summarize text through iterative refinement](/docs/how_to/summarize_refine)
|
||||
|
||||
## [LangGraph](https://langchain-ai.github.io/langgraph)
|
||||
|
||||
LangGraph is an extension of LangChain aimed at
|
||||
|
449
docs/docs/how_to/summarize_map_reduce.ipynb
Normal file
449
docs/docs/how_to/summarize_map_reduce.ipynb
Normal file
File diff suppressed because one or more lines are too long
333
docs/docs/how_to/summarize_refine.ipynb
Normal file
333
docs/docs/how_to/summarize_refine.ipynb
Normal file
File diff suppressed because one or more lines are too long
209
docs/docs/how_to/summarize_stuff.ipynb
Normal file
209
docs/docs/how_to/summarize_stuff.ipynb
Normal file
@ -0,0 +1,209 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c47f5b2f-e14c-43e7-a0ab-d71562636624",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"sidebar_position: 3\n",
|
||||
"keywords: [summarize, summarization, stuff, create_stuff_documents_chain]\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "682a4f53-27db-43ef-a909-dd9ded76051b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# How to summarize text in a single LLM call\n",
|
||||
"\n",
|
||||
"LLMs can summarize and otherwise distill desired information from text, including large volumes of text. In many cases, especially for models with larger context windows, this can be adequately achieved via a single LLM call.\n",
|
||||
"\n",
|
||||
"LangChain implements a simple [pre-built chain](https://api.python.langchain.com/en/latest/chains/langchain.chains.combine_documents.stuff.create_stuff_documents_chain.html) that \"stuffs\" a prompt with the desired context for summarization and other purposes. In this guide we demonstrate how to use the chain."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4aa52e84-d1b5-4b33-b4c4-541156686ef3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Load chat model\n",
|
||||
"\n",
|
||||
"Let's first load a chat model:\n",
|
||||
"```{=mdx}\n",
|
||||
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
|
||||
"\n",
|
||||
"<ChatModelTabs\n",
|
||||
" customVarName=\"llm\"\n",
|
||||
"/>\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "e5f426fc-cea6-4351-8931-1e422d3c8b69",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# | output: false\n",
|
||||
"# | echo: false\n",
|
||||
"\n",
|
||||
"from langchain_openai import ChatOpenAI\n",
|
||||
"\n",
|
||||
"llm = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b137fe82-0a53-4910-b53e-b87a297f329d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Load documents"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a81dc91d-ae72-4996-b809-d4a9050e815e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Next, we need some documents to summarize. Below, we generate some toy documents for illustrative purposes. See the document loader [how-to guides](/docs/how_to/#document-loaders) and [integration pages](/docs/integrations/document_loaders/) for additional sources of data. The [summarization tutorial](/docs/tutorials/summarization) also includes an example summarizing a blog post."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "27c8fed0-b2d7-4549-a086-f5ee657efc41",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.documents import Document\n",
|
||||
"\n",
|
||||
"documents = [\n",
|
||||
" Document(page_content=\"Apples are red\", metadata={\"title\": \"apple_book\"}),\n",
|
||||
" Document(page_content=\"Blueberries are blue\", metadata={\"title\": \"blueberry_book\"}),\n",
|
||||
" Document(page_content=\"Bananas are yelow\", metadata={\"title\": \"banana_book\"}),\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "84216044-6f1e-4b90-b4fa-29ec305abf51",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Load chain\n",
|
||||
"\n",
|
||||
"Below, we define a simple prompt and instantiate the chain with our chat model and documents:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "669afa40-2708-4fa1-841e-c74a67bd9175",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.chains.combine_documents import create_stuff_documents_chain\n",
|
||||
"from langchain_core.prompts import ChatPromptTemplate\n",
|
||||
"\n",
|
||||
"prompt = ChatPromptTemplate.from_template(\"Summarize this content: {context}\")\n",
|
||||
"chain = create_stuff_documents_chain(llm, prompt)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "74f3e276-f003-4112-ba14-c6952076c4f8",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Invoke chain\n",
|
||||
"\n",
|
||||
"Because the chain is a [Runnable](/docs/concepts/#runnable-interface), it implements the usual methods for invocation:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "0701bb7d-fbc6-497e-a577-25d56e6e43c6",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'The content describes the colors of three fruits: apples are red, blueberries are blue, and bananas are yellow.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"result = chain.invoke({\"context\": documents})\n",
|
||||
"result"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "14fb5647-1458-43af-afb7-5aae7b8cab1d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Streaming\n",
|
||||
"\n",
|
||||
"Note that the chain also supports streaming of individual output tokens:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "0d7a5f67-2ec8-4f90-b085-2969fcb14dce",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"|The| content| describes| the| colors| of| three| fruits|:| apples| are| red|,| blueberries| are| blue|,| and| bananas| are| yellow|.||"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for chunk in chain.stream({\"context\": documents}):\n",
|
||||
" print(chunk, end=\"|\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f15c225a-db1d-48cf-b135-f588e7d615e6",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Next steps\n",
|
||||
"\n",
|
||||
"See the summarization [how-to guides](/docs/how_to/#summarization) for additional summarization strategies, including those designed for larger volumes of text.\n",
|
||||
"\n",
|
||||
"See also [this tutorial](/docs/tutorials/summarization) for more detail on summarization."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
File diff suppressed because one or more lines are too long
Loading…
Reference in New Issue
Block a user