"Suppose you have a set of documents (PDFs, Notion pages, customer questions, etc.) and you want to summarize the content. \n",
"Suppose you have a set of documents (PDFs, Notion pages, customer questions, etc.) and you want to summarize the content. \n",
"\n",
"\n",
"LLMs are a great tool for this given their proficiency in understanding and synthesizing text.\n",
"LLMs are a great tool for this given their proficiency in understanding and synthesizing text.\n",
@ -37,8 +35,6 @@
"source": [
"source": [
"## Overview\n",
"## Overview\n",
"\n",
"\n",
"--- \n",
"\n",
"A central question for building a summarizer is how to pass your documents into the LLM's context window. Two common approaches for this are:\n",
"A central question for building a summarizer is how to pass your documents into the LLM's context window. Two common approaches for this are:\n",
"\n",
"\n",
"1. `Stuff`: Simply \"stuff\" all your documents into a single prompt. This is the simplest approach (see [here](/docs/modules/chains/document/stuff) for more on the `StuffDocumentsChains`, which is used for this method).\n",
"1. `Stuff`: Simply \"stuff\" all your documents into a single prompt. This is the simplest approach (see [here](/docs/modules/chains/document/stuff) for more on the `StuffDocumentsChains`, which is used for this method).\n",
@ -61,8 +57,6 @@
"source": [
"source": [
"## Quickstart\n",
"## Quickstart\n",
"\n",
"\n",
"--- \n",
"\n",
"To give you a sneak preview, either pipeline can be wrapped in a single object: `load_summarize_chain`. \n",
"To give you a sneak preview, either pipeline can be wrapped in a single object: `load_summarize_chain`. \n",
"\n",
"\n",
"Suppose we want to summarize a blog post. We can create this in a few lines of code.\n",
"Suppose we want to summarize a blog post. We can create this in a few lines of code.\n",
@ -136,8 +130,6 @@
"source": [
"source": [
"## Option 1. Stuff\n",
"## Option 1. Stuff\n",
"\n",
"\n",
"--- \n",
"\n",
"When we use `load_summarize_chain` with `chain_type=\"stuff\"`, we will use the [StuffDocumentsChain](/docs/modules/chains/document/stuff).\n",
"When we use `load_summarize_chain` with `chain_type=\"stuff\"`, we will use the [StuffDocumentsChain](/docs/modules/chains/document/stuff).\n",
"\n",
"\n",
"The chain will take a list of documents, inserts them all into a prompt, and passes that prompt to an LLM:"
"The chain will take a list of documents, inserts them all into a prompt, and passes that prompt to an LLM:"
@ -201,8 +193,6 @@
"source": [
"source": [
"## Option 2. Map-Reduce\n",
"## Option 2. Map-Reduce\n",
"\n",
"\n",
"---\n",
"\n",
"Let's unpack the map reduce approach. For this, we'll first map each document to an individual summary using an `LLMChain`. Then we'll use a `ReduceDocumentsChain` to combine those summaries into a single global summary.\n",
"Let's unpack the map reduce approach. For this, we'll first map each document to an individual summary using an `LLMChain`. Then we'll use a `ReduceDocumentsChain` to combine those summaries into a single global summary.\n",
" \n",
" \n",
"First, we specfy the LLMChain to use for mapping each document to an individual summary:"
"First, we specfy the LLMChain to use for mapping each document to an individual summary:"
@ -361,7 +351,11 @@
"\n",
"\n",
"**Real-world use-case**\n",
"**Real-world use-case**\n",
"\n",
"\n",
"* See [this blog post](https://blog.langchain.dev/llms-to-improve-documentation/) case-study on analyzing user interactions (questions about LangChain documentation)! "
"* See [this blog post](https://blog.langchain.dev/llms-to-improve-documentation/) case-study on analyzing user interactions (questions about LangChain documentation)! \n",
"* The blog post and associated [repo](https://github.com/mendableai/QA_clustering) also introduce clustering as a means of summarization.\n",
"* This opens up a third path beyond the `stuff` or `map-reduce` approaches that is worth considering.\n",