diff --git a/docs/docs_skeleton/static/img/summarization_use_case_3.png b/docs/docs_skeleton/static/img/summarization_use_case_3.png new file mode 100644 index 0000000000..4296919237 Binary files /dev/null and b/docs/docs_skeleton/static/img/summarization_use_case_3.png differ diff --git a/docs/extras/use_cases/summarization.ipynb b/docs/extras/use_cases/summarization.ipynb index 95d1a9839a..cdc9c331c9 100644 --- a/docs/extras/use_cases/summarization.ipynb +++ b/docs/extras/use_cases/summarization.ipynb @@ -11,8 +11,6 @@ "\n", "## Use case\n", "\n", - "--- \n", - "\n", "Suppose you have a set of documents (PDFs, Notion pages, customer questions, etc.) and you want to summarize the content. \n", "\n", "LLMs are a great tool for this given their proficiency in understanding and synthesizing text.\n", @@ -37,8 +35,6 @@ "source": [ "## Overview\n", "\n", - "--- \n", - "\n", "A central question for building a summarizer is how to pass your documents into the LLM's context window. Two common approaches for this are:\n", "\n", "1. `Stuff`: Simply \"stuff\" all your documents into a single prompt. This is the simplest approach (see [here](/docs/modules/chains/document/stuff) for more on the `StuffDocumentsChains`, which is used for this method).\n", @@ -61,8 +57,6 @@ "source": [ "## Quickstart\n", "\n", - "--- \n", - "\n", "To give you a sneak preview, either pipeline can be wrapped in a single object: `load_summarize_chain`. \n", "\n", "Suppose we want to summarize a blog post. We can create this in a few lines of code.\n", @@ -136,8 +130,6 @@ "source": [ "## Option 1. Stuff\n", "\n", - "--- \n", - "\n", "When we use `load_summarize_chain` with `chain_type=\"stuff\"`, we will use the [StuffDocumentsChain](/docs/modules/chains/document/stuff).\n", "\n", "The chain will take a list of documents, inserts them all into a prompt, and passes that prompt to an LLM:" @@ -201,8 +193,6 @@ "source": [ "## Option 2. Map-Reduce\n", "\n", - "---\n", - "\n", "Let's unpack the map reduce approach. For this, we'll first map each document to an individual summary using an `LLMChain`. Then we'll use a `ReduceDocumentsChain` to combine those summaries into a single global summary.\n", " \n", "First, we specfy the LLMChain to use for mapping each document to an individual summary:" @@ -361,7 +351,11 @@ "\n", "**Real-world use-case**\n", "\n", - "* See [this blog post](https://blog.langchain.dev/llms-to-improve-documentation/) case-study on analyzing user interactions (questions about LangChain documentation)! " + "* See [this blog post](https://blog.langchain.dev/llms-to-improve-documentation/) case-study on analyzing user interactions (questions about LangChain documentation)! \n", + "* The blog post and associated [repo](https://github.com/mendableai/QA_clustering) also introduce clustering as a means of summarization.\n", + "* This opens up a third path beyond the `stuff` or `map-reduce` approaches that is worth considering.\n", + "\n", + "![Image description](/img/summarization_use_case_3.png)" ] }, { @@ -370,8 +364,6 @@ "metadata": {}, "source": [ "## Option 3. Refine\n", - "\n", - "--- \n", " \n", "[Refine](/docs/modules/chains/document/refine) is similar to map-reduce:\n", "\n", @@ -497,9 +489,9 @@ ], "metadata": { "kernelspec": { - "display_name": "venv", + "display_name": "Python 3 (ipykernel)", "language": "python", - "name": "venv" + "name": "python3" }, "language_info": { "codemirror_mode": { @@ -511,7 +503,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.1" + "version": "3.10.9" } }, "nbformat": 4,