Update QA doc w/ Runnables (#11401)

1 year ago · 211a74941a
parent 5a1f614175
commit 211a74941a
2 changed files with 162 additions and 257 deletions
--- a/docs/docs_skeleton/static/img/multi_vector.png
+++ b/docs/docs_skeleton/static/img/multi_vector.png
--- a/docs/extras/use_cases/question_answering/question_answering.ipynb
+++ b/docs/extras/use_cases/question_answering/question_answering.ipynb
@ -10,9 +10,13 @@
    "[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/use_cases/question_answering/qa.ipynb)\n",
    "\n",
    "## Use case\n",
-    "Suppose you have some text documents (PDF, blog, Notion pages, etc.) and want to ask questions related to the contents of those documents. LLMs, given their proficiency in understanding text, are a great tool for this.\n",
+    "Suppose you have some text documents (PDF, blog, Notion pages, etc.) and want to ask questions related to the contents of those documents. \n",
    "\n",
-    "In this walkthrough we'll go over how to build a question-answering over documents application using LLMs. Two very related use cases which we cover elsewhere are:\n",
+    "LLMs, given their proficiency in understanding text, are a great tool for this.\n",
+    "\n",
+    "In this walkthrough we'll go over how to build a question-answering over documents application using LLMs. \n",
+    "\n",
+    "Two very related use cases which we cover elsewhere are:\n",
    "- [QA over structured data](/docs/use_cases/qa_structured/sql) (e.g., SQL)\n",
    "- [QA over code](/docs/use_cases/code_understanding) (e.g., Python)\n",
    "\n",
@ -20,19 +24,21 @@
    "\n",
    "## Overview\n",
    "The pipeline for converting raw unstructured data into a QA chain looks like this:\n",
-    "1. `Loading`: First we need to load our data. Unstructured data can be loaded from many sources. Use the [LangChain integration hub](https://integrations.langchain.com/) to browse the full set of loaders.\n",
-    "Each loader returns data as a LangChain [`Document`](/docs/components/schema/document).\n",
+    "1. `Loading`: First we need to load our data. Use the [LangChain integration hub](https://integrations.langchain.com/) to browse the full set of loaders. \n",
    "2. `Splitting`: [Text splitters](/docs/modules/data_connection/document_transformers/) break `Documents` into splits of specified size\n",
    "3. `Storage`: Storage (e.g., often a [vectorstore](/docs/modules/data_connection/vectorstores/)) will house [and often embed](https://www.pinecone.io/learn/vector-embeddings/) the splits\n",
    "4. `Retrieval`: The app retrieves splits from storage (e.g., often [with similar embeddings](https://www.pinecone.io/learn/k-nearest-neighbor/) to the input question)\n",
    "5. `Generation`: An [LLM](/docs/modules/model_io/models/llms/) produces an answer using a prompt that includes the question and the retrieved data\n",
-    "6. `Conversation` (Extension): Hold a multi-turn conversation by adding [Memory](/docs/modules/memory/) to your QA chain.\n",
    "\n",
    "![flow.jpeg](/img/qa_flow.jpeg)\n",
    "\n",
    "## Quickstart\n",
    "\n",
-    "To give you a sneak preview, the above pipeline can be all be wrapped in a single object: `VectorstoreIndexCreator`. Suppose we want a QA app over this [blog post](https://lilianweng.github.io/posts/2023-06-23-agent/). We can create this in a few lines of code. First set environment variables and install packages:"
+    "Suppose we want a QA app over this [blog post](https://lilianweng.github.io/posts/2023-06-23-agent/). \n",
+    "\n",
+    "We can create this in a few lines of code. \n",
+    "\n",
+    "First set environment variables and install packages:"
   ]
  },
  {
@ -42,7 +48,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "pip install openai chromadb\n",
+    "pip install langchain openai chromadb langchainhub\n",
    "\n",
    "# Set env var OPENAI_API_KEY or load from a .env file\n",
    "# import dotenv\n",
@ -53,44 +59,118 @@
  {
   "cell_type": "code",
   "execution_count": 1,
-   "id": "046cefc0",
+   "id": "820244ae-74b4-4593-b392-822979dd91b8",
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain.document_loaders import WebBaseLoader\n",
-    "from langchain.indexes import VectorstoreIndexCreator\n",
+    "# Load documents\n",
    "\n",
-    "loader = WebBaseLoader(\"https://lilianweng.github.io/posts/2023-06-23-agent/\")\n",
-    "index = VectorstoreIndexCreator().from_loaders([loader])"
+    "from langchain.document_loaders import WebBaseLoader\n",
+    "loader = WebBaseLoader(\"https://lilianweng.github.io/posts/2023-06-23-agent/\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
-   "id": "f4bf8740",
+   "id": "c89a0aa7-1e7e-4557-90e5-a7ea87db00e7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Split documents\n",
+    "\n",
+    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
+    "text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0)\n",
+    "splits = text_splitter.split_documents(loader.load())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "000e46f6-dafc-4a43-8417-463d0614fd30",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Embed and store splits\n",
+    "\n",
+    "from langchain.vectorstores import Chroma\n",
+    "from langchain.embeddings import OpenAIEmbeddings\n",
+    "vectorstore = Chroma.from_documents(documents=splits,embedding=OpenAIEmbeddings())\n",
+    "retriever = vectorstore.as_retriever()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "dacbde0b-7d45-4a2c-931d-81bb094aec94",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Prompt \n",
+    "# https://smith.langchain.com/hub/rlm/rag-prompt\n",
+    "\n",
+    "from langchain import hub\n",
+    "rag_prompt = hub.pull(\"rlm/rag-prompt\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "79b9fdae-c2bf-4cf6-884f-c19aa07dd975",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# LLM\n",
+    "\n",
+    "from langchain.chat_models import ChatOpenAI\n",
+    "llm = ChatOpenAI(model_name=\"gpt-3.5-turbo\", temperature=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "92c0f3ae-6ab2-4d04-9b22-1963b96b9db5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# RAG chain \n",
+    "\n",
+    "from langchain.schema.runnable import RunnablePassthrough\n",
+    "rag_chain = (\n",
+    "    {\"context\": retriever, \"question\": RunnablePassthrough()} \n",
+    "    | rag_prompt \n",
+    "    | llm \n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "0d3b0f36-7b56-49c0-8e40-a1aa9ebcbf24",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "' Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done using LLM with simple prompting, task-specific instructions, or with human inputs. Tree of Thoughts (Yao et al. 2023) is an extension of Chain of Thought (Wei et al. 2022) which explores multiple reasoning possibilities at each step.'"
+       "AIMessage(content='Task decomposition is the process of breaking down a task into smaller subgoals or steps. It can be done using simple prompting, task-specific instructions, or human inputs.')"
      ]
     },
-     "execution_count": 3,
+     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "index.query(\"What is Task Decomposition?\")"
+    "rag_chain.invoke(\"What is Task Decomposition?\")"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "8224aad6",
+   "id": "639dc31a-7f16-40f6-ba2a-20e7c2ecfe60",
   "metadata": {},
   "source": [
-    "Ok, but what's going on under the hood, and how could we customize this for our specific use case? For that, let's take a look at how we can construct this pipeline piece by piece."
+    "[Here](https://smith.langchain.com/public/2270a675-74de-47ac-b111-b232d8340a64/r) is the LangSmith trace for this chain.\n",
+    "\n",
+    "Below we will explain each step in more detail."
   ]
  },
  {
@ -100,7 +180,9 @@
   "source": [
    "## Step 1. Load\n",
    "\n",
-    "Specify a `DocumentLoader` to load in your unstructured data as `Documents`. A `Document` is a piece of text (the `page_content`) and associated metadata."
+    "Specify a `DocumentLoader` to load in your unstructured data as `Documents`. \n",
+    "\n",
+    "A `Document` is a dict with text (`page_content`) and `metadata`."
   ]
  },
  {
@ -122,7 +204,7 @@
   "metadata": {},
   "source": [
    "### Go deeper\n",
-    "- Browse the > 120 data loader integrations [here](https://integrations.langchain.com/).\n",
+    "- Browse the > 160 data loader integrations [here](https://integrations.langchain.com/).\n",
    "- See further documentation on loaders [here](/docs/modules/data_connection/document_loaders/).\n",
    "\n",
    "## Step 2. Split\n",
@ -150,7 +232,7 @@
   "source": [
    "### Go deeper\n",
    "\n",
-    "- `DocumentSplitters` are just one type of the more generic `DocumentTransformers`, which can all be useful in this preprocessing step.\n",
+    "- `DocumentSplitters` are just one type of the more generic `DocumentTransformers`.\n",
    "- See further documentation on transformers [here](/docs/modules/data_connection/document_transformers/).\n",
    "- `Context-aware splitters` keep the location (\"context\") of each split in the original `Document`:\n",
    "    - [Markdown files](/docs/use_cases/question_answering/how_to/document-context-aware-QA)\n",
@ -160,7 +242,10 @@
    "## Step 3. Store\n",
    "\n",
    "To be able to look up our document splits, we first need to store them where we can later look them up.\n",
-    "The most common way to do this is to embed the contents of each document then store the embedding and document in a vector store, with the embedding being used to index the document."
+    "\n",
+    "The most common way to do this is to embed the contents of each document split.\n",
+    "\n",
+    "We store the embedding and splits in a vectorstore."
   ]
  },
  {
@ -193,7 +278,9 @@
    "\n",
    "## Step 4. Retrieve\n",
    "\n",
-    "Retrieve relevant splits for any question using [similarity search](https://www.pinecone.io/learn/what-is-similarity-search/)."
+    "Retrieve relevant splits for any question using [similarity search](https://www.pinecone.io/learn/what-is-similarity-search/).\n",
+    "\n",
+    "This is simply \"top K\" retrieval where we select documents based on embedding similarity to the query."
   ]
  },
  {
@ -228,7 +315,9 @@
    "\n",
    "Vectorstores are commonly used for retrieval, but they are not the only option. For example, SVMs (see thread [here](https://twitter.com/karpathy/status/1647025230546886658?s=20)) can also be used.\n",
    "\n",
-    "LangChain [has many retrievers](/docs/modules/data_connection/retrievers/) including, but not limited to, vectorstores. All retrievers implement a common method `get_relevant_documents()` (and its asynchronous variant `aget_relevant_documents()`)."
+    "LangChain [has many retrievers](/docs/modules/data_connection/retrievers/) including, but not limited to, vectorstores. \n",
+    "\n",
+    "All retrievers implement a common method `get_relevant_documents()` (and its asynchronous variant `aget_relevant_documents()`)."
   ]
  },
  {
@ -275,7 +364,6 @@
   "outputs": [],
   "source": [
    "import logging\n",
-    "\n",
    "from langchain.chat_models import ChatOpenAI\n",
    "from langchain.retrievers.multi_query import MultiQueryRetriever\n",
    "\n",
@ -288,6 +376,20 @@
    "len(unique_docs)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "ee8420e6-73a6-411b-a84d-74b096bddad7",
+   "metadata": {},
+   "source": [
+    "In addition, a useful concept for improving retrieval is decoupling the documents from the embedded search key.\n",
+    "\n",
+    "For example, we can embed a document summary or question that are likely to lead to the document being retrieved.\n",
+    "\n",
+    "See details in [here](docs/modules/data_connection/retrievers/multi_vector) on the multi-vector retriever for this purpose.\n",
+    "\n",
+    "![mv.png](/img/multi_vector.png)"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "415d6824",
@ -295,34 +397,44 @@
   "source": [
    "## Step 5. Generate\n",
    "\n",
-    "Distill the retrieved documents into an answer using an LLM/Chat model (e.g., `gpt-3.5-turbo`) with `RetrievalQA` chain.\n"
+    "Distill the retrieved documents into an answer using an LLM/Chat model (e.g., `gpt-3.5-turbo`).\n",
+    "\n",
+    "We use the [Runnable](https://python.langchain.com/docs/expression_language/interface) protocol to define the chain.\n",
+    "\n",
+    "Runnable protocol pipes together components in a transparent way.\n",
+    "\n",
+    "We used a prompt for RAG that is checked into the LangChain prompt hub ([here](https://smith.langchain.com/hub/rlm/rag-prompt))."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 11,
   "id": "99fa1aec",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "{'query': 'What are the approaches to Task Decomposition?',\n",
-       " 'result': 'The approaches to task decomposition include:\\n\\n1. Simple prompting: This approach involves using simple prompts or questions to guide the agent in breaking down a task into smaller subgoals. For example, the agent can be prompted with \"Steps for XYZ\" or \"What are the subgoals for achieving XYZ?\" to facilitate task decomposition.\\n\\n2. Task-specific instructions: In this approach, task-specific instructions are provided to the agent to guide the decomposition process. For example, if the task is to write a novel, the agent can be instructed to \"Write a story outline\" as a step in the task decomposition.\\n\\n3. Human inputs: This approach involves incorporating human inputs in the task decomposition process. Humans can provide guidance, feedback, and assistance to the agent in breaking down complex tasks into manageable subgoals.\\n\\nThese approaches aim to enable efficient handling of complex tasks by breaking them down into smaller, more manageable subgoals.'}"
+       "AIMessage(content='Task decomposition is the process of breaking down a task into smaller subgoals or steps. It can be done using simple prompting, task-specific instructions, or human inputs.')"
      ]
     },
-     "execution_count": 9,
+     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "from langchain.chains import RetrievalQA\n",
    "from langchain.chat_models import ChatOpenAI\n",
-    "\n",
    "llm = ChatOpenAI(model_name=\"gpt-3.5-turbo\", temperature=0)\n",
-    "qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever())\n",
-    "qa_chain({\"query\": question})"
+    "\n",
+    "from langchain.schema.runnable import RunnablePassthrough\n",
+    "rag_chain = (\n",
+    "    {\"context\": retriever, \"question\": RunnablePassthrough()} \n",
+    "    | rag_prompt \n",
+    "    | llm \n",
+    ")\n",
+    "\n",
+    "rag_chain.invoke(\"What is Task Decomposition?\")"
   ]
  },
  {
@ -330,12 +442,10 @@
   "id": "f7d52c84",
   "metadata": {},
   "source": [
-    "Note, you can pass in an `LLM` or a `ChatModel` (like we did here) to the `RetrievalQA` chain.\n",
-    "\n",
    "### Go deeper\n",
    "\n",
    "#### Choosing LLMs\n",
-    "- Browse the > 55 LLM and chat model integrations [here](https://integrations.langchain.com/).\n",
+    "- Browse the > 90 LLM and chat model integrations [here](https://integrations.langchain.com/).\n",
    "- See further documentation on LLMs and chat models [here](/docs/modules/model_io/models/).\n",
    "- See a guide on local LLMS [here](/docs/modules/use_cases/question_answering/how_to/local_retrieval_qa)."
   ]
@ -347,28 +457,29 @@
   "source": [
    "#### Customizing the prompt\n",
    "\n",
-    "The prompt in `RetrievalQA` chain can be easily customized."
+    "As shown above, we can load prompts (e.g., [this RAG prompt](https://smith.langchain.com/hub/rlm/rag-prompt)) from the prompt hub.\n",
+    "\n",
+    "The prompt can also be easily customized, as shown below."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 12,
   "id": "e4fee704",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "'The approaches to Task Decomposition are (1) using simple prompting by LLM, (2) using task-specific instructions, and (3) incorporating human inputs. Thanks for asking!'"
+       "AIMessage(content='Task decomposition is the process of breaking down a complicated task into smaller, more manageable subtasks or steps. It can be done using prompts, task-specific instructions, or human inputs. Thanks for asking!')"
      ]
     },
-     "execution_count": 10,
+     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "from langchain.chains import RetrievalQA\n",
    "from langchain.prompts import PromptTemplate\n",
    "\n",
    "template = \"\"\"Use the following pieces of context to answer the question at the end. \n",
@ -378,229 +489,23 @@
    "{context}\n",
    "Question: {question}\n",
    "Helpful Answer:\"\"\"\n",
-    "QA_CHAIN_PROMPT = PromptTemplate.from_template(template)\n",
+    "rag_prompt_custom = PromptTemplate.from_template(template)\n",
    "\n",
-    "llm = ChatOpenAI(model_name=\"gpt-3.5-turbo\", temperature=0)\n",
-    "qa_chain = RetrievalQA.from_chain_type(\n",
-    "    llm,\n",
-    "    retriever=vectorstore.as_retriever(),\n",
-    "    chain_type_kwargs={\"prompt\": QA_CHAIN_PROMPT}\n",
+    "rag_chain = (\n",
+    "    {\"context\": retriever, \"question\": RunnablePassthrough()} \n",
+    "    | rag_prompt_custom \n",
+    "    | llm \n",
    ")\n",
-    "result = qa_chain({\"query\": question})\n",
-    "result[\"result\"]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c825e9bf-6a56-46e4-8bbb-05441f76cb96",
-   "metadata": {},
-   "source": [
-    "We can also store and fetch prompts from the LangChain prompt hub.\n",
-    "\n",
-    "This will work with your [LangSmith API key](https://docs.smith.langchain.com/).\n",
    "\n",
-    "For example, see [here](https://smith.langchain.com/hub/rlm/rag-prompt) is a common prompt for RAG.\n",
-    "\n",
-    "We can load this."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a896060f-ebc4-4236-a4ad-32960601c6e8",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "pip install langchainhub"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "id": "aef8e734-ba54-48ae-b959-1898618f2d90",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'The approaches to task decomposition include using LLM with simple prompting, task-specific instructions, and human inputs.'"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# RAG prompt\n",
-    "from langchain import hub\n",
-    "QA_CHAIN_PROMPT_HUB = hub.pull(\"rlm/rag-prompt\")\n",
-    "\n",
-    "qa_chain = RetrievalQA.from_chain_type(\n",
-    "    llm,\n",
-    "    retriever=vectorstore.as_retriever(),\n",
-    "    chain_type_kwargs={\"prompt\": QA_CHAIN_PROMPT_HUB}\n",
-    ")\n",
-    "result = qa_chain({\"query\": question})\n",
-    "result[\"result\"]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "ff40e8db",
-   "metadata": {
-    "jp-MarkdownHeadingCollapsed": true
-   },
-   "source": [
-    "#### Return source documents\n",
-    "\n",
-    "The full set of retrieved documents used for answer distillation can be returned using `return_source_documents=True`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "id": "60004293",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "4\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "Document(page_content='Task decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.', metadata={'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\\nAgent System Overview In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:', 'language': 'en', 'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'title': \"LLM Powered Autonomous Agents | Lil'Log\"})"
-      ]
-     },
-     "execution_count": 14,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.chains import RetrievalQA\n",
-    "\n",
-    "qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever(),\n",
-    "                                       return_source_documents=True)\n",
-    "result = qa_chain({\"query\": question})\n",
-    "print(len(result['source_documents']))\n",
-    "result['source_documents'][0]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1b600236",
-   "metadata": {},
-   "source": [
-    "#### Return citations\n",
-    "\n",
-    "Answer citations can be returned using `RetrievalQAWithSourcesChain`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 16,
-   "id": "948f6d19",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'question': 'What are the approaches to Task Decomposition?',\n",
-       " 'answer': 'The approaches to Task Decomposition include:\\n1. Using LLM with simple prompting, such as providing steps or subgoals for achieving a task.\\n2. Using task-specific instructions, such as providing a specific instruction like \"Write a story outline\" for writing a novel.\\n3. Using human inputs to decompose the task.\\nAnother approach is the Tree of Thoughts, which extends the Chain of Thought (CoT) technique by exploring multiple reasoning possibilities at each step and generating multiple thoughts per step, creating a tree structure. The search process can be BFS or DFS, and each state can be evaluated by a classifier or majority vote.\\nSources: https://lilianweng.github.io/posts/2023-06-23-agent/',\n",
-       " 'sources': ''}"
-      ]
-     },
-     "execution_count": 16,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.chains import RetrievalQAWithSourcesChain\n",
-    "\n",
-    "qa_chain = RetrievalQAWithSourcesChain.from_chain_type(llm,retriever=vectorstore.as_retriever())\n",
-    "\n",
-    "result = qa_chain({\"question\": question})\n",
-    "result"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "73d0b138",
-   "metadata": {},
-   "source": [
-    "#### Customizing retrieved document processing\n",
-    "\n",
-    "Retrieved documents can be fed to an LLM for answer distillation in a few different ways.\n",
-    "\n",
-    "`stuff`, `refine`, `map-reduce`, and `map-rerank` chains for passing documents to an LLM prompt are well summarized [here](/docs/modules/chains/document/).\n",
-    " \n",
-    "`stuff` is commonly used because it simply \"stuffs\" all retrieved documents into the prompt.\n",
-    "\n",
-    "The [load_qa_chain](/docs/use_cases/question_answering/how_to/question_answering.html) is an easy way to pass documents to an LLM using these various approaches (e.g., see `chain_type`)."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 17,
-   "id": "29aa139f",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'output_text': 'The approaches to task decomposition mentioned in the provided context are:\\n\\n1. Chain of thought (CoT): This approach involves instructing the language model to \"think step by step\" and decompose complex tasks into smaller and simpler steps. It enhances model performance on complex tasks by utilizing more test-time computation.\\n\\n2. Tree of Thoughts: This approach extends CoT by exploring multiple reasoning possibilities at each step. It decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS or DFS, and each state is evaluated by a classifier or majority vote.\\n\\n3. LLM with simple prompting: This approach involves using a language model with simple prompts like \"Steps for XYZ\" or \"What are the subgoals for achieving XYZ?\" to perform task decomposition.\\n\\n4. Task-specific instructions: This approach involves providing task-specific instructions to guide the language model in decomposing the task. For example, providing the instruction \"Write a story outline\" for the task of writing a novel.\\n\\n5. Human inputs: Task decomposition can also be done with human inputs, where humans provide guidance and input to break down the task into smaller subtasks.'}"
-      ]
-     },
-     "execution_count": 17,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.chains.question_answering import load_qa_chain\n",
-    "\n",
-    "chain = load_qa_chain(llm, chain_type=\"stuff\")\n",
-    "chain({\"input_documents\": unique_docs, \"question\": question},return_only_outputs=True)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a8cb8cd1",
-   "metadata": {},
-   "source": [
-    "We can also pass the `chain_type` to `RetrievalQA`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 18,
-   "id": "f68574bd",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever(),\n",
-    "                                       chain_type=\"stuff\")\n",
-    "result = qa_chain({\"query\": question})"
+    "rag_chain.invoke(\"What is Task Decomposition?\")"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "b33aeb5f",
+   "id": "5f5b6297-715a-444e-b3ef-a6d27382b435",
   "metadata": {},
   "source": [
-    "In summary, the user can choose the desired level of abstraction for QA:\n",
-    "\n",
-    "![summary_chains.png](/img/summary_chains.png)\n",
-    "\n",
-    "## Step 6. Chat\n",
-    "\n",
-    "See our [use-case on chat](/docs/use_cases/chatbots) for detail on this!"
+    "We can use [LangSmith](https://smith.langchain.com/public/129cac54-44d5-453a-9807-3bd4835e5f96/r) to see the trace."
   ]
  }
 ],
@ -620,7 +525,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.9.16"
  }
 },
 "nbformat": 4,