mirror of https://github.com/hwchase17/langchain
docs: cleanup rag use case (#15284)
parent
11accf8366
commit
27dca2d92f
@ -1,3 +1,3 @@
|
||||
label: 'QA over structured data'
|
||||
label: 'Q&A over structured data'
|
||||
collapsed: false
|
||||
position: 0
|
||||
position: 0.1
|
||||
|
@ -1,2 +0,0 @@
|
||||
position: 0.1
|
||||
collapsed: true
|
@ -0,0 +1,391 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "raw",
|
||||
"id": "023635f2-71cf-43f2-a2e2-a7b4ced30a74",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"sidebar_position: 2\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "86fc5bb2-017f-434e-8cd6-53ab214a5604",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Add chat history\n",
|
||||
"\n",
|
||||
"In many Q&A applications we want to allow the user to have a back-and-forth conversation, meaning the application needs some sort of \"memory\" of past questions and answers, and some logic for incorporating those into its current thinking.\n",
|
||||
"\n",
|
||||
"In this guide we focus on **adding logic for incorporating historical messages, and NOT on chat history management.** Chat history management is [covered here](/docs/expression_language/how_to/message_history).\n",
|
||||
"\n",
|
||||
"We'll work off of the Q&A app we built over the [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post by Lilian Weng in the [Quickstart](/docs/use_cases/question_answering/quickstart). We'll need to update two things about our existing app:\n",
|
||||
"\n",
|
||||
"1. **Prompt**: Update our prompt to support historical messages as an input.\n",
|
||||
"2. **Contextualizing questions**: Add a sub-chain that takes the latest user question and reformulates it in the context of the chat history. This is needed in case the latest question references some context from past messages. For example, if a user asks a follow-up question like \"Can you elaborate on the second point?\", this cannot be understood without the context of the previous message. Therefore we can't effectively perform retrieval with a question like this."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "487d8d79-5ee9-4aa4-9fdf-cd5f4303e099",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"### Dependencies\n",
|
||||
"\n",
|
||||
"We'll use an OpenAI chat model and embeddings and a Chroma vector store in this walkthrough, but everything shown here works with any [ChatModel](/docs/modules/model_io/chat/) or [LLM](/docs/modules/model_io/llms/), [Embeddings](/docs/modules/data_connection/text_embedding/), and [VectorStore](/docs/modules/data_connection/vectorstores/) or [Retriever](/docs/modules/data_connection/retrievers/). \n",
|
||||
"\n",
|
||||
"We'll use the following packages:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "28d272cd-4e31-40aa-bbb4-0be0a1f49a14",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#!pip install -U langchain langchain-community langchainhub openai chromadb bs4"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "51ef48de-70b6-4f43-8e0b-ab9b84c9c02a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We need to set environment variable `OPENAI_API_KEY`, which can be done directly or loaded from a `.env` file like so:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "143787ca-d8e6-4dc9-8281-4374f4d71720",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import getpass\n",
|
||||
"import os\n",
|
||||
"\n",
|
||||
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
|
||||
"\n",
|
||||
"# import dotenv\n",
|
||||
"\n",
|
||||
"# dotenv.load_dotenv()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1665e740-ce01-4f09-b9ed-516db0bd326f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### LangSmith\n",
|
||||
"\n",
|
||||
"Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls. As these applications get more and more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent. The best way to do this is with [LangSmith](https://smith.langchain.com).\n",
|
||||
"\n",
|
||||
"Note that LangSmith is not needed, but it is helpful. If you do want to use LangSmith, after you sign up at the link above, make sure to set your environment variables to start logging traces:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "07411adb-3722-4f65-ab7f-8f6f57663d11",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
|
||||
"os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "fa6ba684-26cf-4860-904e-a4d51380c134",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Chain without chat history\n",
|
||||
"\n",
|
||||
"Here is the Q&A app we built over the [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post by Lilian Weng in the [Quickstart](/docs/use_cases/question_answering/quickstart):"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "d8a913b1-0eea-442a-8a64-ec73333f104b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import bs4\n",
|
||||
"from langchain import hub\n",
|
||||
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
|
||||
"from langchain_community.chat_models import ChatOpenAI\n",
|
||||
"from langchain_community.document_loaders import WebBaseLoader\n",
|
||||
"from langchain_community.embeddings import OpenAIEmbeddings\n",
|
||||
"from langchain_community.vectorstores import Chroma\n",
|
||||
"from langchain_core.output_parsers import StrOutputParser\n",
|
||||
"from langchain_core.runnables import RunnablePassthrough"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "820244ae-74b4-4593-b392-822979dd91b8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Load, chunk and index the contents of the blog.\n",
|
||||
"loader = WebBaseLoader(\n",
|
||||
" web_paths=(\"https://lilianweng.github.io/posts/2023-06-23-agent/\",),\n",
|
||||
" bs_kwargs=dict(\n",
|
||||
" parse_only=bs4.SoupStrainer(\n",
|
||||
" class_=(\"post-content\", \"post-title\", \"post-header\")\n",
|
||||
" )\n",
|
||||
" ),\n",
|
||||
")\n",
|
||||
"docs = loader.load()\n",
|
||||
"\n",
|
||||
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)\n",
|
||||
"splits = text_splitter.split_documents(docs)\n",
|
||||
"vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())\n",
|
||||
"\n",
|
||||
"# Retrieve and generate using the relevant snippets of the blog.\n",
|
||||
"retriever = vectorstore.as_retriever()\n",
|
||||
"prompt = hub.pull(\"rlm/rag-prompt\")\n",
|
||||
"llm = ChatOpenAI(model_name=\"gpt-3.5-turbo\", temperature=0)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def format_docs(docs):\n",
|
||||
" return \"\\n\\n\".join(doc.page_content for doc in docs)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"rag_chain = (\n",
|
||||
" {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
|
||||
" | prompt\n",
|
||||
" | llm\n",
|
||||
" | StrOutputParser()\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "0d3b0f36-7b56-49c0-8e40-a1aa9ebcbf24",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done through prompting techniques like Chain of Thought or Tree of Thoughts, or by using task-specific instructions or human inputs. Task decomposition helps agents plan ahead and manage complicated tasks more effectively.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"rag_chain.invoke(\"What is Task Decomposition?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "776ae958-cbdc-4471-8669-c6087436f0b5",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Contextualizing the question\n",
|
||||
"\n",
|
||||
"First we'll need to define a sub-chain that takes historical messages and the latest user question, and reformulates the question if it makes reference to any information in the historical information.\n",
|
||||
"\n",
|
||||
"We'll use a prompt that includes a `MessagesPlaceholder` variable under the name \"chat_history\". This allows us to pass in a list of Messages to the prompt using the \"chat_history\" input key, and these messages will be inserted after the system message and before the human message containing the latest question."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 28,
|
||||
"id": "2b685428-8b82-4af1-be4f-7232c5d55b73",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
|
||||
"\n",
|
||||
"contextualize_q_system_prompt = \"\"\"Given a chat history and the latest user question \\\n",
|
||||
"which might reference context in the chat history, formulate a standalone question \\\n",
|
||||
"which can be understood without the chat history. Do NOT answer the question, \\\n",
|
||||
"just reformulate it if needed and otherwise return it as is.\"\"\"\n",
|
||||
"contextualize_q_prompt = ChatPromptTemplate.from_messages(\n",
|
||||
" [\n",
|
||||
" (\"system\", contextualize_q_system_prompt),\n",
|
||||
" MessagesPlaceholder(variable_name=\"chat_history\"),\n",
|
||||
" (\"human\", \"{question}\"),\n",
|
||||
" ]\n",
|
||||
")\n",
|
||||
"contextualize_q_chain = contextualize_q_prompt | llm | StrOutputParser()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "23cbd8d7-7162-4fb0-9e69-67ea4d4603a5",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Using this chain we can ask follow-up questions that reference past messages and have them reformulated into standalone questions:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 29,
|
||||
"id": "46ee9aa1-16f1-4509-8dae-f8c71f4ad47d",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'What is the definition of \"large\" in the context of a language model?'"
|
||||
]
|
||||
},
|
||||
"execution_count": 29,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_core.messages import AIMessage, HumanMessage\n",
|
||||
"\n",
|
||||
"contextualize_q_chain.invoke(\n",
|
||||
" {\n",
|
||||
" \"chat_history\": [\n",
|
||||
" HumanMessage(content=\"What does LLM stand for?\"),\n",
|
||||
" AIMessage(content=\"Large language model\"),\n",
|
||||
" ],\n",
|
||||
" \"question\": \"What is meant by large\",\n",
|
||||
" }\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "42a47168-4a1f-4e39-bd2d-d5b03609a243",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Chain with chat history\n",
|
||||
"\n",
|
||||
"And now we can build our full QA chain. \n",
|
||||
"\n",
|
||||
"Notice we add some routing functionality to only run the \"condense question chain\" when our chat history isn't empty. Here we're taking advantage of the fact that if a function in an LCEL chain returns another chain, that chain will itself be invoked."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 30,
|
||||
"id": "66f275f3-ddef-4678-b90d-ee64576878f9",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"qa_system_prompt = \"\"\"You are an assistant for question-answering tasks. \\\n",
|
||||
"Use the following pieces of retrieved context to answer the question. \\\n",
|
||||
"If you don't know the answer, just say that you don't know. \\\n",
|
||||
"Use three sentences maximum and keep the answer concise.\\\n",
|
||||
"\n",
|
||||
"{context}\"\"\"\n",
|
||||
"qa_prompt = ChatPromptTemplate.from_messages(\n",
|
||||
" [\n",
|
||||
" (\"system\", qa_system_prompt),\n",
|
||||
" MessagesPlaceholder(variable_name=\"chat_history\"),\n",
|
||||
" (\"human\", \"{question}\"),\n",
|
||||
" ]\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def contextualized_question(input: dict):\n",
|
||||
" if input.get(\"chat_history\"):\n",
|
||||
" return contextualize_q_chain\n",
|
||||
" else:\n",
|
||||
" return input[\"question\"]\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"rag_chain = (\n",
|
||||
" RunnablePassthrough.assign(\n",
|
||||
" context=contextualized_question | retriever | format_docs\n",
|
||||
" )\n",
|
||||
" | qa_prompt\n",
|
||||
" | llm\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 31,
|
||||
"id": "51fd0e54-5bb4-4a9a-b012-87a18ebe2bef",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"AIMessage(content='Common ways of task decomposition include:\\n\\n1. Using Chain of Thought (CoT): CoT is a prompting technique that instructs the model to \"think step by step\" and decompose complex tasks into smaller and simpler steps. This approach utilizes more computation at test-time and sheds light on the model\\'s thinking process.\\n\\n2. Prompting with LLM: Language Model (LLM) can be used to prompt the model with simple instructions like \"Steps for XYZ\" or \"What are the subgoals for achieving XYZ?\" This method guides the model to break down the task into manageable steps.\\n\\n3. Task-specific instructions: For certain tasks, task-specific instructions can be provided to guide the model in decomposing the task. For example, for writing a novel, the instruction \"Write a story outline\" can be given to help the model break down the task into smaller components.\\n\\n4. Human inputs: In some cases, human inputs can be used to assist in task decomposition. Humans can provide insights, expertise, and domain knowledge to help break down complex tasks into smaller subtasks.\\n\\nThese approaches aim to simplify complex tasks and enable more effective problem-solving and planning.')"
|
||||
]
|
||||
},
|
||||
"execution_count": 31,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"chat_history = []\n",
|
||||
"\n",
|
||||
"question = \"What is Task Decomposition?\"\n",
|
||||
"ai_msg = rag_chain.invoke({\"question\": question, \"chat_history\": chat_history})\n",
|
||||
"chat_history.extend([HumanMessage(content=question), ai_msg])\n",
|
||||
"\n",
|
||||
"second_question = \"What are common ways of doing it?\"\n",
|
||||
"rag_chain.invoke({\"question\": second_question, \"chat_history\": chat_history})"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "53263a65-4de2-4dd8-9291-6a8169ab6f1d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
":::tip\n",
|
||||
"\n",
|
||||
"Check out the [LangSmith trace](https://smith.langchain.com/public/b3001782-bb30-476a-886b-12da17ec258f/r) \n",
|
||||
"\n",
|
||||
":::"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "fdf6c7e0-84f8-4747-b2ae-e84315152bd9",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Here we've gone over how to add application logic for incorporating historical outputs, but we're still manually updating the chat history and inserting it into each input. In a real Q&A application we'll want some way of persisting chat history and some way of automatically inserting and updating it.\n",
|
||||
"\n",
|
||||
"For this we can use:\n",
|
||||
"- [BaseChatMessageHistory](/docs/modules/memory/chat_messages/): Store chat history.\n",
|
||||
"- [RunnableWithMessageHistory](/docs/expression_language/how_to/message_history): Wrapper for an LCEL chain and a `BaseChatMessageHistory` that handles injecting chat history into inputs and updating it after each invocation.\n",
|
||||
"\n",
|
||||
"For a detailed walkthrough of how to use these classes together to create a stateful conversational chain, head to the [How to add message history (memory)](/docs/expression_language/how_to/message_history) LCEL page."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "poetry-venv",
|
||||
"language": "python",
|
||||
"name": "poetry-venv"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@ -1,340 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "88d7cc8c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Text splitting by header\n",
|
||||
"\n",
|
||||
"Text splitting for vector storage often uses sentences or other delimiters [to keep related text together](https://www.pinecone.io/learn/chunking-strategies/). \n",
|
||||
"\n",
|
||||
"But many documents (such as `Markdown` files) have structure (headers) that can be explicitly used in splitting. \n",
|
||||
"\n",
|
||||
"The `MarkdownHeaderTextSplitter` lets a user split `Markdown` files files based on specified headers. \n",
|
||||
"\n",
|
||||
"This results in chunks that retain the header(s) that it came from in the metadata.\n",
|
||||
"\n",
|
||||
"This works nicely w/ `SelfQueryRetriever`.\n",
|
||||
"\n",
|
||||
"First, tell the retriever about our splits.\n",
|
||||
"\n",
|
||||
"Then, query based on the doc structure (e.g., \"summarize the doc introduction\"). \n",
|
||||
"\n",
|
||||
"Chunks only from that section of the Document will be filtered and used in chat / Q+A.\n",
|
||||
"\n",
|
||||
"Let's test this out on an [example Notion page](https://rlancemartin.notion.site/Auto-Evaluation-of-Metadata-Filtering-18502448c85240828f33716740f9574b?pvs=4)!\n",
|
||||
"\n",
|
||||
"First, I download the page to Markdown as explained [here](https://python.langchain.com/docs/ecosystem/integrations/notion)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"id": "2e587f65",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Load Notion page as a markdownfile file\n",
|
||||
"from langchain.document_loaders import NotionDirectoryLoader\n",
|
||||
"\n",
|
||||
"path = \"../Notion_DB/\"\n",
|
||||
"loader = NotionDirectoryLoader(path)\n",
|
||||
"docs = loader.load()\n",
|
||||
"md_file = docs[0].page_content"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"id": "1cd3fd7e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Let's create groups based on the section headers in our page\n",
|
||||
"from langchain.text_splitter import MarkdownHeaderTextSplitter\n",
|
||||
"\n",
|
||||
"headers_to_split_on = [\n",
|
||||
" (\"###\", \"Section\"),\n",
|
||||
"]\n",
|
||||
"markdown_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on)\n",
|
||||
"md_header_splits = markdown_splitter.split_text(md_file)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4f73a609",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now, perform text splitting on the header grouped documents. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 26,
|
||||
"id": "7fbff95f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Define our text splitter\n",
|
||||
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
|
||||
"\n",
|
||||
"chunk_size = 500\n",
|
||||
"chunk_overlap = 0\n",
|
||||
"text_splitter = RecursiveCharacterTextSplitter(\n",
|
||||
" chunk_size=chunk_size, chunk_overlap=chunk_overlap\n",
|
||||
")\n",
|
||||
"all_splits = text_splitter.split_documents(md_header_splits)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5bd72546",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This sets us up well do perform metadata filtering based on the document structure.\n",
|
||||
"\n",
|
||||
"Let's bring this all together by building a vectorstore first."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "b050b4de",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"! pip install chromadb"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 27,
|
||||
"id": "01d59c39",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Build vectorstore and keep the metadata\n",
|
||||
"from langchain.embeddings import OpenAIEmbeddings\n",
|
||||
"from langchain.vectorstores import Chroma\n",
|
||||
"\n",
|
||||
"vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "310346dd",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Let's create a `SelfQueryRetriever` that can filter based upon metadata we defined."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 28,
|
||||
"id": "7fd4d283",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Create retriever\n",
|
||||
"from langchain.chains.query_constructor.base import AttributeInfo\n",
|
||||
"from langchain.llms import OpenAI\n",
|
||||
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
|
||||
"\n",
|
||||
"# Define our metadata\n",
|
||||
"metadata_field_info = [\n",
|
||||
" AttributeInfo(\n",
|
||||
" name=\"Section\",\n",
|
||||
" description=\"Part of the document that the text comes from\",\n",
|
||||
" type=\"string or list[string]\",\n",
|
||||
" ),\n",
|
||||
"]\n",
|
||||
"document_content_description = \"Major sections of the document\"\n",
|
||||
"\n",
|
||||
"# Define self query retriever\n",
|
||||
"llm = OpenAI(temperature=0)\n",
|
||||
"retriever = SelfQueryRetriever.from_llm(\n",
|
||||
" llm, vectorstore, document_content_description, metadata_field_info, verbose=True\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "218b9820",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We can see that we can query *only for texts* in the `Introduction` of the document!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 29,
|
||||
"id": "d688db6e",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"query='Introduction' filter=Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='Section', value='Introduction') limit=None\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[Document(page_content='![Untitled](Auto-Evaluation%20of%20Metadata%20Filtering%2018502448c85240828f33716740f9574b/Untitled.png)', metadata={'Section': 'Introduction'}),\n",
|
||||
" Document(page_content='Q+A systems often use a two-step approach: retrieve relevant text chunks and then synthesize them into an answer. There many ways to approach this. For example, we recently [discussed](https://blog.langchain.dev/auto-evaluation-of-anthropic-100k-context-window/) the Retriever-Less option (at bottom in the below diagram), highlighting the Anthropic 100k context window model. Metadata filtering is an alternative approach that pre-filters chunks based on a user-defined criteria in a VectorDB using', metadata={'Section': 'Introduction'}),\n",
|
||||
" Document(page_content='metadata tags prior to semantic search.', metadata={'Section': 'Introduction'})]"
|
||||
]
|
||||
},
|
||||
"execution_count": 29,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Test\n",
|
||||
"retriever.get_relevant_documents(\"Summarize the Introduction section of the document\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 29,
|
||||
"id": "f8064987",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"query='Introduction' filter=Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='Section', value='Introduction') limit=None\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[Document(page_content='![Untitled](Auto-Evaluation%20of%20Metadata%20Filtering%2018502448c85240828f33716740f9574b/Untitled.png)', metadata={'Section': 'Introduction'}),\n",
|
||||
" Document(page_content='Q+A systems often use a two-step approach: retrieve relevant text chunks and then synthesize them into an answer. There many ways to approach this. For example, we recently [discussed](https://blog.langchain.dev/auto-evaluation-of-anthropic-100k-context-window/) the Retriever-Less option (at bottom in the below diagram), highlighting the Anthropic 100k context window model. Metadata filtering is an alternative approach that pre-filters chunks based on a user-defined criteria in a VectorDB using', metadata={'Section': 'Introduction'}),\n",
|
||||
" Document(page_content='metadata tags prior to semantic search.', metadata={'Section': 'Introduction'})]"
|
||||
]
|
||||
},
|
||||
"execution_count": 29,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Test\n",
|
||||
"retriever.get_relevant_documents(\"Summarize the Introduction section of the document\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f35999b3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We can also look at other parts of the document."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 30,
|
||||
"id": "47929be4",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"query='Testing' filter=Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='Section', value='Testing') limit=None\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[Document(page_content='![Untitled](Auto-Evaluation%20of%20Metadata%20Filtering%2018502448c85240828f33716740f9574b/Untitled%202.png)', metadata={'Section': 'Testing'}),\n",
|
||||
" Document(page_content='`SelfQueryRetriever` works well in [many cases](https://twitter.com/hwchase17/status/1656791488569954304/photo/1). For example, given [this test case](https://twitter.com/hwchase17/status/1656791488569954304?s=20): \\n![Untitled](Auto-Evaluation%20of%20Metadata%20Filtering%2018502448c85240828f33716740f9574b/Untitled%201.png) \\nThe query can be nicely broken up into semantic query and metadata filter: \\n```python\\nsemantic query: \"prompt injection\"', metadata={'Section': 'Testing'}),\n",
|
||||
" Document(page_content='Below, we can see detailed results from the app: \\n- Kor extraction is above to perform the transformation between query and metadata format ✅\\n- Self-querying attempts to filter using the episode ID (`252`) in the query and fails 🚫\\n- Baseline returns docs from 3 different episodes (one from `252`), confusing the answer 🚫', metadata={'Section': 'Testing'}),\n",
|
||||
" Document(page_content='will use in retrieval [here](https://github.com/langchain-ai/auto-evaluator/blob/main/streamlit/kor_retriever_lex.py).', metadata={'Section': 'Testing'})]"
|
||||
]
|
||||
},
|
||||
"execution_count": 30,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"retriever.get_relevant_documents(\"Summarize the Testing section of the document\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1af7720f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now, we can create chat or Q+A apps that are aware of the explicit document structure. \n",
|
||||
"\n",
|
||||
"The ability to retain document structure for metadata filtering can be helpful for complicated or longer documents."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 31,
|
||||
"id": "565822a1",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"query='Testing' filter=Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='Section', value='Testing') limit=None\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'The Testing section of the document describes the evaluation of the `SelfQueryRetriever` component in comparison to a baseline model. The evaluation was performed on a test case where the query was broken down into a semantic query and a metadata filter. The results showed that the `SelfQueryRetriever` component was able to perform the transformation between query and metadata format, but failed to filter using the episode ID in the query. The baseline model returned documents from three different episodes, which confused the answer. The `SelfQueryRetriever` component was deemed to work well in many cases and will be used in retrieval.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 31,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain.chains import RetrievalQA\n",
|
||||
"from langchain.chat_models import ChatOpenAI\n",
|
||||
"\n",
|
||||
"llm = ChatOpenAI(model_name=\"gpt-3.5-turbo\", temperature=0)\n",
|
||||
"qa_chain = RetrievalQA.from_chain_type(llm, retriever=retriever)\n",
|
||||
"qa_chain.run(\"Summarize the Testing section of the document\")"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
},
|
||||
"vscode": {
|
||||
"interpreter": {
|
||||
"hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1"
|
||||
}
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue