docs: query analysis use case (#17766)

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
4 months ago · a6f0506aaf
parent 6782dac420
commit a6f0506aaf
14 changed files with 3256 additions and 8 deletions
--- a/docs/docs/use_cases/query_analysis/few_shot.ipynb
+++ b/docs/docs/use_cases/query_analysis/few_shot.ipynb
@ -0,0 +1,385 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 2\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Adding examples to the prompt\n",
+    "\n",
+    "As our query analysis becomes more complex, adding examples to the prompt can meaningfully improve performance.\n",
+    "\n",
+    "Let's take a look at how we can add examples for the LangChain YouTube video query analyzer we built in the [Quickstart](/docs/use_cases/query_analysis/quickstart)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain-core langchain-openai"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "57396e23-c192-4d97-846b-5eacea4d6b8d",
+   "metadata": {},
+   "source": [
+    "## Query schema\n",
+    "\n",
+    "We'll define a query schema that we want our model to output. To make our query analysis a bit more interesting, we'll add a `sub_queries` field that contains more narrow questions derived from the top level question."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 37,
+   "id": "0b51dd76-820d-41a4-98c8-893f6fe0d1ea",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import List, Optional\n",
+    "\n",
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "sub_queries_description = \"\"\"\\\n",
+    "If the original question contains multiple distinct sub-questions, \\\n",
+    "or if there are more generic questions that would be helpful to answer in \\\n",
+    "order to answer the original question, write a list of all relevant sub-questions. \\\n",
+    "Make sure this list is comprehensive and covers all parts of the original question. \\\n",
+    "It's ok if there's redundancy in the sub-questions. \\\n",
+    "Make sure the sub-questions are as narrowly focused as possible.\"\"\"\n",
+    "\n",
+    "\n",
+    "class Search(BaseModel):\n",
+    "    \"\"\"Search over a database of tutorial videos about a software library.\"\"\"\n",
+    "\n",
+    "    query: str = Field(\n",
+    "        ...,\n",
+    "        description=\"Primary similarity search query applied to video transcripts.\",\n",
+    "    )\n",
+    "    sub_queries: List[str] = Field(\n",
+    "        default_factory=list, description=sub_queries_description\n",
+    "    )\n",
+    "    publish_year: Optional[int] = Field(None, description=\"Year video was published\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f8b08c52-1ce9-4d8b-a779-cbe8efde51d1",
+   "metadata": {},
+   "source": [
+    "## Query generation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 64,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "system = \"\"\"You are an expert at converting user questions into database queries. \\\n",
+    "You have access to a database of tutorial videos about a software library for building LLM-powered applications. \\\n",
+    "Given a question, return a list of database queries optimized to retrieve the most relevant results.\n",
+    "\n",
+    "If there are acronyms or words you are not familiar with, do not try to rephrase them.\"\"\"\n",
+    "\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        MessagesPlaceholder(\"examples\", optional=True),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "structured_llm = llm.with_structured_output(Search)\n",
+    "query_analyzer = {\"question\": RunnablePassthrough()} | prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f403517a-b8e3-44ac-b0a6-02f8305635a2",
+   "metadata": {},
+   "source": [
+    "Let's try out our query analyzer without any examples in the prompt:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 65,
+   "id": "0bcfce06-6f0c-4f9d-a1fc-dc29342d2aae",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='web voyager vs reflection agents', sub_queries=['difference between web voyager and reflection agents', 'do web voyager and reflection agents use langgraph'], publish_year=None)"
+      ]
+     },
+     "execution_count": 65,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    \"what's the difference between web voyager and reflection agents? do both use langgraph?\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "00962b08-899c-465c-9a41-6459b207e0f2",
+   "metadata": {},
+   "source": [
+    "## Adding examples and tuning the prompt\n",
+    "\n",
+    "This works pretty well, but we probably want it to decompose the question even further to separate the queries about Web Voyager and Reflection Agents.\n",
+    "\n",
+    "To tune our query generation results, we can add some examples of inputs questions and gold standard output queries to our prompt."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 53,
+   "id": "15b4923d-a08e-452d-8889-9a09a57d1095",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "examples = []"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 54,
+   "id": "da5330e6-827a-40e5-982b-b23b6286b758",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"What's chat langchain, is it a langchain template?\"\n",
+    "query = Search(\n",
+    "    query=\"What is chat langchain and is it a langchain template?\",\n",
+    "    sub_queries=[\"What is chat langchain\", \"What is a langchain template\"],\n",
+    ")\n",
+    "examples.append({\"input\": question, \"tool_calls\": [query]})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 55,
+   "id": "580e857a-27df-4ecf-a19c-458dc9244ec8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"How to build multi-agent system and stream intermediate steps from it\"\n",
+    "query = Search(\n",
+    "    query=\"How to build multi-agent system and stream intermediate steps from it\",\n",
+    "    sub_queries=[\n",
+    "        \"How to build multi-agent system\",\n",
+    "        \"How to stream intermediate steps from multi-agent system\",\n",
+    "        \"How to stream intermediate steps\",\n",
+    "    ],\n",
+    ")\n",
+    "\n",
+    "examples.append({\"input\": question, \"tool_calls\": [query]})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 56,
+   "id": "fa63310d-69e3-4701-825c-fbb01f8a5a16",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"LangChain agents vs LangGraph?\"\n",
+    "query = Search(\n",
+    "    query=\"What's the difference between LangChain agents and LangGraph? How do you deploy them?\",\n",
+    "    sub_queries=[\n",
+    "        \"What are LangChain agents\",\n",
+    "        \"What is LangGraph\",\n",
+    "        \"How do you deploy LangChain agents\",\n",
+    "        \"How do you deploy LangGraph\",\n",
+    "    ],\n",
+    ")\n",
+    "examples.append({\"input\": question, \"tool_calls\": [query]})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bd21389c-f862-44e6-9d51-92db10979525",
+   "metadata": {},
+   "source": [
+    "Now we need to update our prompt template and chain so that the examples are included in each prompt. Since we're working with OpenAI function-calling, we'll need to do a bit of extra structuring to send example inputs and outputs to the model. We'll create a `tool_example_to_messages` helper function to handle this for us:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 57,
+   "id": "68b03709-9a60-4acf-b96c-cafe1056c6f3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import uuid\n",
+    "from typing import Dict\n",
+    "\n",
+    "from langchain_core.messages import (\n",
+    "    AIMessage,\n",
+    "    BaseMessage,\n",
+    "    HumanMessage,\n",
+    "    SystemMessage,\n",
+    "    ToolMessage,\n",
+    ")\n",
+    "\n",
+    "\n",
+    "def tool_example_to_messages(example: Dict) -> List[BaseMessage]:\n",
+    "    messages: List[BaseMessage] = [HumanMessage(content=example[\"input\"])]\n",
+    "    openai_tool_calls = []\n",
+    "    for tool_call in example[\"tool_calls\"]:\n",
+    "        openai_tool_calls.append(\n",
+    "            {\n",
+    "                \"id\": str(uuid.uuid4()),\n",
+    "                \"type\": \"function\",\n",
+    "                \"function\": {\n",
+    "                    \"name\": tool_call.__class__.__name__,\n",
+    "                    \"arguments\": tool_call.json(),\n",
+    "                },\n",
+    "            }\n",
+    "        )\n",
+    "    messages.append(\n",
+    "        AIMessage(content=\"\", additional_kwargs={\"tool_calls\": openai_tool_calls})\n",
+    "    )\n",
+    "    tool_outputs = example.get(\"tool_outputs\") or [\n",
+    "        \"You have correctly called this tool.\"\n",
+    "    ] * len(openai_tool_calls)\n",
+    "    for output, tool_call in zip(tool_outputs, openai_tool_calls):\n",
+    "        messages.append(ToolMessage(content=output, tool_call_id=tool_call[\"id\"]))\n",
+    "    return messages\n",
+    "\n",
+    "\n",
+    "example_msgs = [msg for ex in examples for msg in tool_example_to_messages(ex)]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 58,
+   "id": "d9bf9f87-3e6b-4fc2-957b-949b077fab54",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import MessagesPlaceholder\n",
+    "\n",
+    "query_analyzer_with_examples = (\n",
+    "    {\"question\": RunnablePassthrough()}\n",
+    "    | prompt.partial(examples=example_msgs)\n",
+    "    | structured_llm\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 62,
+   "id": "e565ccb0-3530-4782-b56b-d1f6d0a8e559",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='Difference between web voyager and reflection agents, do they both use LangGraph?', sub_queries=['What is Web Voyager', 'What are Reflection agents', 'Do Web Voyager and Reflection agents use LangGraph'], publish_year=None)"
+      ]
+     },
+     "execution_count": 62,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer_with_examples.invoke(\n",
+    "    \"what's the difference between web voyager and reflection agents? do both use langgraph?\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e5ea49ff-be53-4072-8c25-08682bb31a19",
+   "metadata": {},
+   "source": [
+    "Thanks to our examples we get a slightly more decomposed search query. With some more prompt engineering and tuning of our examples we could improve query generation even more.\n",
+    "\n",
+    "You can see that the examples are passed to the model as messages in the [LangSmith trace](https://smith.langchain.com/public/aeaaafce-d2b1-4943-9a61-bc954e8fc6f2/r)."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/index.ipynb
+++ b/docs/docs/use_cases/query_analysis/index.ipynb
@ -0,0 +1,83 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "a47da0d0-0927-4adb-93e6-99a434f732cf",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 0.3\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Query analysis\n",
+    "\n",
+    "In any question answering application we need to retrieve information based on a user question. The simplest way to do this involves passing the user question directly to a retriever. However, in many cases it can improve performance by \"optimizing\" the query in some way. This is typically done by an LLM. Specifically, this involves passing the raw question (or list of messages) into an LLM and returning one or more optimized queries, which typically contain a string and optionally other structured information.\n",
+    "\n",
+    "![Query Analysis](../../../static/img/query_analysis.png)\n",
+    "\n",
+    "## Background Information\n",
+    "\n",
+    "This guide assumes familiarity with the basic building blocks of a simple RAG application outlined in the [Q&A with RAG Quickstart](/docs/use_cases/question_answering/quickstart). Please read and understand that before diving in here.\n",
+    "\n",
+    "## Problems Solved\n",
+    "\n",
+    "Query analysis helps solves problems where the user question is not optimal to pass into the retriever. This can be the case when:\n",
+    "\n",
+    "* The retriever supports searches and filters against specific fields of the data, and user input could be referring to any of these fields,\n",
+    "* The user input contains multiple distinct questions in it,\n",
+    "* To get the relevant information multiple queries are needed,\n",
+    "* Search quality is sensitive to phrasing,\n",
+    "* There are multiple retrievers that could be searched over, and the user input could be reffering to any of them.\n",
+    "\n",
+    "Note that different problems will require different solutions. In order to determine what query analysis technique you should use, you will want to understand exactly what the problem with your current retrieval system is. This is best done by looking at failure data points of your current application and identifying common themes. Only once you know what your problems are can you begin to solve them.\n",
+    "\n",
+    "## Quickstart\n",
+    "\n",
+    "Head to the [quickstart](/docs/use_cases/query_analysis/quickstart) to see how to use query analysis in a basic end-to-end example. This will cover creating a simple index, showing a failure mode that occur when passing a raw user question to that index, and then an example of how query analysis can help address that issue. There are MANY different query analysis techniques (see below) and this end-to-end example will not show all of them.\n",
+    "\n",
+    "\n",
+    "## Techniques\n",
+    "\n",
+    "There are multiple techniques we support for going from raw question or list of messages into a more optimized query. These include:\n",
+    "\n",
+    "* [Query decomposition](/docs/use_cases/query_analysis/techniques/decomposition): If a user input contains multiple distinct questions, we can decompose the input into separate queries that will each be executed independently.\n",
+    "* [Query expansion](/docs/use_cases/query_analysis/techniques/expansion): If an index is sensitive to query phrasing, we can generate multiple paraphrased versions of the user question to increase our chances of retrieving a relevant result.\n",
+    "* [Hypothetical document embedding (HyDE)](/docs/use_cases/query_analysis/techniques/hyde): If we're working with a similarity search-based index, like a vector store, then searching on raw questions may not work well because their embeddings may not be very similar to those of the relevant documents. Instead it might help to have the model generate a hypothetical relevant document, and then use that to perform similarity search.\n",
+    "* [Query routing](/docs/use_cases/query_analysis/techniques/routing): If we have multiple indexes and only a subset are useful for any given user input, we can route the input to only retrieve results from the relevant ones.\n",
+    "* [Step back prompting](/docs/use_cases/query_analysis/techniques/step_back): Sometimes search quality and model generations can be tripped up by the specifics of a question. One way to handle this is to first generate a more abstract, \"step back\" question and to query based on both the original and step back question.\n",
+    "* [Query structuring](/docs/use_cases/query_analysis/techniques/structuring): If our documents have multiple searchable/filterable attributes, we can infer from any raw user question which specific attributes should be searched/filtered over. For example, when a user input specific something about video publication date, that should become a filter on the `publish_date` attribute of each document.\n",
+    "\n",
+    "## How to\n",
+    "\n",
+    "* [Add examples to prompt](/docs/use_cases/query_analysis/few_shot): As our query analysis becomes more complex, adding examples to the prompt can meaningfully improve performance."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "poetry-venv-2",
+   "language": "python",
+   "name": "poetry-venv-2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/quickstart.ipynb
+++ b/docs/docs/use_cases/query_analysis/quickstart.ipynb
@ -0,0 +1,591 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 0\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Quickstart\n",
+    "\n",
+    "This example will show how to use query analysis in a basic end-to-end example. This will cover creating a simple index, showing a failure mode that occur when passing a raw user question to that index, and then an example of how query analysis can help address that issue. There are MANY different query analysis techniques and this end-to-end example will not show all of them.\n",
+    "\n",
+    "For the purpose of this example, we will do retrieval over the LangChain YouTube videos."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain langchain-community langchain-openai youtube-transcript-api pytube faiss-cpu"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c20b48b8-16d7-4089-bc17-f2d240b3935a",
+   "metadata": {},
+   "source": [
+    "### Load documents\n",
+    "\n",
+    "We can use the `YouTubeLoader` to load transcripts of a few LangChain videos:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "ae6921e1-3d5a-431c-9999-29a5f33201e1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_community.document_loaders import YoutubeLoader\n",
+    "\n",
+    "urls = [\n",
+    "    \"https://www.youtube.com/watch?v=HAn9vnJy6S4\",\n",
+    "    \"https://www.youtube.com/watch?v=dA1cHGACXCo\",\n",
+    "    \"https://www.youtube.com/watch?v=ZcEMLz27sL4\",\n",
+    "    \"https://www.youtube.com/watch?v=hvAPnpSfSGo\",\n",
+    "    \"https://www.youtube.com/watch?v=EhlPDL4QrWY\",\n",
+    "    \"https://www.youtube.com/watch?v=mmBo8nlu2j0\",\n",
+    "    \"https://www.youtube.com/watch?v=rQdibOsL1ps\",\n",
+    "    \"https://www.youtube.com/watch?v=28lC4fqukoc\",\n",
+    "    \"https://www.youtube.com/watch?v=es-9MgxB-uc\",\n",
+    "    \"https://www.youtube.com/watch?v=wLRHwKuKvOE\",\n",
+    "    \"https://www.youtube.com/watch?v=ObIltMaRJvY\",\n",
+    "    \"https://www.youtube.com/watch?v=DjuXACWYkkU\",\n",
+    "    \"https://www.youtube.com/watch?v=o7C9ld6Ln-M\",\n",
+    "]\n",
+    "docs = []\n",
+    "for url in urls:\n",
+    "    docs.extend(YoutubeLoader.from_youtube_url(url, add_video_info=True).load())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "2b84918e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import datetime\n",
+    "\n",
+    "# Add some additional metadata: what year the video was published\n",
+    "for doc in docs:\n",
+    "    doc.metadata[\"publish_year\"] = int(\n",
+    "        datetime.datetime.strptime(\n",
+    "            doc.metadata[\"publish_date\"], \"%Y-%m-%d %H:%M:%S\"\n",
+    "        ).strftime(\"%Y\")\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ce7da456-3023-4f04-bba1-f7e2c468c7fe",
+   "metadata": {},
+   "source": [
+    "Here are the titles of the videos we've loaded:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 59,
+   "id": "3e1a99ee-1078-4373-b80a-630af48bf94a",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['OpenGPTs',\n",
+       " 'Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve',\n",
+       " 'Streaming Events: Introducing a new `stream_events` method',\n",
+       " 'LangGraph: Multi-Agent Workflows',\n",
+       " 'Build and Deploy a RAG app with Pinecone Serverless',\n",
+       " 'Auto-Prompt Builder (with Hosted LangServe)',\n",
+       " 'Build a Full Stack RAG App With TypeScript',\n",
+       " 'Getting Started with Multi-Modal LLMs',\n",
+       " 'SQL Research Assistant',\n",
+       " 'Skeleton-of-Thought: Building a New Template from Scratch',\n",
+       " 'Benchmarking RAG over LangChain Docs',\n",
+       " 'Building a Research Assistant from Scratch',\n",
+       " 'LangServe and LangChain Templates Webinar']"
+      ]
+     },
+     "execution_count": 59,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "[doc.metadata[\"title\"] for doc in docs]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "05a71032-14c3-4517-aa9a-3a5e88eaeb92",
+   "metadata": {},
+   "source": [
+    "Here's the metadata associated with each video. We can see that each document also has a title, view count, publication date, and length:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 60,
+   "id": "c7748415-ddbf-4c55-a242-c28833c03caf",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'source': 'HAn9vnJy6S4',\n",
+       " 'title': 'OpenGPTs',\n",
+       " 'description': 'Unknown',\n",
+       " 'view_count': 7210,\n",
+       " 'thumbnail_url': 'https://i.ytimg.com/vi/HAn9vnJy6S4/hq720.jpg',\n",
+       " 'publish_date': '2024-01-31 00:00:00',\n",
+       " 'length': 1530,\n",
+       " 'author': 'LangChain',\n",
+       " 'publish_year': 2024}"
+      ]
+     },
+     "execution_count": 60,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "docs[0].metadata"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5db72331-1e79-4910-8faa-473a0e370277",
+   "metadata": {},
+   "source": [
+    "And here's a sample from a document's contents:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 61,
+   "id": "845149b7-130e-4228-ac80-d0a9286ef1d3",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"hello today I want to talk about open gpts open gpts is a project that we built here at linkchain uh that replicates the GPT store in a few ways so it creates uh end user-facing friendly interface to create different Bots and these Bots can have access to different tools and they can uh be given files to retrieve things over and basically it's a way to create a variety of bots and expose the configuration of these Bots to end users it's all open source um it can be used with open AI it can be us\""
+      ]
+     },
+     "execution_count": 61,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "docs[0].page_content[:500]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "561697c8-b848-4b12-847c-ab6a8e2d1ae6",
+   "metadata": {},
+   "source": [
+    "### Indexing documents\n",
+    "\n",
+    "Whenever we perform retrieval we need to create an index of documents that we can query. We'll use a vector store to index our documents, and we'll chunk them first to make our retrievals more concise and precise:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "1f621694",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
+    "from langchain_community.vectorstores import Chroma\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "\n",
+    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)\n",
+    "chunked_docs = text_splitter.split_documents(docs)\n",
+    "embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")\n",
+    "vectorstore = Chroma.from_documents(\n",
+    "    chunked_docs,\n",
+    "    embeddings,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "483d8d0a-5c1b-46b0-862c-a4eccfd5ae3c",
+   "metadata": {},
+   "source": [
+    "## Retrieval without query analysis\n",
+    "\n",
+    "We can perform similarity search on a user question directly to find chunks relevant to the question:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 64,
+   "id": "09435e9b-57b4-41b1-b34a-449815bdfae0",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Build and Deploy a RAG app with Pinecone Serverless\n",
+      "hi this is Lance from the Lang chain team and today we're going to be building and deploying a rag app using pine con serval list from scratch so we're going to kind of walk through all the code required to do this and I'll use these slides as kind of a guide to kind of lay the the ground work um so first what is rag so under capoy has this pretty nice visualization that shows LMS as a kernel of a new kind of operating system and of course one of the core components of our operating system is th\n"
+     ]
+    }
+   ],
+   "source": [
+    "search_results = vectorstore.similarity_search(\"how do I build a RAG agent\")\n",
+    "print(search_results[0].metadata[\"title\"])\n",
+    "print(search_results[0].page_content[:500])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5a79ef1b-7edd-4b68-98e5-c0e4c0dd02e6",
+   "metadata": {},
+   "source": [
+    "This works pretty well! Our first result is quite relevant to the question.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a891e8f5-ef0c-4ec0-b25f-eda7a5350a85",
+   "metadata": {},
+   "source": [
+    "What if we wanted to search for results from a specific time period?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 65,
+   "id": "7adbfc11-ca01-4883-8978-e4f6e4a1d23d",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "OpenGPTs\n",
+      "2024-01-31\n",
+      "hardcoded that it will always do a retrieval step here the assistant decides whether to do a retrieval step or not sometimes this is good sometimes this is bad sometimes it you don't need to do a retrieval step when I said hi it didn't need to call it tool um but other times you know the the llm might mess up and not realize that it needs to do a retrieval step and so the rag bot will always do a retrieval step so it's more focused there because this is also a simpler architecture so it's always\n"
+     ]
+    }
+   ],
+   "source": [
+    "search_results = vectorstore.similarity_search(\"videos on RAG published in 2023\")\n",
+    "print(search_results[0].metadata[\"title\"])\n",
+    "print(search_results[0].metadata[\"publish_date\"])\n",
+    "print(search_results[0].page_content[:500])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4790e2db-3c6e-440b-b6e8-ebdd6600fda5",
+   "metadata": {},
+   "source": [
+    "Our first result is from 2024, and not very relevant to the input. Since we're just searching against document contents, there's no way for the results to be filtered on any document attributes.\n",
+    "\n",
+    "This is just one failure mode that can arise. Let's now take a look at how a basic form of query analysis can fix it!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "57396e23-c192-4d97-846b-5eacea4d6b8d",
+   "metadata": {},
+   "source": [
+    "## Query analysis\n",
+    "\n",
+    "To handle these failure modes we'll do some query structuring. This will involve defining a **query schema** that contains some date filters and use a function-calling model to convert a user question into a structured queries. \n",
+    "\n",
+    "### Query schema\n",
+    "In this case we'll have explicit min and max attributes for publication date so that it can be filtered on."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "0b51dd76-820d-41a4-98c8-893f6fe0d1ea",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import Optional\n",
+    "\n",
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "\n",
+    "class Search(BaseModel):\n",
+    "    \"\"\"Search over a database of tutorial videos about a software library.\"\"\"\n",
+    "\n",
+    "    query: str = Field(\n",
+    "        ...,\n",
+    "        description=\"Similarity search query applied to video transcripts.\",\n",
+    "    )\n",
+    "    publish_year: Optional[int] = Field(None, description=\"Year video was published\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f8b08c52-1ce9-4d8b-a779-cbe8efde51d1",
+   "metadata": {},
+   "source": [
+    "### Query generation\n",
+    "\n",
+    "To convert user questions to structured queries we'll make use of OpenAI's function-calling API. Specifically we'll use the new [ChatModel.with_structured_output()](/docs/guides/structured_output) constructor to handle passing the schema to the model and parsing the output."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/bagatur/langchain/libs/core/langchain_core/_api/beta_decorator.py:86: LangChainBetaWarning: The function `with_structured_output` is in beta. It is actively being worked on, so the API may change.\n",
+      "  warn_beta(\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "system = \"\"\"You are an expert at converting user questions into database queries. \\\n",
+    "You have access to a database of tutorial videos about a software library for building LLM-powered applications. \\\n",
+    "Given a question, return a list of database queries optimized to retrieve the most relevant results.\n",
+    "\n",
+    "If there are acronyms or words you are not familiar with, do not try to rephrase them.\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "structured_llm = llm.with_structured_output(Search)\n",
+    "query_analyzer = {\"question\": RunnablePassthrough()} | prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f403517a-b8e3-44ac-b0a6-02f8305635a2",
+   "metadata": {},
+   "source": [
+    "Let's see what queries our analyzer generates for the questions we searched earlier:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "bc1d3863",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='build RAG agent', publish_year=None)"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\"how do I build a RAG agent\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "af62af17-4f90-4dbd-a8b4-dfff51f1db95",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='RAG', publish_year=2023)"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\"videos on RAG published in 2023\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c7c65b2f-7881-45fc-a47b-a4eaaf48245f",
+   "metadata": {},
+   "source": [
+    "## Retrieval with query analysis\n",
+    "\n",
+    "Our query analysis looks pretty good; now let's try using our generated queries to actually perform retrieval. \n",
+    "\n",
+    "**Note:** in our example, we specified `tool_choice=\"Search\"`. This will force the LLM to call one - and only one - function, meaning that we will always have one optimized query to look up. Note that this is not always the case - see other guides for how to deal with situations when no - or multiple - optmized queries are returned."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "1e047d87",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import List\n",
+    "\n",
+    "from langchain_core.documents import Document"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "8dac7866",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def retrieval(search: Search) -> List[Document]:\n",
+    "    if search.publish_year is not None:\n",
+    "        # This is syntax specific to Chroma,\n",
+    "        # the vector database we are using.\n",
+    "        _filter = {\"publish_year\": {\"$eq\": search.publish_year}}\n",
+    "    else:\n",
+    "        _filter = None\n",
+    "    return vectorstore.similarity_search(search.query, filter=_filter)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "232ad8a7-7990-4066-9228-d35a555f7293",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "retrieval_chain = query_analyzer | retrieval"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e6a4460c",
+   "metadata": {},
+   "source": [
+    "We can now run this chain on the problematic input from before, and see that it yields only results from that year!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "id": "e7f683b5-b1c5-4dec-b163-2efc162a2b51",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results = retrieval_chain.invoke(\"RAG tutorial published in 2023\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "id": "1ad52512-b3e8-42a3-8701-d9e87fb8b46c",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[('Getting Started with Multi-Modal LLMs', '2023-12-20 00:00:00'),\n",
+       " ('LangServe and LangChain Templates Webinar', '2023-11-02 00:00:00'),\n",
+       " ('Getting Started with Multi-Modal LLMs', '2023-12-20 00:00:00'),\n",
+       " ('Building a Research Assistant from Scratch', '2023-11-16 00:00:00')]"
+      ]
+     },
+     "execution_count": 21,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "[(doc.metadata[\"title\"], doc.metadata[\"publish_date\"]) for doc in results]"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/techniques/_category_.yml
+++ b/docs/docs/use_cases/query_analysis/techniques/_category_.yml
@ -0,0 +1,2 @@
+position: 1
+label: 'Techniques'
--- a/docs/docs/use_cases/query_analysis/techniques/decomposition.ipynb
+++ b/docs/docs/use_cases/query_analysis/techniques/decomposition.ipynb
@ -0,0 +1,440 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "a47da0d0-0927-4adb-93e6-99a434f732cf",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 1\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Decomposition\n",
+    "\n",
+    "When a user asks a question there is no guarantee that the relevant results can be returned with a single query. Sometimes to answer a question we need to split it into distinct sub-questions, retrieve results for each sub-question, and then answer using the cumulative context.\n",
+    "\n",
+    "For example if a user asks: \"How is Web Voyager different from reflection agents\", and we have one document that explains Web Voyager and one that explains reflection agents but no document that compares the two, then we'd likely get better results by retrieving for both \"What is Web Voyager\" and \"What are reflection agents\" and combining the retrieved documents than by retrieving based on the user question directly.\n",
+    "\n",
+    "This process of splitting an input into multiple distinct sub-queries is what we refer to as **query decomposition**. It is also sometimes referred to as sub-query generation. In this guide we'll walk through an example of how to do decomposition, using our example of a Q&A bot over the LangChain YouTube videos from the [Quickstart](/docs/use_cases/query_analysis/quickstart)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain langchain-openai"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "57396e23-c192-4d97-846b-5eacea4d6b8d",
+   "metadata": {},
+   "source": [
+    "## Query generation\n",
+    "\n",
+    "To convert user questions to a list of sub questions we'll use OpenAI's function-calling API, which can return multiple functions each turn:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "0b51dd76-820d-41a4-98c8-893f6fe0d1ea",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import datetime\n",
+    "from typing import Literal, Optional, Tuple\n",
+    "\n",
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "\n",
+    "class SubQuery(BaseModel):\n",
+    "    \"\"\"Search over a database of tutorial videos about a software library.\"\"\"\n",
+    "\n",
+    "    sub_query: str = Field(\n",
+    "        ...,\n",
+    "        description=\"A very specific query against the database.\",\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.output_parsers import PydanticToolsParser\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "system = \"\"\"You are an expert at converting user questions into database queries. \\\n",
+    "You have access to a database of tutorial videos about a software library for building LLM-powered applications. \\\n",
+    "\n",
+    "Perform query decomposition. Given a user question, break it down into distinct sub questions that \\\n",
+    "you need to answer in order to answer the original question.\n",
+    "\n",
+    "If there are acronyms or words you are not familiar with, do not try to rephrase them.\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "llm_with_tools = llm.bind_tools([SubQuery])\n",
+    "parser = PydanticToolsParser(tools=[SubQuery])\n",
+    "query_analyzer = prompt | llm_with_tools | parser"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f403517a-b8e3-44ac-b0a6-02f8305635a2",
+   "metadata": {},
+   "source": [
+    "Let's try it out:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "92bc7bac-700d-4666-b523-f0f8c3644ad5",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[SubQuery(sub_query='How to do rag')]"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke({\"question\": \"how to do rag\"})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "87590c6d-edd7-4805-bf68-c906907f9291",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[SubQuery(sub_query='How to use multi-modal models in a chain?'),\n",
+       " SubQuery(sub_query='How to turn a chain into a REST API?')]"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    {\n",
+    "        \"question\": \"how to use multi-modal models in a chain and turn chain into a rest api\"\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "4c949da8-b97e-45f5-937b-5c431e59edad",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[SubQuery(sub_query='What is Web Voyager and how does it differ from Reflection Agents?'),\n",
+       " SubQuery(sub_query='Do Web Voyager and Reflection Agents use Langgraph?')]"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    {\n",
+    "        \"question\": \"what's the difference between web voyager and reflection agents? do they use langgraph?\"\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "51ba81f0-f00b-4656-b840-037bf4306c60",
+   "metadata": {},
+   "source": [
+    "## Adding examples and tuning the prompt\n",
+    "\n",
+    "This works pretty well, but we probably want it to decompose the last question even further to separate the queries about Web Voyager and Reflection Agents. If we aren't sure up front what types of queries will do best with our index, we can also intentionally include some redundancy in our queries, so that we return both sub queries and higher level queries. \n",
+    "\n",
+    "To tune our query generation results, we can add some examples of inputs questions and gold standard output queries to our prompt. We can also try to improve our system message."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "id": "4d00d74b-7bc7-4224-ad09-fff8e7aeeaff",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "examples = []"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "id": "171f3c37-36da-4a80-911e-d0447168b9d8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"What's chat langchain, is it a langchain template?\"\n",
+    "queries = [\n",
+    "    SubQuery(sub_query=\"What is chat langchain\"),\n",
+    "    SubQuery(sub_query=\"What is a langchain template\"),\n",
+    "]\n",
+    "examples.append({\"input\": question, \"tool_calls\": queries})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "id": "92523941-5aa0-4d1e-a795-1d14529b48c2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"How would I use LangGraph to build an automaton\"\n",
+    "queries = [\n",
+    "    SubQuery(sub_query=\"How to build automaton with LangGraph\"),\n",
+    "]\n",
+    "examples.append({\"input\": question, \"tool_calls\": queries})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 27,
+   "id": "844df58a-abd3-4c06-9a59-b7eccbbefc0a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"How to build multi-agent system and stream intermediate steps from it\"\n",
+    "queries = [\n",
+    "    SubQuery(sub_query=\"How to build multi-agent system\"),\n",
+    "    SubQuery(sub_query=\"How to stream intermediate steps\"),\n",
+    "    SubQuery(sub_query=\"How to stream intermediate steps from multi-agent system\"),\n",
+    "]\n",
+    "examples.append({\"input\": question, \"tool_calls\": queries})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 31,
+   "id": "45288345-0c5a-4c57-b007-8981ce21aedd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"What's the difference between LangChain agents and LangGraph?\"\n",
+    "queries = [\n",
+    "    SubQuery(sub_query=\"What's the difference between LangChain agents and LangGraph?\"),\n",
+    "    SubQuery(sub_query=\"What are LangChain agents\"),\n",
+    "    SubQuery(sub_query=\"What is LangGraph\"),\n",
+    "]\n",
+    "examples.append({\"input\": question, \"tool_calls\": queries})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c68ee464-9f72-4a76-96fb-a87aeb29daa3",
+   "metadata": {},
+   "source": [
+    "Now we need to update our prompt template and chain so that the examples are included in each prompt. Since we're working with OpenAI function-calling, we'll need to do a bit of extra structuring to send example inputs and outputs to the model. We'll create a `tool_example_to_messages` helper function to handle this for us:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 36,
+   "id": "80a33517-afa5-4152-a041-55e01eadf04d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import uuid\n",
+    "from typing import Dict, List\n",
+    "\n",
+    "from langchain_core.messages import (\n",
+    "    AIMessage,\n",
+    "    BaseMessage,\n",
+    "    HumanMessage,\n",
+    "    SystemMessage,\n",
+    "    ToolMessage,\n",
+    ")\n",
+    "\n",
+    "\n",
+    "def tool_example_to_messages(example: Dict) -> List[BaseMessage]:\n",
+    "    messages: List[BaseMessage] = [HumanMessage(content=example[\"input\"])]\n",
+    "    openai_tool_calls = []\n",
+    "    for tool_call in example[\"tool_calls\"]:\n",
+    "        openai_tool_calls.append(\n",
+    "            {\n",
+    "                \"id\": str(uuid.uuid4()),\n",
+    "                \"type\": \"function\",\n",
+    "                \"function\": {\n",
+    "                    \"name\": tool_call.__class__.__name__,\n",
+    "                    \"arguments\": tool_call.json(),\n",
+    "                },\n",
+    "            }\n",
+    "        )\n",
+    "    messages.append(\n",
+    "        AIMessage(content=\"\", additional_kwargs={\"tool_calls\": openai_tool_calls})\n",
+    "    )\n",
+    "    tool_outputs = example.get(\"tool_outputs\") or [\n",
+    "        \"This is an example of a correct usage of this tool. Make sure to continue using the tool this way.\"\n",
+    "    ] * len(openai_tool_calls)\n",
+    "    for output, tool_call in zip(tool_outputs, openai_tool_calls):\n",
+    "        messages.append(ToolMessage(content=output, tool_call_id=tool_call[\"id\"]))\n",
+    "    return messages\n",
+    "\n",
+    "\n",
+    "example_msgs = [msg for ex in examples for msg in tool_example_to_messages(ex)]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "696f4bf1-467a-497a-8478-9bc6c84dda33",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import MessagesPlaceholder\n",
+    "\n",
+    "system = \"\"\"You are an expert at converting user questions into database queries. \\\n",
+    "You have access to a database of tutorial videos about a software library for building LLM-powered applications. \\\n",
+    "\n",
+    "Perform query decomposition. Given a user question, break it down into the most specific sub questions you can \\\n",
+    "which will help you answer the original question. Each sub question should be about a single concept/fact/idea.\n",
+    "\n",
+    "If there are acronyms or words you are not familiar with, do not try to rephrase them.\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        MessagesPlaceholder(\"examples\", optional=True),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "query_analyzer_with_examples = (\n",
+    "    prompt.partial(examples=example_msgs) | llm_with_tools | parser\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 37,
+   "id": "82824a2b-8985-430c-817a-6c8466bddf37",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[SubQuery(sub_query=\"What's the difference between web voyager and reflection agents\"),\n",
+       " SubQuery(sub_query='Do web voyager and reflection agents use LangGraph'),\n",
+       " SubQuery(sub_query='What is web voyager'),\n",
+       " SubQuery(sub_query='What are reflection agents')]"
+      ]
+     },
+     "execution_count": 37,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer_with_examples.invoke(\n",
+    "    {\n",
+    "        \"question\": \"what's the difference between web voyager and reflection agents? do they use langgraph?\"\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7d872bf9-e70f-44f7-be86-4aa130ca9d17",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "poetry-venv-2",
+   "language": "python",
+   "name": "poetry-venv-2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/techniques/expansion.ipynb
+++ b/docs/docs/use_cases/query_analysis/techniques/expansion.ipynb
@ -0,0 +1,212 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "a47da0d0-0927-4adb-93e6-99a434f732cf",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 2\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Expansion\n",
+    "\n",
+    "Information retrieval systems can be sensitive to phrasing and specific keywords. To mitigate this, one classic retrieval technique is to generate multiple paraphrased versions of a query and return results for all versions of the query. This is called **query expansion**. LLMs are a great tool for generating these alternate versions of a query.\n",
+    "\n",
+    "Let's take a look at how we might do query expansion for our Q&A bot over the LangChain YouTube videos, which we started in the [Quickstart](/docs/use_cases/query_analysis/quickstart)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain langchain-openai"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f8b08c52-1ce9-4d8b-a779-cbe8efde51d1",
+   "metadata": {},
+   "source": [
+    "## Query generation\n",
+    "\n",
+    "To make sure we get multiple paraphrasings we'll use OpenAI's function-calling API."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "0b51dd76-820d-41a4-98c8-893f6fe0d1ea",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "\n",
+    "class ParaphrasedQuery(BaseModel):\n",
+    "    \"\"\"You have performed query expansion to generate a paraphrasing of a question.\"\"\"\n",
+    "\n",
+    "    paraphrased_query: str = Field(\n",
+    "        ...,\n",
+    "        description=\"A unique paraphrasing of the original question.\",\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.output_parsers import PydanticToolsParser\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "system = \"\"\"You are an expert at converting user questions into database queries. \\\n",
+    "You have access to a database of tutorial videos about a software library for building LLM-powered applications. \\\n",
+    "\n",
+    "Perform query expansion. If there are multiple common ways of phrasing a user question \\\n",
+    "or common synonyms for key words in the question, make sure to return multiple versions \\\n",
+    "of the query with the different phrasings.\n",
+    "\n",
+    "If there are acronyms or words you are not familiar with, do not try to rephrase them.\n",
+    "\n",
+    "Return at least 3 versions of the question.\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "llm_with_tools = llm.bind_tools([ParaphrasedQuery])\n",
+    "query_analyzer = prompt | llm_with_tools | PydanticToolsParser(tools=[ParaphrasedQuery])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f403517a-b8e3-44ac-b0a6-02f8305635a2",
+   "metadata": {},
+   "source": [
+    "Let's see what queries our analyzer generates for the questions we searched earlier:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "af62af17-4f90-4dbd-a8b4-dfff51f1db95",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[ParaphrasedQuery(paraphrased_query='How to utilize multi-modal models sequentially and convert the sequence into a REST API'),\n",
+       " ParaphrasedQuery(paraphrased_query='Steps for using multi-modal models in a series and transforming the series into a RESTful API'),\n",
+       " ParaphrasedQuery(paraphrased_query='Guide on employing multi-modal models in a chain and converting the chain into a RESTful API')]"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    {\n",
+    "        \"question\": \"how to use multi-modal models in a chain and turn chain into a rest api\"\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "4d05b888-b0ff-4e00-abc9-98adfb1c92be",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[ParaphrasedQuery(paraphrased_query='How to stream events from LLM agent?'),\n",
+       " ParaphrasedQuery(paraphrased_query='How can I receive events from LLM agent in real-time?'),\n",
+       " ParaphrasedQuery(paraphrased_query='What is the process for capturing events from LLM agent?')]"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke({\"question\": \"stream events from llm agent\"})"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "poetry-venv-2",
+   "language": "python",
+   "name": "poetry-venv-2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/techniques/hyde.ipynb
+++ b/docs/docs/use_cases/query_analysis/techniques/hyde.ipynb
@ -0,0 +1,274 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "a47da0d0-0927-4adb-93e6-99a434f732cf",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 2\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# HyDE\n",
+    "\n",
+    "If we're working with a similarity search-based index, like a vector store, then searching on raw questions may not work well because their embeddings may not be very similar to those of the relevant documents. Instead it might help to have the model generate a hypothetical relevant document, and then use that to perform similarity search. This is the key idea behind [Hypothetical Document Embedding, or HyDE](https://arxiv.org/pdf/2212.10496.pdf).\n",
+    "\n",
+    "Let's take a look at how we might perform search via hypothetical documents for our Q&A bot over the LangChain YouTube videos."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain langchain-openai"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f8b08c52-1ce9-4d8b-a779-cbe8efde51d1",
+   "metadata": {},
+   "source": [
+    "## Hypothetical document generation\n",
+    "\n",
+    "Ultimately generating a relevant hypothetical document reduces to trying to answer the user question. Since we're desiging a Q&A bot for LangChain YouTube videos, we'll provide some basic context about LangChain and prompt the model to use a more pedantic style so that we get more realistic hypothetical documents:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.output_parsers import StrOutputParser\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "system = \"\"\"You are an expert about a set of software for building LLM-powered applications called LangChain, LangGraph, LangServe, and LangSmith.\n",
+    "\n",
+    "LangChain is a Python framework that provides a large set of integrations that can easily be composed to build LLM applications.\n",
+    "LangGraph is a Python package built on top of LangChain that makes it easy to build stateful, multi-actor LLM applications.\n",
+    "LangServe is a Python package built on top of LangChain that makes it easy to deploy a LangChain application as a REST API.\n",
+    "LangSmith is a platform that makes it easy to trace and test LLM applications.\n",
+    "\n",
+    "Answer the user question as best you can. Answer as though you were writing a tutorial that addressed the user question.\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "qa_no_context = prompt | llm | StrOutputParser()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "af62af17-4f90-4dbd-a8b4-dfff51f1db95",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "To use multi-modal models in a chain and turn the chain into a REST API, you can leverage the capabilities of LangChain, LangGraph, and LangServe. Here's a step-by-step guide on how to achieve this:\n",
+      "\n",
+      "1. **Building a Multi-Modal Model with LangChain**:\n",
+      "   - Start by defining your multi-modal model using LangChain. LangChain provides integrations with various deep learning frameworks like TensorFlow, PyTorch, and Hugging Face Transformers, making it easy to incorporate different modalities such as text, images, and audio.\n",
+      "   - You can create separate components for each modality and then combine them in a chain to build a multi-modal model.\n",
+      "\n",
+      "2. **Building a Stateful, Multi-Actor Application with LangGraph**:\n",
+      "   - Once you have your multi-modal model defined in LangChain, you can use LangGraph to build a stateful, multi-actor application around it.\n",
+      "   - LangGraph allows you to define actors that interact with each other and maintain state, which is useful for handling multi-modal inputs and outputs in a chain.\n",
+      "\n",
+      "3. **Deploying the Chain as a REST API with LangServe**:\n",
+      "   - After building your multi-modal model and application using LangChain and LangGraph, you can deploy the chain as a REST API using LangServe.\n",
+      "   - LangServe simplifies the process of exposing your LangChain application as a REST API, allowing you to easily interact with your multi-modal model through HTTP requests.\n",
+      "\n",
+      "4. **Testing and Tracing with LangSmith**:\n",
+      "   - To ensure the reliability and performance of your multi-modal model and REST API, you can use LangSmith for testing and tracing.\n",
+      "   - LangSmith provides tools for tracing the execution of your LLM applications and running tests to validate their functionality.\n",
+      "\n",
+      "By following these steps and leveraging the capabilities of LangChain, LangGraph, LangServe, and LangSmith, you can effectively use multi-modal models in a chain and turn the chain into a REST API.\n"
+     ]
+    }
+   ],
+   "source": [
+    "answer = qa_no_context.invoke(\n",
+    "    {\n",
+    "        \"question\": \"how to use multi-modal models in a chain and turn chain into a rest api\"\n",
+    "    }\n",
+    ")\n",
+    "print(answer)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3e58a714-9368-4e8e-a163-58bc4a5e56e6",
+   "metadata": {},
+   "source": [
+    "## Returning the hypothetical document and original question\n",
+    "\n",
+    "To increase our recall we may want to retrieve documents based on both the hypothetical document and the original question. We can easily return both like so:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "447ed63c-ba9f-4eaf-8ed8-b3235e45da4e",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'question': 'how to use multi-modal models in a chain and turn chain into a rest api',\n",
+       " 'hypothetical_document': \"To use multi-modal models in a chain and turn the chain into a REST API, you can leverage the capabilities of LangChain, LangGraph, and LangServe. Here's a step-by-step guide on how to achieve this:\\n\\n1. **Set up your multi-modal models**: First, you need to create or import your multi-modal models. These models can include text, image, audio, or any other type of data that you want to process in your LLM application.\\n\\n2. **Build your LangGraph application**: Use LangGraph to build a stateful, multi-actor LLM application that incorporates your multi-modal models. LangGraph allows you to define the flow of data and interactions between different components of your application.\\n\\n3. **Integrate your models in LangChain**: LangChain provides integrations for various types of models and data sources. You can easily integrate your multi-modal models into your LangGraph application using LangChain's capabilities.\\n\\n4. **Deploy your LangChain application as a REST API using LangServe**: Once you have built your multi-modal LLM application using LangGraph and LangChain, you can deploy it as a REST API using LangServe. LangServe simplifies the process of exposing your LangChain application as a web service, making it accessible to other applications and users.\\n\\n5. **Test and trace your application using LangSmith**: Finally, you can use LangSmith to trace and test your multi-modal LLM application. LangSmith provides tools for monitoring the performance of your application, debugging any issues, and ensuring that it functions as expected.\\n\\nBy following these steps and leveraging the capabilities of LangChain, LangGraph, LangServe, and LangSmith, you can effectively use multi-modal models in a chain and turn the chain into a REST API.\"}"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "\n",
+    "hyde_chain = RunnablePassthrough.assign(hypothetical_document=qa_no_context)\n",
+    "\n",
+    "hyde_chain.invoke(\n",
+    "    {\n",
+    "        \"question\": \"how to use multi-modal models in a chain and turn chain into a rest api\"\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a42ed79f-cb6f-490e-8186-a4a05b223857",
+   "metadata": {},
+   "source": [
+    "## Using function-calling to get structured output\n",
+    "\n",
+    "If we were composing this technique with other query analysis techniques, we'd likely be using function calling to get out structured query objects. We can use function-calling for HyDE like so:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "2e1fecf6-9c07-4efa-80eb-8fb15392b25f",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Query(answer='To use multi-modal models in a chain and turn the chain into a REST API, you can follow these steps:\\n\\n1. Use LangChain to build your multi-modal model by integrating different modalities such as text, image, and audio.\\n2. Utilize LangGraph, a Python package built on top of LangChain, to create a stateful, multi-actor LLM application that can handle interactions between different modalities.\\n3. Once your multi-modal model is built using LangChain and LangGraph, you can deploy it as a REST API using LangServe, another Python package that simplifies the process of creating REST APIs from LangChain applications.\\n4. Use LangSmith to trace and test your multi-modal model to ensure its functionality and performance.\\n\\nBy following these steps, you can effectively use multi-modal models in a chain and turn the chain into a REST API.')]"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_core.output_parsers.openai_tools import PydanticToolsParser\n",
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "\n",
+    "class Query(BaseModel):\n",
+    "    answer: str = Field(\n",
+    "        ...,\n",
+    "        description=\"Answer the user question as best you can. Answer as though you were writing a tutorial that addressed the user question.\",\n",
+    "    )\n",
+    "\n",
+    "\n",
+    "system = \"\"\"You are an expert about a set of software for building LLM-powered applications called LangChain, LangGraph, LangServe, and LangSmith.\n",
+    "\n",
+    "LangChain is a Python framework that provides a large set of integrations that can easily be composed to build LLM applications.\n",
+    "LangGraph is a Python package built on top of LangChain that makes it easy to build stateful, multi-actor LLM applications.\n",
+    "LangServe is a Python package built on top of LangChain that makes it easy to deploy a LangChain application as a REST API.\n",
+    "LangSmith is a platform that makes it easy to trace and test LLM applications.\"\"\"\n",
+    "\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm_with_tools = llm.bind_tools([Query])\n",
+    "hyde_chain = prompt | llm_with_tools | PydanticToolsParser(tools=[Query])\n",
+    "hyde_chain.invoke(\n",
+    "    {\n",
+    "        \"question\": \"how to use multi-modal models in a chain and turn chain into a rest api\"\n",
+    "    }\n",
+    ")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "poetry-venv-2",
+   "language": "python",
+   "name": "poetry-venv-2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/techniques/routing.ipynb
+++ b/docs/docs/use_cases/query_analysis/techniques/routing.ipynb
@ -0,0 +1,262 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "a47da0d0-0927-4adb-93e6-99a434f732cf",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 2\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Routing\n",
+    "\n",
+    "Sometimes we have multiple indexes for different domains, and for different questions we want to query different subsets of these indexes. For example, suppose we had one vector store index for all of the LangChain python documentation and one for all of the LangChain js documentation. Given a question about LangChain usage, we'd want to infer which language the the question was referring to and query the appropriate docs. **Query routing** is the process of classifying which index or subset of indexes a query should be performed on."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install -qU langchain-core langchain-openai"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f8b08c52-1ce9-4d8b-a779-cbe8efde51d1",
+   "metadata": {},
+   "source": [
+    "## Routing with function calling models\n",
+    "\n",
+    "With function-calling models it's simple to use models for classification, which is what routing comes down to:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/bagatur/langchain/libs/core/langchain_core/_api/beta_decorator.py:86: LangChainBetaWarning: The function `with_structured_output` is in beta. It is actively being worked on, so the API may change.\n",
+      "  warn_beta(\n"
+     ]
+    }
+   ],
+   "source": [
+    "from typing import Literal\n",
+    "\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "\n",
+    "class RouteQuery(BaseModel):\n",
+    "    \"\"\"Route a user query to the most relevant datasource.\"\"\"\n",
+    "\n",
+    "    datasource: Literal[\"python_docs\", \"js_docs\", \"golang_docs\"] = Field(\n",
+    "        ...,\n",
+    "        description=\"Given a user question choose which datasource would be most relevant for answering their question\",\n",
+    "    )\n",
+    "\n",
+    "\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "structured_llm = llm.with_structured_output(RouteQuery)\n",
+    "\n",
+    "system = \"\"\"You are an expert at routing a user question to the appropriate data source.\n",
+    "\n",
+    "Based on the programming language the question is referring to, route it to the relevant data source.\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "router = prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "61c682f1-9c46-4d7e-b909-5cfdabf41544",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "RouteQuery(datasource='python_docs')"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "question = \"\"\"Why doesn't the following code work:\n",
+    "\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "\n",
+    "prompt = ChatPromptTemplate.from_messages([\"human\", \"speak in {language}\"])\n",
+    "prompt.invoke(\"french\")\n",
+    "\"\"\"\n",
+    "router.invoke({\"question\": question})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "3be4c9de-3b79-4f78-928c-0a65a0b87193",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "RouteQuery(datasource='js_docs')"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "question = \"\"\"Why doesn't the following code work:\n",
+    "\n",
+    "\n",
+    "import { ChatPromptTemplate } from \"@langchain/core/prompts\";\n",
+    "\n",
+    "\n",
+    "const chatPrompt = ChatPromptTemplate.fromMessages([\n",
+    "  [\"human\", \"speak in {language}\"],\n",
+    "]);\n",
+    "\n",
+    "const formattedChatPrompt = await chatPrompt.invoke({\n",
+    "  input_language: \"french\"\n",
+    "});\n",
+    "\"\"\"\n",
+    "router.invoke({\"question\": question})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "38e01995-fa14-4f55-96e9-ae17d0d86e48",
+   "metadata": {},
+   "source": [
+    "## Routing to multiple indexes\n",
+    "\n",
+    "If we may want to query multiple indexes we can do that, too, by updating our schema to accept a List of data sources:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "af62af17-4f90-4dbd-a8b4-dfff51f1db95",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "RouteQuery(datasources=['python_docs', 'js_docs'])"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from typing import List\n",
+    "\n",
+    "\n",
+    "class RouteQuery(BaseModel):\n",
+    "    \"\"\"Route a user query to the most relevant datasource.\"\"\"\n",
+    "\n",
+    "    datasources: List[Literal[\"python_docs\", \"js_docs\", \"golang_docs\"]] = Field(\n",
+    "        ...,\n",
+    "        description=\"Given a user question choose which datasources would be most relevant for answering their question\",\n",
+    "    )\n",
+    "\n",
+    "\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "structured_llm = llm.with_structured_output(RouteQuery)\n",
+    "router = prompt | structured_llm\n",
+    "router.invoke(\n",
+    "    {\n",
+    "        \"question\": \"is there feature parity between the Python and JS implementations of OpenAI chat models\"\n",
+    "    }\n",
+    ")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "poetry-venv-2",
+   "language": "python",
+   "name": "poetry-venv-2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/techniques/step_back.ipynb
+++ b/docs/docs/use_cases/query_analysis/techniques/step_back.ipynb
@ -0,0 +1,244 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "a47da0d0-0927-4adb-93e6-99a434f732cf",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 2\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Step back prompting\n",
+    "\n",
+    "Sometimes search quality and model generations can be tripped up by the specifics of a question. One way to handle this is to first generate a more abstract, \"step back\" question and to query based on both the original and step back question.\n",
+    "\n",
+    "For example, if we ask a question of the form \"Why does my LangGraph agent astream_events return {LONG_TRACE} instead of {DESIRED_OUTPUT}\" we will likely retrieve more relevant documents if we search with the more generic question \"How does astream_events work with a LangGraph agent\" than if we search with the specific user question.\n",
+    "\n",
+    "Let's take a look at how we might use step back prompting in the context of our Q&A bot over the LangChain YouTube videos."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain-core langchain-openai"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f8b08c52-1ce9-4d8b-a779-cbe8efde51d1",
+   "metadata": {},
+   "source": [
+    "## Step back question generation\n",
+    "\n",
+    "Generating good step back questions comes down to writing a good prompt:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.output_parsers import StrOutputParser\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "system = \"\"\"You are an expert at taking a specific question and extracting a more generic question that gets at \\\n",
+    "the underlying principles needed to answer the specific question.\n",
+    "\n",
+    "You will be asked about a set of software for building LLM-powered applications called LangChain, LangGraph, LangServe, and LangSmith.\n",
+    "\n",
+    "LangChain is a Python framework that provides a large set of integrations that can easily be composed to build LLM applications.\n",
+    "LangGraph is a Python package built on top of LangChain that makes it easy to build stateful, multi-actor LLM applications.\n",
+    "LangServe is a Python package built on top of LangChain that makes it easy to deploy a LangChain application as a REST API.\n",
+    "LangSmith is a platform that makes it easy to trace and test LLM applications.\n",
+    "\n",
+    "Given a specific user question about one or more of these products, write a more generic question that needs to be answered in order to answer the specific question. \\\n",
+    "\n",
+    "If you don't recognize a word or acronym to not try to rewrite it.\n",
+    "\n",
+    "Write concise questions.\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "step_back = prompt | llm | StrOutputParser()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "id": "af62af17-4f90-4dbd-a8b4-dfff51f1db95",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "What are the specific methods or functions provided by LangGraph for extracting LLM calls from an event stream that includes various types of interactions and data sources?\n"
+     ]
+    }
+   ],
+   "source": [
+    "question = (\n",
+    "    \"I built a LangGraph agent using Gemini Pro and tools like vectorstores and duckduckgo search. \"\n",
+    "    \"How do I get just the LLM calls from the event stream\"\n",
+    ")\n",
+    "result = step_back.invoke({\"question\": question})\n",
+    "print(result)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3e58a714-9368-4e8e-a163-58bc4a5e56e6",
+   "metadata": {},
+   "source": [
+    "## Returning the stepback question and the original question\n",
+    "\n",
+    "To increase our recall we'll likely want to retrieve documents based on both the step back question and the original question. We can easily return both like so:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "id": "447ed63c-ba9f-4eaf-8ed8-b3235e45da4e",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'question': 'I built a LangGraph agent using Gemini Pro and tools like vectorstores and duckduckgo search. How do I get just the LLM calls from the event stream',\n",
+       " 'step_back': 'What are the specific methods or functions provided by LangGraph for extracting LLM calls from an event stream generated by an agent built using external tools like Gemini Pro, vectorstores, and DuckDuckGo search?'}"
+      ]
+     },
+     "execution_count": 24,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "\n",
+    "step_back_and_original = RunnablePassthrough.assign(step_back=step_back)\n",
+    "\n",
+    "step_back_and_original.invoke({\"question\": question})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a42ed79f-cb6f-490e-8186-a4a05b223857",
+   "metadata": {},
+   "source": [
+    "## Using function-calling to get structured output\n",
+    "\n",
+    "If we were composing this technique with other query analysis techniques, we'd likely be using function calling to get out structured query objects. We can use function-calling for step back prompting like so:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "id": "2e1fecf6-9c07-4efa-80eb-8fb15392b25f",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[StepBackQuery(step_back_question='What are the steps to filter and extract specific types of calls from an event stream in a Python framework like LangGraph?')]"
+      ]
+     },
+     "execution_count": 26,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_core.output_parsers.openai_tools import PydanticToolsParser\n",
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "\n",
+    "class StepBackQuery(BaseModel):\n",
+    "    step_back_question: str = Field(\n",
+    "        ...,\n",
+    "        description=\"Given a specific user question about one or more of these products, write a more generic question that needs to be answered in order to answer the specific question.\",\n",
+    "    )\n",
+    "\n",
+    "\n",
+    "llm_with_tools = llm.bind_tools([StepBackQuery])\n",
+    "hyde_chain = prompt | llm_with_tools | PydanticToolsParser(tools=[StepBackQuery])\n",
+    "hyde_chain.invoke({\"question\": question})"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "poetry-venv-2",
+   "language": "python",
+   "name": "poetry-venv-2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/techniques/structuring.ipynb
+++ b/docs/docs/use_cases/query_analysis/techniques/structuring.ipynb
@ -0,0 +1,731 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "a47da0d0-0927-4adb-93e6-99a434f732cf",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 3\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Structuring\n",
+    "\n",
+    "One of the most important steps in retrieval is turning a text input into the right search and filter parameters. This process of extracting structured parameters from an unstructured input is what we refer to as **query structuring**.\n",
+    "\n",
+    "To illustrate, let's return to our example of a Q&A bot over the LangChain YouTube videos from the [Quickstart](/docs/use_cases/query_analysis/quickstart) and see what more complex structured queries might look like in this case."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain langchain-openai youtube-transcript-api pytube"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "# os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c20b48b8-16d7-4089-bc17-f2d240b3935a",
+   "metadata": {},
+   "source": [
+    "### Load example document\n",
+    "\n",
+    "Let's load a representative document"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "ae6921e1-3d5a-431c-9999-29a5f33201e1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_community.document_loaders import YoutubeLoader\n",
+    "\n",
+    "docs = YoutubeLoader.from_youtube_url(\n",
+    "    \"https://www.youtube.com/watch?v=pbAd8O1Lvm4\", add_video_info=True\n",
+    ").load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "05a71032-14c3-4517-aa9a-3a5e88eaeb92",
+   "metadata": {},
+   "source": [
+    "Here's the metadata associated with a video:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "c7748415-ddbf-4c55-a242-c28833c03caf",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'source': 'pbAd8O1Lvm4',\n",
+       " 'title': 'Self-reflective RAG with LangGraph: Self-RAG and CRAG',\n",
+       " 'description': 'Unknown',\n",
+       " 'view_count': 9006,\n",
+       " 'thumbnail_url': 'https://i.ytimg.com/vi/pbAd8O1Lvm4/hq720.jpg',\n",
+       " 'publish_date': '2024-02-07 00:00:00',\n",
+       " 'length': 1058,\n",
+       " 'author': 'LangChain'}"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "docs[0].metadata"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5db72331-1e79-4910-8faa-473a0e370277",
+   "metadata": {},
+   "source": [
+    "And here's a sample from a document's contents:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "845149b7-130e-4228-ac80-d0a9286ef1d3",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"hi this is Lance from Lang chain I'm going to be talking about using Lang graph to build a diverse and sophisticated rag flows so just to set the stage the basic rag flow you can see here starts with a question retrieval of relevant documents from an index which are passed into the context window of an llm for generation of an answer grounded in the ret documents so that's kind of the basic outline and we can see it's like a very linear path um in practice though you often encounter a few differ\""
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "docs[0].page_content[:500]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "57396e23-c192-4d97-846b-5eacea4d6b8d",
+   "metadata": {},
+   "source": [
+    "## Query schema\n",
+    "\n",
+    "In order to generate structured queries we first need to define our query schema. We can see that each document has a title, view count, publication date, and length in seconds. Let's assume we've built an index that allows us to perform unstructured search over the contents and title of each document, and to use range filtering on view count, publication date, and length.\n",
+    "\n",
+    "To start we'll create a schema with explicit min and max attributes for view count, publication date, and video length so that those can be filtered on. And we'll add separate attributes for searches against the transcript contents versus the video title. \n",
+    "\n",
+    "We could alternatively create a more generic schema where instead of having one or more filter attributes for each filterable field, we have a single `filters` attribute that takes a list of (attribute, condition, value) tuples. We'll demonstrate how to do this as well. Which approach works best depends on the complexity of your index. If you have many filterable fields then it may be better to have a single `filters` query attribute. If you have only a few filterable fields and/or there are fields that can only be filtered in very specific ways, it can be helpful to have separate query attributes for them, each with their own description."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "0b51dd76-820d-41a4-98c8-893f6fe0d1ea",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import datetime\n",
+    "from typing import Literal, Optional, Tuple\n",
+    "\n",
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "\n",
+    "class TutorialSearch(BaseModel):\n",
+    "    \"\"\"Search over a database of tutorial videos about a software library.\"\"\"\n",
+    "\n",
+    "    content_search: str = Field(\n",
+    "        ...,\n",
+    "        description=\"Similarity search query applied to video transcripts.\",\n",
+    "    )\n",
+    "    title_search: str = Field(\n",
+    "        ...,\n",
+    "        description=(\n",
+    "            \"Alternate version of the content search query to apply to video titles. \"\n",
+    "            \"Should be succinct and only include key words that could be in a video \"\n",
+    "            \"title.\"\n",
+    "        ),\n",
+    "    )\n",
+    "    min_view_count: Optional[int] = Field(\n",
+    "        None,\n",
+    "        description=\"Minimum view count filter, inclusive. Only use if explicitly specified.\",\n",
+    "    )\n",
+    "    max_view_count: Optional[int] = Field(\n",
+    "        None,\n",
+    "        description=\"Maximum view count filter, exclusive. Only use if explicitly specified.\",\n",
+    "    )\n",
+    "    earliest_publish_date: Optional[datetime.date] = Field(\n",
+    "        None,\n",
+    "        description=\"Earliest publish date filter, inclusive. Only use if explicitly specified.\",\n",
+    "    )\n",
+    "    latest_publish_date: Optional[datetime.date] = Field(\n",
+    "        None,\n",
+    "        description=\"Latest publish date filter, exclusive. Only use if explicitly specified.\",\n",
+    "    )\n",
+    "    min_length_sec: Optional[int] = Field(\n",
+    "        None,\n",
+    "        description=\"Minimum video length in seconds, inclusive. Only use if explicitly specified.\",\n",
+    "    )\n",
+    "    max_length_sec: Optional[int] = Field(\n",
+    "        None,\n",
+    "        description=\"Maximum video length in seconds, exclusive. Only use if explicitly specified.\",\n",
+    "    )\n",
+    "\n",
+    "    def pretty_print(self) -> None:\n",
+    "        for field in self.__fields__:\n",
+    "            if getattr(self, field) is not None and getattr(self, field) != getattr(\n",
+    "                self.__fields__[field], \"default\", None\n",
+    "            ):\n",
+    "                print(f\"{field}: {getattr(self, field)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f8b08c52-1ce9-4d8b-a779-cbe8efde51d1",
+   "metadata": {},
+   "source": [
+    "## Query generation\n",
+    "\n",
+    "To convert user questions to structured queries we'll make use of a function-calling model, like ChatOpenAI. LangChain has some nice constructors that make it easy to specify a desired function call schema via a Pydantic class:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "system = \"\"\"You are an expert at converting user questions into database queries. \\\n",
+    "You have access to a database of tutorial videos about a software library for building LLM-powered applications. \\\n",
+    "Given a question, return a database query optimized to retrieve the most relevant results.\n",
+    "\n",
+    "If there are acronyms or words you are not familiar with, do not try to rephrase them.\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "structured_llm = llm.with_structured_output(TutorialSearch)\n",
+    "query_analyzer = prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f403517a-b8e3-44ac-b0a6-02f8305635a2",
+   "metadata": {},
+   "source": [
+    "Let's try it out:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "92bc7bac-700d-4666-b523-f0f8c3644ad5",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "content_search: rag from scratch\n",
+      "title_search: rag from scratch\n"
+     ]
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke({\"question\": \"rag from scratch\"}).pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "id": "af62af17-4f90-4dbd-a8b4-dfff51f1db95",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "content_search: chat langchain\n",
+      "title_search: chat langchain\n",
+      "earliest_publish_date: 2023-01-01\n",
+      "latest_publish_date: 2024-01-01\n"
+     ]
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    {\"question\": \"videos on chat langchain published in 2023\"}\n",
+    ").pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "id": "87590c6d-edd7-4805-bf68-c906907f9291",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "content_search: multi-modal models agent\n",
+      "title_search: multi-modal models agent\n",
+      "max_length_sec: 300\n"
+     ]
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    {\n",
+    "        \"question\": \"how to use multi-modal models in an agent, only videos under 5 minutes\"\n",
+    "    }\n",
+    ").pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b35e7ddf-ed39-4e70-a980-29a4c2d93ebd",
+   "metadata": {},
+   "source": [
+    "## Alternative: Succinct schema\n",
+    "\n",
+    "If we have many filterable fields then having a verbose schema could harm performance, or may not even be possible given limitations on the size of function schemas. In these cases we can try more succinct query schemas that trade off some explicitness of direction for concision:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 64,
+   "id": "81a036c0-c770-47dc-8b06-1dcfa403fdb1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import List, Literal, Union\n",
+    "\n",
+    "\n",
+    "class Filter(BaseModel):\n",
+    "    field: Literal[\"view_count\", \"publish_date\", \"length_sec\"]\n",
+    "    comparison: Literal[\"eq\", \"lt\", \"lte\", \"gt\", \"gte\"]\n",
+    "    value: Union[int, datetime.date] = Field(\n",
+    "        ...,\n",
+    "        description=\"If field is publish_date then value must be a ISO-8601 format date\",\n",
+    "    )\n",
+    "\n",
+    "\n",
+    "class TutorialSearch(BaseModel):\n",
+    "    \"\"\"Search over a database of tutorial videos about a software library.\"\"\"\n",
+    "\n",
+    "    content_search: str = Field(\n",
+    "        ...,\n",
+    "        description=\"Similarity search query applied to video transcripts.\",\n",
+    "    )\n",
+    "    title_search: str = Field(\n",
+    "        ...,\n",
+    "        description=(\n",
+    "            \"Alternate version of the content search query to apply to video titles. \"\n",
+    "            \"Should be succinct and only include key words that could be in a video \"\n",
+    "            \"title.\"\n",
+    "        ),\n",
+    "    )\n",
+    "    filters: List[Filter] = Field(\n",
+    "        default_factory=list,\n",
+    "        description=\"Filters over specific fields. Final condition is a logical conjunction of all filters.\",\n",
+    "    )\n",
+    "\n",
+    "    def pretty_print(self) -> None:\n",
+    "        for field in self.__fields__:\n",
+    "            if getattr(self, field) is not None and getattr(self, field) != getattr(\n",
+    "                self.__fields__[field], \"default\", None\n",
+    "            ):\n",
+    "                print(f\"{field}: {getattr(self, field)}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 65,
+   "id": "a1b48f5f-34d3-4abc-a652-936c593e6186",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "structured_llm = llm.with_structured_output(TutorialSearch)\n",
+    "query_analyzer = prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8f91b967-1b0e-4fff-9e00-63b1ea32ab2a",
+   "metadata": {},
+   "source": [
+    "Let's try it out:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 66,
+   "id": "4fefa1ac-509d-41e8-bfa3-a0f1481d9bab",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "content_search: rag from scratch\n",
+      "title_search: rag\n",
+      "filters: []\n"
+     ]
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke({\"question\": \"rag from scratch\"}).pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 67,
+   "id": "81171733-c3b6-4356-8081-81a757a5daf7",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "content_search: chat langchain\n",
+      "title_search: 2023\n",
+      "filters: [Filter(field='publish_date', comparison='eq', value=datetime.date(2023, 1, 1))]\n"
+     ]
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    {\"question\": \"videos on chat langchain published in 2023\"}\n",
+    ").pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 68,
+   "id": "47965e1b-6c87-4dce-9791-0007aa5a6a94",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "content_search: multi-modal models in an agent\n",
+      "title_search: multi-modal models agent\n",
+      "filters: [Filter(field='length_sec', comparison='lt', value=300), Filter(field='view_count', comparison='gte', value=276)]\n"
+     ]
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    {\n",
+    "        \"question\": \"how to use multi-modal models in an agent, only videos under 5 minutes and with over 276 views\"\n",
+    "    }\n",
+    ").pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "49a1def0-246e-47fd-9f7f-bc5d18bcd802",
+   "metadata": {},
+   "source": [
+    "We can see that the analyzer handles integers well but struggles with date ranges. We can try adjusting our schema description and/or our prompt to correct this:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 78,
+   "id": "cce5857c-8a20-4dc0-a216-5330ee567195",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class TutorialSearch(BaseModel):\n",
+    "    \"\"\"Search over a database of tutorial videos about a software library.\"\"\"\n",
+    "\n",
+    "    content_search: str = Field(\n",
+    "        ...,\n",
+    "        description=\"Similarity search query applied to video transcripts.\",\n",
+    "    )\n",
+    "    title_search: str = Field(\n",
+    "        ...,\n",
+    "        description=(\n",
+    "            \"Alternate version of the content search query to apply to video titles. \"\n",
+    "            \"Should be succinct and only include key words that could be in a video \"\n",
+    "            \"title.\"\n",
+    "        ),\n",
+    "    )\n",
+    "    filters: List[Filter] = Field(\n",
+    "        default_factory=list,\n",
+    "        description=(\n",
+    "            \"Filters over specific fields. Final condition is a logical conjunction of all filters. \"\n",
+    "            \"If a time period longer than one day is specified then it must result in filters that define a date range. \"\n",
+    "            f\"Keep in mind the current date is {datetime.date.today().strftime('%m-%d-%Y')}.\"\n",
+    "        ),\n",
+    "    )\n",
+    "\n",
+    "    def pretty_print(self) -> None:\n",
+    "        for field in self.__fields__:\n",
+    "            if getattr(self, field) is not None and getattr(self, field) != getattr(\n",
+    "                self.__fields__[field], \"default\", None\n",
+    "            ):\n",
+    "                print(f\"{field}: {getattr(self, field)}\")\n",
+    "\n",
+    "\n",
+    "structured_llm = llm.with_structured_output(TutorialSearch)\n",
+    "query_analyzer = prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 79,
+   "id": "d7e287b0-a434-49df-a12f-04369bd12679",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "content_search: chat langchain\n",
+      "title_search: chat langchain\n",
+      "filters: [Filter(field='publish_date', comparison='gte', value=datetime.date(2023, 1, 1)), Filter(field='publish_date', comparison='lte', value=datetime.date(2023, 12, 31))]\n"
+     ]
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    {\"question\": \"videos on chat langchain published in 2023\"}\n",
+    ").pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b938f083-4690-4283-9429-070ff3f46c0b",
+   "metadata": {},
+   "source": [
+    "This seems to work!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cc228c39-01f0-4475-b9bf-15d33033dbb7",
+   "metadata": {},
+   "source": [
+    "## Sorting: Going beyond search\n",
+    "\n",
+    "With certain indexes searching by field isn't the only way to retrieve results — we can also sort documents by a field and retrieve the top sorted results. With structured querying this is easy to accomodate by adding separate query fields that specify how to sort results."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 84,
+   "id": "2b7ec524-f625-483f-bafb-f8301fded7ed",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class TutorialSearch(BaseModel):\n",
+    "    \"\"\"Search over a database of tutorial videos about a software library.\"\"\"\n",
+    "\n",
+    "    content_search: str = Field(\n",
+    "        \"\",\n",
+    "        description=\"Similarity search query applied to video transcripts.\",\n",
+    "    )\n",
+    "    title_search: str = Field(\n",
+    "        \"\",\n",
+    "        description=(\n",
+    "            \"Alternate version of the content search query to apply to video titles. \"\n",
+    "            \"Should be succinct and only include key words that could be in a video \"\n",
+    "            \"title.\"\n",
+    "        ),\n",
+    "    )\n",
+    "    min_view_count: Optional[int] = Field(\n",
+    "        None, description=\"Minimum view count filter, inclusive.\"\n",
+    "    )\n",
+    "    max_view_count: Optional[int] = Field(\n",
+    "        None, description=\"Maximum view count filter, exclusive.\"\n",
+    "    )\n",
+    "    earliest_publish_date: Optional[datetime.date] = Field(\n",
+    "        None, description=\"Earliest publish date filter, inclusive.\"\n",
+    "    )\n",
+    "    latest_publish_date: Optional[datetime.date] = Field(\n",
+    "        None, description=\"Latest publish date filter, exclusive.\"\n",
+    "    )\n",
+    "    min_length_sec: Optional[int] = Field(\n",
+    "        None, description=\"Minimum video length in seconds, inclusive.\"\n",
+    "    )\n",
+    "    max_length_sec: Optional[int] = Field(\n",
+    "        None, description=\"Maximum video length in seconds, exclusive.\"\n",
+    "    )\n",
+    "    sort_by: Literal[\n",
+    "        \"relevance\",\n",
+    "        \"view_count\",\n",
+    "        \"publish_date\",\n",
+    "        \"length\",\n",
+    "    ] = Field(\"relevance\", description=\"Attribute to sort by.\")\n",
+    "    sort_order: Literal[\"ascending\", \"descending\"] = Field(\n",
+    "        \"descending\", description=\"Whether to sort in ascending or descending order.\"\n",
+    "    )\n",
+    "\n",
+    "    def pretty_print(self) -> None:\n",
+    "        for field in self.__fields__:\n",
+    "            if getattr(self, field) is not None and getattr(self, field) != getattr(\n",
+    "                self.__fields__[field], \"default\", None\n",
+    "            ):\n",
+    "                print(f\"{field}: {getattr(self, field)}\")\n",
+    "\n",
+    "\n",
+    "structured_llm = llm.with_structured_output(TutorialSearch)\n",
+    "query_analyzer = prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 85,
+   "id": "15a399c3-e4c7-4cae-9f9e-ba22cf6cfcdc",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "title_search: LangChain\n",
+      "sort_by: publish_date\n"
+     ]
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    {\"question\": \"What has LangChain released lately?\"}\n",
+    ").pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 86,
+   "id": "0e613fa2-7be4-45ba-bc1a-8f1f02379d94",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "sort_by: length\n"
+     ]
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke({\"question\": \"What are the longest videos?\"}).pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a1893ee0-4760-4b39-986f-770be50f0d0e",
+   "metadata": {},
+   "source": [
+    "We can even support searching and sorting together. This might look like first retrieving all results above a relevancy threshold and then sorting them according to the specified attribute:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 88,
+   "id": "8d0285dc-a78f-4be5-b50c-a99f8137a5fa",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "content_search: agents\n",
+      "sort_by: length\n",
+      "sort_order: ascending\n"
+     ]
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\n",
+    "    {\"question\": \"What are the shortest videos about agents?\"}\n",
+    ").pretty_print()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "poetry-venv-2",
+   "language": "python",
+   "name": "poetry-venv-2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/static/img/query_analysis.png
+++ b/docs/static/img/query_analysis.png
--- a/libs/langchain/langchain/chains/query_constructor/base.py
+++ b/libs/langchain/langchain/chains/query_constructor/base.py
@ -137,15 +137,18 @@ def fix_filter_directive(
        if allowed_operators and filter.operator not in allowed_operators:
            return None
        args = [
-            fix_filter_directive(
-                arg,
-                allowed_comparators=allowed_comparators,
-                allowed_operators=allowed_operators,
-                allowed_attributes=allowed_attributes,
+            cast(
+                FilterDirective,
+                fix_filter_directive(
+                    arg,
+                    allowed_comparators=allowed_comparators,
+                    allowed_operators=allowed_operators,
+                    allowed_attributes=allowed_attributes,
+                ),
            )
            for arg in filter.arguments
+            if arg is not None
        ]
-        args = [arg for arg in args if arg is not None]
        if not args:
            return None
        elif len(args) == 1 and filter.operator in (Operator.AND, Operator.OR):
--- a/libs/langchain/langchain/chains/query_constructor/ir.py
+++ b/libs/langchain/langchain/chains/query_constructor/ir.py
@ -103,6 +103,13 @@ class Comparison(FilterDirective):
    attribute: str
    value: Any

+    def __init__(
+        self, comparator: Comparator, attribute: str, value: Any, **kwargs: Any
+    ) -> None:
+        super().__init__(
+            comparator=comparator, attribute=attribute, value=value, **kwargs
+        )
+

 class Operation(FilterDirective):
    """A logical operation over other directives."""
@ -110,6 +117,11 @@ class Operation(FilterDirective):
    operator: Operator
    arguments: List[FilterDirective]

+    def __init__(
+        self, operator: Operator, arguments: List[FilterDirective], **kwargs: Any
+    ):
+        super().__init__(operator=operator, arguments=arguments, **kwargs)
+

 class StructuredQuery(Expr):
    """A structured query."""
@ -120,3 +132,12 @@ class StructuredQuery(Expr):
    """Filtering expression."""
    limit: Optional[int]
    """Limit on the number of results."""
+
+    def __init__(
+        self,
+        query: str,
+        filter: Optional[FilterDirective],
+        limit: Optional[int] = None,
+        **kwargs: Any,
+    ):
+        super().__init__(query=query, filter=filter, limit=limit, **kwargs)
--- a/libs/langchain/langchain/retrievers/self_query/supabase.py
+++ b/libs/langchain/langchain/retrievers/self_query/supabase.py
@ -68,14 +68,14 @@ class SupabaseVectorTranslator(Visitor):
            return self.visit_operation(
                Operation(
                    operator=Operator.AND,
-                    arguments=(
+                    arguments=[
                        Comparison(
                            comparator=comparison.comparator,
                            attribute=comparison.attribute,
                            value=value,
                        )
                        for value in comparison.value
-                    ),
+                    ],
                )
            )