From 6f30d68581f69345cfac53903873b6849338b190 Mon Sep 17 00:00:00 2001 From: Harrison Chase Date: Sat, 25 Feb 2023 13:27:24 -0800 Subject: [PATCH] add example of using agent with vectorstores (#1285) --- .../agents/examples/agent_vectorstore.ipynb | 409 ++++++++++++++++++ docs/modules/agents/how_to_guides.rst | 2 + 2 files changed, 411 insertions(+) create mode 100644 docs/modules/agents/examples/agent_vectorstore.ipynb diff --git a/docs/modules/agents/examples/agent_vectorstore.ipynb b/docs/modules/agents/examples/agent_vectorstore.ipynb new file mode 100644 index 0000000000..25d7e33d94 --- /dev/null +++ b/docs/modules/agents/examples/agent_vectorstore.ipynb @@ -0,0 +1,409 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "68b24990", + "metadata": {}, + "source": [ + "# Agents and Vectorstores\n", + "\n", + "This notebook covers how to combine agents and vectorstores. The use case for this is that you've ingested your data into a vectorstore and want to interact with it in an agentic manner.\n", + "\n", + "The reccomended method for doing so is to create a VectorDBQAChain and then use that as a tool in the overall agent. Let's take a look at doing this below. You can do this with multiple different vectordbs, and use the agent as a way to route between them. There are two different ways of doing this - you can either let the agent use the vectorstores as normal tools, or you can set `return_direct=True` to really just use the agent as a router." + ] + }, + { + "cell_type": "markdown", + "id": "9b22020a", + "metadata": {}, + "source": [ + "## Create the Vectorstore" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "2e87c10a", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.embeddings.openai import OpenAIEmbeddings\n", + "from langchain.vectorstores import Chroma\n", + "from langchain.text_splitter import CharacterTextSplitter\n", + "from langchain import OpenAI, VectorDBQA\n", + "llm = OpenAI(temperature=0)" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "id": "f2675861", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Running Chroma using direct local API.\n", + "Using DuckDB in-memory for database. Data will be transient.\n" + ] + } + ], + "source": [ + "from langchain.document_loaders import TextLoader\n", + "loader = TextLoader('../../state_of_the_union.txt')\n", + "documents = loader.load()\n", + "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n", + "texts = text_splitter.split_documents(documents)\n", + "\n", + "embeddings = OpenAIEmbeddings()\n", + "docsearch = Chroma.from_documents(texts, embeddings, collection_name=\"state-of-union\")" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "id": "bc5403d4", + "metadata": {}, + "outputs": [], + "source": [ + "state_of_union = VectorDBQA.from_chain_type(llm=llm, chain_type=\"stuff\", vectorstore=docsearch)" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "id": "1431cded", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.document_loaders import WebBaseLoader" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "id": "915d3ff3", + "metadata": {}, + "outputs": [], + "source": [ + "loader = WebBaseLoader(\"https://beta.ruff.rs/docs/faq/\")" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "id": "96a2edf8", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Running Chroma using direct local API.\n", + "Using DuckDB in-memory for database. Data will be transient.\n" + ] + } + ], + "source": [ + "docs = loader.load()\n", + "ruff_texts = text_splitter.split_documents(docs)\n", + "ruff_db = Chroma.from_documents(ruff_texts, embeddings, collection_name=\"ruff\")\n", + "ruff = VectorDBQA.from_chain_type(llm=llm, chain_type=\"stuff\", vectorstore=ruff_db)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "71ecef90", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "c0a6c031", + "metadata": {}, + "source": [ + "## Create the Agent" + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "id": "eb142786", + "metadata": {}, + "outputs": [], + "source": [ + "# Import things that are needed generically\n", + "from langchain.agents import initialize_agent, Tool\n", + "from langchain.tools import BaseTool\n", + "from langchain.llms import OpenAI\n", + "from langchain import LLMMathChain, SerpAPIWrapper" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "id": "850bc4e9", + "metadata": {}, + "outputs": [], + "source": [ + "tools = [\n", + " Tool(\n", + " name = \"State of Union QA System\",\n", + " func=state_of_union.run,\n", + " description=\"useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.\"\n", + " ),\n", + " Tool(\n", + " name = \"Ruff QA System\",\n", + " func=ruff.run,\n", + " description=\"useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.\"\n", + " ),\n", + "]" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "id": "fc47f230", + "metadata": {}, + "outputs": [], + "source": [ + "# Construct the agent. We will use the default agent type here.\n", + "# See documentation for a full list of options.\n", + "agent = initialize_agent(tools, llm, agent=\"zero-shot-react-description\", verbose=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "id": "10ca2db8", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m I need to find out what Biden said about Ketanji Brown Jackson in the State of the Union address.\n", + "Action: State of Union QA System\n", + "Action Input: What did Biden say about Ketanji Brown Jackson in the State of the Union address?\u001b[0m\n", + "Observation: \u001b[36;1m\u001b[1;3m Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.\u001b[0m\n", + "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n", + "Final Answer: Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + }, + { + "data": { + "text/plain": [ + "\"Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.\"" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "agent.run(\"What did biden say about ketanji brown jackson is the state of the union address?\")" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "id": "4e91b811", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m I need to find out the advantages of using ruff over flake8\n", + "Action: Ruff QA System\n", + "Action Input: What are the advantages of using ruff over flake8?\u001b[0m\n", + "Observation: \u001b[33;1m\u001b[1;3m Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.\u001b[0m\n", + "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n", + "Final Answer: Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + }, + { + "data": { + "text/plain": [ + "'Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.'" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "agent.run(\"Why use ruff over flake8?\")" + ] + }, + { + "cell_type": "markdown", + "id": "787a9b5e", + "metadata": {}, + "source": [ + "## Use the Agent solely as a router" + ] + }, + { + "cell_type": "markdown", + "id": "9161ba91", + "metadata": {}, + "source": [ + "You can also set `return_direct=True` if you intend to use the agent as a router and just want to directly return the result of the VectorDBQaChain.\n", + "\n", + "Notice that in the above examples the agent did some extra work after querying the VectorDBQAChain. You can avoid that and just return the result directly." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "id": "f59b377e", + "metadata": {}, + "outputs": [], + "source": [ + "tools = [\n", + " Tool(\n", + " name = \"State of Union QA System\",\n", + " func=state_of_union.run,\n", + " description=\"useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.\",\n", + " return_direct=True\n", + " ),\n", + " Tool(\n", + " name = \"Ruff QA System\",\n", + " func=ruff.run,\n", + " description=\"useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.\",\n", + " return_direct=True\n", + " ),\n", + "]" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "id": "8615707a", + "metadata": {}, + "outputs": [], + "source": [ + "agent = initialize_agent(tools, llm, agent=\"zero-shot-react-description\", verbose=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "id": "36e718a9", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m I need to find out what Biden said about Ketanji Brown Jackson in the State of the Union address.\n", + "Action: State of Union QA System\n", + "Action Input: What did Biden say about Ketanji Brown Jackson in the State of the Union address?\u001b[0m\n", + "Observation: \u001b[36;1m\u001b[1;3m Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + }, + { + "data": { + "text/plain": [ + "\" Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.\"" + ] + }, + "execution_count": 50, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "agent.run(\"What did biden say about ketanji brown jackson is the state of the union address?\")" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "id": "edfd0a1a", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m I need to find out the advantages of using ruff over flake8\n", + "Action: Ruff QA System\n", + "Action Input: What are the advantages of using ruff over flake8?\u001b[0m\n", + "Observation: \u001b[33;1m\u001b[1;3m Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + }, + { + "data": { + "text/plain": [ + "' Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.'" + ] + }, + "execution_count": 51, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "agent.run(\"Why use ruff over flake8?\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5442de04", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.1" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/modules/agents/how_to_guides.rst b/docs/modules/agents/how_to_guides.rst index 5bb67e1e77..a8ac0ffcdb 100644 --- a/docs/modules/agents/how_to_guides.rst +++ b/docs/modules/agents/how_to_guides.rst @@ -7,6 +7,8 @@ The first category of how-to guides here cover specific parts of working with ag `Custom Tools <./examples/custom_tools.html>`_: How to create custom tools that an agent can use. +`Agents With Vectorstores <./examples/agent_vectorstore.html>`_: How to use vectorstores with agents. + `Intermediate Steps <./examples/intermediate_steps.html>`_: How to access and use intermediate steps to get more visibility into the internals of an agent. `Custom Agent <./examples/custom_agent.html>`_: How to create a custom agent (specifically, a custom LLM + prompt to drive that agent).