langchain/cookbook/custom_agent_with_plugin_retrieval_using_plugnplai.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ba5f8741",
   "metadata": {},
   "source": [
    "# Plug-and-Plai\n",
    "\n",
    "This notebook builds upon the idea of [plugin retrieval](./custom_agent_with_plugin_retrieval.html), but pulls all tools from `plugnplai` - a directory of AI Plugins."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fea4812c",
   "metadata": {},
   "source": [
    "## Set up environment\n",
    "\n",
    "Do necessary imports, etc."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aca08be8",
   "metadata": {},
   "source": [
    "Install plugnplai lib to get a list of active plugins from https://plugplai.com directory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "52e248c9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip available: \u001b[0m\u001b[31;49m22.3.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.1.1\u001b[0m\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
   "source": [
    "pip install plugnplai -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "9af9734e",
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "from typing import Union\n",
    "\n",
    "import plugnplai\n",
    "from langchain.agents import (\n",
    "    AgentExecutor,\n",
    "    AgentOutputParser,\n",
    "    LLMSingleActionAgent,\n",
    ")\n",
    "from langchain.agents.agent_toolkits import NLAToolkit\n",
    "from langchain.chains import LLMChain\n",
    "from langchain.prompts import StringPromptTemplate\n",
    "from langchain.schema import AgentAction, AgentFinish\n",
    "from langchain_community.llms import OpenAI\n",
    "from langchain_community.tools.plugin import AIPlugin"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2f91d8b4",
   "metadata": {},
   "source": [
    "## Setup LLM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "a1a3b59c",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(temperature=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6df0253f",
   "metadata": {},
   "source": [
    "## Set up plugins\n",
    "\n",
    "Load and index plugins"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "9e0f7882",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get all plugins from plugnplai.com\n",
    "urls = plugnplai.get_plugins()\n",
    "\n",
    "#  Get ChatGPT plugins - only ChatGPT verified plugins\n",
    "urls = plugnplai.get_plugins(filter=\"ChatGPT\")\n",
    "\n",
    "#  Get working plugins - only tested plugins (in progress)\n",
    "urls = plugnplai.get_plugins(filter=\"working\")\n",
    "\n",
    "\n",
    "AI_PLUGINS = [AIPlugin.from_url(url + \"/.well-known/ai-plugin.json\") for url in urls]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17362717",
   "metadata": {},
   "source": [
    "## Tool Retriever\n",
    "\n",
    "We will use a vectorstore to create embeddings for each tool description. Then, for an incoming query we can create embeddings for that query and do a similarity search for relevant tools."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "77c4be4b",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.schema import Document\n",
    "from langchain_community.embeddings import OpenAIEmbeddings\n",
    "from langchain_community.vectorstores import FAISS"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "9092a158",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Attempting to load an OpenAPI 3.0.1 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
      "Attempting to load an OpenAPI 3.0.1 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
      "Attempting to load an OpenAPI 3.0.1 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
      "Attempting to load an OpenAPI 3.0.2 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
      "Attempting to load an OpenAPI 3.0.1 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
      "Attempting to load an OpenAPI 3.0.1 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
      "Attempting to load an OpenAPI 3.0.1 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
      "Attempting to load an OpenAPI 3.0.1 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
      "Attempting to load a Swagger 2.0 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n"
     ]
    }
   ],
   "source": [
    "embeddings = OpenAIEmbeddings()\n",
    "docs = [\n",
    "    Document(\n",
    "        page_content=plugin.description_for_model,\n",
    "        metadata={\"plugin_name\": plugin.name_for_model},\n",
    "    )\n",
    "    for plugin in AI_PLUGINS\n",
    "]\n",
    "vector_store = FAISS.from_documents(docs, embeddings)\n",
    "toolkits_dict = {\n",
    "    plugin.name_for_model: NLAToolkit.from_llm_and_ai_plugin(llm, plugin)\n",
    "    for plugin in AI_PLUGINS\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "735a7566",
   "metadata": {},
   "outputs": [],
   "source": [
    "retriever = vector_store.as_retriever()\n",
    "\n",
    "\n",
    "def get_tools(query):\n",
    "    # Get documents, which contain the Plugins to use\n",
    "    docs = retriever.get_relevant_documents(query)\n",
    "    # Get the toolkits, one for each plugin\n",
    "    tool_kits = [toolkits_dict[d.metadata[\"plugin_name\"]] for d in docs]\n",
    "    # Get the tools: a separate NLAChain for each endpoint\n",
    "    tools = []\n",
    "    for tk in tool_kits:\n",
    "        tools.extend(tk.nla_tools)\n",
    "    return tools"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7699afd7",
   "metadata": {},
   "source": [
    "We can now test this retriever to see if it seems to work."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "425f2886",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Milo.askMilo',\n",
       " 'Zapier_Natural_Language_Actions_(NLA)_API_(Dynamic)_-_Beta.search_all_actions',\n",
       " 'Zapier_Natural_Language_Actions_(NLA)_API_(Dynamic)_-_Beta.preview_a_zap',\n",
       " 'Zapier_Natural_Language_Actions_(NLA)_API_(Dynamic)_-_Beta.get_configuration_link',\n",
       " 'Zapier_Natural_Language_Actions_(NLA)_API_(Dynamic)_-_Beta.list_exposed_actions',\n",
       " 'SchoolDigger_API_V2.0.Autocomplete_GetSchools',\n",
       " 'SchoolDigger_API_V2.0.Districts_GetAllDistricts2',\n",
       " 'SchoolDigger_API_V2.0.Districts_GetDistrict2',\n",
       " 'SchoolDigger_API_V2.0.Rankings_GetSchoolRank2',\n",
       " 'SchoolDigger_API_V2.0.Rankings_GetRank_District',\n",
       " 'SchoolDigger_API_V2.0.Schools_GetAllSchools20',\n",
       " 'SchoolDigger_API_V2.0.Schools_GetSchool20',\n",
       " 'Speak.translate',\n",
       " 'Speak.explainPhrase',\n",
       " 'Speak.explainTask']"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tools = get_tools(\"What could I do today with my kiddo\")\n",
    "[t.name for t in tools]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "3aa88768",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Open_AI_Klarna_product_Api.productsUsingGET',\n",
       " 'Milo.askMilo',\n",
       " 'Zapier_Natural_Language_Actions_(NLA)_API_(Dynamic)_-_Beta.search_all_actions',\n",
       " 'Zapier_Natural_Language_Actions_(NLA)_API_(Dynamic)_-_Beta.preview_a_zap',\n",
       " 'Zapier_Natural_Language_Actions_(NLA)_API_(Dynamic)_-_Beta.get_configuration_link',\n",
       " 'Zapier_Natural_Language_Actions_(NLA)_API_(Dynamic)_-_Beta.list_exposed_actions',\n",
       " 'SchoolDigger_API_V2.0.Autocomplete_GetSchools',\n",
       " 'SchoolDigger_API_V2.0.Districts_GetAllDistricts2',\n",
       " 'SchoolDigger_API_V2.0.Districts_GetDistrict2',\n",
       " 'SchoolDigger_API_V2.0.Rankings_GetSchoolRank2',\n",
       " 'SchoolDigger_API_V2.0.Rankings_GetRank_District',\n",
       " 'SchoolDigger_API_V2.0.Schools_GetAllSchools20',\n",
       " 'SchoolDigger_API_V2.0.Schools_GetSchool20']"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tools = get_tools(\"what shirts can i buy?\")\n",
    "[t.name for t in tools]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e7a075c",
   "metadata": {},
   "source": [
    "## Prompt Template\n",
    "\n",
    "The prompt template is pretty standard, because we're not actually changing that much logic in the actual prompt template, but rather we are just changing how retrieval is done."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "339b1bb8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set up the base template\n",
    "template = \"\"\"Answer the following questions as best you can, but speaking as a pirate might speak. You have access to the following tools:\n",
    "\n",
    "{tools}\n",
    "\n",
    "Use the following format:\n",
    "\n",
    "Question: the input question you must answer\n",
    "Thought: you should always think about what to do\n",
    "Action: the action to take, should be one of [{tool_names}]\n",
    "Action Input: the input to the action\n",
    "Observation: the result of the action\n",
    "... (this Thought/Action/Action Input/Observation can repeat N times)\n",
    "Thought: I now know the final answer\n",
    "Final Answer: the final answer to the original input question\n",
    "\n",
    "Begin! Remember to speak as a pirate when giving your final answer. Use lots of \"Arg\"s\n",
    "\n",
    "Question: {input}\n",
    "{agent_scratchpad}\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1583acdc",
   "metadata": {},
   "source": [
    "The custom prompt template now has the concept of a tools_getter, which we call on the input to select the tools to use"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "fd969d31",
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Callable\n",
    "\n",
    "\n",
    "# Set up a prompt template\n",
    "class CustomPromptTemplate(StringPromptTemplate):\n",
    "    # The template to use\n",
    "    template: str\n",
    "    ############## NEW ######################\n",
    "    # The list of tools available\n",
    "    tools_getter: Callable\n",
    "\n",
    "    def format(self, **kwargs) -> str:\n",
    "        # Get the intermediate steps (AgentAction, Observation tuples)\n",
    "        # Format them in a particular way\n",
    "        intermediate_steps = kwargs.pop(\"intermediate_steps\")\n",
    "        thoughts = \"\"\n",
    "        for action, observation in intermediate_steps:\n",
    "            thoughts += action.log\n",
    "            thoughts += f\"\\nObservation: {observation}\\nThought: \"\n",
    "        # Set the agent_scratchpad variable to that value\n",
    "        kwargs[\"agent_scratchpad\"] = thoughts\n",
    "        ############## NEW ######################\n",
    "        tools = self.tools_getter(kwargs[\"input\"])\n",
    "        # Create a tools variable from the list of tools provided\n",
    "        kwargs[\"tools\"] = \"\\n\".join(\n",
    "            [f\"{tool.name}: {tool.description}\" for tool in tools]\n",
    "        )\n",
    "        # Create a list of tool names for the tools provided\n",
    "        kwargs[\"tool_names\"] = \", \".join([tool.name for tool in tools])\n",
    "        return self.template.format(**kwargs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "798ef9fb",
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt = CustomPromptTemplate(\n",
    "    template=template,\n",
    "    tools_getter=get_tools,\n",
    "    # This omits the `agent_scratchpad`, `tools`, and `tool_names` variables because those are generated dynamically\n",
    "    # This includes the `intermediate_steps` variable because that is needed\n",
    "    input_variables=[\"input\", \"intermediate_steps\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ef3a1af3",
   "metadata": {},
   "source": [
    "## Output Parser\n",
    "\n",
    "The output parser is unchanged from the previous notebook, since we are not changing anything about the output format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "7c6fe0d3",
   "metadata": {},
   "outputs": [],
   "source": [
    "class CustomOutputParser(AgentOutputParser):\n",
    "    def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:\n",
    "        # Check if agent should finish\n",
    "        if \"Final Answer:\" in llm_output:\n",
    "            return AgentFinish(\n",
    "                # Return values is generally always a dictionary with a single `output` key\n",
    "                # It is not recommended to try anything else at the moment :)\n",
    "                return_values={\"output\": llm_output.split(\"Final Answer:\")[-1].strip()},\n",
    "                log=llm_output,\n",
    "            )\n",
    "        # Parse out the action and action input\n",
    "        regex = r\"Action\\s*\\d*\\s*:(.*?)\\nAction\\s*\\d*\\s*Input\\s*\\d*\\s*:[\\s]*(.*)\"\n",
    "        match = re.search(regex, llm_output, re.DOTALL)\n",
    "        if not match:\n",
    "            raise ValueError(f\"Could not parse LLM output: `{llm_output}`\")\n",
    "        action = match.group(1).strip()\n",
    "        action_input = match.group(2)\n",
    "        # Return the action and action input\n",
    "        return AgentAction(\n",
    "            tool=action, tool_input=action_input.strip(\" \").strip('\"'), log=llm_output\n",
    "        )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "d278706a",
   "metadata": {},
   "outputs": [],
   "source": [
    "output_parser = CustomOutputParser()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "170587b1",
   "metadata": {},
   "source": [
    "## Set up LLM, stop sequence, and the agent\n",
    "\n",
    "Also the same as the previous notebook"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "f9d4c374",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(temperature=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "9b1cc2a2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# LLM chain consisting of the LLM and a prompt\n",
    "llm_chain = LLMChain(llm=llm, prompt=prompt)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "e4f5092f",
   "metadata": {},
   "outputs": [],
   "source": [
    "tool_names = [tool.name for tool in tools]\n",
    "agent = LLMSingleActionAgent(\n",
    "    llm_chain=llm_chain,\n",
    "    output_parser=output_parser,\n",
    "    stop=[\"\\nObservation:\"],\n",
    "    allowed_tools=tool_names,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa8a5326",
   "metadata": {},
   "source": [
    "## Use the Agent\n",
    "\n",
    "Now we can use it!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "490604e9",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent_executor = AgentExecutor.from_agent_and_tools(\n",
    "    agent=agent, tools=tools, verbose=True\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "653b1617",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3mThought: I need to find a product API\n",
      "Action: Open_AI_Klarna_product_Api.productsUsingGET\n",
      "Action Input: shirts\u001b[0m\n",
      "\n",
      "Observation:\u001b[36;1m\u001b[1;3mI found 10 shirts from the API response. They range in price from $9.99 to $450.00 and come in a variety of materials, colors, and patterns.\u001b[0m\u001b[32;1m\u001b[1;3m I now know what shirts I can buy\n",
      "Final Answer: Arg, I found 10 shirts from the API response. They range in price from $9.99 to $450.00 and come in a variety of materials, colors, and patterns.\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'Arg, I found 10 shirts from the API response. They range in price from $9.99 to $450.00 and come in a variety of materials, colors, and patterns.'"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "agent_executor.run(\"what shirts can i buy?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2481ee76",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.3"
  },
  "vscode": {
   "interpreter": {
    "hash": "3ccef4e08d87aa1eeb90f63e0f071292ccb2e9c42e70f74ab2bf6f5493ca7bbc"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}