mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
759 lines
23 KiB
Plaintext
759 lines
23 KiB
Plaintext
|
{
|
||
|
"cells": [
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "5e3cb542-933d-4bf3-a82b-d9d6395a7832",
|
||
|
"metadata": {
|
||
|
"tags": []
|
||
|
},
|
||
|
"source": [
|
||
|
"# Wikibase Agent\n",
|
||
|
"\n",
|
||
|
"This notebook demonstrates a very simple wikibase agent that uses sparql generation. Although this code is intended to work against any\n",
|
||
|
"wikibase instance, we use http://wikidata.org for testing.\n",
|
||
|
"\n",
|
||
|
"If you are interested in wikibases and sparql, please consider helping to improve this agent. Look [here](https://github.com/donaldziff/langchain-wikibase) for more details and open questions.\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "07d42966-7e99-4157-90dc-6704977dcf1b",
|
||
|
"metadata": {
|
||
|
"tags": []
|
||
|
},
|
||
|
"source": [
|
||
|
"## Preliminaries"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "9132f093-c61e-4b8d-abef-91ebef3fc85f",
|
||
|
"metadata": {
|
||
|
"tags": []
|
||
|
},
|
||
|
"source": [
|
||
|
"### API keys and other secrats\n",
|
||
|
"\n",
|
||
|
"We use an `.ini` file, like this: \n",
|
||
|
"```\n",
|
||
|
"[OPENAI]\n",
|
||
|
"OPENAI_API_KEY=xyzzy\n",
|
||
|
"[WIKIDATA]\n",
|
||
|
"WIKIDATA_USER_AGENT_HEADER=argle-bargle\n",
|
||
|
"```"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 1,
|
||
|
"id": "99567dfd-05a7-412f-abf0-9b9f4424acbd",
|
||
|
"metadata": {
|
||
|
"tags": []
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"['./secrets.ini']"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 1,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"import configparser\n",
|
||
|
"config = configparser.ConfigParser()\n",
|
||
|
"config.read('./secrets.ini')"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "332b6658-c978-41ca-a2be-4f8677fecaef",
|
||
|
"metadata": {
|
||
|
"tags": []
|
||
|
},
|
||
|
"source": [
|
||
|
"### OpenAI API Key\n",
|
||
|
"\n",
|
||
|
"An OpenAI API key is required unless you modify the code below to use another LLM provider."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 2,
|
||
|
"id": "dd328ee2-33cc-4e1e-aff7-cc0a2e05e2e6",
|
||
|
"metadata": {
|
||
|
"tags": []
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"openai_api_key = config['OPENAI']['OPENAI_API_KEY']\n",
|
||
|
"import os\n",
|
||
|
"os.environ.update({'OPENAI_API_KEY': openai_api_key})"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "42a9311b-600d-42bc-b000-2692ef87a213",
|
||
|
"metadata": {
|
||
|
"tags": []
|
||
|
},
|
||
|
"source": [
|
||
|
"### Wikidata user-agent header\n",
|
||
|
"\n",
|
||
|
"Wikidata policy requires a user-agent header. See https://meta.wikimedia.org/wiki/User-Agent_policy. However, at present this policy is not strictly enforced."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 3,
|
||
|
"id": "17ba657e-789d-40e1-b4b7-4f29ba06fe79",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"wikidata_user_agent_header = None if not config.has_section('WIKIDATA') else config['WIKIDATA']['WIKIDAtA_USER_AGENT_HEADER']"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "db08d308-050a-4fc8-93c9-8de4ae977ac3",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"### Enable tracing if desired"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 4,
|
||
|
"id": "77d2da08-fccd-4676-b77e-c0e89bf343cb",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"#import os\n",
|
||
|
"#os.environ[\"LANGCHAIN_HANDLER\"] = \"langchain\"\n",
|
||
|
"#os.environ[\"LANGCHAIN_SESSION\"] = \"default\" # Make sure this session actually exists. "
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "3dbc5bfc-48ce-4f90-873c-7336b21300c6",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Tools\n",
|
||
|
"\n",
|
||
|
"Three tools are provided for this simple agent:\n",
|
||
|
"* `ItemLookup`: for finding the q-number of an item\n",
|
||
|
"* `PropertyLookup`: for finding the p-number of a property\n",
|
||
|
"* `SparqlQueryRunner`: for running a sparql query"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "1f801b4e-6576-4914-aa4f-6f4c4e3c7924",
|
||
|
"metadata": {
|
||
|
"tags": []
|
||
|
},
|
||
|
"source": [
|
||
|
"## Item and Property lookup\n",
|
||
|
"\n",
|
||
|
"Item and Property lookup are implemented in a single method, using an elastic search endpoint. Not all wikibase instances have it, but wikidata does, and that's where we'll start."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 5,
|
||
|
"id": "42d23f0a-1c74-4c9c-85f2-d0e24204e96a",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def get_nested_value(o: dict, path: list) -> any:\n",
|
||
|
" current = o\n",
|
||
|
" for key in path:\n",
|
||
|
" try:\n",
|
||
|
" current = current[key]\n",
|
||
|
" except:\n",
|
||
|
" return None\n",
|
||
|
" return current\n",
|
||
|
"\n",
|
||
|
"import requests\n",
|
||
|
"\n",
|
||
|
"from typing import Optional\n",
|
||
|
"\n",
|
||
|
"def vocab_lookup(search: str, entity_type: str = \"item\",\n",
|
||
|
" url: str = \"https://www.wikidata.org/w/api.php\",\n",
|
||
|
" user_agent_header: str = wikidata_user_agent_header,\n",
|
||
|
" srqiprofile: str = None,\n",
|
||
|
" ) -> Optional[str]: \n",
|
||
|
" headers = {\n",
|
||
|
" 'Accept': 'application/json'\n",
|
||
|
" }\n",
|
||
|
" if wikidata_user_agent_header is not None:\n",
|
||
|
" headers['User-Agent'] = wikidata_user_agent_header\n",
|
||
|
" \n",
|
||
|
" if entity_type == \"item\":\n",
|
||
|
" srnamespace = 0\n",
|
||
|
" srqiprofile = \"classic_noboostlinks\" if srqiprofile is None else srqiprofile\n",
|
||
|
" elif entity_type == \"property\":\n",
|
||
|
" srnamespace = 120\n",
|
||
|
" srqiprofile = \"classic\" if srqiprofile is None else srqiprofile\n",
|
||
|
" else:\n",
|
||
|
" raise ValueError(\"entity_type must be either 'property' or 'item'\") \n",
|
||
|
" \n",
|
||
|
" params = {\n",
|
||
|
" \"action\": \"query\",\n",
|
||
|
" \"list\": \"search\",\n",
|
||
|
" \"srsearch\": search,\n",
|
||
|
" \"srnamespace\": srnamespace,\n",
|
||
|
" \"srlimit\": 1,\n",
|
||
|
" \"srqiprofile\": srqiprofile,\n",
|
||
|
" \"srwhat\": 'text',\n",
|
||
|
" \"format\": \"json\"\n",
|
||
|
" }\n",
|
||
|
" \n",
|
||
|
" response = requests.get(url, headers=headers, params=params)\n",
|
||
|
" \n",
|
||
|
" if response.status_code == 200:\n",
|
||
|
" title = get_nested_value(response.json(), ['query', 'search', 0, 'title'])\n",
|
||
|
" if title is None:\n",
|
||
|
" return f\"I couldn't find any {entity_type} for '{search}'. Please rephrase your request and try again\"\n",
|
||
|
" # if there is a prefix, strip it off\n",
|
||
|
" return title.split(':')[-1]\n",
|
||
|
" else:\n",
|
||
|
" return \"Sorry, I got an error. Please try again.\""
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 6,
|
||
|
"id": "e52060fa-3614-43fb-894e-54e9b75d1e9f",
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"Q4180017\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"print(vocab_lookup(\"Malin 1\"))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 7,
|
||
|
"id": "b23ab322-b2cf-404e-b36f-2bfc1d79b0d3",
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"P31\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"print(vocab_lookup(\"instance of\", entity_type=\"property\"))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 8,
|
||
|
"id": "89020cc8-104e-42d0-ac32-885e590de515",
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"I couldn't find any item for 'Ceci n'est pas un q-item'. Please rephrase your request and try again\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"print(vocab_lookup(\"Ceci n'est pas un q-item\"))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "78d66d8b-0e34-4d3f-a18d-c7284840ac76",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Sparql runner "
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "c6f60069-fbe0-4015-87fb-0e487cd914e7",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"This tool runs sparql - by default, wikidata is used."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 9,
|
||
|
"id": "b5b97a4d-2a39-4993-88d9-e7818c0a2853",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import requests\n",
|
||
|
"from typing import List, Dict, Any\n",
|
||
|
"import json\n",
|
||
|
"\n",
|
||
|
"def run_sparql(query: str, url='https://query.wikidata.org/sparql',\n",
|
||
|
" user_agent_header: str = wikidata_user_agent_header) -> List[Dict[str, Any]]:\n",
|
||
|
" headers = {\n",
|
||
|
" 'Accept': 'application/json'\n",
|
||
|
" }\n",
|
||
|
" if wikidata_user_agent_header is not None:\n",
|
||
|
" headers['User-Agent'] = wikidata_user_agent_header\n",
|
||
|
"\n",
|
||
|
" response = requests.get(url, headers=headers, params={'query': query, 'format': 'json'})\n",
|
||
|
"\n",
|
||
|
" if response.status_code != 200:\n",
|
||
|
" return \"That query failed. Perhaps you could try a different one?\"\n",
|
||
|
" results = get_nested_value(response.json(),['results', 'bindings'])\n",
|
||
|
" return json.dumps(results)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 10,
|
||
|
"id": "149722ec-8bc1-4d4f-892b-e4ddbe8444c1",
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"'[{\"count\": {\"datatype\": \"http://www.w3.org/2001/XMLSchema#integer\", \"type\": \"literal\", \"value\": \"20\"}}]'"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 10,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"run_sparql(\"SELECT (COUNT(?children) as ?count) WHERE { wd:Q1339 wdt:P40 ?children . }\")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "9f0302fd-ba35-4acc-ba32-1d7c9295c898",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Agent"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "3122a961-9673-4a52-b1cd-7d62fbdf8d96",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Wrap the tools"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 11,
|
||
|
"id": "cc41ae88-2e53-4363-9878-28b26430cb1e",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"from langchain.agents import Tool, AgentExecutor, LLMSingleActionAgent, AgentOutputParser\n",
|
||
|
"from langchain.prompts import StringPromptTemplate\n",
|
||
|
"from langchain import OpenAI, LLMChain\n",
|
||
|
"from typing import List, Union\n",
|
||
|
"from langchain.schema import AgentAction, AgentFinish\n",
|
||
|
"import re"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 12,
|
||
|
"id": "2810a3ce-b9c6-47ee-8068-12ca967cd0ea",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Define which tools the agent can use to answer user queries\n",
|
||
|
"tools = [\n",
|
||
|
" Tool(\n",
|
||
|
" name = \"ItemLookup\",\n",
|
||
|
" func=(lambda x: vocab_lookup(x, entity_type=\"item\")),\n",
|
||
|
" description=\"useful for when you need to know the q-number for an item\"\n",
|
||
|
" ),\n",
|
||
|
" Tool(\n",
|
||
|
" name = \"PropertyLookup\",\n",
|
||
|
" func=(lambda x: vocab_lookup(x, entity_type=\"property\")),\n",
|
||
|
" description=\"useful for when you need to know the p-number for a property\"\n",
|
||
|
" ),\n",
|
||
|
" Tool(\n",
|
||
|
" name = \"SparqlQueryRunner\",\n",
|
||
|
" func=run_sparql,\n",
|
||
|
" description=\"useful for getting results from a wikibase\"\n",
|
||
|
" ) \n",
|
||
|
"]"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "ab0f2778-a195-4a4a-a5b4-c1e809e1fb7b",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Prompts"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 13,
|
||
|
"id": "7bd4ba4f-57d6-4ceb-b932-3cb0d0509a24",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Set up the base template\n",
|
||
|
"template = \"\"\"\n",
|
||
|
"Answer the following questions by running a sparql query against a wikibase where the p and q items are \n",
|
||
|
"completely unknown to you. You will need to discover the p and q items before you can generate the sparql.\n",
|
||
|
"Do not assume you know the p and q items for any concepts. Always use tools to find all p and q items.\n",
|
||
|
"After you generate the sparql, you should run it. The results will be returned in json. \n",
|
||
|
"Summarize the json results in natural language.\n",
|
||
|
"\n",
|
||
|
"You may assume the following prefixes:\n",
|
||
|
"PREFIX wd: <http://www.wikidata.org/entity/>\n",
|
||
|
"PREFIX wdt: <http://www.wikidata.org/prop/direct/>\n",
|
||
|
"PREFIX p: <http://www.wikidata.org/prop/>\n",
|
||
|
"PREFIX ps: <http://www.wikidata.org/prop/statement/>\n",
|
||
|
"\n",
|
||
|
"When generating sparql:\n",
|
||
|
"* Try to avoid \"count\" and \"filter\" queries if possible\n",
|
||
|
"* Never enclose the sparql in back-quotes\n",
|
||
|
"\n",
|
||
|
"You have access to the following tools:\n",
|
||
|
"\n",
|
||
|
"{tools}\n",
|
||
|
"\n",
|
||
|
"Use the following format:\n",
|
||
|
"\n",
|
||
|
"Question: the input question for which you must provide a natural language answer\n",
|
||
|
"Thought: you should always think about what to do\n",
|
||
|
"Action: the action to take, should be one of [{tool_names}]\n",
|
||
|
"Action Input: the input to the action\n",
|
||
|
"Observation: the result of the action\n",
|
||
|
"... (this Thought/Action/Action Input/Observation can repeat N times)\n",
|
||
|
"Thought: I now know the final answer\n",
|
||
|
"Final Answer: the final answer to the original input question\n",
|
||
|
"\n",
|
||
|
"Question: {input}\n",
|
||
|
"{agent_scratchpad}\"\"\"\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 14,
|
||
|
"id": "7e8d771a-64bb-4ec8-b472-6a9a40c6dd38",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Set up a prompt template\n",
|
||
|
"class CustomPromptTemplate(StringPromptTemplate):\n",
|
||
|
" # The template to use\n",
|
||
|
" template: str\n",
|
||
|
" # The list of tools available\n",
|
||
|
" tools: List[Tool]\n",
|
||
|
" \n",
|
||
|
" def format(self, **kwargs) -> str:\n",
|
||
|
" # Get the intermediate steps (AgentAction, Observation tuples)\n",
|
||
|
" # Format them in a particular way\n",
|
||
|
" intermediate_steps = kwargs.pop(\"intermediate_steps\")\n",
|
||
|
" thoughts = \"\"\n",
|
||
|
" for action, observation in intermediate_steps:\n",
|
||
|
" thoughts += action.log\n",
|
||
|
" thoughts += f\"\\nObservation: {observation}\\nThought: \"\n",
|
||
|
" # Set the agent_scratchpad variable to that value\n",
|
||
|
" kwargs[\"agent_scratchpad\"] = thoughts\n",
|
||
|
" # Create a tools variable from the list of tools provided\n",
|
||
|
" kwargs[\"tools\"] = \"\\n\".join([f\"{tool.name}: {tool.description}\" for tool in self.tools])\n",
|
||
|
" # Create a list of tool names for the tools provided\n",
|
||
|
" kwargs[\"tool_names\"] = \", \".join([tool.name for tool in self.tools])\n",
|
||
|
" return self.template.format(**kwargs)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 15,
|
||
|
"id": "f97dca78-fdde-4a70-9137-e34a21d14e64",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"prompt = CustomPromptTemplate(\n",
|
||
|
" template=template,\n",
|
||
|
" tools=tools,\n",
|
||
|
" # This omits the `agent_scratchpad`, `tools`, and `tool_names` variables because those are generated dynamically\n",
|
||
|
" # This includes the `intermediate_steps` variable because that is needed\n",
|
||
|
" input_variables=[\"input\", \"intermediate_steps\"]\n",
|
||
|
")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "12c57d77-3c1e-4cde-9a83-7d2134392479",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Output parser \n",
|
||
|
"This is unchanged from langchain docs"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 16,
|
||
|
"id": "42da05eb-c103-4649-9d20-7143a8880721",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"class CustomOutputParser(AgentOutputParser):\n",
|
||
|
" \n",
|
||
|
" def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:\n",
|
||
|
" # Check if agent should finish\n",
|
||
|
" if \"Final Answer:\" in llm_output:\n",
|
||
|
" return AgentFinish(\n",
|
||
|
" # Return values is generally always a dictionary with a single `output` key\n",
|
||
|
" # It is not recommended to try anything else at the moment :)\n",
|
||
|
" return_values={\"output\": llm_output.split(\"Final Answer:\")[-1].strip()},\n",
|
||
|
" log=llm_output,\n",
|
||
|
" )\n",
|
||
|
" # Parse out the action and action input\n",
|
||
|
" regex = r\"Action: (.*?)[\\n]*Action Input:[\\s]*(.*)\"\n",
|
||
|
" match = re.search(regex, llm_output, re.DOTALL)\n",
|
||
|
" if not match:\n",
|
||
|
" raise ValueError(f\"Could not parse LLM output: `{llm_output}`\")\n",
|
||
|
" action = match.group(1).strip()\n",
|
||
|
" action_input = match.group(2)\n",
|
||
|
" # Return the action and action input\n",
|
||
|
" return AgentAction(tool=action, tool_input=action_input.strip(\" \").strip('\"'), log=llm_output)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 17,
|
||
|
"id": "d2b4d710-8cc9-4040-9269-59cf6c5c22be",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"output_parser = CustomOutputParser()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "48a758cb-93a7-4555-b69a-896d2d43c6f0",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Specify the LLM model"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 18,
|
||
|
"id": "72988c79-8f60-4b0f-85ee-6af32e8de9c2",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"from langchain.chat_models import ChatOpenAI\n",
|
||
|
"llm = ChatOpenAI(model=\"gpt-4\", temperature=0)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "95685d14-647a-4e24-ae2c-a8dd1e364921",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Agent and agent executor"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 19,
|
||
|
"id": "13d55765-bfa1-43b3-b7cb-00f52ebe7747",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# LLM chain consisting of the LLM and a prompt\n",
|
||
|
"llm_chain = LLMChain(llm=llm, prompt=prompt)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 20,
|
||
|
"id": "b3f7ac3c-398e-49f9-baed-554f49a191c3",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"tool_names = [tool.name for tool in tools]\n",
|
||
|
"agent = LLMSingleActionAgent(\n",
|
||
|
" llm_chain=llm_chain, \n",
|
||
|
" output_parser=output_parser,\n",
|
||
|
" stop=[\"\\nObservation:\"], \n",
|
||
|
" allowed_tools=tool_names\n",
|
||
|
")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 21,
|
||
|
"id": "65740577-272e-4853-8d47-b87784cfaba0",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"id": "66e3d13b-77cf-41d3-b541-b54535c14459",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Run it!"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 22,
|
||
|
"id": "6e97a07c-d7bf-4a35-9ab2-b59ae865c62c",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# If you prefer in-line tracing, uncomment this line\n",
|
||
|
"# agent_executor.agent.llm_chain.verbose = True"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 23,
|
||
|
"id": "a11ca60d-f57b-4fe8-943e-a258e37463c7",
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
|
||
|
"\u001b[32;1m\u001b[1;3mThought: I need to find the Q number for J.S. Bach.\n",
|
||
|
"Action: ItemLookup\n",
|
||
|
"Action Input: J.S. Bach\u001b[0m\n",
|
||
|
"\n",
|
||
|
"Observation:\u001b[36;1m\u001b[1;3mQ1339\u001b[0m\u001b[32;1m\u001b[1;3mI need to find the P number for children.\n",
|
||
|
"Action: PropertyLookup\n",
|
||
|
"Action Input: children\u001b[0m\n",
|
||
|
"\n",
|
||
|
"Observation:\u001b[33;1m\u001b[1;3mP1971\u001b[0m\u001b[32;1m\u001b[1;3mNow I can query the number of children J.S. Bach had.\n",
|
||
|
"Action: SparqlQueryRunner\n",
|
||
|
"Action Input: SELECT ?children WHERE { wd:Q1339 wdt:P1971 ?children }\u001b[0m\n",
|
||
|
"\n",
|
||
|
"Observation:\u001b[38;5;200m\u001b[1;3m[{\"children\": {\"datatype\": \"http://www.w3.org/2001/XMLSchema#decimal\", \"type\": \"literal\", \"value\": \"20\"}}]\u001b[0m\u001b[32;1m\u001b[1;3mI now know the final answer.\n",
|
||
|
"Final Answer: J.S. Bach had 20 children.\u001b[0m\n",
|
||
|
"\n",
|
||
|
"\u001b[1m> Finished chain.\u001b[0m\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"'J.S. Bach had 20 children.'"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 23,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"agent_executor.run(\"How many children did J.S. Bach have?\")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 24,
|
||
|
"id": "d0b42a41-996b-4156-82e4-f0651a87ee34",
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
|
||
|
"\u001b[32;1m\u001b[1;3mThought: To find Hakeem Olajuwon's Basketball-Reference.com NBA player ID, I need to first find his Wikidata item (Q-number) and then query for the relevant property (P-number).\n",
|
||
|
"Action: ItemLookup\n",
|
||
|
"Action Input: Hakeem Olajuwon\u001b[0m\n",
|
||
|
"\n",
|
||
|
"Observation:\u001b[36;1m\u001b[1;3mQ273256\u001b[0m\u001b[32;1m\u001b[1;3mNow that I have Hakeem Olajuwon's Wikidata item (Q273256), I need to find the P-number for the Basketball-Reference.com NBA player ID property.\n",
|
||
|
"Action: PropertyLookup\n",
|
||
|
"Action Input: Basketball-Reference.com NBA player ID\u001b[0m\n",
|
||
|
"\n",
|
||
|
"Observation:\u001b[33;1m\u001b[1;3mP2685\u001b[0m\u001b[32;1m\u001b[1;3mNow that I have both the Q-number for Hakeem Olajuwon (Q273256) and the P-number for the Basketball-Reference.com NBA player ID property (P2685), I can run a SPARQL query to get the ID value.\n",
|
||
|
"Action: SparqlQueryRunner\n",
|
||
|
"Action Input: \n",
|
||
|
"SELECT ?playerID WHERE {\n",
|
||
|
" wd:Q273256 wdt:P2685 ?playerID .\n",
|
||
|
"}\u001b[0m\n",
|
||
|
"\n",
|
||
|
"Observation:\u001b[38;5;200m\u001b[1;3m[{\"playerID\": {\"type\": \"literal\", \"value\": \"o/olajuha01\"}}]\u001b[0m\u001b[32;1m\u001b[1;3mI now know the final answer\n",
|
||
|
"Final Answer: Hakeem Olajuwon's Basketball-Reference.com NBA player ID is \"o/olajuha01\".\u001b[0m\n",
|
||
|
"\n",
|
||
|
"\u001b[1m> Finished chain.\u001b[0m\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"'Hakeem Olajuwon\\'s Basketball-Reference.com NBA player ID is \"o/olajuha01\".'"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 24,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"agent_executor.run(\"What is the Basketball-Reference.com NBA player ID of Hakeem Olajuwon?\")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"id": "05fb3a3e-8a9f-482d-bd54-4c6e60ef60dd",
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": []
|
||
|
}
|
||
|
],
|
||
|
"metadata": {
|
||
|
"kernelspec": {
|
||
|
"display_name": "conda210",
|
||
|
"language": "python",
|
||
|
"name": "conda210"
|
||
|
},
|
||
|
"language_info": {
|
||
|
"codemirror_mode": {
|
||
|
"name": "ipython",
|
||
|
"version": 3
|
||
|
},
|
||
|
"file_extension": ".py",
|
||
|
"mimetype": "text/x-python",
|
||
|
"name": "python",
|
||
|
"nbconvert_exporter": "python",
|
||
|
"pygments_lexer": "ipython3",
|
||
|
"version": "3.9.16"
|
||
|
}
|
||
|
},
|
||
|
"nbformat": 4,
|
||
|
"nbformat_minor": 5
|
||
|
}
|