mirror of https://github.com/hwchase17/langchain
You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
391 lines
10 KiB
Plaintext
391 lines
10 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Amazon Neptune with SPARQL\n",
|
|
"\n",
|
|
">[Amazon Neptune](https://aws.amazon.com/neptune/) is a high-performance graph analytics and serverless database for superior scalability and availability.\n",
|
|
">\n",
|
|
">This example shows the QA chain that queries [Resource Description Framework (RDF)](https://en.wikipedia.org/wiki/Resource_Description_Framework) data \n",
|
|
"in an `Amazon Neptune` graph database using the `SPARQL` query language and returns a human-readable response.\n",
|
|
">\n",
|
|
">[SPARQL](https://en.wikipedia.org/wiki/SPARQL) is a standard query language for `RDF` graphs.\n",
|
|
"\n",
|
|
"\n",
|
|
"This example uses a `NeptuneRdfGraph` class that connects with the Neptune database and loads its schema. \n",
|
|
"The `NeptuneSparqlQAChain` is used to connect the graph and LLM to ask natural language questions.\n",
|
|
"\n",
|
|
"This notebook demonstrates an example using organizational data.\n",
|
|
"\n",
|
|
"Requirements for running this notebook:\n",
|
|
"- Neptune 1.2.x cluster accessible from this notebook\n",
|
|
"- Kernel with Python 3.9 or higher\n",
|
|
"- For Bedrock access, ensure IAM role has this policy\n",
|
|
"\n",
|
|
"```json\n",
|
|
"{\n",
|
|
" \"Action\": [\n",
|
|
" \"bedrock:ListFoundationModels\",\n",
|
|
" \"bedrock:InvokeModel\"\n",
|
|
" ],\n",
|
|
" \"Resource\": \"*\",\n",
|
|
" \"Effect\": \"Allow\"\n",
|
|
"}\n",
|
|
"```\n",
|
|
"\n",
|
|
"- S3 bucket for staging sample data. The bucket should be in the same account/region as Neptune."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Setting up\n",
|
|
"\n",
|
|
"### Seed the W3C organizational data\n",
|
|
"\n",
|
|
"Seed the W3C organizational data, W3C org ontology plus some instances. \n",
|
|
" \n",
|
|
"You will need an S3 bucket in the same region and account. Set `STAGE_BUCKET`as the name of that bucket."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"STAGE_BUCKET = \"<bucket-name>\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%bash -s \"$STAGE_BUCKET\"\n",
|
|
"\n",
|
|
"rm -rf data\n",
|
|
"mkdir -p data\n",
|
|
"cd data\n",
|
|
"echo getting org ontology and sample org instances\n",
|
|
"wget http://www.w3.org/ns/org.ttl \n",
|
|
"wget https://raw.githubusercontent.com/aws-samples/amazon-neptune-ontology-example-blog/main/data/example_org.ttl \n",
|
|
"\n",
|
|
"echo Copying org ttl to S3\n",
|
|
"aws s3 cp org.ttl s3://$1/org.ttl\n",
|
|
"aws s3 cp example_org.ttl s3://$1/example_org.ttl\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Bulk-load the org ttl - both ontology and instances"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%load -s s3://{STAGE_BUCKET} -f turtle --store-to loadres --run"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%load_status {loadres['payload']['loadId']} --errors --details"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Setup Chain"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"!pip install --upgrade --force-reinstall langchain"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"!pip install --upgrade --force-reinstall langchain-core"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"!pip install --upgrade --force-reinstall langchain-community"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"** Restart kernel **"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Prepare an example"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"EXAMPLES = \"\"\"\n",
|
|
"\n",
|
|
"<question>\n",
|
|
"Find organizations.\n",
|
|
"</question>\n",
|
|
"\n",
|
|
"<sparql>\n",
|
|
"PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> \n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
|
"PREFIX org: <http://www.w3.org/ns/org#> \n",
|
|
"\n",
|
|
"select ?org ?orgName where {{\n",
|
|
" ?org rdfs:label ?orgName .\n",
|
|
"}} \n",
|
|
"</sparql>\n",
|
|
"\n",
|
|
"<question>\n",
|
|
"Find sites of an organization\n",
|
|
"</question>\n",
|
|
"\n",
|
|
"<sparql>\n",
|
|
"PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> \n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
|
"PREFIX org: <http://www.w3.org/ns/org#> \n",
|
|
"\n",
|
|
"select ?org ?orgName ?siteName where {{\n",
|
|
" ?org rdfs:label ?orgName .\n",
|
|
" ?org org:hasSite/rdfs:label ?siteName . \n",
|
|
"}} \n",
|
|
"</sparql>\n",
|
|
"\n",
|
|
"<question>\n",
|
|
"Find suborganizations of an organization\n",
|
|
"</question>\n",
|
|
"\n",
|
|
"<sparql>\n",
|
|
"PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> \n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
|
"PREFIX org: <http://www.w3.org/ns/org#> \n",
|
|
"\n",
|
|
"select ?org ?orgName ?subName where {{\n",
|
|
" ?org rdfs:label ?orgName .\n",
|
|
" ?org org:hasSubOrganization/rdfs:label ?subName .\n",
|
|
"}} \n",
|
|
"</sparql>\n",
|
|
"\n",
|
|
"<question>\n",
|
|
"Find organizational units of an organization\n",
|
|
"</question>\n",
|
|
"\n",
|
|
"<sparql>\n",
|
|
"PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> \n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
|
"PREFIX org: <http://www.w3.org/ns/org#> \n",
|
|
"\n",
|
|
"select ?org ?orgName ?unitName where {{\n",
|
|
" ?org rdfs:label ?orgName .\n",
|
|
" ?org org:hasUnit/rdfs:label ?unitName . \n",
|
|
"}} \n",
|
|
"</sparql>\n",
|
|
"\n",
|
|
"<question>\n",
|
|
"Find members of an organization. Also find their manager, or the member they report to.\n",
|
|
"</question>\n",
|
|
"\n",
|
|
"<sparql>\n",
|
|
"PREFIX org: <http://www.w3.org/ns/org#> \n",
|
|
"PREFIX foaf: <http://xmlns.com/foaf/0.1/> \n",
|
|
"\n",
|
|
"select * where {{\n",
|
|
" ?person rdf:type foaf:Person .\n",
|
|
" ?person org:memberOf ?org .\n",
|
|
" OPTIONAL {{ ?person foaf:firstName ?firstName . }}\n",
|
|
" OPTIONAL {{ ?person foaf:family_name ?lastName . }}\n",
|
|
" OPTIONAL {{ ?person org:reportsTo ??manager }} .\n",
|
|
"}}\n",
|
|
"</sparql>\n",
|
|
"\n",
|
|
"\n",
|
|
"<question>\n",
|
|
"Find change events, such as mergers and acquisitions, of an organization\n",
|
|
"</question>\n",
|
|
"\n",
|
|
"<sparql>\n",
|
|
"PREFIX org: <http://www.w3.org/ns/org#> \n",
|
|
"\n",
|
|
"select ?event ?prop ?obj where {{\n",
|
|
" ?org rdfs:label ?orgName .\n",
|
|
" ?event rdf:type org:ChangeEvent .\n",
|
|
" ?event org:originalOrganization ?origOrg .\n",
|
|
" ?event org:resultingOrganization ?resultingOrg .\n",
|
|
"}}\n",
|
|
"</sparql>\n",
|
|
"\n",
|
|
"\"\"\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import boto3\n",
|
|
"from langchain.chains.graph_qa.neptune_sparql import NeptuneSparqlQAChain\n",
|
|
"from langchain_community.chat_models import BedrockChat\n",
|
|
"from langchain_community.graphs import NeptuneRdfGraph\n",
|
|
"\n",
|
|
"host = \"<your host>\"\n",
|
|
"port = 8182 # change if different\n",
|
|
"region = \"us-east-1\" # change if different\n",
|
|
"graph = NeptuneRdfGraph(host=host, port=port, use_iam_auth=True, region_name=region)\n",
|
|
"\n",
|
|
"# Optionally change the schema\n",
|
|
"# elems = graph.get_schema_elements\n",
|
|
"# change elems ...\n",
|
|
"# graph.load_schema(elems)\n",
|
|
"\n",
|
|
"MODEL_ID = \"anthropic.claude-v2\"\n",
|
|
"bedrock_client = boto3.client(\"bedrock-runtime\")\n",
|
|
"llm = BedrockChat(model_id=MODEL_ID, client=bedrock_client)\n",
|
|
"\n",
|
|
"chain = NeptuneSparqlQAChain.from_llm(\n",
|
|
" llm=llm,\n",
|
|
" graph=graph,\n",
|
|
" examples=EXAMPLES,\n",
|
|
" verbose=True,\n",
|
|
" top_K=10,\n",
|
|
" return_intermediate_steps=True,\n",
|
|
" return_direct=False,\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Ask questions\n",
|
|
"Depends on the data we ingested above"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"chain.invoke(\"\"\"How many organizations are in the graph\"\"\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"chain.invoke(\"\"\"Are there any mergers or acquisitions\"\"\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"chain.invoke(\"\"\"Find organizations\"\"\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"chain.invoke(\"\"\"Find sites of MegaSystems or MegaFinancial\"\"\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"chain.invoke(\"\"\"Find a member who is manager of one or more members.\"\"\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"chain.invoke(\"\"\"Find five members and who their manager is.\"\"\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"chain.invoke(\n",
|
|
" \"\"\"Find org units or suborganizations of The Mega Group. What are the sites of those units?\"\"\"\n",
|
|
")"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.12"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|