You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/docs/docs/integrations/graphs/arangodb.ipynb

827 lines
23 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "c94240f5",
"metadata": {
"id": "c94240f5"
},
"source": [
"# ArangoDB\n",
"\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/arangodb/interactive_tutorials/blob/master/notebooks/Langchain.ipynb)\n",
"\n",
">[ArangoDB](https://github.com/arangodb/arangodb) is a scalable graph database system to drive value from\n",
">connected data, faster. Native graphs, an integrated search engine, and JSON support, via\n",
">a single query language. `ArangoDB` runs on-prem or in the cloud.\n",
"\n",
"This notebook shows how to use LLMs to provide a natural language interface to an [ArangoDB](https://github.com/arangodb/arangodb#readme) database."
]
},
{
"cell_type": "markdown",
"id": "dbc0ee68",
"metadata": {
"id": "dbc0ee68"
},
"source": [
"## Setting up\n",
"\n",
"You can get a local `ArangoDB` instance running via the [ArangoDB Docker image](https://hub.docker.com/_/arangodb): \n",
"\n",
"```\n",
"docker run -p 8529:8529 -e ARANGO_ROOT_PASSWORD= arangodb/arangodb\n",
"```\n",
"\n",
"An alternative is to use the [ArangoDB Cloud Connector package](https://github.com/arangodb/adb-cloud-connector#readme) to get a temporary cloud instance running:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "izi6YoFC8KRH",
"metadata": {
"id": "izi6YoFC8KRH"
},
"outputs": [],
"source": [
"%%capture\n",
"%pip install --upgrade --quiet python-arango # The ArangoDB Python Driver\n",
"%pip install --upgrade --quiet adb-cloud-connector # The ArangoDB Cloud Instance provisioner\n",
"%pip install --upgrade --quiet langchain-openai\n",
"%pip install --upgrade --quiet langchain"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "62812aad",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "62812aad",
"outputId": "f7ed8346-d88b-40d1-eaff-68e97e0e157e"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Log: requesting new credentials...\n",
"Succcess: new credentials acquired\n",
"{\n",
" \"dbName\": \"TUT3sp29s3pjf1io0h4cfdsq\",\n",
" \"username\": \"TUTo6nkwgzkizej3kysgdyeo8\",\n",
" \"password\": \"TUT9vx0qjqt42i9bq8uik4v9\",\n",
" \"hostname\": \"tutorials.arangodb.cloud\",\n",
" \"port\": 8529,\n",
" \"url\": \"https://tutorials.arangodb.cloud:8529\"\n",
"}\n"
]
}
],
"source": [
"# Instantiate ArangoDB Database\n",
"import json\n",
"\n",
"from adb_cloud_connector import get_temp_credentials\n",
"from arango import ArangoClient\n",
"\n",
"con = get_temp_credentials()\n",
"\n",
"db = ArangoClient(hosts=con[\"url\"]).db(\n",
" con[\"dbName\"], con[\"username\"], con[\"password\"], verify=True\n",
")\n",
"\n",
"print(json.dumps(con, indent=2))"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "0928915d",
"metadata": {
"id": "0928915d"
},
"outputs": [],
"source": [
"# Instantiate the ArangoDB-LangChain Graph\n",
"from langchain_community.graphs import ArangoGraph\n",
"\n",
"graph = ArangoGraph(db)"
]
},
{
"cell_type": "markdown",
"id": "995ea9b9",
"metadata": {
"id": "995ea9b9"
},
"source": [
"## Populating database\n",
"\n",
"We will rely on the `Python Driver` to import our [GameOfThrones](https://github.com/arangodb/example-datasets/tree/master/GameOfThrones) data into our database."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "fedd26b9",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "fedd26b9",
"outputId": "fc7d9067-e4f5-495e-cd0c-79135bc16fc2"
},
"outputs": [
{
"data": {
"text/plain": [
"{'error': False,\n",
" 'created': 4,\n",
" 'errors': 0,\n",
" 'empty': 0,\n",
" 'updated': 0,\n",
" 'ignored': 0,\n",
" 'details': []}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"if db.has_graph(\"GameOfThrones\"):\n",
" db.delete_graph(\"GameOfThrones\", drop_collections=True)\n",
"\n",
"db.create_graph(\n",
" \"GameOfThrones\",\n",
" edge_definitions=[\n",
" {\n",
" \"edge_collection\": \"ChildOf\",\n",
" \"from_vertex_collections\": [\"Characters\"],\n",
" \"to_vertex_collections\": [\"Characters\"],\n",
" },\n",
" ],\n",
")\n",
"\n",
"documents = [\n",
" {\n",
" \"_key\": \"NedStark\",\n",
" \"name\": \"Ned\",\n",
" \"surname\": \"Stark\",\n",
" \"alive\": True,\n",
" \"age\": 41,\n",
" \"gender\": \"male\",\n",
" },\n",
" {\n",
" \"_key\": \"CatelynStark\",\n",
" \"name\": \"Catelyn\",\n",
" \"surname\": \"Stark\",\n",
" \"alive\": False,\n",
" \"age\": 40,\n",
" \"gender\": \"female\",\n",
" },\n",
" {\n",
" \"_key\": \"AryaStark\",\n",
" \"name\": \"Arya\",\n",
" \"surname\": \"Stark\",\n",
" \"alive\": True,\n",
" \"age\": 11,\n",
" \"gender\": \"female\",\n",
" },\n",
" {\n",
" \"_key\": \"BranStark\",\n",
" \"name\": \"Bran\",\n",
" \"surname\": \"Stark\",\n",
" \"alive\": True,\n",
" \"age\": 10,\n",
" \"gender\": \"male\",\n",
" },\n",
"]\n",
"\n",
"edges = [\n",
" {\"_to\": \"Characters/NedStark\", \"_from\": \"Characters/AryaStark\"},\n",
" {\"_to\": \"Characters/NedStark\", \"_from\": \"Characters/BranStark\"},\n",
" {\"_to\": \"Characters/CatelynStark\", \"_from\": \"Characters/AryaStark\"},\n",
" {\"_to\": \"Characters/CatelynStark\", \"_from\": \"Characters/BranStark\"},\n",
"]\n",
"\n",
"db.collection(\"Characters\").import_bulk(documents)\n",
"db.collection(\"ChildOf\").import_bulk(edges)"
]
},
{
"cell_type": "markdown",
"id": "58c1a8ea",
"metadata": {
"id": "58c1a8ea"
},
"source": [
"## Getting and setting the ArangoDB schema\n",
"\n",
"An initial `ArangoDB Schema` is generated upon instantiating the `ArangoDBGraph` object. Below are the schema's getter & setter methods should you be interested in viewing or modifying the schema:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "4e3de44f",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "4e3de44f",
"outputId": "6102f0c6-4a94-4e00-b93e-eb8de2f7d9d5"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"Graph Schema\": [],\n",
" \"Collection Schema\": []\n",
"}\n"
]
}
],
"source": [
"# The schema should be empty here,\n",
"# since `graph` was initialized prior to ArangoDB Data ingestion (see above).\n",
"\n",
"import json\n",
"\n",
"print(json.dumps(graph.schema, indent=4))"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "1fe76ccd",
"metadata": {
"id": "1fe76ccd"
},
"outputs": [],
"source": [
"graph.set_schema()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "mZ679anj_-Er",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "mZ679anj_-Er",
"outputId": "e05229c7-bc61-4803-d720-47e3e9b2b350"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"Graph Schema\": [\n",
" {\n",
" \"graph_name\": \"GameOfThrones\",\n",
" \"edge_definitions\": [\n",
" {\n",
" \"edge_collection\": \"ChildOf\",\n",
" \"from_vertex_collections\": [\n",
" \"Characters\"\n",
" ],\n",
" \"to_vertex_collections\": [\n",
" \"Characters\"\n",
" ]\n",
" }\n",
" ]\n",
" }\n",
" ],\n",
" \"Collection Schema\": [\n",
" {\n",
" \"collection_name\": \"ChildOf\",\n",
" \"collection_type\": \"edge\",\n",
" \"edge_properties\": [\n",
" {\n",
" \"name\": \"_key\",\n",
" \"type\": \"str\"\n",
" },\n",
" {\n",
" \"name\": \"_id\",\n",
" \"type\": \"str\"\n",
" },\n",
" {\n",
" \"name\": \"_from\",\n",
" \"type\": \"str\"\n",
" },\n",
" {\n",
" \"name\": \"_to\",\n",
" \"type\": \"str\"\n",
" },\n",
" {\n",
" \"name\": \"_rev\",\n",
" \"type\": \"str\"\n",
" }\n",
" ],\n",
" \"example_edge\": {\n",
" \"_key\": \"266218884025\",\n",
" \"_id\": \"ChildOf/266218884025\",\n",
" \"_from\": \"Characters/AryaStark\",\n",
" \"_to\": \"Characters/NedStark\",\n",
" \"_rev\": \"_gVPKGSq---\"\n",
" }\n",
" },\n",
" {\n",
" \"collection_name\": \"Characters\",\n",
" \"collection_type\": \"document\",\n",
" \"document_properties\": [\n",
" {\n",
" \"name\": \"_key\",\n",
" \"type\": \"str\"\n",
" },\n",
" {\n",
" \"name\": \"_id\",\n",
" \"type\": \"str\"\n",
" },\n",
" {\n",
" \"name\": \"_rev\",\n",
" \"type\": \"str\"\n",
" },\n",
" {\n",
" \"name\": \"name\",\n",
" \"type\": \"str\"\n",
" },\n",
" {\n",
" \"name\": \"surname\",\n",
" \"type\": \"str\"\n",
" },\n",
" {\n",
" \"name\": \"alive\",\n",
" \"type\": \"bool\"\n",
" },\n",
" {\n",
" \"name\": \"age\",\n",
" \"type\": \"int\"\n",
" },\n",
" {\n",
" \"name\": \"gender\",\n",
" \"type\": \"str\"\n",
" }\n",
" ],\n",
" \"example_document\": {\n",
" \"_key\": \"NedStark\",\n",
" \"_id\": \"Characters/NedStark\",\n",
" \"_rev\": \"_gVPKGPi---\",\n",
" \"name\": \"Ned\",\n",
" \"surname\": \"Stark\",\n",
" \"alive\": true,\n",
" \"age\": 41,\n",
" \"gender\": \"male\"\n",
" }\n",
" }\n",
" ]\n",
"}\n"
]
}
],
"source": [
"# We can now view the generated schema\n",
"\n",
"import json\n",
"\n",
"print(json.dumps(graph.schema, indent=4))"
]
},
{
"cell_type": "markdown",
"id": "68a3c677",
"metadata": {
"id": "68a3c677"
},
"source": [
"## Querying the ArangoDB database\n",
"\n",
"We can now use the `ArangoDB Graph` QA Chain to inquire about our data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "635c4018",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"your-key-here\""
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "7476ce98",
"metadata": {
"id": "7476ce98"
},
"outputs": [],
"source": [
"from langchain.chains import ArangoGraphQAChain\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"chain = ArangoGraphQAChain.from_llm(\n",
" ChatOpenAI(temperature=0), graph=graph, verbose=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "ef8ee27b",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 261
},
"id": "ef8ee27b",
"outputId": "6008ee92-a3ec-4968-d48d-6a5b66403959"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ArangoGraphQAChain chain...\u001b[0m\n",
"AQL Query (1):\u001b[32;1m\u001b[1;3m\n",
"WITH Characters\n",
"FOR character IN Characters\n",
"FILTER character.name == \"Ned\" AND character.surname == \"Stark\"\n",
"RETURN character.alive\n",
"\u001b[0m\n",
"AQL Result:\n",
"\u001b[32;1m\u001b[1;3m[True]\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'Yes, Ned Stark is alive.'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"Is Ned Stark alive?\")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "9CSig1BgA76q",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 261
},
"id": "9CSig1BgA76q",
"outputId": "3060cf15-68e0-4f8a-cdfd-68af3f0e5fbf"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ArangoGraphQAChain chain...\u001b[0m\n",
"AQL Query (1):\u001b[32;1m\u001b[1;3m\n",
"WITH Characters\n",
"FOR character IN Characters\n",
"FILTER character.name == \"Arya\" && character.surname == \"Stark\"\n",
"RETURN character.age\n",
"\u001b[0m\n",
"AQL Result:\n",
"\u001b[32;1m\u001b[1;3m[11]\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'Arya Stark is 11 years old.'"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"How old is Arya Stark?\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "9Fzdic_pA_4y",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 298
},
"id": "9Fzdic_pA_4y",
"outputId": "9bd93580-964e-4c53-e273-6723dad5f375"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ArangoGraphQAChain chain...\u001b[0m\n",
"AQL Query (1):\u001b[32;1m\u001b[1;3m\n",
"WITH Characters, ChildOf\n",
"FOR v, e, p IN 1..1 OUTBOUND 'Characters/AryaStark' ChildOf\n",
" FILTER p.vertices[-1]._key == 'NedStark'\n",
" RETURN p\n",
"\u001b[0m\n",
"AQL Result:\n",
"\u001b[32;1m\u001b[1;3m[{'vertices': [{'_key': 'AryaStark', '_id': 'Characters/AryaStark', '_rev': '_gVPKGPi--B', 'name': 'Arya', 'surname': 'Stark', 'alive': True, 'age': 11, 'gender': 'female'}, {'_key': 'NedStark', '_id': 'Characters/NedStark', '_rev': '_gVPKGPi---', 'name': 'Ned', 'surname': 'Stark', 'alive': True, 'age': 41, 'gender': 'male'}], 'edges': [{'_key': '266218884025', '_id': 'ChildOf/266218884025', '_from': 'Characters/AryaStark', '_to': 'Characters/NedStark', '_rev': '_gVPKGSq---'}], 'weights': [0, 1]}]\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'Yes, Arya Stark and Ned Stark are related. According to the information retrieved from the database, there is a relationship between them. Arya Stark is the child of Ned Stark.'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"Are Arya Stark and Ned Stark related?\")"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "zq_oeDpAOXpF",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 261
},
"id": "zq_oeDpAOXpF",
"outputId": "a47f37b5-4d7b-41c6-fbc7-7b1abc25fa20"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ArangoGraphQAChain chain...\u001b[0m\n",
"AQL Query (1):\u001b[32;1m\u001b[1;3m\n",
"WITH Characters, ChildOf\n",
"FOR v, e IN 1..1 OUTBOUND 'Characters/AryaStark' ChildOf\n",
"FILTER v.alive == false\n",
"RETURN e\n",
"\u001b[0m\n",
"AQL Result:\n",
"\u001b[32;1m\u001b[1;3m[{'_key': '266218884027', '_id': 'ChildOf/266218884027', '_from': 'Characters/AryaStark', '_to': 'Characters/CatelynStark', '_rev': '_gVPKGSu---'}]\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'Yes, Arya Stark has a dead parent. The parent is Catelyn Stark.'"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"Does Arya Stark have a dead parent?\")"
]
},
{
"cell_type": "markdown",
"id": "Ob_3aGauGd7d",
"metadata": {
"id": "Ob_3aGauGd7d"
},
"source": [
"## Chain modifiers"
]
},
{
"cell_type": "markdown",
"id": "3P490E2dGiBp",
"metadata": {
"id": "3P490E2dGiBp"
},
"source": [
"You can alter the values of the following `ArangoDBGraphQAChain` class variables to modify the behaviour of your chain results\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "1B9h3PvzJ41T",
"metadata": {
"id": "1B9h3PvzJ41T"
},
"outputs": [],
"source": [
"# Specify the maximum number of AQL Query Results to return\n",
"chain.top_k = 10\n",
"\n",
"# Specify whether or not to return the AQL Query in the output dictionary\n",
"chain.return_aql_query = True\n",
"\n",
"# Specify whether or not to return the AQL JSON Result in the output dictionary\n",
"chain.return_aql_result = True\n",
"\n",
"# Specify the maximum amount of AQL Generation attempts that should be made\n",
"chain.max_aql_generation_attempts = 5\n",
"\n",
"# Specify a set of AQL Query Examples, which are passed to\n",
"# the AQL Generation Prompt Template to promote few-shot-learning.\n",
"# Defaults to an empty string.\n",
"chain.aql_examples = \"\"\"\n",
"# Is Ned Stark alive?\n",
"RETURN DOCUMENT('Characters/NedStark').alive\n",
"\n",
"# Is Arya Stark the child of Ned Stark?\n",
"FOR e IN ChildOf\n",
" FILTER e._from == \"Characters/AryaStark\" AND e._to == \"Characters/NedStark\"\n",
" RETURN e\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "49cnjYV-PUv3",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 209
},
"id": "49cnjYV-PUv3",
"outputId": "f05f0a86-f922-47d1-91b9-5380ef1f996d"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ArangoGraphQAChain chain...\u001b[0m\n",
"AQL Query (1):\u001b[32;1m\u001b[1;3m\n",
"RETURN DOCUMENT('Characters/NedStark').alive\n",
"\u001b[0m\n",
"AQL Result:\n",
"\u001b[32;1m\u001b[1;3m[True]\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'Yes, according to the information in the database, Ned Stark is alive.'"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"Is Ned Stark alive?\")\n",
"\n",
"# chain(\"Is Ned Stark alive?\") # Returns a dictionary with the AQL Query & AQL Result"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "nWfALJ8dPczE",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 244
},
"id": "nWfALJ8dPczE",
"outputId": "1235baae-f3f7-438e-ef24-658cd17f727d"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new ArangoGraphQAChain chain...\u001b[0m\n",
"AQL Query (1):\u001b[32;1m\u001b[1;3m\n",
"FOR e IN ChildOf\n",
" FILTER e._from == \"Characters/BranStark\" AND e._to == \"Characters/NedStark\"\n",
" RETURN e\n",
"\u001b[0m\n",
"AQL Result:\n",
"\u001b[32;1m\u001b[1;3m[{'_key': '266218884026', '_id': 'ChildOf/266218884026', '_from': 'Characters/BranStark', '_to': 'Characters/NedStark', '_rev': '_gVPKGSq--_'}]\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'Yes, according to the information in the ArangoDB database, Bran Stark is indeed the child of Ned Stark.'"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"Is Bran Stark the child of Ned Stark?\")"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [
"995ea9b9",
"58c1a8ea",
"68a3c677",
"Ob_3aGauGd7d"
],
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}