{ "cells": [ { "cell_type": "markdown", "id": "d2777010", "metadata": {}, "source": [ "# HugeGraph QA Chain\n", "\n", "This notebook shows how to use LLMs to provide a natural language interface to [HugeGraph](https://hugegraph.apache.org/cn/) database." ] }, { "cell_type": "markdown", "id": "f26dcbe4", "metadata": {}, "source": [ "You will need to have a running HugeGraph instance.\n", "You can run a local docker container by running the executing the following script:\n", "\n", "```\n", "docker run \\\n", " --name=graph \\\n", " -itd \\\n", " -p 8080:8080 \\\n", " hugegraph/hugegraph\n", "```\n", "\n", "If we want to connect HugeGraph in the application, we need to install python sdk:\n", "\n", "```\n", "pip3 install hugegraph-python\n", "```" ] }, { "cell_type": "markdown", "id": "d64a29f1", "metadata": {}, "source": [ "If you are using the docker container, you need to wait a couple of second for the database to start, and then we need create schema and write graph data for the database." ] }, { "cell_type": "code", "execution_count": 13, "id": "e53ab93e", "metadata": {}, "outputs": [], "source": [ "from hugegraph.connection import PyHugeGraph\n", "\n", "client = PyHugeGraph(\"localhost\", \"8080\", user=\"admin\", pwd=\"admin\", graph=\"hugegraph\")" ] }, { "cell_type": "markdown", "id": "b7c3a50e", "metadata": {}, "source": [ "First, we create the schema for a simple movie database:" ] }, { "cell_type": "code", "execution_count": 4, "id": "ef5372a8", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'create EdgeLabel success, Detail: \"b\\'{\"id\":1,\"name\":\"ActedIn\",\"source_label\":\"Person\",\"target_label\":\"Movie\",\"frequency\":\"SINGLE\",\"sort_keys\":[],\"nullable_keys\":[],\"index_labels\":[],\"properties\":[],\"status\":\"CREATED\",\"ttl\":0,\"enable_label_index\":true,\"user_data\":{\"~create_time\":\"2023-07-04 10:48:47.908\"}}\\'\"'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"\"\"schema\"\"\"\n", "schema = client.schema()\n", "schema.propertyKey(\"name\").asText().ifNotExist().create()\n", "schema.propertyKey(\"birthDate\").asText().ifNotExist().create()\n", "schema.vertexLabel(\"Person\").properties(\"name\", \"birthDate\").usePrimaryKeyId().primaryKeys(\"name\").ifNotExist().create()\n", "schema.vertexLabel(\"Movie\").properties(\"name\").usePrimaryKeyId().primaryKeys(\"name\").ifNotExist().create()\n", "schema.edgeLabel(\"ActedIn\").sourceLabel(\"Person\").targetLabel(\"Movie\").ifNotExist().create()" ] }, { "cell_type": "markdown", "id": "016f7989", "metadata": {}, "source": [ "Then we can insert some data." ] }, { "cell_type": "code", "execution_count": 26, "id": "b7f4c370", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1:Robert De Niro--ActedIn-->2:The Godfather Part II" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"\"\"graph\"\"\"\n", "g = client.graph()\n", "g.addVertex(\"Person\", {\"name\": \"Al Pacino\", \"birthDate\": \"1940-04-25\"})\n", "g.addVertex(\"Person\", {\"name\": \"Robert De Niro\", \"birthDate\": \"1943-08-17\"})\n", "g.addVertex(\"Movie\", {\"name\": \"The Godfather\"})\n", "g.addVertex(\"Movie\", {\"name\": \"The Godfather Part II\"})\n", "g.addVertex(\"Movie\", {\"name\": \"The Godfather Coda The Death of Michael Corleone\"})\n", "\n", "g.addEdge(\"ActedIn\", \"1:Al Pacino\", \"2:The Godfather\", {})\n", "g.addEdge(\"ActedIn\", \"1:Al Pacino\", \"2:The Godfather Part II\", {})\n", "g.addEdge(\"ActedIn\", \"1:Al Pacino\", \"2:The Godfather Coda The Death of Michael Corleone\", {})\n", "g.addEdge(\"ActedIn\", \"1:Robert De Niro\", \"2:The Godfather Part II\", {})" ] }, { "cell_type": "markdown", "id": "5b8f7788", "metadata": {}, "source": [ "## Creating `HugeGraphQAChain`\n", "\n", "We can now create the `HugeGraph` and `HugeGraphQAChain`. To create the `HugeGraph` we simply need to pass the database object to the `HugeGraph` constructor." ] }, { "cell_type": "code", "execution_count": 27, "id": "f1f68fcf", "metadata": { "is_executing": true }, "outputs": [], "source": [ "from langchain.chat_models import ChatOpenAI\n", "from langchain.chains import HugeGraphQAChain\n", "from langchain.graphs import HugeGraph" ] }, { "cell_type": "code", "execution_count": 28, "id": "b86ebfa7", "metadata": {}, "outputs": [], "source": [ "graph = HugeGraph(\n", " username=\"admin\",\n", " password=\"admin\",\n", " address=\"localhost\",\n", " port=8080,\n", " graph=\"hugegraph\"\n", ")" ] }, { "cell_type": "markdown", "id": "e262540b", "metadata": {}, "source": [ "## Refresh graph schema information\n", "\n", "If the schema of database changes, you can refresh the schema information needed to generate Gremlin statements." ] }, { "cell_type": "code", "execution_count": 29, "id": "134dd8d6", "metadata": {}, "outputs": [], "source": [ "# graph.refresh_schema()" ] }, { "cell_type": "code", "execution_count": 30, "id": "e78b8e72", "metadata": { "ExecuteTime": {} }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Node properties: [name: Person, primary_keys: ['name'], properties: ['name', 'birthDate'], name: Movie, primary_keys: ['name'], properties: ['name']]\n", "Edge properties: [name: ActedIn, properties: []]\n", "Relationships: ['Person--ActedIn-->Movie']\n", "\n" ] } ], "source": [ "print(graph.get_schema)" ] }, { "cell_type": "markdown", "id": "5c27e813", "metadata": {}, "source": [ "## Querying the graph\n", "\n", "We can now use the graph Gremlin QA chain to ask question of the graph" ] }, { "cell_type": "code", "execution_count": 31, "id": "3b23dead", "metadata": {}, "outputs": [], "source": [ "chain = HugeGraphQAChain.from_llm(\n", " ChatOpenAI(temperature=0), graph=graph, verbose=True\n", ")" ] }, { "cell_type": "code", "execution_count": 32, "id": "76aecc93", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\u001b[1m> Entering new chain...\u001b[0m\n", "Generated gremlin:\n", "\u001b[32;1m\u001b[1;3mg.V().has('Movie', 'name', 'The Godfather').in('ActedIn').valueMap(true)\u001b[0m\n", "Full Context:\n", "\u001b[32;1m\u001b[1;3m[{'id': '1:Al Pacino', 'label': 'Person', 'name': ['Al Pacino'], 'birthDate': ['1940-04-25']}]\u001b[0m\n", "\n", "\u001b[1m> Finished chain.\u001b[0m\n" ] }, { "data": { "text/plain": [ "'Al Pacino played in The Godfather.'" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "chain.run(\"Who played in The Godfather?\")" ] }, { "cell_type": "code", "execution_count": null, "id": "869f0258", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "venv", "language": "python", "name": "venv" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 5 }