{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# KuzuQAChain\n", "\n", "This notebook shows how to use LLMs to provide a natural language interface to [Kùzu](https://kuzudb.com) database." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "[Kùzu](https://kuzudb.com) is an in-process property graph database management system. You can simply install it with `pip`:\n", "\n", "```bash\n", "pip install kuzu\n", "```\n", "\n", "Once installed, you can simply import it and start creating a database on the local machine and connect to it:\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import kuzu\n", "\n", "db = kuzu.Database(\"test_db\")\n", "conn = kuzu.Connection(db)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "First, we create the schema for a simple movie database:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.execute(\"CREATE NODE TABLE Movie (name STRING, PRIMARY KEY(name))\")\n", "conn.execute(\n", " \"CREATE NODE TABLE Person (name STRING, birthDate STRING, PRIMARY KEY(name))\"\n", ")\n", "conn.execute(\"CREATE REL TABLE ActedIn (FROM Person TO Movie)\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Then we can insert some data." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.execute(\"CREATE (:Person {name: 'Al Pacino', birthDate: '1940-04-25'})\")\n", "conn.execute(\"CREATE (:Person {name: 'Robert De Niro', birthDate: '1943-08-17'})\")\n", "conn.execute(\"CREATE (:Movie {name: 'The Godfather'})\")\n", "conn.execute(\"CREATE (:Movie {name: 'The Godfather: Part II'})\")\n", "conn.execute(\n", " \"CREATE (:Movie {name: 'The Godfather Coda: The Death of Michael Corleone'})\"\n", ")\n", "conn.execute(\n", " \"MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather' CREATE (p)-[:ActedIn]->(m)\"\n", ")\n", "conn.execute(\n", " \"MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather: Part II' CREATE (p)-[:ActedIn]->(m)\"\n", ")\n", "conn.execute(\n", " \"MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather Coda: The Death of Michael Corleone' CREATE (p)-[:ActedIn]->(m)\"\n", ")\n", "conn.execute(\n", " \"MATCH (p:Person), (m:Movie) WHERE p.name = 'Robert De Niro' AND m.name = 'The Godfather: Part II' CREATE (p)-[:ActedIn]->(m)\"\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Creating `KuzuQAChain`\n", "\n", "We can now create the `KuzuGraph` and `KuzuQAChain`. To create the `KuzuGraph` we simply need to pass the database object to the `KuzuGraph` constructor." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "from langchain.chat_models import ChatOpenAI\n", "from langchain.graphs import KuzuGraph\n", "from langchain.chains import KuzuQAChain" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "graph = KuzuGraph(db)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "chain = KuzuQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph, verbose=True)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Refresh graph schema information\n", "\n", "If the schema of database changes, you can refresh the schema information needed to generate Cypher statements." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# graph.refresh_schema()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Node properties: [{'properties': [('name', 'STRING')], 'label': 'Movie'}, {'properties': [('name', 'STRING'), ('birthDate', 'STRING')], 'label': 'Person'}]\n", "Relationships properties: [{'properties': [], 'label': 'ActedIn'}]\n", "Relationships: ['(:Person)-[:ActedIn]->(:Movie)']\n", "\n" ] } ], "source": [ "print(graph.get_schema)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Querying the graph\n", "\n", "We can now use the `KuzuQAChain` to ask question of the graph" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\u001b[1m> Entering new chain...\u001b[0m\n", "Generated Cypher:\n", "\u001b[32;1m\u001b[1;3mMATCH (p:Person)-[:ActedIn]->(m:Movie {name: 'The Godfather: Part II'}) RETURN p.name\u001b[0m\n", "Full Context:\n", "\u001b[32;1m\u001b[1;3m[{'p.name': 'Al Pacino'}, {'p.name': 'Robert De Niro'}]\u001b[0m\n", "\n", "\u001b[1m> Finished chain.\u001b[0m\n" ] }, { "data": { "text/plain": [ "'Al Pacino and Robert De Niro both played in The Godfather: Part II.'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "chain.run(\"Who played in The Godfather: Part II?\")" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\u001b[1m> Entering new chain...\u001b[0m\n", "Generated Cypher:\n", "\u001b[32;1m\u001b[1;3mMATCH (p:Person {name: 'Robert De Niro'})-[:ActedIn]->(m:Movie)\n", "RETURN m.name\u001b[0m\n", "Full Context:\n", "\u001b[32;1m\u001b[1;3m[{'m.name': 'The Godfather: Part II'}]\u001b[0m\n", "\n", "\u001b[1m> Finished chain.\u001b[0m\n" ] }, { "data": { "text/plain": [ "'Robert De Niro played in The Godfather: Part II.'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "chain.run(\"Robert De Niro played in which movies?\")" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\u001b[1m> Entering new chain...\u001b[0m\n", "Generated Cypher:\n", "\u001b[32;1m\u001b[1;3mMATCH (p:Person {name: 'Robert De Niro'})-[:ActedIn]->(m:Movie)\n", "RETURN p.birthDate\u001b[0m\n", "Full Context:\n", "\u001b[32;1m\u001b[1;3m[{'p.birthDate': '1943-08-17'}]\u001b[0m\n", "\n", "\u001b[1m> Finished chain.\u001b[0m\n" ] }, { "data": { "text/plain": [ "'Robert De Niro was born on August 17, 1943.'" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "chain.run(\"Robert De Niro is born in which year?\")" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\u001b[1m> Entering new chain...\u001b[0m\n", "Generated Cypher:\n", "\u001b[32;1m\u001b[1;3mMATCH (p:Person)-[:ActedIn]->(m:Movie{name:'The Godfather: Part II'})\n", "WITH p, m, p.birthDate AS birthDate\n", "ORDER BY birthDate ASC\n", "LIMIT 1\n", "RETURN p.name\u001b[0m\n", "Full Context:\n", "\u001b[32;1m\u001b[1;3m[{'p.name': 'Al Pacino'}]\u001b[0m\n", "\n", "\u001b[1m> Finished chain.\u001b[0m\n" ] }, { "data": { "text/plain": [ "'The oldest actor who played in The Godfather: Part II is Al Pacino.'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "chain.run(\"Who is the oldest actor who played in The Godfather: Part II?\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }