Support for SPARQL (#7165)

# [SPARQL](https://www.w3.org/TR/rdf-sparql-query/) for [LangChain](https://github.com/hwchase17/langchain) ## Description LangChain support for knowledge graphs relying on W3C standards using RDFlib: SPARQL/ RDF(S)/ OWL with special focus on RDF \ * Works with local files, files from the web, and SPARQL endpoints * Supports both SELECT and UPDATE queries * Includes both a Jupyter notebook with an example and integration tests ## Contribution compared to related PRs and discussions * [Wikibase agent](https://github.com/hwchase17/langchain/pull/2690) - uses SPARQL, but specifically for wikibase querying * [Cypher qa](https://github.com/hwchase17/langchain/pull/5078) - graph DB question answering for Neo4J via Cypher * [PR 6050](https://github.com/hwchase17/langchain/pull/6050) - tries something similar, but does not cover UPDATE queries and supports only RDF * Discussions on [w3c mailing list](mailto:semantic-web@w3.org) related to the combination of LLMs (specifically ChatGPT) and knowledge graphs ## Dependencies * [RDFlib](https://github.com/RDFLib/rdflib) ## Tag maintainer Graph database related to memory -> @hwchase17
1 year ago · db98c44f8f
parent 7cd0936b1c
commit db98c44f8f
9 changed files with 915 additions and 20 deletions
--- a/docs/extras/modules/chains/additional/graph_sparql_qa.ipynb
+++ b/docs/extras/modules/chains/additional/graph_sparql_qa.ipynb
@ -0,0 +1,300 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "c94240f5",
+   "metadata": {},
+   "source": [
+    "# GraphSparqlQAChain\n",
+    "\n",
+    "Graph databases are an excellent choice for applications based on network-like models. To standardize the syntax and semantics of such graphs, the W3C recommends Semantic Web Technologies, cp. [Semantic Web](https://www.w3.org/standards/semanticweb/). [SPARQL](https://www.w3.org/TR/sparql11-query/) serves as a query language analogously to SQL or Cypher for these graphs. This notebook demonstrates the application of LLMs as a natural language interface to a graph database by generating SPARQL.\\\n",
+    "Disclaimer: To date, SPARQL query generation via LLMs is still a bit unstable. Be especially careful with UPDATE queries, which alter the graph."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dbc0ee68",
+   "metadata": {},
+   "source": [
+    "There are several sources you can run queries against, including files on the web, files you have available locally, SPARQL endpoints, e.g., [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page), and [triple stores](https://www.w3.org/wiki/LargeTripleStores)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "62812aad",
+   "metadata": {
+    "pycharm": {
+     "is_executing": true
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.chains import GraphSparqlQAChain\n",
+    "from langchain.graphs import RdfGraph"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "0928915d",
+   "metadata": {
+    "pycharm": {
+     "is_executing": true
+    }
+   },
+   "outputs": [],
+   "source": [
+    "graph = RdfGraph(\n",
+    "    source_file=\"http://www.w3.org/People/Berners-Lee/card\",\n",
+    "    standard=\"rdf\",\n",
+    "    local_copy=\"test.ttl\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "Note that providing a `local_file` is necessary for storing changes locally if the source is read-only."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "id": "58c1a8ea",
+   "metadata": {},
+   "source": [
+    "## Refresh graph schema information\n",
+    "If the schema of the database changes, you can refresh the schema information needed to generate SPARQL queries."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "4e3de44f",
+   "metadata": {
+    "pycharm": {
+     "is_executing": true
+    }
+   },
+   "outputs": [],
+   "source": [
+    "graph.load_schema()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "1fe76ccd",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "In the following, each IRI is followed by the local name and optionally its description in parentheses. \n",
+      "The RDF graph supports the following node types:\n",
+      "<http://xmlns.com/foaf/0.1/PersonalProfileDocument> (PersonalProfileDocument, None), <http://www.w3.org/ns/auth/cert#RSAPublicKey> (RSAPublicKey, None), <http://www.w3.org/2000/10/swap/pim/contact#Male> (Male, None), <http://xmlns.com/foaf/0.1/Person> (Person, None), <http://www.w3.org/2006/vcard/ns#Work> (Work, None)\n",
+      "The RDF graph supports the following relationships:\n",
+      "<http://www.w3.org/2000/01/rdf-schema#seeAlso> (seeAlso, None), <http://purl.org/dc/elements/1.1/title> (title, None), <http://xmlns.com/foaf/0.1/mbox_sha1sum> (mbox_sha1sum, None), <http://xmlns.com/foaf/0.1/maker> (maker, None), <http://www.w3.org/ns/solid/terms#oidcIssuer> (oidcIssuer, None), <http://www.w3.org/2000/10/swap/pim/contact#publicHomePage> (publicHomePage, None), <http://xmlns.com/foaf/0.1/openid> (openid, None), <http://www.w3.org/ns/pim/space#storage> (storage, None), <http://xmlns.com/foaf/0.1/name> (name, None), <http://www.w3.org/2000/10/swap/pim/contact#country> (country, None), <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> (type, None), <http://www.w3.org/ns/solid/terms#profileHighlightColor> (profileHighlightColor, None), <http://www.w3.org/ns/pim/space#preferencesFile> (preferencesFile, None), <http://www.w3.org/2000/01/rdf-schema#label> (label, None), <http://www.w3.org/ns/auth/cert#modulus> (modulus, None), <http://www.w3.org/2000/10/swap/pim/contact#participant> (participant, None), <http://www.w3.org/2000/10/swap/pim/contact#street2> (street2, None), <http://www.w3.org/2006/vcard/ns#locality> (locality, None), <http://xmlns.com/foaf/0.1/nick> (nick, None), <http://xmlns.com/foaf/0.1/homepage> (homepage, None), <http://creativecommons.org/ns#license> (license, None), <http://xmlns.com/foaf/0.1/givenname> (givenname, None), <http://www.w3.org/2006/vcard/ns#street-address> (street-address, None), <http://www.w3.org/2006/vcard/ns#postal-code> (postal-code, None), <http://www.w3.org/2000/10/swap/pim/contact#street> (street, None), <http://www.w3.org/2003/01/geo/wgs84_pos#lat> (lat, None), <http://xmlns.com/foaf/0.1/primaryTopic> (primaryTopic, None), <http://www.w3.org/2006/vcard/ns#fn> (fn, None), <http://www.w3.org/2003/01/geo/wgs84_pos#location> (location, None), <http://usefulinc.com/ns/doap#developer> (developer, None), <http://www.w3.org/2000/10/swap/pim/contact#city> (city, None), <http://www.w3.org/2006/vcard/ns#region> (region, None), <http://xmlns.com/foaf/0.1/member> (member, None), <http://www.w3.org/2003/01/geo/wgs84_pos#long> (long, None), <http://www.w3.org/2000/10/swap/pim/contact#address> (address, None), <http://xmlns.com/foaf/0.1/family_name> (family_name, None), <http://xmlns.com/foaf/0.1/account> (account, None), <http://xmlns.com/foaf/0.1/workplaceHomepage> (workplaceHomepage, None), <http://purl.org/dc/terms/title> (title, None), <http://www.w3.org/ns/solid/terms#publicTypeIndex> (publicTypeIndex, None), <http://www.w3.org/2000/10/swap/pim/contact#office> (office, None), <http://www.w3.org/2000/10/swap/pim/contact#homePage> (homePage, None), <http://xmlns.com/foaf/0.1/mbox> (mbox, None), <http://www.w3.org/2000/10/swap/pim/contact#preferredURI> (preferredURI, None), <http://www.w3.org/ns/solid/terms#profileBackgroundColor> (profileBackgroundColor, None), <http://schema.org/owns> (owns, None), <http://xmlns.com/foaf/0.1/based_near> (based_near, None), <http://www.w3.org/2006/vcard/ns#hasAddress> (hasAddress, None), <http://xmlns.com/foaf/0.1/img> (img, None), <http://www.w3.org/2000/10/swap/pim/contact#assistant> (assistant, None), <http://xmlns.com/foaf/0.1/title> (title, None), <http://www.w3.org/ns/auth/cert#key> (key, None), <http://www.w3.org/ns/ldp#inbox> (inbox, None), <http://www.w3.org/ns/solid/terms#editableProfile> (editableProfile, None), <http://www.w3.org/2000/10/swap/pim/contact#postalCode> (postalCode, None), <http://xmlns.com/foaf/0.1/weblog> (weblog, None), <http://www.w3.org/ns/auth/cert#exponent> (exponent, None), <http://rdfs.org/sioc/ns#avatar> (avatar, None)\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "graph.get_schema"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "68a3c677",
+   "metadata": {},
+   "source": [
+    "## Querying the graph\n",
+    "\n",
+    "Now, you can use the graph SPARQL QA chain to ask questions about the graph."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "7476ce98",
+   "metadata": {
+    "pycharm": {
+     "is_executing": true
+    }
+   },
+   "outputs": [],
+   "source": [
+    "chain = GraphSparqlQAChain.from_llm(\n",
+    "    ChatOpenAI(temperature=0), graph=graph, verbose=True\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "ef8ee27b",
+   "metadata": {
+    "pycharm": {
+     "is_executing": true
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001B[1m> Entering new GraphSparqlQAChain chain...\u001B[0m\n",
+      "Identified intent:\n",
+      "\u001B[32;1m\u001B[1;3mSELECT\u001B[0m\n",
+      "Generated SPARQL:\n",
+      "\u001B[32;1m\u001B[1;3mPREFIX foaf: <http://xmlns.com/foaf/0.1/>\n",
+      "SELECT ?homepage\n",
+      "WHERE {\n",
+      "    ?person foaf:name \"Tim Berners-Lee\" .\n",
+      "    ?person foaf:workplaceHomepage ?homepage .\n",
+      "}\u001B[0m\n",
+      "Full Context:\n",
+      "\u001B[32;1m\u001B[1;3m[]\u001B[0m\n",
+      "\n",
+      "\u001B[1m> Finished chain.\u001B[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "\"Tim Berners-Lee's work homepage is http://www.w3.org/People/Berners-Lee/.\""
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.run(\"What is Tim Berners-Lee's work homepage?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "af4b3294",
+   "metadata": {},
+   "source": [
+    "## Updating the graph\n",
+    "\n",
+    "Analogously, you can update the graph, i.e., insert triples, using natural language."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "fdf38841",
+   "metadata": {
+    "pycharm": {
+     "is_executing": true
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001B[1m> Entering new GraphSparqlQAChain chain...\u001B[0m\n",
+      "Identified intent:\n",
+      "\u001B[32;1m\u001B[1;3mUPDATE\u001B[0m\n",
+      "Generated SPARQL:\n",
+      "\u001B[32;1m\u001B[1;3mPREFIX foaf: <http://xmlns.com/foaf/0.1/>\n",
+      "INSERT {\n",
+      "    ?person foaf:workplaceHomepage <http://www.w3.org/foo/bar/> .\n",
+      "}\n",
+      "WHERE {\n",
+      "    ?person foaf:name \"Timothy Berners-Lee\" .\n",
+      "}\u001B[0m\n",
+      "\n",
+      "\u001B[1m> Finished chain.\u001B[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "'Successfully inserted triples into the graph.'"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.run(\"Save that the person with the name 'Timothy Berners-Lee' has a work homepage at 'http://www.w3.org/foo/bar/'\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5e0f7fc1",
+   "metadata": {},
+   "source": [
+    "Let's verify the results:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "f874171b",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[(rdflib.term.URIRef('https://www.w3.org/'),),\n",
+       " (rdflib.term.URIRef('http://www.w3.org/foo/bar/'),)]"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query = (\n",
+    "    \"\"\"PREFIX foaf: <http://xmlns.com/foaf/0.1/>\\n\"\"\"\n",
+    "    \"\"\"SELECT ?hp\\n\"\"\"\n",
+    "    \"\"\"WHERE {\\n\"\"\"\n",
+    "    \"\"\"    ?person foaf:name \"Timothy Berners-Lee\" . \\n\"\"\"\n",
+    "    \"\"\"    ?person foaf:workplaceHomepage ?hp .\\n\"\"\"\n",
+    "    \"\"\"}\"\"\"\n",
+    ")\n",
+    "graph.query(query)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "lc",
+   "language": "python",
+   "name": "lc"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/langchain/chains/init.py
+++ b/langchain/chains/init.py
@ -19,6 +19,7 @@ from langchain.chains.graph_qa.cypher import GraphCypherQAChain
 from langchain.chains.graph_qa.hugegraph import HugeGraphQAChain
 from langchain.chains.graph_qa.kuzu import KuzuQAChain
 from langchain.chains.graph_qa.nebulagraph import NebulaGraphQAChain
+from langchain.chains.graph_qa.sparql import GraphSparqlQAChain
 from langchain.chains.hyde.base import HypotheticalDocumentEmbedder
 from langchain.chains.llm import LLMChain
 from langchain.chains.llm_bash.base import LLMBashChain
@ -69,6 +70,7 @@ __all__ = [
    "FlareChain",
    "GraphCypherQAChain",
    "GraphQAChain",
+    "GraphSparqlQAChain",
    "HypotheticalDocumentEmbedder",
    "KuzuQAChain",
    "HugeGraphQAChain",
--- a/langchain/chains/graph_qa/prompts.py
+++ b/langchain/chains/graph_qa/prompts.py
@ -109,3 +109,90 @@ Helpful Answer:"""
 CYPHER_QA_PROMPT = PromptTemplate(
    input_variables=["context", "question"], template=CYPHER_QA_TEMPLATE
 )
+
+SPARQL_INTENT_TEMPLATE = """Task: Identify the intent of a prompt and return the appropriate SPARQL query type.
+You are an assistant that distinguishes different types of prompts and returns the corresponding SPARQL query types.
+Consider only the following query types:
+* SELECT: this query type corresponds to questions
+* UPDATE: this query type corresponds to all requests for deleting, inserting, or changing triples
+Note: Be as concise as possible.
+Do not include any explanations or apologies in your responses.
+Do not respond to any questions that ask for anything else than for you to identify a SPARQL query type.
+Do not include any unnecessary whitespaces or any text except the query type, i.e., either return 'SELECT' or 'UPDATE'.
+
+The prompt is:
+{prompt}
+Helpful Answer:"""
+SPARQL_INTENT_PROMPT = PromptTemplate(
+    input_variables=["prompt"], template=SPARQL_INTENT_TEMPLATE
+)
+
+SPARQL_GENERATION_SELECT_TEMPLATE = """Task: Generate a SPARQL SELECT statement for querying a graph database.
+For instance, to find all email addresses of John Doe, the following query in backticks would be suitable:
+```
+PREFIX foaf: <http://xmlns.com/foaf/0.1/>
+SELECT ?email
+WHERE {{
+    ?person foaf:name "John Doe" .
+    ?person foaf:mbox ?email .
+}}
+```
+Instructions:
+Use only the node types and properties provided in the schema.
+Do not use any node types and properties that are not explicitly provided.
+Include all necessary prefixes.
+Schema:
+{schema}
+Note: Be as concise as possible.
+Do not include any explanations or apologies in your responses.
+Do not respond to any questions that ask for anything else than for you to construct a SPARQL query.
+Do not include any text except the SPARQL query generated.
+
+The question is:
+{prompt}"""
+SPARQL_GENERATION_SELECT_PROMPT = PromptTemplate(
+    input_variables=["schema", "prompt"], template=SPARQL_GENERATION_SELECT_TEMPLATE
+)
+
+SPARQL_GENERATION_UPDATE_TEMPLATE = """Task: Generate a SPARQL UPDATE statement for updating a graph database.
+For instance, to add 'jane.doe@foo.bar' as a new email address for Jane Doe, the following query in backticks would be suitable:
+```
+PREFIX foaf: <http://xmlns.com/foaf/0.1/>
+INSERT {{
+    ?person foaf:mbox <mailto:jane.doe@foo.bar> .
+}}
+WHERE {{
+    ?person foaf:name "Jane Doe" .
+}}
+```
+Instructions:
+Make the query as short as possible and avoid adding unnecessary triples.
+Use only the node types and properties provided in the schema.
+Do not use any node types and properties that are not explicitly provided.
+Include all necessary prefixes.
+Schema:
+{schema}
+Note: Be as concise as possible.
+Do not include any explanations or apologies in your responses.
+Do not respond to any questions that ask for anything else than for you to construct a SPARQL query.
+Return only the generated SPARQL query, nothing else.
+
+The information to be inserted is:
+{prompt}"""
+SPARQL_GENERATION_UPDATE_PROMPT = PromptTemplate(
+    input_variables=["schema", "prompt"], template=SPARQL_GENERATION_UPDATE_TEMPLATE
+)
+
+SPARQL_QA_TEMPLATE = """Task: Generate a natural language response from the results of a SPARQL query.
+You are an assistant that creates well-written and human understandable answers.
+The information part contains the information provided, which you can use to construct an answer.
+The information provided is authoritative, you must never doubt it or try to use your internal knowledge to correct it.
+Make your response sound like the information is coming from an AI assistant, but don't add any information.
+Information:
+{context}
+
+Question: {prompt}
+Helpful Answer:"""
+SPARQL_QA_PROMPT = PromptTemplate(
+    input_variables=["context", "prompt"], template=SPARQL_QA_TEMPLATE
+)
--- a/langchain/chains/graph_qa/sparql.py
+++ b/langchain/chains/graph_qa/sparql.py
@ -0,0 +1,127 @@
+"""
+Question answering over an RDF or OWL graph using SPARQL.
+"""
+from __future__ import annotations
+
+from typing import Any, Dict, List, Optional
+
+from pydantic import Field
+
+from langchain.base_language import BaseLanguageModel
+from langchain.callbacks.manager import CallbackManagerForChainRun
+from langchain.chains.base import Chain
+from langchain.chains.graph_qa.prompts import (
+    SPARQL_GENERATION_SELECT_PROMPT,
+    SPARQL_GENERATION_UPDATE_PROMPT,
+    SPARQL_INTENT_PROMPT,
+    SPARQL_QA_PROMPT,
+)
+from langchain.chains.llm import LLMChain
+from langchain.graphs.rdf_graph import RdfGraph
+from langchain.prompts.base import BasePromptTemplate
+
+
+class GraphSparqlQAChain(Chain):
+    """
+    Chain for question-answering against an RDF or OWL graph by generating
+    SPARQL statements.
+    """
+
+    graph: RdfGraph = Field(exclude=True)
+    sparql_generation_select_chain: LLMChain
+    sparql_generation_update_chain: LLMChain
+    sparql_intent_chain: LLMChain
+    qa_chain: LLMChain
+    input_key: str = "query"  #: :meta private:
+    output_key: str = "result"  #: :meta private:
+
+    @property
+    def input_keys(self) -> List[str]:
+        return [self.input_key]
+
+    @property
+    def output_keys(self) -> List[str]:
+        _output_keys = [self.output_key]
+        return _output_keys
+
+    @classmethod
+    def from_llm(
+        cls,
+        llm: BaseLanguageModel,
+        *,
+        qa_prompt: BasePromptTemplate = SPARQL_QA_PROMPT,
+        sparql_select_prompt: BasePromptTemplate = SPARQL_GENERATION_SELECT_PROMPT,
+        sparql_update_prompt: BasePromptTemplate = SPARQL_GENERATION_UPDATE_PROMPT,
+        sparql_intent_prompt: BasePromptTemplate = SPARQL_INTENT_PROMPT,
+        **kwargs: Any,
+    ) -> GraphSparqlQAChain:
+        """Initialize from LLM."""
+        qa_chain = LLMChain(llm=llm, prompt=qa_prompt)
+        sparql_generation_select_chain = LLMChain(llm=llm, prompt=sparql_select_prompt)
+        sparql_generation_update_chain = LLMChain(llm=llm, prompt=sparql_update_prompt)
+        sparql_intent_chain = LLMChain(llm=llm, prompt=sparql_intent_prompt)
+
+        return cls(
+            qa_chain=qa_chain,
+            sparql_generation_select_chain=sparql_generation_select_chain,
+            sparql_generation_update_chain=sparql_generation_update_chain,
+            sparql_intent_chain=sparql_intent_chain,
+            **kwargs,
+        )
+
+    def _call(
+        self,
+        inputs: Dict[str, Any],
+        run_manager: Optional[CallbackManagerForChainRun] = None,
+    ) -> Dict[str, str]:
+        """
+        Generate SPARQL query, use it to retrieve a response from the gdb and answer
+        the question.
+        """
+        _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
+        callbacks = _run_manager.get_child()
+        prompt = inputs[self.input_key]
+
+        _intent = self.sparql_intent_chain.run({"prompt": prompt}, callbacks=callbacks)
+        intent = _intent.strip()
+
+        if intent == "SELECT":
+            sparql_generation_chain = self.sparql_generation_select_chain
+        elif intent == "UPDATE":
+            sparql_generation_chain = self.sparql_generation_update_chain
+        else:
+            raise ValueError(
+                "I am sorry, but this prompt seems to fit none of the currently "
+                "supported SPARQL query types, i.e., SELECT and UPDATE."
+            )
+
+        _run_manager.on_text("Identified intent:", end="\n", verbose=self.verbose)
+        _run_manager.on_text(intent, color="green", end="\n", verbose=self.verbose)
+
+        generated_sparql = sparql_generation_chain.run(
+            {"prompt": prompt, "schema": self.graph.get_schema}, callbacks=callbacks
+        )
+
+        _run_manager.on_text("Generated SPARQL:", end="\n", verbose=self.verbose)
+        _run_manager.on_text(
+            generated_sparql, color="green", end="\n", verbose=self.verbose
+        )
+
+        if intent == "SELECT":
+            context = self.graph.query(generated_sparql)
+
+            _run_manager.on_text("Full Context:", end="\n", verbose=self.verbose)
+            _run_manager.on_text(
+                str(context), color="green", end="\n", verbose=self.verbose
+            )
+            result = self.qa_chain(
+                {"prompt": prompt, "context": context},
+                callbacks=callbacks,
+            )
+            res = result[self.qa_chain.output_key]
+        elif intent == "UPDATE":
+            self.graph.update(generated_sparql)
+            res = "Successfully inserted triples into the graph."
+        else:
+            raise ValueError("Unsupported SPARQL query type.")
+        return {self.output_key: res}
--- a/langchain/graphs/init.py
+++ b/langchain/graphs/init.py
@ -4,5 +4,13 @@ from langchain.graphs.kuzu_graph import KuzuGraph
 from langchain.graphs.nebula_graph import NebulaGraph
 from langchain.graphs.neo4j_graph import Neo4jGraph
 from langchain.graphs.networkx_graph import NetworkxEntityGraph
+from langchain.graphs.rdf_graph import RdfGraph

-__all__ = ["NetworkxEntityGraph", "Neo4jGraph", "NebulaGraph", "KuzuGraph", "HugeGraph"]
+__all__ = [
+    "NetworkxEntityGraph",
+    "Neo4jGraph",
+    "NebulaGraph",
+    "KuzuGraph",
+    "HugeGraph",
+    "RdfGraph",
+]
--- a/langchain/graphs/rdf_graph.py
+++ b/langchain/graphs/rdf_graph.py
@ -0,0 +1,279 @@
+from __future__ import annotations
+
+from typing import (
+    TYPE_CHECKING,
+    List,
+    Optional,
+)
+
+if TYPE_CHECKING:
+    import rdflib
+
+prefixes = {
+    "owl": """PREFIX owl: <http://www.w3.org/2002/07/owl#>\n""",
+    "rdf": """PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>\n""",
+    "rdfs": """PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n""",
+    "xsd": """PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>\n""",
+}
+
+cls_query_rdf = prefixes["rdfs"] + (
+    """SELECT DISTINCT ?cls ?com\n"""
+    """WHERE { \n"""
+    """    ?instance a ?cls . \n"""
+    """    OPTIONAL { ?cls rdfs:comment ?com } \n"""
+    """}"""
+)
+
+cls_query_rdfs = prefixes["rdfs"] + (
+    """SELECT DISTINCT ?cls ?com\n"""
+    """WHERE { \n"""
+    """    ?instance a/rdfs:subClassOf* ?cls . \n"""
+    """    OPTIONAL { ?cls rdfs:comment ?com } \n"""
+    """}"""
+)
+
+cls_query_owl = cls_query_rdfs
+
+rel_query_rdf = prefixes["rdfs"] + (
+    """SELECT DISTINCT ?rel ?com\n"""
+    """WHERE { \n"""
+    """    ?subj ?rel ?obj . \n"""
+    """    OPTIONAL { ?cls rdfs:comment ?com } \n"""
+    """}"""
+)
+
+rel_query_rdfs = (
+    prefixes["rdf"]
+    + prefixes["rdfs"]
+    + (
+        """SELECT DISTINCT ?rel ?com\n"""
+        """WHERE { \n"""
+        """    ?rel a/rdfs:subPropertyOf* rdf:Property . \n"""
+        """    OPTIONAL { ?cls rdfs:comment ?com } \n"""
+        """}"""
+    )
+)
+
+op_query_owl = (
+    prefixes["rdfs"]
+    + prefixes["owl"]
+    + (
+        """SELECT DISTINCT ?op ?com\n"""
+        """WHERE { \n"""
+        """    ?op a/rdfs:subPropertyOf* owl:ObjectProperty . \n"""
+        """    OPTIONAL { ?cls rdfs:comment ?com } \n"""
+        """}"""
+    )
+)
+
+dp_query_owl = (
+    prefixes["rdfs"]
+    + prefixes["owl"]
+    + (
+        """SELECT DISTINCT ?dp ?com\n"""
+        """WHERE { \n"""
+        """    ?dp a/rdfs:subPropertyOf* owl:DatatypeProperty . \n"""
+        """    OPTIONAL { ?cls rdfs:comment ?com } \n"""
+        """}"""
+    )
+)
+
+
+class RdfGraph:
+    """
+    RDFlib wrapper for graph operations.
+    Modes:
+    * local: Local file - can be queried and changed
+    * online: Online file - can only be queried, changes can be stored locally
+    * store: Triple store - can be queried and changed if update_endpoint available
+    Together with a source file, the serialization should be specified.
+    """
+
+    def __init__(
+        self,
+        source_file: Optional[str] = None,
+        serialization: Optional[str] = "ttl",
+        query_endpoint: Optional[str] = None,
+        update_endpoint: Optional[str] = None,
+        standard: Optional[str] = "rdf",
+        local_copy: Optional[str] = None,
+    ) -> None:
+        """
+        Set up the RDFlib graph
+
+        :param source_file: either a path for a local file or a URL
+        :param serialization: serialization of the input
+        :param query_endpoint: SPARQL endpoint for queries, read access
+        :param update_endpoint: SPARQL endpoint for UPDATE queries, write access
+        :param standard: RDF, RDFS, or OWL
+        :param local_copy: new local copy for storing changes
+        """
+        self.source_file = source_file
+        self.serialization = serialization
+        self.query_endpoint = query_endpoint
+        self.update_endpoint = update_endpoint
+        self.standard = standard
+        self.local_copy = local_copy
+
+        try:
+            import rdflib
+            from rdflib.graph import DATASET_DEFAULT_GRAPH_ID as default
+            from rdflib.plugins.stores import sparqlstore
+        except ImportError:
+            raise ValueError(
+                "Could not import rdflib python package. "
+                "Please install it with `pip install rdflib`."
+            )
+        if self.standard not in (supported_standards := ("rdf", "rdfs", "owl")):
+            raise ValueError(
+                f"Invalid standard. Supported standards are: {supported_standards}."
+            )
+
+        if (
+            not source_file
+            and not query_endpoint
+            or source_file
+            and (query_endpoint or update_endpoint)
+        ):
+            raise ValueError(
+                "Could not unambiguously initialize the graph wrapper. "
+                "Specify either a file (local or online) via the source_file "
+                "or a triple store via the endpoints."
+            )
+
+        if source_file:
+            if source_file.startswith("http"):
+                self.mode = "online"
+            else:
+                self.mode = "local"
+                if self.local_copy is None:
+                    self.local_copy = self.source_file
+            self.graph = rdflib.Graph()
+            self.graph.parse(source_file, format=self.serialization)
+
+        if query_endpoint:
+            self.mode = "store"
+            if not update_endpoint:
+                self._store = sparqlstore.SPARQLStore()
+                self._store.open(query_endpoint)
+            else:
+                self._store = sparqlstore.SPARQLUpdateStore()
+                self._store.open((query_endpoint, update_endpoint))
+            self.graph = rdflib.Graph(self._store, identifier=default)
+
+        # Verify that the graph was loaded
+        if not len(self.graph):
+            raise AssertionError("The graph is empty.")
+
+        # Set schema
+        self.schema = ""
+        self.load_schema()
+
+    @property
+    def get_schema(self) -> str:
+        """
+        Returns the schema of the graph database.
+        """
+        return self.schema
+
+    def query(
+        self,
+        query: str,
+    ) -> List[rdflib.query.ResultRow]:
+        """
+        Query the graph.
+        """
+        from rdflib.exceptions import ParserError
+        from rdflib.query import ResultRow
+
+        try:
+            res = self.graph.query(query)
+        except ParserError as e:
+            raise ValueError("Generated SPARQL statement is invalid\n" f"{e}")
+        return [r for r in res if isinstance(r, ResultRow)]
+
+    def update(
+        self,
+        query: str,
+    ) -> None:
+        """
+        Update the graph.
+        """
+        from rdflib.exceptions import ParserError
+
+        try:
+            self.graph.update(query)
+        except ParserError as e:
+            raise ValueError("Generated SPARQL statement is invalid\n" f"{e}")
+        if self.local_copy:
+            self.graph.serialize(
+                destination=self.local_copy, format=self.local_copy.split(".")[-1]
+            )
+        else:
+            raise ValueError("No target file specified for saving the updated file.")
+
+    @staticmethod
+    def _get_local_name(iri: str) -> str:
+        if "#" in iri:
+            local_name = iri.split("#")[-1]
+        elif "/" in iri:
+            local_name = iri.split("/")[-1]
+        else:
+            raise ValueError(f"Unexpected IRI '{iri}', contains neither '#' nor '/'.")
+        return local_name
+
+    def _res_to_str(self, res: rdflib.query.ResultRow, var: str) -> str:
+        return (
+            "<"
+            + res[var]
+            + "> ("
+            + self._get_local_name(res[var])
+            + ", "
+            + str(res["com"])
+            + ")"
+        )
+
+    def load_schema(self) -> None:
+        """
+        Load the graph schema information.
+        """
+
+        def _rdf_s_schema(
+            classes: List[rdflib.query.ResultRow],
+            relationships: List[rdflib.query.ResultRow],
+        ) -> str:
+            return (
+                f"In the following, each IRI is followed by the local name and "
+                f"optionally its description in parentheses. \n"
+                f"The RDF graph supports the following node types:\n"
+                f'{", ".join([self._res_to_str(r, "cls") for r in classes])}\n'
+                f"The RDF graph supports the following relationships:\n"
+                f'{", ".join([self._res_to_str(r, "rel") for r in relationships])}\n'
+            )
+
+        if self.standard == "rdf":
+            clss = self.query(cls_query_rdf)
+            rels = self.query(rel_query_rdf)
+            self.schema = _rdf_s_schema(clss, rels)
+        elif self.standard == "rdfs":
+            clss = self.query(cls_query_rdfs)
+            rels = self.query(rel_query_rdfs)
+            self.schema = _rdf_s_schema(clss, rels)
+        elif self.standard == "owl":
+            clss = self.query(cls_query_owl)
+            ops = self.query(cls_query_owl)
+            dps = self.query(cls_query_owl)
+            self.schema = (
+                f"In the following, each IRI is followed by the local name and "
+                f"optionally its description in parentheses. \n"
+                f"The OWL graph supports the following node types:\n"
+                f'{", ".join([self._res_to_str(r, "cls") for r in clss])}\n'
+                f"The OWL graph supports the following object properties, "
+                f"i.e., relationships between objects:\n"
+                f'{", ".join([self._res_to_str(r, "op") for r in ops])}\n'
+                f"The OWL graph supports the following data properties, "
+                f"i.e., relationships between objects and literals:\n"
+                f'{", ".join([self._res_to_str(r, "dp") for r in dps])}\n'
+            )
+        else:
+            raise ValueError(f"Mode '{self.standard}' is currently not supported.")
--- a/poetry.lock
+++ b/poetry.lock
@ -642,12 +642,17 @@ category = "main"
 optional = true
 python-versions = ">=3.7"
 files = [
+    {file = "awadb-0.3.3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:871e2b10c79348d44522b8430af1ed1ad2632322c74abc20d8a3154de242da96"},
    {file = "awadb-0.3.3-cp310-cp310-manylinux1_x86_64.whl", hash = "sha256:daebc108103c8cace41dfb3235fcfdda28ea48e6cd6548b6072f7ad49b64274b"},
    {file = "awadb-0.3.3-cp311-cp311-macosx_10_13_universal2.whl", hash = "sha256:2bb3ca2f943448060b1bba4395dd99e2218d7f2149507a8fdfa7a3fd4cfe97ec"},
+    {file = "awadb-0.3.3-cp311-cp311-macosx_10_13_x86_64.whl", hash = "sha256:83e92963cde54a4382b0c299939865ce12e145853637642bc8e6eb22bf689386"},
+    {file = "awadb-0.3.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:6f249b04f38840146a5c17ffcd0f4da1bb00a39b8882c96e042acf58045faca2"},
    {file = "awadb-0.3.3-cp311-cp311-manylinux1_x86_64.whl", hash = "sha256:7b99662af9f7b58e217661a70c295e40605900552bec6d8e9553d90dbf19c5c1"},
    {file = "awadb-0.3.3-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:94be44e587f28fa26b2cade0b6f4c04689f50cb0c07183db5ee50e48fe2e9ae3"},
    {file = "awadb-0.3.3-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:314929dc3a8d25c0f234a2b86c920543050f4eb298a6f68bd2c97c9fe3fb6224"},
+    {file = "awadb-0.3.3-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:35d8f580973e137864d2e00edbc7369cd01cf72b673d60fe902c7b3f983c76e9"},
    {file = "awadb-0.3.3-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:8bfccff1c7373899153427d93d96a97ae5371e8a6f09ff4dcbd28fb9f3f63ff4"},
+    {file = "awadb-0.3.3-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:c791bdd0646ec620c8b0fa026915780ebf78c16169cd9da81f54409553ec0114"},
    {file = "awadb-0.3.3-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:810021a90b873f668d8ab63e2c2747b2b2835bf0ae25f4223b6c94f06faffea4"},
 ]

@ -3653,23 +3658,6 @@ cli = ["click (>=8.0.0,<9.0.0)", "pygments (>=2.0.0,<3.0.0)", "rich (>=10,<14)"]
 http2 = ["h2 (>=3,<5)"]
 socks = ["socksio (>=1.0.0,<2.0.0)"]

-[[package]]
-name = "hugegraph-python"
-version = "1.0.0.12"
-description = "Python client for HugeGraph"
-optional = true
-python-versions = "*"
-files = [
-    {file = "hugegraph-python-1.0.0.12.tar.gz", hash = "sha256:06b2dded70c4f4570083f8b6e3a9edfebcf5ac4f07300727afad72389917ab85"},
-    {file = "hugegraph_python-1.0.0.12-py3-none-any.whl", hash = "sha256:69fe20edbe1a392d16afc74df5c94b3b96bc02c848e9ab5b5f18c112a9bc3ebe"},
-]
-
-[package.dependencies]
-decorator = "5.1.1"
-Requests = "2.31.0"
-setuptools = "67.6.1"
-urllib3 = "2.0.3"
-
 [[package]]
 name = "huggingface-hub"
 version = "0.15.1"
@ -4306,6 +4294,7 @@ optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*"
 files = [
    {file = "jsonpointer-2.4-py2.py3-none-any.whl", hash = "sha256:15d51bba20eea3165644553647711d150376234112651b4f1811022aecad7d7a"},
+    {file = "jsonpointer-2.4.tar.gz", hash = "sha256:585cee82b70211fa9e6043b7bb89db6e1aa49524340dde8ad6b63206ea689d88"},
 ]

 [[package]]
@ -8924,6 +8913,28 @@ files = [
 [package.extras]
 test = ["pytest (>=3.0)", "pytest-asyncio"]

+[[package]]
+name = "rdflib"
+version = "6.3.2"
+description = "RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information."
+category = "main"
+optional = true
+python-versions = ">=3.7,<4.0"
+files = [
+    {file = "rdflib-6.3.2-py3-none-any.whl", hash = "sha256:36b4e74a32aa1e4fa7b8719876fb192f19ecd45ff932ea5ebbd2e417a0247e63"},
+    {file = "rdflib-6.3.2.tar.gz", hash = "sha256:72af591ff704f4caacea7ecc0c5a9056b8553e0489dd4f35a9bc52dbd41522e0"},
+]
+
+[package.dependencies]
+isodate = ">=0.6.0,<0.7.0"
+pyparsing = ">=2.1.0,<4"
+
+[package.extras]
+berkeleydb = ["berkeleydb (>=18.1.0,<19.0.0)"]
+html = ["html5lib (>=1.0,<2.0)"]
+lxml = ["lxml (>=4.3.0,<5.0.0)"]
+networkx = ["networkx (>=2.0.0,<3.0.0)"]
+
 [[package]]
 name = "redis"
 version = "4.5.5"
@ -12371,7 +12382,7 @@ cffi = {version = ">=1.11", markers = "platform_python_implementation == \"PyPy\
 cffi = ["cffi (>=1.11)"]

 [extras]
-all = ["O365", "aleph-alpha-client", "anthropic", "arxiv", "atlassian-python-api", "awadb", "azure-ai-formrecognizer", "azure-ai-vision", "azure-cognitiveservices-speech", "azure-cosmos", "azure-identity", "beautifulsoup4", "clarifai", "clickhouse-connect", "cohere", "deeplake", "docarray", "duckduckgo-search", "elasticsearch", "esprima", "faiss-cpu", "google-api-python-client", "google-auth", "google-search-results", "gptcache", "html2text", "huggingface_hub", "jina", "jinja2", "jq", "lancedb", "langkit", "lark", "lxml", "manifest-ml", "momento", "nebula3-python", "neo4j", "networkx", "nlpcloud", "nltk", "nomic", "octoai-sdk", "openai", "openlm", "opensearch-py", "pdfminer-six", "pexpect", "pgvector", "pinecone-client", "pinecone-text", "psycopg2-binary", "pymongo", "pyowm", "pypdf", "pytesseract", "pyvespa", "qdrant-client", "redis", "requests-toolbelt", "sentence-transformers", "singlestoredb", "spacy", "steamship", "tensorflow-text", "tigrisdb", "tiktoken", "torch", "transformers", "weaviate-client", "wikipedia", "wolframalpha"]
+all = ["O365", "aleph-alpha-client", "anthropic", "arxiv", "atlassian-python-api", "awadb", "azure-ai-formrecognizer", "azure-ai-vision", "azure-cognitiveservices-speech", "azure-cosmos", "azure-identity", "beautifulsoup4", "clarifai", "clickhouse-connect", "cohere", "deeplake", "docarray", "duckduckgo-search", "elasticsearch", "esprima", "faiss-cpu", "google-api-python-client", "google-auth", "google-search-results", "gptcache", "html2text", "huggingface_hub", "jina", "jinja2", "jq", "lancedb", "langkit", "lark", "lxml", "manifest-ml", "momento", "nebula3-python", "neo4j", "networkx", "nlpcloud", "nltk", "nomic", "octoai-sdk", "openai", "openlm", "opensearch-py", "pdfminer-six", "pexpect", "pgvector", "pinecone-client", "pinecone-text", "psycopg2-binary", "pymongo", "pyowm", "pypdf", "pytesseract", "pyvespa", "qdrant-client", "rdflib", "redis", "requests-toolbelt", "sentence-transformers", "singlestoredb", "spacy", "steamship", "tensorflow-text", "tigrisdb", "tiktoken", "torch", "transformers", "weaviate-client", "wikipedia", "wolframalpha"]
 azure = ["azure-ai-formrecognizer", "azure-ai-vision", "azure-cognitiveservices-speech", "azure-core", "azure-cosmos", "azure-identity", "azure-search-documents", "openai"]
 clarifai = ["clarifai"]
 cohere = ["cohere"]
@ -12387,4 +12398,4 @@ text-helpers = ["chardet"]
 [metadata]
 lock-version = "2.0"
 python-versions = ">=3.8.1,<4.0"
-content-hash = "1ebf19951081f82f4af129d9e2ea12c40fdc519bf49dffa0cfa85eb28938710c"
+content-hash = "e2450d84b1a0747c45b015e1071ed37265269c325362779cd8bd3b9caa94a9c9"
--- a/pyproject.toml
+++ b/pyproject.toml
@ -114,6 +114,7 @@ openllm = {version = ">=0.1.19", optional = true}
 streamlit = {version = "^1.18.0", optional = true, python = ">=3.8.1,<3.9.7 || >3.9.7,<4.0"}
 psychicapi = {version = "^0.8.0", optional = true}
 cassio = {version = "^0.0.7", optional = true}
+rdflib = {version = "^6.3.2", optional = true}

 [tool.poetry.group.docs.dependencies]
 autodoc_pydantic = "^1.8.0"
@ -310,6 +311,7 @@ all = [
    "awadb",
    "esprima",
    "octoai-sdk",
+    "rdflib",
 ]

 # An extra used to be able to add extended testing.
--- a/tests/integration_tests/chains/test_graph_database_sparql.py
+++ b/tests/integration_tests/chains/test_graph_database_sparql.py
@ -0,0 +1,79 @@
+"""Test RDF/ SPARQL Graph Database Chain."""
+import os
+
+from langchain.chains.graph_qa.sparql import GraphSparqlQAChain
+from langchain.graphs import RdfGraph
+from langchain.llms.openai import OpenAI
+
+
+def test_connect_file_rdf() -> None:
+    """
+    Test loading online resource.
+    """
+    berners_lee_card = "http://www.w3.org/People/Berners-Lee/card"
+
+    graph = RdfGraph(
+        source_file=berners_lee_card,
+        standard="rdf",
+    )
+
+    query = """SELECT ?s ?p ?o\n""" """WHERE { ?s ?p ?o }"""
+
+    output = graph.query(query)
+    assert len(output) == 86
+
+
+def test_sparql_select() -> None:
+    """
+    Test for generating and executing simple SPARQL SELECT query.
+    """
+    berners_lee_card = "http://www.w3.org/People/Berners-Lee/card"
+
+    graph = RdfGraph(
+        source_file=berners_lee_card,
+        standard="rdf",
+    )
+
+    chain = GraphSparqlQAChain.from_llm(OpenAI(temperature=0), graph=graph)
+    output = chain.run("What is Tim Berners-Lee's work homepage?")
+    expected_output = (
+        " The work homepage of Tim Berners-Lee is "
+        "http://www.w3.org/People/Berners-Lee/."
+    )
+    assert output == expected_output
+
+
+def test_sparql_insert() -> None:
+    """
+    Test for generating and executing simple SPARQL INSERT query.
+    """
+    berners_lee_card = "http://www.w3.org/People/Berners-Lee/card"
+    _local_copy = "test.ttl"
+
+    graph = RdfGraph(
+        source_file=berners_lee_card,
+        standard="rdf",
+        local_copy=_local_copy,
+    )
+
+    chain = GraphSparqlQAChain.from_llm(OpenAI(temperature=0), graph=graph)
+    chain.run(
+        "Save that the person with the name 'Timothy Berners-Lee' "
+        "has a work homepage at 'http://www.w3.org/foo/bar/'"
+    )
+    query = (
+        """PREFIX foaf: <http://xmlns.com/foaf/0.1/>\n"""
+        """SELECT ?hp\n"""
+        """WHERE {\n"""
+        """    ?person foaf:name "Timothy Berners-Lee" . \n"""
+        """    ?person foaf:workplaceHomepage ?hp .\n"""
+        """}"""
+    )
+    output = graph.query(query)
+    assert len(output) == 2
+
+    # clean up
+    try:
+        os.remove(_local_copy)
+    except OSError:
+        pass