mirror of
https://github.com/hwchase17/langchain
synced 2024-10-31 15:20:26 +00:00
2667ddc686
**Description: a description of the change** Fixed `make docs_build` and related scripts which caused errors. There are several changes. First, I made the build of the documentation and the API Reference into two separate commands. This is because it takes less time to build. The commands for documents are `make docs_build`, `make docs_clean`, and `make docs_linkcheck`. The commands for API Reference are `make api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`. It looked like `docs/.local_build.sh` could be used to build the documentation, so I used that. Since `.local_build.sh` was also building API Rerefence internally, I removed that process. `.local_build.sh` also added some Bash options to stop in error or so. Futher more added `cd "${SCRIPT_DIR}"` at the beginning so that the script will work no matter which directory it is executed in. `docs/api_reference/api_reference.rst` is removed, because which is generated by `docs/api_reference/create_api_rst.py`, and added it to .gitignore. Finally, the description of CONTRIBUTING.md was modified. **Issue: the issue # it fixes (if applicable)** https://github.com/hwchase17/langchain/issues/6413 **Dependencies: any dependencies required for this change** `nbdoc` was missing in group docs so it was added. I installed it with the `poetry add --group docs nbdoc` command. I am concerned if any modifications are needed to poetry.lock. I would greatly appreciate it if you could pay close attention to this file during the review. **Tag maintainer** - General / Misc / if you don't know who to tag: @baskaryan If this PR needs any additional changes, I'll be happy to make them! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>
303 lines
11 KiB
Plaintext
303 lines
11 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c94240f5",
|
|
"metadata": {},
|
|
"source": [
|
|
"# GraphSparqlQAChain\n",
|
|
"\n",
|
|
"Graph databases are an excellent choice for applications based on network-like models. To standardize the syntax and semantics of such graphs, the W3C recommends Semantic Web Technologies, cp. [Semantic Web](https://www.w3.org/standards/semanticweb/). [SPARQL](https://www.w3.org/TR/sparql11-query/) serves as a query language analogously to SQL or Cypher for these graphs. This notebook demonstrates the application of LLMs as a natural language interface to a graph database by generating SPARQL.\\\n",
|
|
"Disclaimer: To date, SPARQL query generation via LLMs is still a bit unstable. Be especially careful with UPDATE queries, which alter the graph."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "dbc0ee68",
|
|
"metadata": {},
|
|
"source": [
|
|
"There are several sources you can run queries against, including files on the web, files you have available locally, SPARQL endpoints, e.g., [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page), and [triple stores](https://www.w3.org/wiki/LargeTripleStores)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "62812aad",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"is_executing": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain.chat_models import ChatOpenAI\n",
|
|
"from langchain.chains import GraphSparqlQAChain\n",
|
|
"from langchain.graphs import RdfGraph"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"id": "0928915d",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"is_executing": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"graph = RdfGraph(\n",
|
|
" source_file=\"http://www.w3.org/People/Berners-Lee/card\",\n",
|
|
" standard=\"rdf\",\n",
|
|
" local_copy=\"test.ttl\",\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"Note that providing a `local_file` is necessary for storing changes locally if the source is read-only."
|
|
],
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"id": "7af596b5"
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "58c1a8ea",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Refresh graph schema information\n",
|
|
"If the schema of the database changes, you can refresh the schema information needed to generate SPARQL queries."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"id": "4e3de44f",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"is_executing": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"graph.load_schema()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"id": "1fe76ccd",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"In the following, each IRI is followed by the local name and optionally its description in parentheses. \n",
|
|
"The RDF graph supports the following node types:\n",
|
|
"<http://xmlns.com/foaf/0.1/PersonalProfileDocument> (PersonalProfileDocument, None), <http://www.w3.org/ns/auth/cert#RSAPublicKey> (RSAPublicKey, None), <http://www.w3.org/2000/10/swap/pim/contact#Male> (Male, None), <http://xmlns.com/foaf/0.1/Person> (Person, None), <http://www.w3.org/2006/vcard/ns#Work> (Work, None)\n",
|
|
"The RDF graph supports the following relationships:\n",
|
|
"<http://www.w3.org/2000/01/rdf-schema#seeAlso> (seeAlso, None), <http://purl.org/dc/elements/1.1/title> (title, None), <http://xmlns.com/foaf/0.1/mbox_sha1sum> (mbox_sha1sum, None), <http://xmlns.com/foaf/0.1/maker> (maker, None), <http://www.w3.org/ns/solid/terms#oidcIssuer> (oidcIssuer, None), <http://www.w3.org/2000/10/swap/pim/contact#publicHomePage> (publicHomePage, None), <http://xmlns.com/foaf/0.1/openid> (openid, None), <http://www.w3.org/ns/pim/space#storage> (storage, None), <http://xmlns.com/foaf/0.1/name> (name, None), <http://www.w3.org/2000/10/swap/pim/contact#country> (country, None), <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> (type, None), <http://www.w3.org/ns/solid/terms#profileHighlightColor> (profileHighlightColor, None), <http://www.w3.org/ns/pim/space#preferencesFile> (preferencesFile, None), <http://www.w3.org/2000/01/rdf-schema#label> (label, None), <http://www.w3.org/ns/auth/cert#modulus> (modulus, None), <http://www.w3.org/2000/10/swap/pim/contact#participant> (participant, None), <http://www.w3.org/2000/10/swap/pim/contact#street2> (street2, None), <http://www.w3.org/2006/vcard/ns#locality> (locality, None), <http://xmlns.com/foaf/0.1/nick> (nick, None), <http://xmlns.com/foaf/0.1/homepage> (homepage, None), <http://creativecommons.org/ns#license> (license, None), <http://xmlns.com/foaf/0.1/givenname> (givenname, None), <http://www.w3.org/2006/vcard/ns#street-address> (street-address, None), <http://www.w3.org/2006/vcard/ns#postal-code> (postal-code, None), <http://www.w3.org/2000/10/swap/pim/contact#street> (street, None), <http://www.w3.org/2003/01/geo/wgs84_pos#lat> (lat, None), <http://xmlns.com/foaf/0.1/primaryTopic> (primaryTopic, None), <http://www.w3.org/2006/vcard/ns#fn> (fn, None), <http://www.w3.org/2003/01/geo/wgs84_pos#location> (location, None), <http://usefulinc.com/ns/doap#developer> (developer, None), <http://www.w3.org/2000/10/swap/pim/contact#city> (city, None), <http://www.w3.org/2006/vcard/ns#region> (region, None), <http://xmlns.com/foaf/0.1/member> (member, None), <http://www.w3.org/2003/01/geo/wgs84_pos#long> (long, None), <http://www.w3.org/2000/10/swap/pim/contact#address> (address, None), <http://xmlns.com/foaf/0.1/family_name> (family_name, None), <http://xmlns.com/foaf/0.1/account> (account, None), <http://xmlns.com/foaf/0.1/workplaceHomepage> (workplaceHomepage, None), <http://purl.org/dc/terms/title> (title, None), <http://www.w3.org/ns/solid/terms#publicTypeIndex> (publicTypeIndex, None), <http://www.w3.org/2000/10/swap/pim/contact#office> (office, None), <http://www.w3.org/2000/10/swap/pim/contact#homePage> (homePage, None), <http://xmlns.com/foaf/0.1/mbox> (mbox, None), <http://www.w3.org/2000/10/swap/pim/contact#preferredURI> (preferredURI, None), <http://www.w3.org/ns/solid/terms#profileBackgroundColor> (profileBackgroundColor, None), <http://schema.org/owns> (owns, None), <http://xmlns.com/foaf/0.1/based_near> (based_near, None), <http://www.w3.org/2006/vcard/ns#hasAddress> (hasAddress, None), <http://xmlns.com/foaf/0.1/img> (img, None), <http://www.w3.org/2000/10/swap/pim/contact#assistant> (assistant, None), <http://xmlns.com/foaf/0.1/title> (title, None), <http://www.w3.org/ns/auth/cert#key> (key, None), <http://www.w3.org/ns/ldp#inbox> (inbox, None), <http://www.w3.org/ns/solid/terms#editableProfile> (editableProfile, None), <http://www.w3.org/2000/10/swap/pim/contact#postalCode> (postalCode, None), <http://xmlns.com/foaf/0.1/weblog> (weblog, None), <http://www.w3.org/ns/auth/cert#exponent> (exponent, None), <http://rdfs.org/sioc/ns#avatar> (avatar, None)\n",
|
|
"\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"graph.get_schema"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "68a3c677",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Querying the graph\n",
|
|
"\n",
|
|
"Now, you can use the graph SPARQL QA chain to ask questions about the graph."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"id": "7476ce98",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"is_executing": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"chain = GraphSparqlQAChain.from_llm(\n",
|
|
" ChatOpenAI(temperature=0), graph=graph, verbose=True\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"id": "ef8ee27b",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"is_executing": true
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"\n",
|
|
"\n",
|
|
"\u001b[1m> Entering new GraphSparqlQAChain chain...\u001b[0m\n",
|
|
"Identified intent:\n",
|
|
"\u001b[32;1m\u001b[1;3mSELECT\u001b[0m\n",
|
|
"Generated SPARQL:\n",
|
|
"\u001b[32;1m\u001b[1;3mPREFIX foaf: <http://xmlns.com/foaf/0.1/>\n",
|
|
"SELECT ?homepage\n",
|
|
"WHERE {\n",
|
|
" ?person foaf:name \"Tim Berners-Lee\" .\n",
|
|
" ?person foaf:workplaceHomepage ?homepage .\n",
|
|
"}\u001b[0m\n",
|
|
"Full Context:\n",
|
|
"\u001b[32;1m\u001b[1;3m[]\u001b[0m\n",
|
|
"\n",
|
|
"\u001b[1m> Finished chain.\u001b[0m\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"\"Tim Berners-Lee's work homepage is http://www.w3.org/People/Berners-Lee/.\""
|
|
]
|
|
},
|
|
"execution_count": 12,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"chain.run(\"What is Tim Berners-Lee's work homepage?\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "af4b3294",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Updating the graph\n",
|
|
"\n",
|
|
"Analogously, you can update the graph, i.e., insert triples, using natural language."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"id": "fdf38841",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"is_executing": true
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"\n",
|
|
"\n",
|
|
"\u001b[1m> Entering new GraphSparqlQAChain chain...\u001b[0m\n",
|
|
"Identified intent:\n",
|
|
"\u001b[32;1m\u001b[1;3mUPDATE\u001b[0m\n",
|
|
"Generated SPARQL:\n",
|
|
"\u001b[32;1m\u001b[1;3mPREFIX foaf: <http://xmlns.com/foaf/0.1/>\n",
|
|
"INSERT {\n",
|
|
" ?person foaf:workplaceHomepage <http://www.w3.org/foo/bar/> .\n",
|
|
"}\n",
|
|
"WHERE {\n",
|
|
" ?person foaf:name \"Timothy Berners-Lee\" .\n",
|
|
"}\u001b[0m\n",
|
|
"\n",
|
|
"\u001b[1m> Finished chain.\u001b[0m\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'Successfully inserted triples into the graph.'"
|
|
]
|
|
},
|
|
"execution_count": 14,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"chain.run(\n",
|
|
" \"Save that the person with the name 'Timothy Berners-Lee' has a work homepage at 'http://www.w3.org/foo/bar/'\"\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "5e0f7fc1",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's verify the results:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"id": "f874171b",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[(rdflib.term.URIRef('https://www.w3.org/'),),\n",
|
|
" (rdflib.term.URIRef('http://www.w3.org/foo/bar/'),)]"
|
|
]
|
|
},
|
|
"execution_count": 15,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"query = (\n",
|
|
" \"\"\"PREFIX foaf: <http://xmlns.com/foaf/0.1/>\\n\"\"\"\n",
|
|
" \"\"\"SELECT ?hp\\n\"\"\"\n",
|
|
" \"\"\"WHERE {\\n\"\"\"\n",
|
|
" \"\"\" ?person foaf:name \"Timothy Berners-Lee\" . \\n\"\"\"\n",
|
|
" \"\"\" ?person foaf:workplaceHomepage ?hp .\\n\"\"\"\n",
|
|
" \"\"\"}\"\"\"\n",
|
|
")\n",
|
|
"graph.query(query)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "lc",
|
|
"language": "python",
|
|
"name": "lc"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.11.4"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
} |