langchain/docs/extras/integrations/retrievers/vespa.ipynb

139 lines
3.6 KiB
Plaintext
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"id": "ce0f17b9",
"metadata": {},
"source": [
"# Vespa\n",
"\n",
">[Vespa](https://vespa.ai/) is a fully featured search engine and vector database. It supports vector search (ANN), lexical search, and search in structured data, all in the same query.\n",
"\n",
"This notebook shows how to use `Vespa.ai` as a LangChain retriever.\n",
"\n",
"In order to create a retriever, we use [pyvespa](https://pyvespa.readthedocs.io/en/latest/index.html) to\n",
"create a connection a `Vespa` service."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7e6a11ab-38bd-4920-ba11-60cb2f075754",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#!pip install pyvespa"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "c10dd962",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from vespa.application import Vespa\n",
"\n",
"vespa_app = Vespa(url=\"https://doc-search.vespa.oath.cloud\")"
]
},
{
"cell_type": "markdown",
"id": "3df4ce53",
"metadata": {},
"source": [
"This creates a connection to a `Vespa` service, here the Vespa documentation search service.\n",
"Using `pyvespa` package, you can also connect to a\n",
"[Vespa Cloud instance](https://pyvespa.readthedocs.io/en/latest/deploy-vespa-cloud.html)\n",
"or a local\n",
"[Docker instance](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html).\n",
"\n",
"\n",
"After connecting to the service, you can set up the retriever:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7ccca1f4",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"from langchain.retrievers.vespa_retriever import VespaRetriever\n",
"\n",
"vespa_query_body = {\n",
" \"yql\": \"select content from paragraph where userQuery()\",\n",
" \"hits\": 5,\n",
" \"ranking\": \"documentation\",\n",
" \"locale\": \"en-us\",\n",
"}\n",
"vespa_content_field = \"content\"\n",
"retriever = VespaRetriever(vespa_app, vespa_query_body, vespa_content_field)"
]
},
{
"cell_type": "markdown",
"id": "1e7e34e1",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"This sets up a LangChain retriever that fetches documents from the Vespa application.\n",
"Here, up to 5 results are retrieved from the `content` field in the `paragraph` document type,\n",
"using `doumentation` as the ranking method. The `userQuery()` is replaced with the actual query\n",
"passed from LangChain.\n",
"\n",
"Please refer to the [pyvespa documentation](https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa.html#Query)\n",
"for more information.\n",
"\n",
"Now you can return the results and continue using the results in LangChain."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f47a2bfe",
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"retriever.get_relevant_documents(\"what is vespa?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}