{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ce0f17b9",
   "metadata": {},
   "source": [
    "# Vespa retriever\n",
    "\n",
    "This notebook shows how to use Vespa.ai as a LangChain retriever.\n",
    "Vespa.ai is a platform for highly efficient structured text and vector search.\n",
    "Please refer to [Vespa.ai](https://vespa.ai) for more information.\n",
    "\n",
    "In order to create a retriever, we use [pyvespa](https://pyvespa.readthedocs.io/en/latest/index.html) to\n",
    "create a connection a Vespa service."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c10dd962",
   "metadata": {},
   "outputs": [],
   "source": [
    "from vespa.application import Vespa\n",
    "\n",
    "vespa_app = Vespa(url=\"https://doc-search.vespa.oath.cloud\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3df4ce53",
   "metadata": {},
   "source": [
    "This creates a connection to a Vespa service, here the Vespa documentation search service.\n",
    "Using pyvespa, you can also connect to a\n",
    "[Vespa Cloud instance](https://pyvespa.readthedocs.io/en/latest/deploy-vespa-cloud.html)\n",
    "or a local\n",
    "[Docker instance](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html).\n",
    "\n",
    "\n",
    "After connecting to the service, you can set up the retriever:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7ccca1f4",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from langchain.retrievers.vespa_retriever import VespaRetriever\n",
    "\n",
    "vespa_query_body = {\n",
    "    \"yql\": \"select content from paragraph where userQuery()\",\n",
    "    \"hits\": 5,\n",
    "    \"ranking\": \"documentation\",\n",
    "    \"locale\": \"en-us\"\n",
    "}\n",
    "vespa_content_field = \"content\"\n",
    "retriever = VespaRetriever(vespa_app, vespa_query_body, vespa_content_field)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1e7e34e1",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "This sets up a LangChain retriever that fetches documents from the Vespa application.\n",
    "Here, up to 5 results are retrieved from the `content` field in the `paragraph` document type,\n",
    "using `doumentation` as the ranking method. The `userQuery()` is replaced with the actual query\n",
    "passed from LangChain.\n",
    "\n",
    "Please refer to the [pyvespa documentation](https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa.html#Query)\n",
    "for more information.\n",
    "\n",
    "Now you can return the results and continue using the results in LangChain."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "f47a2bfe",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "retriever.get_relevant_documents(\"what is vespa?\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}