langchain/docs/modules/indexes/retrievers/examples/chatgpt-plugin.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1edb9e6b",
   "metadata": {},
   "source": [
    "# ChatGPT Plugin\n",
    "\n",
    ">[OpenAI plugins](https://platform.openai.com/docs/plugins/introduction) connect ChatGPT to third-party applications. These plugins enable ChatGPT to interact with APIs defined by developers, enhancing ChatGPT's capabilities and allowing it to perform a wide range of actions.\n",
    "\n",
    ">Plugins can allow ChatGPT to do things like:\n",
    ">- Retrieve real-time information; e.g., sports scores, stock prices, the latest news, etc.\n",
    ">- Retrieve knowledge-base information; e.g., company docs, personal notes, etc.\n",
    ">- Perform actions on behalf of the user; e.g., booking a flight, ordering food, etc.\n",
    "\n",
    "This notebook shows how to use the ChatGPT Retriever Plugin within LangChain."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "bbe89ca0",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# STEP 1: Load\n",
    "\n",
    "# Load documents using LangChain's DocumentLoaders\n",
    "# This is from https://langchain.readthedocs.io/en/latest/modules/document_loaders/examples/csv.html\n",
    "\n",
    "from langchain.document_loaders.csv_loader import CSVLoader\n",
    "loader = CSVLoader(file_path='../../document_loaders/examples/example_data/mlb_teams_2012.csv')\n",
    "data = loader.load()\n",
    "\n",
    "\n",
    "# STEP 2: Convert\n",
    "\n",
    "# Convert Document to format expected by https://github.com/openai/chatgpt-retrieval-plugin\n",
    "from typing import List\n",
    "from langchain.docstore.document import Document\n",
    "import json\n",
    "\n",
    "def write_json(path: str, documents: List[Document])-> None:\n",
    "    results = [{\"text\": doc.page_content} for doc in documents]\n",
    "    with open(path, \"w\") as f:\n",
    "        json.dump(results, f, indent=2)\n",
    "\n",
    "write_json(\"foo.json\", data)\n",
    "\n",
    "# STEP 3: Use\n",
    "\n",
    "# Ingest this as you would any other json file in https://github.com/openai/chatgpt-retrieval-plugin/tree/main/scripts/process_json\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0474661d",
   "metadata": {},
   "source": [
    "## Using the ChatGPT Retriever Plugin\n",
    "\n",
    "Okay, so we've created the ChatGPT Retriever Plugin, but how do we actually use it?\n",
    "\n",
    "The below code walks through how to do that."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb27da9f-d574-425d-8fab-92b03b997568",
   "metadata": {},
   "source": [
    "We want to use `ChatGPTPluginRetriever` so we have to get the OpenAI API Key."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "b5d8c9e9-839f-42e9-933a-08195797dd4c",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdin",
     "output_type": "stream",
     "text": [
      "OpenAI API Key: ········\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import getpass\n",
    "\n",
    "os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "39d6074e",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from langchain.retrievers import ChatGPTPluginRetriever"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "33fd23d1",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "retriever = ChatGPTPluginRetriever(url=\"http://0.0.0.0:8000\", bearer_token=\"foo\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "16250bdf",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Document(page_content=\"This is Alice's phone number: 123-456-7890\", lookup_str='', metadata={'id': '456_0', 'metadata': {'source': 'email', 'source_id': '567', 'url': None, 'created_at': '1609592400.0', 'author': 'Alice', 'document_id': '456'}, 'embedding': None, 'score': 0.925571561}, lookup_index=0),\n",
       " Document(page_content='This is a document about something', lookup_str='', metadata={'id': '123_0', 'metadata': {'source': 'file', 'source_id': 'https://example.com/doc1', 'url': 'https://example.com/doc1', 'created_at': '1609502400.0', 'author': 'Alice', 'document_id': '123'}, 'embedding': None, 'score': 0.6987589}, lookup_index=0),\n",
       " Document(page_content='Team: Angels \"Payroll (millions)\": 154.49 \"Wins\": 89', lookup_str='', metadata={'id': '59c2c0c1-ae3f-4272-a1da-f44a723ea631_0', 'metadata': {'source': None, 'source_id': None, 'url': None, 'created_at': None, 'author': None, 'document_id': '59c2c0c1-ae3f-4272-a1da-f44a723ea631'}, 'embedding': None, 'score': 0.697888613}, lookup_index=0)]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "retriever.get_relevant_documents(\"alice's phone number\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c8b5794b",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`{`
			`"cells": [`
			`{`
			`"cell_type": "markdown",`
add docs for openai retriever ingest (#1969) 2023-03-24 15:24:33 +00:00			`"id": "1edb9e6b",`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`"metadata": {},`
			`"source": [`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`"# ChatGPT Plugin\n",`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`"\n",`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`">[OpenAI plugins](https://platform.openai.com/docs/plugins/introduction) connect ChatGPT to third-party applications. These plugins enable ChatGPT to interact with APIs defined by developers, enhancing ChatGPT's capabilities and allowing it to perform a wide range of actions.\n",`
add docs for openai retriever ingest (#1969) 2023-03-24 15:24:33 +00:00			`"\n",`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`">Plugins can allow ChatGPT to do things like:\n",`
			`">- Retrieve real-time information; e.g., sports scores, stock prices, the latest news, etc.\n",`
			`">- Retrieve knowledge-base information; e.g., company docs, personal notes, etc.\n",`
			`">- Perform actions on behalf of the user; e.g., booking a flight, ordering food, etc.\n",`
add docs for openai retriever ingest (#1969) 2023-03-24 15:24:33 +00:00			`"\n",`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`"This notebook shows how to use the ChatGPT Retriever Plugin within LangChain."`
add docs for openai retriever ingest (#1969) 2023-03-24 15:24:33 +00:00			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": 2,`
			`"id": "bbe89ca0",`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`"metadata": {`
			`"tags": []`
			`},`
add docs for openai retriever ingest (#1969) 2023-03-24 15:24:33 +00:00			`"outputs": [],`
			`"source": [`
			`"# STEP 1: Load\n",`
			`"\n",`
			`"# Load documents using LangChain's DocumentLoaders\n",`
			`"# This is from https://langchain.readthedocs.io/en/latest/modules/document_loaders/examples/csv.html\n",`
			`"\n",`
			`"from langchain.document_loaders.csv_loader import CSVLoader\n",`
			`"loader = CSVLoader(file_path='../../document_loaders/examples/example_data/mlb_teams_2012.csv')\n",`
			`"data = loader.load()\n",`
			`"\n",`
			`"\n",`
			`"# STEP 2: Convert\n",`
			`"\n",`
			`"# Convert Document to format expected by https://github.com/openai/chatgpt-retrieval-plugin\n",`
			`"from typing import List\n",`
			`"from langchain.docstore.document import Document\n",`
			`"import json\n",`
			`"\n",`
			`"def write_json(path: str, documents: List[Document])-> None:\n",`
			`" results = [{\"text\": doc.page_content} for doc in documents]\n",`
			`" with open(path, \"w\") as f:\n",`
			`" json.dump(results, f, indent=2)\n",`
			`"\n",`
			`"write_json(\"foo.json\", data)\n",`
			`"\n",`
			`"# STEP 3: Use\n",`
			`"\n",`
			`"# Ingest this as you would any other json file in https://github.com/openai/chatgpt-retrieval-plugin/tree/main/scripts/process_json\n"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"id": "0474661d",`
			`"metadata": {},`
			`"source": [`
			`"## Using the ChatGPT Retriever Plugin\n",`
			`"\n",`
			`"Okay, so we've created the ChatGPT Retriever Plugin, but how do we actually use it?\n",`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`"\n",`
add docs for openai retriever ingest (#1969) 2023-03-24 15:24:33 +00:00			`"The below code walks through how to do that."`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`]`
			`},`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`{`
			`"cell_type": "markdown",`
			`"id": "fb27da9f-d574-425d-8fab-92b03b997568",`
			`"metadata": {},`
			`"source": [`
			"We want to use `ChatGPTPluginRetriever` so we have to get the OpenAI API Key."
			`]`
			`},`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`{`
			`"cell_type": "code",`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`"execution_count": 6,`
			`"id": "b5d8c9e9-839f-42e9-933a-08195797dd4c",`
			`"metadata": {`
			`"tags": []`
			`},`
			`"outputs": [`
			`{`
			`"name": "stdin",`
			`"output_type": "stream",`
			`"text": [`
			`"OpenAI API Key: ········\n"`
			`]`
			`}`
			`],`
			`"source": [`
			`"import os\n",`
			`"import getpass\n",`
			`"\n",`
			`"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": 7,`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`"id": "39d6074e",`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`"metadata": {`
			`"tags": []`
			`},`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`"outputs": [],`
			`"source": [`
			`"from langchain.retrievers import ChatGPTPluginRetriever"`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`"execution_count": 10,`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`"id": "33fd23d1",`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`"metadata": {`
			`"tags": []`
			`},`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`"outputs": [],`
			`"source": [`
			`"retriever = ChatGPTPluginRetriever(url=\"http://0.0.0.0:8000\", bearer_token=\"foo\")"`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": 3,`
			`"id": "16250bdf",`
			`"metadata": {},`
			`"outputs": [`
			`{`
			`"data": {`
			`"text/plain": [`
			`"[Document(page_content=\"This is Alice's phone number: 123-456-7890\", lookup_str='', metadata={'id': '456_0', 'metadata': {'source': 'email', 'source_id': '567', 'url': None, 'created_at': '1609592400.0', 'author': 'Alice', 'document_id': '456'}, 'embedding': None, 'score': 0.925571561}, lookup_index=0),\n",`
			`" Document(page_content='This is a document about something', lookup_str='', metadata={'id': '123_0', 'metadata': {'source': 'file', 'source_id': 'https://example.com/doc1', 'url': 'https://example.com/doc1', 'created_at': '1609502400.0', 'author': 'Alice', 'document_id': '123'}, 'embedding': None, 'score': 0.6987589}, lookup_index=0),\n",`
			`" Document(page_content='Team: Angels \"Payroll (millions)\": 154.49 \"Wins\": 89', lookup_str='', metadata={'id': '59c2c0c1-ae3f-4272-a1da-f44a723ea631_0', 'metadata': {'source': None, 'source_id': None, 'url': None, 'created_at': None, 'author': None, 'document_id': '59c2c0c1-ae3f-4272-a1da-f44a723ea631'}, 'embedding': None, 'score': 0.697888613}, lookup_index=0)]"`
			`]`
			`},`
			`"execution_count": 3,`
			`"metadata": {},`
			`"output_type": "execute_result"`
			`}`
			`],`
			`"source": [`
WIP: Harrison/base retriever (#1765) 2023-03-24 14:46:49 +00:00			`"retriever.get_relevant_documents(\"alice's phone number\")"`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"execution_count": null,`
			`"id": "c8b5794b",`
			`"metadata": {},`
			`"outputs": [],`
			`"source": []`
			`}`
			`],`
			`"metadata": {`
			`"kernelspec": {`
			`"display_name": "Python 3 (ipykernel)",`
			`"language": "python",`
			`"name": "python3"`
			`},`
			`"language_info": {`
			`"codemirror_mode": {`
			`"name": "ipython",`
			`"version": 3`
			`},`
			`"file_extension": ".py",`
			`"mimetype": "text/x-python",`
			`"name": "python",`
			`"nbconvert_exporter": "python",`
			`"pygments_lexer": "ipython3",`
docs `retriever` improvements (#4430) # Docs: improvements in the `retrievers/examples/` notebooks Its primary purpose is to make the Jupyter notebook examples consistent and more suitable for first-time viewers. - add links to the integration source (if applicable) with a short description of this source; - removed `_retriever` suffix from the file names (where it existed) for consistency; - removed ` retriever` from the notebook title (where it existed) for consistency; - added code to install necessary Python package(s); - added code to set up the necessary API Key. - very small fixes in notebooks from other folders (for consistency): - docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb - docs/modules/indexes/vectorstores/examples/pinecone.ipynb - docs/modules/models/llms/integrations/cohere.ipynb - fixed misspelling in langchain/retrievers/time_weighted_retriever.py comment (sorry, about this change in a .py file ) ## Who can review @dev2049 2023-05-17 22:29:22 +00:00			`"version": "3.10.6"`
retrievers interface (#1948) 2023-03-24 02:00:38 +00:00			`}`
			`},`
			`"nbformat": 4,`
			`"nbformat_minor": 5`
			`}`