add docs for openai retriever ingest (#1969)

This commit is contained in:
Harrison Chase 2023-03-24 08:24:33 -07:00 committed by GitHub
parent 47d37db2d2
commit 6ec5780547
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -1,15 +1,75 @@
{ {
"cells": [ "cells": [
{
"cell_type": "markdown",
"id": "1edb9e6b",
"metadata": {},
"source": [
"# ChatGPT Plugin Retriever\n",
"\n",
"This notebook shows how to use the ChatGPT Retriever Plugin within LangChain."
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"id": "074b0004", "id": "074b0004",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# ChatGPT Plugin Retriever\n", "## Create\n",
"\n", "\n",
"This notebook shows how to use the ChatGPT Retriever Plugin within LangChain.\n", "First, let's go over how to create the ChatGPT Retriever Plugin.\n",
"\n", "\n",
"To set up the ChatGPT Retriever Plugin, please follow instructions [here](https://github.com/openai/chatgpt-retrieval-plugin)." "To set up the ChatGPT Retriever Plugin, please follow instructions [here](https://github.com/openai/chatgpt-retrieval-plugin).\n",
"\n",
"You can also create the ChatGPT Retriever Plugin from LangChain document loaders. The below code walks through how to do that."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "bbe89ca0",
"metadata": {},
"outputs": [],
"source": [
"# STEP 1: Load\n",
"\n",
"# Load documents using LangChain's DocumentLoaders\n",
"# This is from https://langchain.readthedocs.io/en/latest/modules/document_loaders/examples/csv.html\n",
"\n",
"from langchain.document_loaders.csv_loader import CSVLoader\n",
"loader = CSVLoader(file_path='../../document_loaders/examples/example_data/mlb_teams_2012.csv')\n",
"data = loader.load()\n",
"\n",
"\n",
"# STEP 2: Convert\n",
"\n",
"# Convert Document to format expected by https://github.com/openai/chatgpt-retrieval-plugin\n",
"from typing import List\n",
"from langchain.docstore.document import Document\n",
"import json\n",
"\n",
"def write_json(path: str, documents: List[Document])-> None:\n",
" results = [{\"text\": doc.page_content} for doc in documents]\n",
" with open(path, \"w\") as f:\n",
" json.dump(results, f, indent=2)\n",
"\n",
"write_json(\"foo.json\", data)\n",
"\n",
"# STEP 3: Use\n",
"\n",
"# Ingest this as you would any other json file in https://github.com/openai/chatgpt-retrieval-plugin/tree/main/scripts/process_json\n"
]
},
{
"cell_type": "markdown",
"id": "0474661d",
"metadata": {},
"source": [
"## Using the ChatGPT Retriever Plugin\n",
"\n",
"Okay, so we've created the ChatGPT Retriever Plugin, but how do we actually use it?\n",
"\n",
"The below code walks through how to do that."
] ]
}, },
{ {