mirror of
https://github.com/openai/openai-cookbook
synced 2024-11-04 06:00:33 +00:00
updates model name in text description of Q&A notebook
This commit is contained in:
parent
7272515849
commit
5c7e36f574
@ -167,13 +167,14 @@
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"id": "ee85ee77-d8d2-4788-b57e-0785f2d7e2e3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Adding extra information into the prompt only works when the dataset of extra content that the model may need to know is small enough to fit in a single prompt. What do we do when we need the model to choose relevant contextual information from within a large body of information?\n",
|
||||
"\n",
|
||||
"**In the remainder of this notebook, we will demonstrate a method for augmenting GPT-3 with a large body of additional contextual information by using document embeddings and retrieval.** This method answers queries in two steps: first it retrieves the information relevant to the query, then it writes an answer tailored to the question based on the retrieved information. The first step uses the [Embedding API](https://beta.openai.com/docs/guides/embeddings), the second step uses the [Completions API](https://beta.openai.com/docs/guides/completion/introduction).\n",
|
||||
"**In the remainder of this notebook, we will demonstrate a method for augmenting GPT-3 with a large body of additional contextual information by using document embeddings and retrieval.** This method answers queries in two steps: first it retrieves the information relevant to the query, then it writes an answer tailored to the question based on the retrieved information. The first step uses the [Embeddings API](https://beta.openai.com/docs/guides/embeddings), the second step uses the [Completions API](https://beta.openai.com/docs/guides/completion/introduction).\n",
|
||||
" \n",
|
||||
"The steps are:\n",
|
||||
"* Preprocess the contextual information by splitting it into chunks and create an embedding vector for each chunk.\n",
|
||||
@ -308,15 +309,14 @@
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"id": "a17b88b9-7ea2-491e-9727-12617c74a77d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We preprocess the document sections by creating an embedding vector for each section. An embedding is a vector of numbers that helps us understand how semantically similar or different the texts are. The closer two embeddings are to each other, the more similar are their contents. See the [documentation on OpenAI embeddings](https://beta.openai.com/docs/guides/embeddings) for more information.\n",
|
||||
"\n",
|
||||
"This indexing stage can be executed offline and only runs once to precompute the indexes for the dataset so that each piece of content can be retrieved later. Since this is a small example, we will store and search the embeddings locally. If you have a larger dataset, consider using a vector search engine like [Pinecone](https://www.pinecone.io/) or [Weaviate](https://github.com/semi-technologies/weaviate) to power the search.\n",
|
||||
"\n",
|
||||
"For the purposes of this tutorial we chose to use Curie embeddings, which are 4096-dimensional embeddings at a very good price and performance point. Since we will be using these embeddings for retrieval, we’ll use the \"search\" embeddings (see the [documentation](https://beta.openai.com/docs/guides/embeddings))."
|
||||
"This indexing stage can be executed offline and only runs once to precompute the indexes for the dataset so that each piece of content can be retrieved later. Since this is a small example, we will store and search the embeddings locally. If you have a larger dataset, consider using a vector search engine like [Pinecone](https://www.pinecone.io/) or [Weaviate](https://github.com/semi-technologies/weaviate) to power the search."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -725,11 +725,12 @@
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"id": "7b48d155-d2d4-447c-ab8e-5a5b4722b07c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Wow! By combining the Embeddings and Completions APIs, we have created a question-answering model which can answer questions using a large base of additional knowledge. It also understands when it doesn't know the answer! \n",
|
||||
"By combining the Embeddings and Completions APIs, we have created a question-answering model which can answer questions using a large base of additional knowledge. It also understands when it doesn't know the answer! \n",
|
||||
"\n",
|
||||
"For this example we have used a dataset of Wikipedia articles, but that dataset could be replaced with books, articles, documentation, service manuals, or much much more. **We can't wait to see what you create with GPT-3!**\n",
|
||||
"\n",
|
||||
|
Loading…
Reference in New Issue
Block a user