"<span style=\"color:orange; font-weight:bold\">Note: To answer questions based on text documents, we recommend the procedure in <a href=\"https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb\">Question Answering using Embeddings</a>. Some of the code below may rely on <a href=\"https://github.com/openai/openai-cookbook/tree/main/transition_guides_for_deprecated_API_endpoints\">deprecated API endpoints</a>.</span>"
"# 3. Train a fine-tuning model specialized for Q&A\n",
"This notebook will utilize the dataset of context, question and answer pairs to additionally create adversarial questions and context pairs, where the question was not generated on that context. In those cases the model will be prompted to answer \"No sufficient context for answering the question\". We will also train a discriminator model, which predicts whether the question can be answered based on the context or not.\n",
"\n",
"We will add hard adversarial examples as well, which will be based either on semantically similar sections, or neighbouring sections, originating from the same article."
" - random negative example, where the random context is paired with the question \n",
" - two hard negative examples\n",
" - one originating from the same wikipedia article\n",
" - another, which is most similar to the correct context\n",
"\n",
"This process is noisy, as sometimes the question might be answerable given a different context, but on average we hope this won't affect the peformance too much.\n",
"\n",
"We apply the same process of dataset creation for both the discriminator, and the Q&A answering model. We apply the process separately for the training and testing set, to ensure that the examples from the traing set don't feature within the test set."
" rows.append({\"prompt\":f\"{random_context}\\nQuestion: {q[2:].strip()}\\nAnswer:\", \"completion\":f\" No appropriate context found to answer the question.\"})\n",
"\n",
" return pd.DataFrame(rows) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We apply the same process of dataset creation for both the discriminator, and the Q&A answering model. We apply the process separately for the training and testing set, to ensure that the examples from the traing set don't feature within the test set."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": []
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"for name, is_disc in [('discriminator', True), ('qa', False)]:\n",
" for train_test, dt in [('train', train_df), ('test', test_df)]:\n",
" ft = create_fine_tuning_dataset(dt, discriminator=is_disc, n_negative=1, add_related=True)\n",
"!openai api fine_tunes.create -t \"olympics-data/qa_train.jsonl\" -v \"olympics-data/qa_test.jsonl\" --batch_size 16"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3.3 Using the fine-tuned models\n",
"\n",
"We will now use the fine-tuned discriminator and the fine-tuned Q&A model. By requesting logprobs, we can see how certain the discriminator is in a `yes` vs `no` answer."
" result = openai.Completion.create(model=answering_model, prompt=prompt, max_tokens=30, temperature=0, top_p=1, n=1, stop=['.','\\n'])\n",
" return result['choices'][0]['text']\n",
"\n",
"apply_ft_qa_answer('The first human-made object in space was the Soviet Union satellite Sputnik 1 on 4 October 1957.', \n",
" 'What was the first human-made object in space?', ft_qa)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see that the model can answer the question, when the context is appropriate."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' The Soviet Union was the first country to successfully launch a satellite into space'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"apply_ft_qa_answer('The first human-made object in space was the Soviet Union satellite Sputnik 1 on 4 October 1957.',\n",
" 'What is impressive about the Soviet Union?', ft_qa)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' No appropriate context found to answer the question'"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"apply_ft_qa_answer('The first human-made object in space was the Soviet Union satellite Sputnik 1 on 4 October 1957.',\n",
" 'How many cars were produced in the Soviet Union in 1970?', ft_qa)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see that the model knows when to answer the question, and when to say that insufficient context is present to answer the question."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also combine a discriminator and a base model, or a fine-tuned Q&A model. Discriminator can essentially serve as a decision whether the question can be answered given the context or not."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' Weather could cause a sport event to have no crowd'"
" \"Crowdless games are a rare although not unheard-of occurrence in sports. \\\n",
" When they do occur, it is usually the result of events beyond the control \\\n",
" of the teams or fans, such as weather-related concerns, public health concerns, \\\n",
" or wider civil disturbances unrelated to the game. For instance, \\\n",
" the COVID-19 pandemic caused many sports leagues around the world \\\n",
" to be played behind closed doors.\",\n",
" \"Could weather cause a sport event to have no crowd?\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The above function illustrates how to potentially combine a discriminator and a fine-tuned Q&A model. This gives a more fine-grained control over how certain we want the model to be before it answers the question.\n",
"\n",
"We'll now take a look on how answers endpoint works - combining search to retrieve the relevant context from a knowledge base, and then using the fine-tuned Q&A model to answer the question."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3.4 Answering the question based on a knowledge base\n",
"Finally we can use a logic similar to the [/answers](https://beta.openai.com/docs/api-reference/answers) endpoint, where we first search for the relevant context, and then ask a Q&A model to answer the question given that context. If you'd like to see the implementation details, check out the [`answers_with_ft.py`](answers_with_ft.py) file."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\" Canada won the Women's football tournament at the 2020 Olympic games\""
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from answers_with_ft import answer_question\n",
"answer_question(olympics_search_fileid, ft_qa, \"Which country won the Women's football tournament at the 2020 Olympic games?\")"