{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Note: To answer questions based on text documents, we recommend the procedure in Question Answering using Embeddings. Some of the code below may rely on deprecated API endpoints." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 3. Train a fine-tuning model specialized for Q&A\n", "This notebook will utilize the dataset of context, question and answer pairs to additionally create adversarial questions and context pairs, where the question was not generated on that context. In those cases the model will be prompted to answer \"No sufficient context for answering the question\". We will also train a discriminator model, which predicts whether the question can be answered based on the context or not.\n", "\n", "We will add hard adversarial examples as well, which will be based either on semantically similar sections, or neighbouring sections, originating from the same article." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | title | \n", "heading | \n", "content | \n", "tokens | \n", "context | \n", "questions | \n", "answers | \n", "
---|---|---|---|---|---|---|---|
0 | \n", "2020 Summer Olympics | \n", "Summary | \n", "The 2020 Summer Olympics (Japanese: 2020年夏季オリン... | \n", "713 | \n", "2020 Summer Olympics\\nSummary\\n\\nThe 2020 Summ... | \n", "1. What is the 2020 Summer Olympics?\\n2. When ... | \n", "1. The 2020 Summer Olympics is an internationa... | \n", "
1 | \n", "2020 Summer Olympics | \n", "Host city selection | \n", "The International Olympic Committee (IOC) vote... | \n", "126 | \n", "2020 Summer Olympics\\nHost city selection\\n\\nT... | \n", "1. \\n2. \\n3. \\n4. | \n", "1. What is the International Olympic Committee... | \n", "
2 | \n", "2020 Summer Olympics | \n", "Impact of the COVID-19 pandemic | \n", "In January 2020, concerns were raised about th... | \n", "369 | \n", "2020 Summer Olympics\\nImpact of the COVID-19 p... | \n", "1. What was the COVID-19 pandemic?\\n2. How did... | \n", "1. The COVID-19 pandemic was a pandemic that o... | \n", "
3 | \n", "2020 Summer Olympics | \n", "Qualifying event cancellation and postponement | \n", "Concerns about the pandemic began to affect qu... | \n", "298 | \n", "2020 Summer Olympics\\nQualifying event cancell... | \n", "1. What was the original location of the Asia ... | \n", "1. The original location of the Asia & Oceania... | \n", "
4 | \n", "2020 Summer Olympics | \n", "Effect on doping tests | \n", "Mandatory doping tests were being severely res... | \n", "163 | \n", "2020 Summer Olympics\\nEffect on doping tests\\n... | \n", "1. What was the COVID-19 pandemic?\\n2. What di... | \n", "1. The COVID-19 pandemic was a pandemic that o... | \n", "