{ "cells": [ { "cell_type": "markdown", "id": "984169ca", "metadata": {}, "source": [ "# Question Answering Benchmarking: Paul Graham Essay\n", "\n", "Here we go over how to benchmark performance on a question answering task over a Paul Graham essay.\n", "\n", "It is highly recommended that you do any evaluation/benchmarking with tracing enabled. See [here](https://python.langchain.com/docs/modules/callbacks/how_to/tracing) for an explanation of what tracing is and how to set it up." ] }, { "cell_type": "markdown", "id": "8a16b75d", "metadata": {}, "source": [ "## Loading the data\n", "First, let's load the data." ] }, { "cell_type": "code", "execution_count": 2, "id": "5b2d5e98", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Found cached dataset json (/Users/harrisonchase/.cache/huggingface/datasets/LangChainDatasets___json/LangChainDatasets--question-answering-paul-graham-76e8f711e038d742/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9264acfe710b4faabf060f0fcf4f7308", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/1 [00:00