{ "cells": [ { "cell_type": "markdown", "id": "3cadcf88", "metadata": {}, "source": [ "# Using HuggingFace Datasets\n", "\n", "This example shows how to use HuggingFace datasets to evaluate models. Specifically, we show how to load examples to evaluate models on from HuggingFace's dataset package." ] }, { "cell_type": "markdown", "id": "0e3ce977", "metadata": {}, "source": [ "## Setup\n", "\n", "For demonstration purposes, we will just evaluate a simple question answering system." ] }, { "cell_type": "code", "execution_count": 1, "id": "4c10054f", "metadata": {}, "outputs": [], "source": [ "from langchain.prompts import PromptTemplate\n", "from langchain.chains import LLMChain\n", "from langchain.llms import OpenAI" ] }, { "cell_type": "code", "execution_count": 2, "id": "9abdf160", "metadata": {}, "outputs": [], "source": [ "prompt = PromptTemplate(template=\"Question: {question}\\nAnswer:\", input_variables=[\"question\"])" ] }, { "cell_type": "code", "execution_count": 3, "id": "d41ef7bb", "metadata": {}, "outputs": [], "source": [ "llm = OpenAI(model_name=\"text-davinci-003\", temperature=0)\n", "chain = LLMChain(llm=llm, prompt=prompt)" ] }, { "cell_type": "markdown", "id": "cbea2132", "metadata": {}, "source": [ "## Examples\n", "\n", "Now we load a dataset from HuggingFace, and then convert it to a list of dictionaries for easier usage." ] }, { "cell_type": "code", "execution_count": 4, "id": "d2373cf1", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Found cached dataset truthful_qa (/Users/harrisonchase/.cache/huggingface/datasets/truthful_qa/generation/1.1.0/70210b72382652635215516e59663843b88eda16bd2acef909fb46700beb039a)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "92216d733c694ab4bfa812614f2223a4", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/1 [00:00