From a094b7f807f72167d362754cfeb18dc683b88bfd Mon Sep 17 00:00:00 2001 From: dev2049 <130488702+dev2049@users.noreply.github.com> Date: Wed, 12 Apr 2023 17:05:20 -0700 Subject: [PATCH] Improve eval chain prompt (#2798) Eval chain is currently very sensitive to differences in phrasing, punctuation, and tangential information. This prompt has worked better for me on my examples. More general q: Do we have any framework for evaluating default prompt changes? Could maybe start doing some regression testing? --- langchain/evaluation/qa/eval_prompt.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/langchain/evaluation/qa/eval_prompt.py b/langchain/evaluation/qa/eval_prompt.py index d270fddd..1fff5fa5 100644 --- a/langchain/evaluation/qa/eval_prompt.py +++ b/langchain/evaluation/qa/eval_prompt.py @@ -2,7 +2,7 @@ from langchain.prompts import PromptTemplate template = """You are a teacher grading a quiz. -You are given a question, the student's answer, and the true answer, and are asked to score it as either CORRECT or INCORRECT. +You are given a question, the student's answer, and the true answer, and are asked to score the student answer as either CORRECT or INCORRECT. Example Format: QUESTION: question here @@ -10,7 +10,7 @@ STUDENT ANSWER: student's answer here TRUE ANSWER: true answer here GRADE: CORRECT or INCORRECT here -Please remember to grade them based on being factually accurate. Begin! +Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin! QUESTION: {query} STUDENT ANSWER: {result} @@ -29,7 +29,7 @@ CONTEXT: context the question is about here STUDENT ANSWER: student's answer here GRADE: CORRECT or INCORRECT here -Please remember to grade them based on being factually accurate. Begin! +Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin! QUESTION: {query} CONTEXT: {context} @@ -51,7 +51,7 @@ STUDENT ANSWER: student's answer here EXPLANATION: step by step reasoning here GRADE: CORRECT or INCORRECT here -Please remember to grade them based on being factually accurate. Begin! +Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin! QUESTION: {query} CONTEXT: {context}