From a094b7f807f72167d362754cfeb18dc683b88bfd Mon Sep 17 00:00:00 2001
From: dev2049 <130488702+dev2049@users.noreply.github.com>
Date: Wed, 12 Apr 2023 17:05:20 -0700
Subject: [PATCH] Improve eval chain prompt (#2798)

Eval chain is currently very sensitive to differences in phrasing,
punctuation, and tangential information. This prompt has worked better
for me on my examples.

More general q: Do we have any framework for evaluating default prompt
changes? Could maybe start doing some regression testing?
---
 langchain/evaluation/qa/eval_prompt.py | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/langchain/evaluation/qa/eval_prompt.py b/langchain/evaluation/qa/eval_prompt.py
index d270fddd..1fff5fa5 100644
--- a/langchain/evaluation/qa/eval_prompt.py
+++ b/langchain/evaluation/qa/eval_prompt.py
@@ -2,7 +2,7 @@
 from langchain.prompts import PromptTemplate
 
 template = """You are a teacher grading a quiz.
-You are given a question, the student's answer, and the true answer, and are asked to score it as either CORRECT or INCORRECT.
+You are given a question, the student's answer, and the true answer, and are asked to score the student answer as either CORRECT or INCORRECT.
 
 Example Format:
 QUESTION: question here
@@ -10,7 +10,7 @@ STUDENT ANSWER: student's answer here
 TRUE ANSWER: true answer here
 GRADE: CORRECT or INCORRECT here
 
-Please remember to grade them based on being factually accurate. Begin!
+Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin! 
 
 QUESTION: {query}
 STUDENT ANSWER: {result}
@@ -29,7 +29,7 @@ CONTEXT: context the question is about here
 STUDENT ANSWER: student's answer here
 GRADE: CORRECT or INCORRECT here
 
-Please remember to grade them based on being factually accurate. Begin!
+Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin! 
 
 QUESTION: {query}
 CONTEXT: {context}
@@ -51,7 +51,7 @@ STUDENT ANSWER: student's answer here
 EXPLANATION: step by step reasoning here
 GRADE: CORRECT or INCORRECT here
 
-Please remember to grade them based on being factually accurate. Begin!
+Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin! 
 
 QUESTION: {query}
 CONTEXT: {context}