Improve eval chain prompt (#2798)

Eval chain is currently very sensitive to differences in phrasing, punctuation, and tangential information. This prompt has worked better for me on my examples. More general q: Do we have any framework for evaluating default prompt changes? Could maybe start doing some regression testing?
2023-04-12 17:05:20 -07:00 · 2023-04-12 17:05:20 -07:00 · a094b7f807
commit a094b7f807
parent 1c7fb31bba
1 changed files with 4 additions and 4 deletions
--- a/langchain/evaluation/qa/eval_prompt.py
+++ b/langchain/evaluation/qa/eval_prompt.py
@ -2,7 +2,7 @@
 from langchain.prompts import PromptTemplate
 template = """You are a teacher grading a quiz.
-You are given a question, the student's answer, and the true answer, and are asked to score it as either CORRECT or INCORRECT.
+You are given a question, the student's answer, and the true answer, and are asked to score the student answer as either CORRECT or INCORRECT.
 Example Format:
 QUESTION: question here
@ -10,7 +10,7 @@ STUDENT ANSWER: student's answer here
 TRUE ANSWER: true answer here
 GRADE: CORRECT or INCORRECT here
-Please remember to grade them based on being factually accurate. Begin!
+Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin! 
 QUESTION: {query}
 STUDENT ANSWER: {result}
@ -29,7 +29,7 @@ CONTEXT: context the question is about here
 STUDENT ANSWER: student's answer here
 GRADE: CORRECT or INCORRECT here
-Please remember to grade them based on being factually accurate. Begin!
+Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin! 
 QUESTION: {query}
 CONTEXT: {context}
@ -51,7 +51,7 @@ STUDENT ANSWER: student's answer here
 EXPLANATION: step by step reasoning here
 GRADE: CORRECT or INCORRECT here
-Please remember to grade them based on being factually accurate. Begin!
+Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin! 
 QUESTION: {query}
 CONTEXT: {context}