Improve eval chain prompt (#2798)

Eval chain is currently very sensitive to differences in phrasing,
punctuation, and tangential information. This prompt has worked better
for me on my examples.

More general q: Do we have any framework for evaluating default prompt
changes? Could maybe start doing some regression testing?
This commit is contained in:
dev2049 2023-04-12 17:05:20 -07:00 committed by GitHub
parent 1c7fb31bba
commit a094b7f807
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -2,7 +2,7 @@
from langchain.prompts import PromptTemplate from langchain.prompts import PromptTemplate
template = """You are a teacher grading a quiz. template = """You are a teacher grading a quiz.
You are given a question, the student's answer, and the true answer, and are asked to score it as either CORRECT or INCORRECT. You are given a question, the student's answer, and the true answer, and are asked to score the student answer as either CORRECT or INCORRECT.
Example Format: Example Format:
QUESTION: question here QUESTION: question here
@ -10,7 +10,7 @@ STUDENT ANSWER: student's answer here
TRUE ANSWER: true answer here TRUE ANSWER: true answer here
GRADE: CORRECT or INCORRECT here GRADE: CORRECT or INCORRECT here
Please remember to grade them based on being factually accurate. Begin! Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin!
QUESTION: {query} QUESTION: {query}
STUDENT ANSWER: {result} STUDENT ANSWER: {result}
@ -29,7 +29,7 @@ CONTEXT: context the question is about here
STUDENT ANSWER: student's answer here STUDENT ANSWER: student's answer here
GRADE: CORRECT or INCORRECT here GRADE: CORRECT or INCORRECT here
Please remember to grade them based on being factually accurate. Begin! Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin!
QUESTION: {query} QUESTION: {query}
CONTEXT: {context} CONTEXT: {context}
@ -51,7 +51,7 @@ STUDENT ANSWER: student's answer here
EXPLANATION: step by step reasoning here EXPLANATION: step by step reasoning here
GRADE: CORRECT or INCORRECT here GRADE: CORRECT or INCORRECT here
Please remember to grade them based on being factually accurate. Begin! Grade the student answers based ONLY on their factual accuracy. Ignore differences in punctuation and phrasing between the student answer and true answer. It is OK if the student answer contains more information than the true answer, as long as it does not contain any conflicting statements. Begin!
QUESTION: {query} QUESTION: {query}
CONTEXT: {context} CONTEXT: {context}