langchain/docs/use_cases/evaluation
Vashisht Madhavan aa439ac2ff
Adding an in-context QA evaluation chain + chain of thought reasoning chain for improved accuracy (#2444)
Right now, eval chains require an answer for every question. It's
cumbersome to collect this ground truth so getting around this issue
with 2 things:

* Adding a context param in `ContextQAEvalChain` and simply evaluating
if the question is answered accurately from context
* Adding chain of though explanation prompting to improve the accuracy
of this w/o GT.

This also gets to feature parity with openai/evals which has the same
contextual eval w/o GT.

TODO in follow-up:
* Better prompt inheritance. No need for seperate prompt for CoT
reasoning. How can we merge them together

---------

Co-authored-by: Vashisht Madhavan <vashishtmadhavan@Vashs-MacBook-Pro.local>
2023-04-06 22:32:41 -07:00
..
agent_benchmarking.ipynb bump version to 131 (#2391) 2023-04-04 07:21:50 -07:00
agent_vectordb_sota_pg.ipynb bump version to 131 (#2391) 2023-04-04 07:21:50 -07:00
benchmarking_template.ipynb Harrison/agent eval (#1620) 2023-03-14 12:37:48 -07:00
data_augmented_question_answering.ipynb WIP: Harrison/base retriever (#1765) 2023-03-24 07:46:49 -07:00
huggingface_datasets.ipynb Update huggingface_datasets.ipynb (#1417) 2023-03-04 00:22:31 -08:00
llm_math.ipynb Harrison/llm math (#1808) 2023-03-20 07:53:26 -07:00
qa_benchmarking_pg.ipynb WIP: Harrison/base retriever (#1765) 2023-03-24 07:46:49 -07:00
qa_benchmarking_sota.ipynb WIP: Harrison/base retriever (#1765) 2023-03-24 07:46:49 -07:00
qa_generation.ipynb Harrison/agent eval (#1620) 2023-03-14 12:37:48 -07:00
question_answering.ipynb Adding an in-context QA evaluation chain + chain of thought reasoning chain for improved accuracy (#2444) 2023-04-06 22:32:41 -07:00
sql_qa_benchmarking_chinook.ipynb Harrison/agent eval (#1620) 2023-03-14 12:37:48 -07:00