mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
cc60fed3be
Notebook shows preference scoring between two chains and reports wilson score interval + p value I think I'll add the option to insert ground truth labels but doesn't have to be in this PR |
||
---|---|---|
.. | ||
agent_benchmarking.ipynb | ||
agent_vectordb_sota_pg.ipynb | ||
benchmarking_template.ipynb | ||
comparisons.ipynb | ||
criteria_eval_chain.ipynb | ||
data_augmented_question_answering.ipynb | ||
generic_agent_evaluation.ipynb | ||
huggingface_datasets.ipynb | ||
index.mdx | ||
llm_math.ipynb | ||
openapi_eval.ipynb | ||
qa_benchmarking_pg.ipynb | ||
qa_benchmarking_sota.ipynb | ||
qa_generation.ipynb | ||
question_answering.ipynb | ||
sql_qa_benchmarking_chinook.ipynb |