mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
cc60fed3be
Notebook shows preference scoring between two chains and reports wilson score interval + p value I think I'll add the option to insert ground truth labels but doesn't have to be in this PR |
||
---|---|---|
.. | ||
comparison | ||
criteria | ||
qa | ||
run_evaluators | ||
__init__.py |