Zander Chase
cc60fed3be
Add a Pairwise Comparison Chain ( #6703 )
...
Notebook shows preference scoring between two chains and reports wilson
score interval + p value
I think I'll add the option to insert ground truth labels but doesn't
have to be in this PR
2023-06-26 20:47:41 -07:00
Zander Chase
c460b04c64
Update String Evaluator ( #6615 )
...
- Add protocol for `evaluate_strings`
- Move the criteria evaluator out so it's not restricted to being
applied on traced runs
2023-06-26 14:16:14 -07:00
Davis Chase
3298bf4f00
docs/fix links ( #6498 )
2023-06-20 14:06:50 -07:00
Davis Chase
d3c2eab0b3
Docs nit ( #6350 )
2023-06-18 20:58:12 -07:00
Davis Chase
6640293087
fix eval guide links ( #6319 )
2023-06-16 17:53:46 -07:00
Davis Chase
24b2af5218
nit ( #6305 )
2023-06-16 16:21:27 -07:00
Davis Chase
87e502c6bc
Doc refactor ( #6300 )
...
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-16 11:52:56 -07:00