Ruixi Fan
0b69a7e9ab
[Document fix] Fix an expired link qa_benchmarking_pg.ipynb ( #7110 )
...
## Change description
- Description: Fix an expired link that points to the readthedocs site.
- Dependencies: No
2023-07-03 19:03:16 -06:00
Johnny Lim
9dc77614e3
Polish reference docs ( #7045 )
...
This PR fixes broken links in the reference docs.
2023-07-02 08:08:51 -06:00
William FH
8c73037dff
Simplify eval arg names ( #6944 )
...
It'll be easier to switch between these if the names of predictions are
consistent
2023-06-30 07:47:53 -07:00
Zander Chase
e1fdb67440
Update description in Evals notebook ( #6808 )
2023-06-27 00:26:49 -07:00
Zander Chase
ad028bbb80
Permit Constitutional Principles ( #6807 )
...
In the criteria evaluator.
2023-06-27 00:23:54 -07:00
Zander Chase
d7dbf4aefe
Clean up agent trajectory interface ( #6799 )
...
- Enable reference
- Enable not specifying tools at the start
- Add methods with keywords
2023-06-26 22:54:04 -07:00
Zander Chase
cc60fed3be
Add a Pairwise Comparison Chain ( #6703 )
...
Notebook shows preference scoring between two chains and reports wilson
score interval + p value
I think I'll add the option to insert ground truth labels but doesn't
have to be in this PR
2023-06-26 20:47:41 -07:00
Zander Chase
c460b04c64
Update String Evaluator ( #6615 )
...
- Add protocol for `evaluate_strings`
- Move the criteria evaluator out so it's not restricted to being
applied on traced runs
2023-06-26 14:16:14 -07:00
Davis Chase
3298bf4f00
docs/fix links ( #6498 )
2023-06-20 14:06:50 -07:00
Davis Chase
d3c2eab0b3
Docs nit ( #6350 )
2023-06-18 20:58:12 -07:00
Davis Chase
6640293087
fix eval guide links ( #6319 )
2023-06-16 17:53:46 -07:00
Davis Chase
24b2af5218
nit ( #6305 )
2023-06-16 16:21:27 -07:00
Davis Chase
87e502c6bc
Doc refactor ( #6300 )
...
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-16 11:52:56 -07:00