Commit Graph

13 Commits (80e86b602eb48d75e1ad518dc31ad7151d03c2d0)

Author SHA1 Message Date
Johnny Lim 9dc77614e3
Polish reference docs (#7045)
This PR fixes broken links in the reference docs.
1 year ago
William FH 8c73037dff
Simplify eval arg names (#6944)
It'll be easier to switch between these if the names of predictions are
consistent
1 year ago
Zander Chase e1fdb67440
Update description in Evals notebook (#6808) 1 year ago
Zander Chase ad028bbb80
Permit Constitutional Principles (#6807)
In the criteria evaluator.
1 year ago
Zander Chase d7dbf4aefe
Clean up agent trajectory interface (#6799)
- Enable reference
- Enable not specifying tools at the start
- Add methods with keywords
1 year ago
Zander Chase cc60fed3be
Add a Pairwise Comparison Chain (#6703)
Notebook shows preference scoring between two chains and reports wilson
score interval + p value

I think I'll add the option to insert ground truth labels but doesn't
have to be in this PR
1 year ago
Zander Chase c460b04c64
Update String Evaluator (#6615)
- Add protocol for `evaluate_strings` 
- Move the criteria evaluator out so it's not restricted to being
applied on traced runs
1 year ago
Davis Chase 4fabd02d25
Add OpenLLM wrapper(#6578)
LLM wrapper for models served with OpenLLM

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: Chaoyu <paranoyang@gmail.com>
1 year ago
Davis Chase 3298bf4f00
docs/fix links (#6498) 1 year ago
Davis Chase d3c2eab0b3
Docs nit (#6350) 1 year ago
Davis Chase 6640293087
fix eval guide links (#6319) 1 year ago
Davis Chase 24b2af5218
nit (#6305) 1 year ago
Davis Chase 87e502c6bc
Doc refactor (#6300)
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago