forked from Archives/langchain
You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
31303d0b11
This PR adds additional evaluation metrics for data-augmented QA, resulting in a report like this at the end of the notebook: ![Screen Shot 2023-03-08 at 8 53 23 AM](https://user-images.githubusercontent.com/398875/223731199-8eb8e77f-5ff3-40a2-a23e-f3bede623344.png) The score calculation is based on the [Critique](https://docs.inspiredco.ai/critique/) toolkit, an API-based toolkit (like OpenAI) that has minimal dependencies, so it should be easy for people to run if they choose. The code could further be simplified by actually adding a chain that calls Critique directly, but that probably should be saved for another PR if necessary. Any comments or change requests are welcome! |
1 year ago | |
---|---|---|
.. | ||
evaluation | 1 year ago | |
agents.md | 1 year ago | |
chatbots.md | 1 year ago | |
combine_docs.md | 1 year ago | |
evaluation.rst | 1 year ago | |
generate_examples.ipynb | 1 year ago | |
model_laboratory.ipynb | 1 year ago | |
question_answering.md | 1 year ago | |
summarization.md | 1 year ago |