You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
reflexion-human-eval/README.md

39 lines
2.0 KiB
Markdown

1 year ago
# Reflexion: Language Agents with Verbal Reinforcement Learning
2 years ago
1 year ago
![Reflexion RL diagram](./figures/reflexion_rl.pdf)
2 years ago
1 year ago
This repo holds the code, demos, and logs for: [Reflexion: Language Agents with Verbal Reinforcement Learning. Noah Shinn, Federico Cassano, Beck Labash, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao. _Preprint_, 2023](https://arxiv.org/abs/2303.11366)
2 years ago
1 year ago
![Reflexion tasks](./figures/reflexion_tasks.pdf)
1 year ago
We release the LeetcodeHardGym [here](https://github.com/GammaTauAI/leetcode-hard-gym)
1 year ago
### Another Note
1 year ago
Due to the nature of these experiments, it may not be feasible for individual developers to rerun the results as GPT-4 has limited access and significant API charges. All runs from the paper and additional results are logged in `./programming_runs/root` for programming, `./alfworld_runs/root` for decision-making, and `./hotpotqa_runs/root` for reasoning. Programming runs can be validated with scripts [here](https://github.com/noahshinn024/reflexion/blob/main/programming/validate_py_results.py) and [here](https://github.com/noahshinn024/reflexion/blob/main/programming/validate_rs_results.py) to validate the Python and Rust solutions with the unit tests provided by their respective benchmarks.
1 year ago
### Warning
2 years ago
1 year ago
Please do not run the Reflexion programming agent in an unsecure environment as the generated code is not validated before execution.
1 year ago
### Other Notes
1 year ago
Check out the code for the original draft [here](https://github.com/noahshinn024/reflexion-draft)
1 year ago
Read the original blog [here](https://nanothoughts.substack.com/p/reflecting-on-reflexion)
1 year ago
Check out an interesting type-inference implementation here: [OpenTau](https://github.com/GammaTauAI/opentau)
1 year ago
If you have any questions, contact [noahshinn024@gmail.com](noahshinn024@gmail.com)
### Cite
```bibtex
@article{shinn2023reflexion,
title={Reflexion: an autonomous agent with dynamic memory and self-reflection},
author={Shinn, Noah and Labash, Beck and Gopinath, Ashwin},
journal={arXiv preprint arXiv:2303.11366},
year={2023}
}
```