You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
reflexion-human-eval/README.md

2.0 KiB

Reflexion: Language Agents with Verbal Reinforcement Learning

Reflexion RL diagram

This repo holds the code, demos, and logs for: Reflexion: Language Agents with Verbal Reinforcement Learning. Noah Shinn, Federico Cassano, Beck Labash, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao. Preprint, 2023

Reflexion tasks

We release the LeetcodeHardGym here

Another Note

Due to the nature of these experiments, it may not be feasible for individual developers to rerun the results as GPT-4 has limited access and significant API charges. All runs from the paper and additional results are logged in ./programming_runs/root for programming, ./alfworld_runs/root for decision-making, and ./hotpotqa_runs/root for reasoning. Programming runs can be validated with scripts here and here to validate the Python and Rust solutions with the unit tests provided by their respective benchmarks.

Warning

Please do not run the Reflexion programming agent in an unsecure environment as the generated code is not validated before execution.

Other Notes

Check out the code for the original draft here

Read the original blog here

Check out an interesting type-inference implementation here: OpenTau

If you have any questions, contact noahshinn024@gmail.com

Cite

@article{shinn2023reflexion,
  title={Reflexion: an autonomous agent with dynamic memory and self-reflection},
  author={Shinn, Noah and Labash, Beck and Gopinath, Ashwin},
  journal={arXiv preprint arXiv:2303.11366},
  year={2023}
}