mirror of https://github.com/GammaTauAI/reflexion-human-eval synced 2024-11-11 19:10:53 +00:00

Go to file

Noah Shinn 5269ef4ae0 start v2		2023-05-21 15:52:27 +02:00
alfworld_runs	start v2	2023-05-21 15:48:05 +02:00
benchmarks	rs hardest 50 results	2023-04-18 17:45:36 -04:00
figures	start v2	2023-05-21 15:52:27 +02:00
programming_runs	reinit submodules	2023-05-21 15:51:39 +02:00
webshop_runs	alfworld and webshop	2023-05-21 15:35:36 +02:00
.gitignore	Leetcode Hard: Python3 Benchmark	2023-04-06 01:39:31 -04:00
.gitmodules	reinit submodules	2023-05-21 15:51:39 +02:00
LICENSE	Initial commit	2023-03-22 02:38:53 -04:00
README.md	start v2	2023-05-21 15:52:27 +02:00

README.md

Reflexion: Language Agents with Verbal Reinforcement Learning

This repo holds the code, demos, and logs for: Reflexion: Language Agents with Verbal Reinforcement Learning. Noah Shinn, Federico Cassano, Beck Labash, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao. Preprint, 2023

We release the LeetcodeHardGym here

Another Note

Due to the nature of these experiments, it may not be feasible for individual developers to rerun the results as GPT-4 has limited access and significant API charges. All runs from the paper and additional results are logged in ./programming_runs/root for programming, ./alfworld_runs/root for decision-making, and ./hotpotqa_runs/root for reasoning. Programming runs can be validated with scripts here and here to validate the Python and Rust solutions with the unit tests provided by their respective benchmarks.

Warning

Please do not run the Reflexion programming agent in an unsecure environment as the generated code is not validated before execution.

Other Notes

Check out the code for the original draft here

Read the original blog here

Check out an interesting type-inference implementation here: OpenTau

If you have any questions, contact noahshinn024@gmail.com

Cite

@article{shinn2023reflexion,
  title={Reflexion: an autonomous agent with dynamic memory and self-reflection},
  author={Shinn, Noah and Labash, Beck and Gopinath, Ashwin},
  journal={arXiv preprint arXiv:2303.11366},
  year={2023}
}