gpt reflexion

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

Go to file

Noah Shinn 34ab94a3b3 start run instructions		1 year ago
alfworld_runs	start v2	1 year ago
benchmarks	rs hardest 50 results	1 year ago
figures	start v2	1 year ago
hotpotqa_runs	HotPotQA runs	1 year ago
programming_runs	start run instructions	1 year ago
webshop_runs	alfworld and webshop	1 year ago
.gitignore	Leetcode Hard: Python3 Benchmark	2 years ago
.gitmodules	reinit submodules	1 year ago
LICENSE	Initial commit	2 years ago
README.md	start run instructions	1 year ago

README.md

Reflexion: Language Agents with Verbal Reinforcement Learning

This repo holds the code, demos, and logs for the Reflexion paper (v2 not out yet): Reflexion: Language Agents with Verbal Reinforcement Learning. Noah Shinn, Federico Cassano, Beck Labash, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao. Preprint, 2023

We release the LeetcodeHardGym here

Note

decision-making: ./alfworld_runs and ./webshop_runs programming: v2 not released yet, to be cleaned soon reasoning: ./hotpotqa_runs

To Run: decision-making (AlfWorld)

Clone this repo and move to the AlfWorld directory

git clone https://github.com/noahshinn024/reflexion && cd ./alfworld_runs

Specify the run parameters in ./run_reflexion.sh. num_trials: number of iterative learning steps num_envs: number of task-environment pairs per trial run_name: the name for this run use_memory: use persisting memory to store self-reflections (turn off to run a baseline run) is_resume: use logging directory to resume a previous run resume_dir: the logging directory from which to resume the previous run start_trial_num: if resume run, then the trial number of which to start

Run the trial

./run_reflexion.sh

The logs will be sent to ./root/<run_name>.

To Run: reasoning (HotPotQA)

Clone this repo and move to the AlfWorld directory

git clone https://github.com/noahshinn024/reflexion && cd ./hotpotqa_runs

Another Note

Due to the nature of these experiments, it may not be feasible for individual developers to rerun the results as GPT-4 has limited access and significant API charges. All runs from the paper and additional results are logged in ./alfworld_runs/root for decision-making and ./hotpotqa_runs/root for reasoning.

Other Notes

Check out the code for the original draft here

Read the original blog here

Check out an interesting type-inference implementation here: OpenTau

For all questions, contact noahshinn024@gmail.com

Cite

@article{shinn2023reflexion,
  title={Reflexion: an autonomous agent with dynamic memory and self-reflection},
  author={Shinn, Noah and Labash, Beck and Gopinath, Ashwin},
  journal={arXiv preprint arXiv:2303.11366},
  year={2023}
}