Add Tree-of-Thought Prompting

2024-11-02 15:40:13 +00:00 · 2023-06-18 15:56:50 +01:00 · 2023-06-18 15:56:50 +01:00 · 5fa58789ff
commit 5fa58789ff
parent 391ae1795c
1 changed files with 10 additions and 0 deletions
--- a/pages/techniques/tot.en.mdx
+++ b/pages/techniques/tot.en.mdx
@ -31,3 +31,13 @@ Code available [here](https://github.com/princeton-nlp/tree-of-thought-llm) and
 At a high level, the main ideas of [Yao et el. (2023)](https://arxiv.org/abs/2305.10601) and [Long (2023)](https://arxiv.org/abs/2305.08291) are similar. Both enhance LLM's capability for complex problem solving through tree search via a multi-round conversation. One of the main difference is that [Yao et el. (2023)](https://arxiv.org/abs/2305.10601) leverages DFS/BFS/beam search, while the tree search strategy (i.e. when to backtrack and backtracking by how many levels, etc.) proposed in [Long (2023)](https://arxiv.org/abs/2305.08291) is driven by a "ToT Controller" trained through reinforcement learning. DFS/BFS/Beam search are generic solution search strategies with no adaptation to specific problems. In comparison, a ToT Controller trained through RL might be able learn from new data set or through self-play (AlphaGo vs brute force search), and hence the RL-based ToT system can continue to evolve and learn new knowledge even with a fixed LLM.
 [Hulbert (2023)](https://github.com/dave1010/tree-of-thought-prompting) has proposed Tree-of-Thought Prompting, which applies the main concept from ToT frameworks as a simple prompting technique, getting the LLM to evaluate intermediate thoughts in a single prompt. A sample ToT prompt is:
 ```
 Imagine three different experts are answering this question.
 All experts will write down 1 step of their thinking,
 then share it with the group.
 Then all experts will go on to the next step, etc.
 If any expert realises they're wrong at any point then they leave.
 The question is...
 ```