Update tot.en.mdx

2024-11-02 15:40:13 +00:00 · 2023-06-14 17:58:42 -07:00 · 2023-06-14 17:58:42 -07:00 · 7932c65d1e
commit 7932c65d1e
parent 3c375604e1
1 changed files with 3 additions and 0 deletions
--- a/pages/techniques/tot.en.mdx
+++ b/pages/techniques/tot.en.mdx
@ -28,3 +28,6 @@ From the results reported in the figure below, ToT substantially outperforms the
 Image Source: [Yao et el. (2023)](https://arxiv.org/abs/2305.10601) 

 Code available [here](https://github.com/princeton-nlp/tree-of-thought-llm) and [here](https://github.com/jieyilong/tree-of-thought-puzzle-solver)
+
+At a high level, the main ideas of [Yao et el. (2023)](https://arxiv.org/abs/2305.10601) and [Long (2023)](https://arxiv.org/abs/2305.08291) are similar. Both enhance LLM's capability for complex problem solving through tree search via a multi-round conversation. One of the main difference is that [Yao et el. (2023)](https://arxiv.org/abs/2305.10601) leverages DFS/BFS/beam search, while the tree search strategy (i.e. when to backtrack and backtracking by how many levels, etc.) proposed in [Long (2023)](https://arxiv.org/abs/2305.08291) is driven by a "ToT Controller" trained through reinforcement learning. DFS/BFS/Beam search are generic solution search strategies with no adaptation to specific problems. In comparison, a ToT Controller trained through RL might be able learn from new data set or through self-play (AlphaGo vs brute force search), and hence the RL-based ToT system can continue to evolve and learn new knowledge even with a fixed LLM.
+