thoughtsculpt

pull/454/head
Elvis Saravia 1 month ago
parent 228304392e
commit 738aa55d4b

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 140 KiB

@ -2,6 +2,8 @@
"llm-agents": "LLM Agents",
"rag": "RAG for LLMs",
"llm-reasoning": "LLM Reasoning",
"thoughtsculpt": "ThoughtSculpt",
"infini-attention": "Infini-Attention",
"guided-cot": "LM-Guided CoT",
"trustworthiness-in-llms": "Trustworthiness in LLMs",
"llm-tokenization": "LLM Tokenization",

@ -0,0 +1,27 @@
# Efficient Infinite Context Transformers
import {Bleed} from 'nextra-theme-docs'
<Bleed>
<iframe width="100%"
height="415px"
src="https://www.youtube.com/embed/tOaTaQ8ZGRo?si=pFP-KiLe63Ppl9Pd" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowFullScreen
/>
</Bleed>
A new [paper](https://arxiv.org/abs/2404.07143) by Google integrates compressive memory into a vanilla dot-product attention layer.
The goal is to enable Transformer LLMs to effectively process infinitely long inputs with bounded memory footprint and computation.
They propose a new attention technique called Infini-attention which incorporates a compressive memory module into a vanilla attention mechanism.
!["Infini-Attention"](../../img/research/infini-attention.png)
It builds in both masked local attention and long-term linear attention into a single Transformer block. This allows the Infini-Transformer model to efficiently handle both long and short-range contextual dependencies.
This approach outperforms baseline models on long-context language modeling with a 114x compression ratio of memory!
They also show that a 1B LLM can naturally scale to a 1M sequence length and a 8B model achieves a new SoTA result on a 500K length book summarization task.
Given how important long-context LLMs are becoming having an effective memory system could unlock powerful reasoning, planning, continual adaption, and capabilities not seen before in LLMs.

@ -0,0 +1,27 @@
# Reasoning with Intermediate Revision and Search for LLMs
import {Bleed} from 'nextra-theme-docs'
<Bleed>
<iframe width="100%"
height="415px"
src="https://www.youtube.com/embed/13fr5m6ezOM?si=DH3XYfzbMsg9aeIx" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowFullScreen
/>
</Bleed>
This work by Chi et al. (2024) presents an approach for general reasoning and search on tasks that can be decomposed into components.
The proposed graph-based framework, THOUGHTSCULPT, incorporates iterative self-revision capabilities and allows an LLM to build an interwoven network of thoughts.
Unlike other approaches such as Tree-of-thoughts that shape the reasoning process using a tree, this new approach incorporates Monte Carlo Tree Search (MCTS) to efficiently navigate the search space.
This new method uses an LLM-powered thought evaluator to provide feedback on candidate partial outputs. Then a thought generator component produces potential solutions. The thought evaluator and thought generator are considered the expansion phase which helps with refining the current solution.
!["ThoughtSculpt"](../../img/research/thoughtsculpt.png)
Finally, the decision simulator (which acts as part of the MCTS process) simulates consecutive lines of thought to evaluate the potential value of a path.
Due to its ability for continuous thought iteration, THOUGHTSCULPT is particularly suitable for tasks such as open-ended generation, multip-step reasoning, and creative ideation.
We might be seeing more advanced approaches that use similar concepts and search algorithms to elevate the reasoning capabilities of LLMs and the ability to tackle problems that require complex reason and planning. Great paper to keep track of this research trend.
Loading…
Cancel
Save