Prompt-Engineering-Guide/pages/research/llm-recall.en.mdx

# LLM In-Context Recall is Prompt Dependent

import {Bleed} from 'nextra-theme-docs'

<iframe width="100%"
  height="415px"
  src="https://www.youtube.com/embed/2cNO76lIZ4s?si=tbbdo-vnr56YQ077" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
  allowFullScreen
  />

This new [paper by Machlab and Battle (2024)](https://arxiv.org/abs/2404.08865) analyzes the in-context recall performance of different LLMs using several needle-in-a-haystack tests.

It shows that various LLMs recall facts at different lengths and placement depths. It finds that a model's recall performance is significantly affected by small changes in the prompt. 

!["Needle In the HayStack Performance"](../../img/research/haystack-performance.png)
*Source: [Machlab and Battle (2024)](https://arxiv.org/abs/2404.08865)*


In addition, the interplay between prompt content and training data can degrade the response quality.

The recall ability of a model can be improved with increasing size, enhancing the attention mechanism, trying different training strategies, and applying fine-tuning.

Important practical tip from the paper: "Continued evaluation will further inform the selection of LLMs for individual use cases, maximizing their impact and efficiency in real-world applications as the technology continues to evolve."

The takeaways from this paper are the importance of careful prompt design, establishing a continuous evaluation protocol, and testing different model enhancement strategies to improve recall and utility.
llm in-context recall 2024-04-16 21:19:27 +00:00			`# LLM In-Context Recall is Prompt Dependent`

			`import {Bleed} from 'nextra-theme-docs'`

chore: removed all <Bleed> tags from iframes to fix videos overflow over left sidebar 2024-05-31 10:25:27 +00:00			`<iframe width="100%"`
			`height="415px"`
			`src="https://www.youtube.com/embed/2cNO76lIZ4s?si=tbbdo-vnr56YQ077" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"`
			`allowFullScreen`
			`/>`
llm in-context recall 2024-04-16 21:19:27 +00:00
			`This new [paper by Machlab and Battle (2024)](https://arxiv.org/abs/2404.08865) analyzes the in-context recall performance of different LLMs using several needle-in-a-haystack tests.`

			`It shows that various LLMs recall facts at different lengths and placement depths. It finds that a model's recall performance is significantly affected by small changes in the prompt.`

			`!["Needle In the HayStack Performance"](../../img/research/haystack-performance.png)`
			`Source: [Machlab and Battle (2024)](https://arxiv.org/abs/2404.08865)`


			`In addition, the interplay between prompt content and training data can degrade the response quality.`

			`The recall ability of a model can be improved with increasing size, enhancing the attention mechanism, trying different training strategies, and applying fine-tuning.`

			`Important practical tip from the paper: "Continued evaluation will further inform the selection of LLMs for individual use cases, maximizing their impact and efficiency in real-world applications as the technology continues to evolve."`

chore: removed all <Bleed> tags from iframes to fix videos overflow over left sidebar 2024-05-31 10:25:27 +00:00			`The takeaways from this paper are the importance of careful prompt design, establishing a continuous evaluation protocol, and testing different model enhancement strategies to improve recall and utility.`