rag-faithfulness

pull/8/head
Elvis Saravia 3 weeks ago
parent ac2b46c623
commit 8dcc7bffd6

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

@ -1,5 +1,7 @@
# Llama 3
import {Bleed} from 'nextra-theme-docs'
Meta recently [introduced](https://llama.meta.com/llama3/) their new family of large language models (LLMs) called Llama 3. This release includes 8B and 70B parameters pre-trained and instruction-tuned models.
## Llama 3 Architecture Details
@ -29,4 +31,19 @@ The pretrained models also outperform other models on several benchmarks like AG
Meta also reported that they will be releasing a 400B parameter model which is still training and coming soon! There are also efforts around multimodal support, multilingual capabilities, and longer context windows in the pipeline. The current checkpoint for Llama 3 400B (as of April 15, 2024) produces the following results on the common benchmarks like MMLU and Big-Bench Hard:
The licensing information for the Llama 3 models can be found on the [model card](https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md).
!["Llama 3 400B"](../../img/llama3/llama-400b.png)
*Source: [Meta AI](https://ai.meta.com/blog/meta-llama-3/)*
The licensing information for the Llama 3 models can be found on the [model card](https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md).
## Extended Review of Llama 3
Here is a longer review of Llama 3:
<Bleed>
<iframe width="100%"
height="415px"
src="https://www.youtube.com/embed/h2aEmciRd6U?si=m7-xXu5IWpB-6mE0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowFullScreen
/>
</Bleed>

@ -2,6 +2,7 @@
"llm-agents": "LLM Agents",
"rag": "RAG for LLMs",
"llm-reasoning": "LLM Reasoning",
"rag-faithfulness": "RAG Faithfulness",
"llm-recall": "LLM In-Context Recall",
"rag_hallucinations": "RAG Reduces Hallucination",
"synthetic_data": "Synthetic Data",

@ -0,0 +1,26 @@
# How Faithful are RAG Models?
import {Bleed} from 'nextra-theme-docs'
<Bleed>
<iframe width="100%"
height="415px"
src="https://www.youtube.com/embed/eEU1dWVE8QQ?si=b-qgCU8nibBCSX8H" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowFullScreen
/>
</Bleed>
This new paper by [Wu et al. (2024)](https://arxiv.org/abs/2404.10198) aims to quantify the tug-of-war between RAG and LLMs' internal prior.
It focuses on GPT-4 and other LLMs on question answering for the analysis.
It finds that providing correct retrieved information fixes most of the model mistakes (94% accuracy).
!["RAG Faithfulness"](../../img/research/rag-faith.png)
*Source: [Wu et al. (2024)](https://arxiv.org/abs/2404.10198)*
When the documents contain more incorrect values and the LLM's internal prior is weak, the LLM is more likely to recite incorrect information. However, the LLMs are found to be more resistant when they have a stronger prior.
The paper also reports that "the more the modified information deviates from the model's prior, the less likely the model is to prefer it."
So many developers and companies are using RAG systems in production. This work highlights the importance of assessing risks when using LLMs given different kinds of contextual information that may contain supporting, contradicting, or completely incorrection information.
Loading…
Cancel
Save