Prompt-Engineering-Guide/pages/models/llama.en.mdx

## LLaMA: Open and Efficient Foundation Language Models

<Callout emoji="⚠️">
  This section is under heavy development.
</Callout>


import {Screenshot} from 'components/screenshot'
import { Callout, FileTree } from 'nextra-theme-docs'
import LLAMA1 from '../../img/llama-1.png'


## What's new?

This paper introduces a collection of foundation language models ranging from 7B to 65B parameters. 

The models are trained on trillion of tokens with publicly available datasets.

The work by [(Hoffman et al. 2022)](https://arxiv.org/abs/2203.15556) shows that given a compute budget smaller models trained on a lot more data can achieve better performance than the larger counterparts. This work recommends training 10B models on 200B tokens. However, the LLaMA paper finds that the performance of a 7B model continues to improve even after 1T tokens. 

<Screenshot src={LLAMA1} alt="LLAMA1" />

This work focuses on training models (LLaMA) that achieve the best possible performance at various inference budgets, by training on more tokens. 


## Capabilities & Key Results

Overall, LLaMA-13B outperform GPT-3(175B) on many benchmarks despite being 10x smaller and possible to run a single GPU. LLaMA 65B is competitive with models like Chinchilla-70B and PaLM-540B.


*Paper:* [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)

*Code:* https://github.com/facebookresearch/llama

## References

- [LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention](https://arxiv.org/abs/2303.16199) (March 2023)
- [GPT4All](https://github.com/nomic-ai/gpt4all) (March 2023)
- [ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge](https://arxiv.org/abs/2303.14070) (March 2023)
- [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) (March 2023)
multilanguage support 2023-03-31 00:43:20 +00:00			`## LLaMA: Open and Efficient Foundation Language Models`

			`<Callout emoji="⚠️">`
			`This section is under heavy development.`
			`</Callout>`


			`import {Screenshot} from 'components/screenshot'`
			`import { Callout, FileTree } from 'nextra-theme-docs'`
			`import LLAMA1 from '../../img/llama-1.png'`


			`## What's new?`

			`This paper introduces a collection of foundation language models ranging from 7B to 65B parameters.`

			`The models are trained on trillion of tokens with publicly available datasets.`

			`The work by [(Hoffman et al. 2022)](https://arxiv.org/abs/2203.15556) shows that given a compute budget smaller models trained on a lot more data can achieve better performance than the larger counterparts. This work recommends training 10B models on 200B tokens. However, the LLaMA paper finds that the performance of a 7B model continues to improve even after 1T tokens.`

			`<Screenshot src={LLAMA1} alt="LLAMA1" />`

			`This work focuses on training models (LLaMA) that achieve the best possible performance at various inference budgets, by training on more tokens.`


			`## Capabilities & Key Results`

			`Overall, LLaMA-13B outperform GPT-3(175B) on many benchmarks despite being 10x smaller and possible to run a single GPU. LLaMA 65B is competitive with models like Chinchilla-70B and PaLM-540B.`


			`Paper: [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)`

			`Code: https://github.com/facebookresearch/llama`

			`## References`

added papers for the week 2023-04-02 21:54:43 +00:00			`- [LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention](https://arxiv.org/abs/2303.16199) (March 2023)`
multilanguage support 2023-03-31 00:43:20 +00:00			`- [GPT4All](https://github.com/nomic-ai/gpt4all) (March 2023)`
			`- [ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge](https://arxiv.org/abs/2303.14070) (March 2023)`
			`- [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) (March 2023)`