mixtral 8x22b

1 month ago · 83799613db
parent ebb1ff94db
commit 83799613db
5 changed files with 29 additions and 0 deletions
--- a/img/mixtral/mixtral-8-cost.png
+++ b/img/mixtral/mixtral-8-cost.png
--- a/img/mixtral/mixtral-8-maths.png
+++ b/img/mixtral/mixtral-8-maths.png
--- a/img/mixtral/mixtral-8-reasoning.png
+++ b/img/mixtral/mixtral-8-reasoning.png
--- a/pages/models/_meta.en.json
+++ b/pages/models/_meta.en.json
@ -13,6 +13,7 @@
    "mistral-7b": "Mistral 7B",
    "mistral-large": "Mistral Large",
    "mixtral": "Mixtral",
+    "mixtral-8x22b": "Mixtral 8x22B",
    "olmo": "OLMo",    
    "phi-2": "Phi-2",
    "sora": "Sora",
--- a/pages/models/mixtral-8x22b.en.mdx
+++ b/pages/models/mixtral-8x22b.en.mdx
@ -0,0 +1,28 @@
+# Mixtral 8x22B
+
+Mixtral 8x22B is a new open large language model (LLM) released by Mistral AI. Mixtral 8x22B is characterized as a sparse mixture-of-experts model with 39B active parameters out of a total of 141B parameters. 
+
+## Capabilities
+
+Mixtral 8x22B is trained to be a cost-efficient model with capabilities that include multilingual understanding, math reasoning, code generation, native function calling support,  and constrained output support. The model supports a context window size of 64K tokens which enables high-performing information recall on large documents. 
+
+Mistral AI claims that Mixtral 8x22B delivers one of the best performance-to-cost ratio community models and it is significantly fast due to its sparse activations. 
+
+!["Mixtral 8x22B Performance"](../../img/mixtral/mixtral-8-cost.png)
+*Source: [Mistral AI Blog](https://mistral.ai/news/mixtral-8x22b/)*
+
+## Results
+
+According to the [official reported results](https://mistral.ai/news/mixtral-8x22b/), Mixtral 8x22B (with 39B active parameters) outperforms state-of-the-art open models like Command R+ and Llama 2 70B on several reasoning and knowledge benchmarks like MMLU, HellaS, TriQA, NaturalQA, among others.
+
+!["Mixtral 8x22B Reasoning and Knowledge Performance"](../../img/mixtral/mixtral-8-reasoning.png)
+*Source: [Mistral AI Blog](https://mistral.ai/news/mixtral-8x22b/)*
+
+Mixtral 8x22B outperforms all open models on coding and math tasks when evaluated on benchmarks such as GSM8K, HumanEval, and Math. It's reported that Mixtral 8x22B Instruct achieves a score of 90% on GSM8K (maj@8).
+
+!["Mixtral 8x22B Reasoning and Knowledge Performance"](../../img/mixtral/mixtral-8-maths.png)
+*Source: [Mistral AI Blog](https://mistral.ai/news/mixtral-8x22b/)*
+
+More information on Mixtral 8x22B and how to use it here: https://docs.mistral.ai/getting-started/open_weight_models/#operation/listModels
+
+The model is released under an Apache 2.0 license.