Mixtral
Mistral's sparse mixture-of-experts open model — GPT-3.5 quality at a fraction of the compute cost.
Overview
Mixtral 8x7B is Mistral AI's sparse mixture-of-experts language model that achieved GPT-3.5-class performance while only activating 13 billion of its 47 billion parameters per inference. Released in December 2023, it demonstrated that the mixture-of-experts architecture could deliver frontier-level results at dramatically lower compute cost, making it widely adopted for cost-sensitive production deployments.
The architecture routes each input token through 2 of 8 expert networks, meaning the model has specialised sub-networks for different types of knowledge while keeping inference costs low. This approach allows Mixtral to excel on coding, mathematics, and multilingual tasks at speeds competitive with much smaller models.
Mixtral 8x22B, released in April 2024, scaled the architecture further to achieve performance competitive with GPT-4 on many benchmarks. Both models are released under the Apache 2.0 licence for unrestricted commercial use. Many inference providers including Together AI, Fireworks, and Groq offer Mixtral hosting at very low cost, and the model is widely used in RAG applications where cost per query is critical.
Pricing
Pricing shown for reference only. These figures reflect RECATOOLS research as of 8 May 2026 and may be out of date or incomplete. This is not financial or purchasing advice — always confirm the current price on the provider’s official website before making any decision.
Use cases
ASEAN Perspective
Mixtral in Southeast Asia
ASEAN-region availability and pricing notes coming soon. Drop the editorial team a note via /contact/ if you can supply local context (Singapore/Malaysia/Indonesia/Thailand/Vietnam).
Mixtral is Mistral AI's open-weight sparse mixture-of-experts model line (8x7B and 8x22B), notable for delivering strong quality at lower active-parameter cost than dense models of similar size. Apache-2.0 licensing makes it genuinely usable commercially, and it runs self-hosted, through Mistral's own API, or via third-party inference hosts.
It suits teams that want a capable open model they can host themselves for data control, and developers comparing open alternatives to GPT-class models. Caveats: it has been overtaken by newer Mistral and competitor releases, so it is no longer frontier; multilingual coverage is decent but English-centric, and running the larger variant well needs serious GPU memory.
Notable facts
- Mixtral 8x7B was released by Mistral via a BitTorrent magnet link on Twitter before any official announcement — a guerrilla marketing approach that generated massive developer interest.
- Despite having 47 billion total parameters, Mixtral activates only 13 billion per inference — the same cost as a 13B model but with the knowledge capacity of a 47B model.
- Mixtral outperforms Llama 2 70B and matches GPT-3.5 on most benchmarks while running at 6x faster inference speed.
Frequently asked questions
About this listing
This entry was compiled from publicly available data including Mixtral's official website, press releases, documentation, and reputable third-party publications. RECATOOLS is not affiliated with Mixtral unless explicitly stated.
Third-party AI tools update their pricing, features, availability, and policies frequently. Information here may be outdated by the time you read this — we make reasonable efforts to keep listings current, but cannot guarantee absolute accuracy.
For the latest details, please refer to Mixtral directly →
Spotted something out of date? Suggest an update →
Alternatives to Mixtral
More in LLMs & Chat