Mixtral

Mistral's open-weight mixture-of-experts model, Apache 2.0 licensed

API Commercial Use LLM Mistral Moe Open Source

LLMs & Chat Open Source Has API Open Source

Researched 8 May 2026, 20:44 SGT · Published 8 May 2026, 08:00 SGT · Reviewed 11 Jul 2026

Visit Mixtral Compare alternatives

RECATOOLS Score

7.2 / 10

Capability

Value for money

Ease of use

ASEAN readiness

API quality

Founded

2023

Paris, France

Users

500k+ API users

Launched

Dec 2023

Developer

Mistral AI

Overview

Mixtral 8x7B (Dec 2023) and 8x22B (Apr 2024) are Mistral AI's open-weight sparse mixture-of-experts models — GPT-3.5-to-GPT-4-class performance at lower active-parameter cost, Apache 2.0 licensed. Now superseded within Mistral's own lineup by the Mistral 3 family.

Pricing

Pricing shown for reference only. These figures reflect RECATOOLS research as of 11 Jul 2026 and may be out of date or incomplete. This is not financial or purchasing advice — always confirm the current price on the provider’s official website before making any decision.

Free

Free to download and self-host

Use cases

Cost-efficient RAG pipelines where per-query cost is critical Multilingual content generation across European and Asian languages High-throughput coding assistance at lower inference cost than GPT-4

ASEAN Perspective

Mixtral in Southeast Asia

ASEAN-region availability and pricing notes coming soon. Drop the editorial team a note via /contact/ if you can supply local context (Singapore/Malaysia/Indonesia/Thailand/Vietnam).

RECATOOLS Verdict

Mixtral (8x7B and 8x22B) was Mistral AI's proof that sparse mixture-of-experts could match dense models at a fraction of the active-parameter cost, and the Apache 2.0 license made it genuinely usable in production without licensing headaches. It still runs cheaply through Together AI, Fireworks, Groq and self-hosted setups.

It's legacy now, though — Mistral's own roadmap has moved on to the Mistral 3 family (including a 41B-active/675B-total Large 3), and competing open models have closed the gap since 2023-24. Mixtral remains a reasonable pick for teams that already have it in production or want a well-understood, cheap-to-run open model, but nobody should deploy it new in 2026 expecting frontier quality.

Independent AI-assisted assessment by RECATOOLS.

What people say

December 2023 is when Mixtral 8x7B shipped, and it's aged the way most 2023-era open models have: still functional, still cheap, no longer anyone's first choice. At launch it was a genuine surprise — a sparse mixture-of-experts model routing each token through 2 of 8 expert networks, using roughly 13B of its 47B parameters per forward pass, and landing close to GPT-3.5 on most benchmarks while running faster than a dense model of similar size.

Mixtral 8x22B followed in April 2024, scaling the same idea up to something that traded blows with GPT-4 on several benchmarks — a genuinely notable result for an open-weight release at the time. Both ship under Apache 2.0, so there's no restrictive license blocking commercial use, and inference providers like Together AI, Fireworks and Groq still host it cheaply for anyone who wants it in a RAG pipeline or cost-sensitive production system.

What's changed since is Mistral's own roadmap. The company has moved on to the Mistral 3 line — including Mistral Large 3, a 41B-active/675B-total model — plus new products like Vibe (the May 2026 rename of Le Chat) and enterprise tooling under Forge, alongside an $830M raise in March 2026 for new datacenters. Against that backdrop, and against newer open releases from Meta, Alibaba and others, Mixtral reads as a solid, well-documented legacy option rather than a model anyone should reach for first. Fine for teams already running it or for learning mixture-of-experts architecture; skip it for anything needing current-generation quality.

Summary of public user & expert reviews, compiled by RECATOOLS.

Notable facts

Mixtral 8x7B was released by Mistral via a BitTorrent magnet link on Twitter before any official announcement — a guerrilla marketing approach that generated massive developer interest.
Despite having 47 billion total parameters, Mixtral activates only 13 billion per inference — the same cost as a 13B model but with the knowledge capacity of a 47B model.
Mixtral outperforms Llama 2 70B and matches GPT-3.5 on most benchmarks while running at 6x faster inference speed.

Frequently asked questions

What is mixture-of-experts?

MoE routes each input through a subset of specialised 'expert' neural networks, achieving high capability while keeping inference cost low by only activating relevant experts.

Is Mixtral free?

Yes. Model weights are free under Apache 2.0. Cloud inference via third-party providers has per-token costs.

How does Mixtral compare to Llama?

Mixtral outperforms Llama 2 70B on most benchmarks while being much faster and cheaper to run due to the MoE architecture.

Can I fine-tune Mixtral?

Yes. The Apache 2.0 licence permits fine-tuning and commercial deployment.

What hardware do I need to run Mixtral locally?

Mixtral 8x7B requires approximately 48GB of GPU VRAM, or can run with CPU offloading on systems with 64GB+ RAM.

About this listing

Researched on Friday, 8 May 2026 at 20:44 SGT (UTC+8)

Published on Friday, 8 May 2026 at 08:00 SGT (UTC+8)

Last reviewed Saturday, 11 July 2026 (1 week ago)

This entry was compiled from publicly available data including Mixtral's official website, press releases, documentation, and reputable third-party publications. RECATOOLS is not affiliated with Mixtral unless explicitly stated.

Data accuracy

Third-party AI tools update their pricing, features, availability, and policies frequently. Information here may be outdated by the time you read this — we make reasonable efforts to keep listings current, but cannot guarantee absolute accuracy.

For the latest details, please refer to Mixtral directly →

Spotted something out of date? Suggest an update →