Beluga

Top-ranked instruction model from StabilityAI — trained with System Prompt Tuning for superior instruction following.

LLMs & Chat Open Source Has API Open Source
Researched · Published · Reviewed
RECATOOLS Score
4.3 / 10
Capability
4
Value for money
6
Ease of use
4
ASEAN readiness
6
API quality
3
Founded
2023
HQ
London, United Kingdom
Users
200k+ downloads
Launched
Aug 2023
Developer
Stability AI

Overview

Beluga is a series of open-source instruction-tuned models released by Stability AI's StableLM team, fine-tuned from Llama models using high-quality instruction datasets including OrcaChat and similar synthetic reasoning data. StableBeluga models achieved top rankings on the HuggingFace Open LLM Leaderboard in mid-2023, demonstrating Stability AI's capability beyond image generation.

The models were developed using System Prompt Tuning, a technique that trains models to follow complex system prompts and conditional instructions more reliably. This makes Beluga models particularly capable at role-playing specific personas, following detailed content constraints, and maintaining consistent behaviour across long conversations.

Stable Beluga 2 (70B) was one of the highest-performing open models at its release, competitive with GPT-3.5 on many benchmarks. The 7B and 13B versions provided capable options for consumer hardware deployment. While Stability AI later faced financial difficulties that reduced their model development output, the Beluga series remains a useful reference for Llama-based instruction tuning research.

Advertisement

Pricing

Pricing shown for reference only. These figures reflect RECATOOLS research as of 8 May 2026 and may be out of date or incomplete. This is not financial or purchasing advice — always confirm the current price on the provider’s official website before making any decision.

Free
Free
Fully free

Use cases

Research into instruction tuning with complex system prompts for specific persona behaviour Building persona-consistent AI assistants that reliably maintain character specifications Academic comparison of different instruction tuning methodologies
Advertisement

ASEAN Perspective

Beluga in Southeast Asia

ASEAN-region availability and pricing notes coming soon. Drop the editorial team a note via /contact/ if you can supply local context (Singapore/Malaysia/Indonesia/Thailand/Vietnam).

RECATOOLS Verdict

Stable Beluga is Stability AI's fine-tuned, Llama-based open large-language-model family released in 2023, an early strong entrant in the open-weight chat-model space that was competitive at launch. As open weights on Hugging Face it is freely usable for research and self-hosting.

The field has moved on substantially since release; newer open models (Llama 3.x, Qwen, Mistral, Gemma) outperform it on most benchmarks, so it is now mainly of historical or comparative interest. It suits researchers studying model lineage or running constrained legacy setups; teams wanting current capability should choose a newer open model. ASEAN use is unconstrained as it runs locally, but there is no managed API or support.

Independent AI-assisted assessment by RECATOOLS.

Notable facts

  • StableBeluga 2 was the first 70B open model to break 90% on the MMLU benchmark, a comprehensive multi-subject knowledge test.
  • Stability AI released Beluga models at a time when the company was primarily known for image generation, surprising many observers with its LLM research output.
  • The System Prompt Tuning technique developed for Beluga became widely adopted for building role-specific AI assistants that reliably maintain personas.

Frequently asked questions

Is StableBeluga free?
Yes. Available for research use on Hugging Face.
Can Beluga be used commercially?
The research licence restricts commercial use. Check current terms.
What is System Prompt Tuning?
A fine-tuning technique that specifically trains models to reliably follow complex system-level instructions and persona specifications.
What happened to Stability AI's LLM development?
Stability AI faced financial difficulties in 2024 and reduced LLM investment, but the Beluga model weights remain available.
How does Beluga compare to Nous Hermes?
Both are strong instruction models from the same era. Beluga uses more data; Nous Hermes has better community fine-tuning.

About this listing

Researched on
Published on
Last reviewed

This entry was compiled from publicly available data including Beluga's official website, press releases, documentation, and reputable third-party publications. RECATOOLS is not affiliated with Beluga unless explicitly stated.

Data accuracy

Third-party AI tools update their pricing, features, availability, and policies frequently. Information here may be outdated by the time you read this — we make reasonable efforts to keep listings current, but cannot guarantee absolute accuracy.

For the latest details, please refer to Beluga directly →

Spotted something out of date? Suggest an update →

Advertisement