Wan (Wan 2.1)
Alibaba's open-source video AI that beat Sora on VBench — run it locally, free, with an Apache 2.0 licence.
Overview
Wan 2.1 is a suite of open-source video foundation models released by Alibaba's Wan Team in February 2025 under the Apache 2.0 licence. The flagship 14B-parameter model topped the VBench leaderboard with an 86.22% score — outperforming OpenAI Sora (84.28%), Runway Gen-3 (82.32%), and Tencent HunyuanVideo (83.24%) — and is the only open-source model to place in the global top five at launch. A lightweight 1.3B variant runs on as little as 8.19 GB of VRAM, making local generation accessible on consumer GPUs like an RTX 4070 or higher.
The model family covers text-to-video, image-to-video, first-last-frame-to-video, and video editing tasks, with native bilingual text rendering in both Chinese and English inside generated frames. Models are freely downloadable from Hugging Face and ModelScope, and the official wan.video platform offers a hosted consumer product with tiered paid plans. A technical report (arXiv 2503.20314) by Team Wan — comprising more than 60 named researchers at Alibaba — details the diffusion-transformer architecture, Wan-VAE, and flow-matching training pipeline that underpin the series.
Pricing
Pricing shown for reference only. These figures reflect RECATOOLS research as of 16 Jun 2026 and may be out of date or incomplete. This is not financial or purchasing advice — always confirm the current price on the provider’s official website before making any decision.
Use cases
What you can produce with Wan (Wan 2.1)
- 5-second text-to-video clips at 480p or 720p resolution
- Image-to-video animations from a single still photograph
- First-and-last-frame interpolation videos for controlled scene transitions
- Video editing and stylisation via VACE model variant
- Bilingual video content with legible Chinese and English text rendered inside frames
- Locally hosted inference pipeline on consumer NVIDIA GPUs (RTX 3080 and above)
- Fine-tuned derivative models via Apache 2.0 open weights
ASEAN Perspective
Wan (Wan 2.1) in Southeast Asia
Wan 2.1 holds notable relevance for the ASEAN region: it is developed by Alibaba, whose Cloud division has its deepest Southeast Asia footprint in Singapore and Malaysia (five data centres in Malaysia alone as of mid-2025), and it natively supports Chinese text rendering — useful for Chinese-language content creators across Singapore, Malaysia, and Vietnam. The model is actively deployed by ASEAN-facing AI platforms such as GladCube's Dra Vis and TabSpace.ai via Alibaba Cloud's Model Studio. Its zero-cost local deployment path is a significant equaliser for developers and SMEs in cost-sensitive markets like Indonesia, the Philippines, and Vietnam, where commercial video AI subscriptions are prohibitively expensive for most creators.
Wan 2.1 is arguably the most significant open-source video model released to date: it is the only freely downloadable model family that could genuinely challenge commercial incumbents at launch, topping VBench 86.22% vs Sora's 84.28%, and its Apache 2.0 licence allows unrestricted commercial deployment, fine-tuning, and on-premise hosting with no royalties. The 1.3B variant's 8 GB VRAM floor puts serious AI video generation within reach of hobbyists and indie studios for the first time. Value-for-money is exceptional — the weights are free and the hosted API is competitively priced against Runway or Kling.
Caveats are real. Generation speed on consumer hardware is slow (roughly four minutes for five seconds on an RTX 4090), and local setup requires comfort with Python, CUDA, and Hugging Face tooling. The 14B model demands a high-end GPU (24 GB+ VRAM recommended). By mid-2026 the competitive gap has narrowed: Kling 3.0 leads on character consistency and 1080p quality, while Sora 2 dominates photorealism and long-form generation. Wan 2.1 remains the go-to for developers needing open weights, fine-tuning access, or cost-controlled API integration.
What people say
Wan 2.1 is the benchmark-setter for open-source video AI as of its February 2025 release, achieving a verified 86.22% VBench score that surpassed Sora and Luma at launch. Its Apache 2.0 licence, freely downloadable weights, and 8 GB VRAM minimum make it uniquely accessible. Generation speed on consumer hardware is slow, setup requires technical proficiency, and the 14B model demands high-end GPUs. By 2026 newer closed models have reclaimed quality leadership in specific domains. For developers and cost-focused creators, Wan 2.1 remains the most compelling open-source option available.
Summary of public user & expert reviews, compiled by RECATOOLS.
Notable facts
- Wan 2.1 is the only open-source model to rank in the global top five on the VBench video benchmark leaderboard at its February 2025 launch.
- The 1.3B lightweight variant requires just 8.19 GB of VRAM — enough to run on a standard RTX 3080 or 4070 gaming GPU.
- The technical paper has more than 60 named co-authors, making it one of the largest team-authored AI model reports of 2025.
- Wan 2.1 was the first video generation model capable of rendering legible text in both Chinese and English characters directly inside generated video frames.
Frequently asked questions
About this listing
This entry was compiled from publicly available data including Wan (Wan 2.1)'s official website, press releases, documentation, and reputable third-party publications. RECATOOLS is not affiliated with Wan (Wan 2.1) unless explicitly stated.
Third-party AI tools update their pricing, features, availability, and policies frequently. Information here may be outdated by the time you read this — we make reasonable efforts to keep listings current, but cannot guarantee absolute accuracy.
For the latest details, please refer to Wan (Wan 2.1) directly →
Spotted something out of date? Suggest an update →
Wan (Wan 2.1) in the news
Alternatives to Wan (Wan 2.1)
More in Video & Audio