DeepInfra
Serverless inference for open models.
Overview
A pay-per-token cloud for running open-source LLMs and other models behind a simple, OpenAI-compatible API.
ASEAN Perspective
DeepInfra in Southeast Asia
ASEAN-region availability and pricing notes coming soon. Drop the editorial team a note via /contact/ if you can supply local context (Singapore/Malaysia/Indonesia/Thailand/Vietnam).
DeepInfra is an inference-hosting platform that serves a broad catalogue of open-source models (LLMs, embeddings, image, speech) through a simple, OpenAI-compatible, pay-per-token API at aggressively low prices. It is a practical way to run open models without managing GPUs.
It suits developers and startups who want cheap, hosted access to Llama, Mistral, DeepSeek and similar models with minimal lock-in. Caveats: it is a developer infrastructure service, not an end-user product; you are responsible for prompt/quality engineering; and SLAs, support depth and consistency are lighter than hyperscalers. Pricing is its headline strength. Global API access works from ASEAN; docs are solid if terse.
About this listing
This entry was compiled from publicly available data including DeepInfra's official website, press releases, documentation, and reputable third-party publications. RECATOOLS is not affiliated with DeepInfra unless explicitly stated.
Third-party AI tools update their pricing, features, availability, and policies frequently. Information here may be outdated by the time you read this — we make reasonable efforts to keep listings current, but cannot guarantee absolute accuracy.
For the latest details, please refer to DeepInfra directly →
Spotted something out of date? Suggest an update →
Alternatives to DeepInfra
More in LLMs & Chat