Tag

Inference

18 AI tools 2 articles

Advertisement

AI tools18

Production model serving that hit $600M ARR in 2026.

AI Directory →

Cerebras Inference

1,800+ tokens/sec inference on wafer-scale silicon.

AI Directory →

Open-model inference from $0.02 per million tokens.

AI Directory →

Flat monthly rate for unlimited use of 20,000+ open LLMs

AI Directory →

Production inference platform for open-weights LLMs

AI Directory →

Gitee AI 模力方舟

OSChina's serverless hub for open model inference, tuning and apps

AI Directory →

GPU marketplace and pay-per-token inference for 25+ open AI models

AI Directory →

Lambda Inference

Pay-per-token inference API that Lambda itself is winding down

AI Directory →

Acquired by NVIDIA in 2025, now DGX Cloud Lepton.

AI Directory →

200+ models and per-second GPU cloud, priced low.

AI Directory →

One API key, one bill, 300-plus LLMs

AI Directory →

Brokered GPU inference: pay per token, skip the long-term contract

AI Directory →

China's largest independent AI cloud, now filing for a Hong Kong IPO

AI Directory →

Run any open AI model via API — no infrastructure

AI Directory →

SambaNova Cloud

Open models at record tokens-per-second on RDU silicon

AI Directory →

SiliconFlow 硅基流动 (SiliconCloud)

OpenAI-compatible API access to 200+ open models, billed per token

AI Directory →

High-performance inference for open-weights LLMs

AI Directory →

Volcengine Ark 火山方舟 (ByteDance)

ByteDance's MaaS platform for Doubao, DeepSeek, GLM and Kimi

AI Directory →

Advertisement

Articles2

Rows of servers in a data centre, illustrating the AI inference infrastructure Baseten provides.

AI & ML 2 min Baseten is in talks to raise US$1 billion at an US$11 billion valuation as inference money keeps flowing Baseten, which rents Nvidia servers to companies running AI models, is in talks to raise US$1 billion at an US... AI AI Tools Desk 1 Jun

Laptop running a local process on a desk, illustrating on-device AI model inference.

Developer Tools 2 min Lablup open-sources MLXcel, an Apple-Silicon inference engine, under Apache 2.0 Lablup has released MLXcel, an open-source engine for running AI models on Apple Silicon, under the permissive... KE Kenji Tanaka 1 Jun

Advertisement