Lablup has open-sourced MLXcel, an inference engine for running AI models on Apple Silicon, releasing it under the permissive Apache 2.0 licence. The company frames the move around a goal bigger than the code: spreading the capacity to do AI inference more widely, rather than leaving it concentrated in a handful of clouds. The project is live on GitHub.
What it is
Inference is the model doing its job — answering a prompt, classifying an image — as opposed to the training that built it. MLXcel is built to run that workload efficiently on Apple's own chips, the M-series silicon in modern Macs. It joins a growing set of Apple-Silicon inference tools, but the Apache 2.0 licence is the part that matters: it lets anyone use, modify and ship the engine commercially without the strings that come with more restrictive terms, per Lablup's announcement.
Why decentralised inference is the point
Most AI compute today runs in a few large data centres. That is efficient and also a concentration of power, cost and dependency. Tools that run capable models on hardware people already own push in the other direction. A developer with a Mac can prototype, test and even serve smaller models without renting cloud GPUs. For privacy-sensitive work, keeping inference on the device rather than sending data to a provider is its own argument.
The caveat
On-device inference on a laptop does not replace a data centre for large models or heavy traffic. The ceiling is the hardware in front of you. What an engine like MLXcel changes is the floor: more people able to run useful models locally, and a healthier open ecosystem around Apple Silicon. For a developer weighing whether to reach for a cloud API by default, that floor is where the practical decisions get made.