Run private AI on your Mac
CereVault Serve
A lightweight Mac menu bar server for running and sharing local AI models across your trusted network. Built on Apple's MLX framework, so nothing leaves your devices unless you ask it to.
Your Mac becomes a personal AI server.
Curated model catalog
Pick from Llama 3.2, Gemma 2, Qwen 2.5, and Phi 3.5 with one click. Each model downloads directly to your Mac.
Lightweight menu bar app
Lives quietly in your menu bar, with no Dock icon and no windows in your way until you need them.
Private by default
Models run entirely on your Mac. Network access is loopback-only until you explicitly turn on device sharing.
Local network sharing
Flip one toggle to let your iPhone, iPad, and other Macs on your Wi-Fi connect through CereVault Serve.
OpenAI-compatible API
Works with any client that supports a custom base URL, with streaming, token-by-token responses.
Storage you control
Manage downloaded models in one place and free disk space anytime, typically 1–3 GB per model.
Models from a public registry, served locally.
When you choose to download a model, the files come directly from Hugging Face, a public model registry, and that connection is between your Mac and Hugging Face only. After that, everything runs on your machine. No accounts, no subscriptions, no analytics, and no data sent to the developer or any third party.
Requirements
- Apple Silicon Mac (M1 or later)
- macOS 14 or later
- 1–3 GB free disk space per downloaded model
- Wi-Fi or local network for device sharing (optional)