Open-source LLMs
in your pocket
Astra is an offline AI assistant for your phone. Built on the llama.cpp inference engine, ships with Qwen3.5 0.8b, and pulls major open-source models from Hugging Face and ModelScope in one tap.Local inference · Zero data upload · Works in airplane mode.
Why choose Astra
An on-device LLM runtime designed for mobile — cloud-grade intelligence, returned to you.
Fully offline
Subway, basement, airplane mode — Astra never goes dark. Every inference runs locally, no network required.
Privacy first
Your chats never leave the device. No uploads, no logs, no cloud sync. Privacy is the default.
Powered by llama.cpp
A best-in-class open-source inference engine, deeply optimized for mobile CPU / GPU / NPU.
Two model hubs
Seamless integration with Hugging Face and ModelScope. Search → download → load, in one flow.
Ready out of the box
Ships with Qwen3.5 0.8B — chat the moment you install, no setup, no extra downloads.
All platforms
Android and iOS are live now; macOS and Windows clients are on the way.
Popular models, one tap away
Deeply integrated with Hugging Face and ModelScope — search, install, done.
Alibaba's lightweight Tongyi Qianwen instruct model — Astra's built-in default, excellent in Chinese.
Llama 3.2 mid-size — strong multilingual abilities.
Google Gemma 2 instruct — balanced quality and efficiency.
DeepSeek R1 distilled — strong chain-of-thought reasoning.
Microsoft's tiny powerhouse — exceptional parameter efficiency.
01.AI's bilingual Chinese-English chat model.
Set sail for the stars
Free, no account required. Run your own LLM on your own phone.