Key Takeaways
- Home control rewards low latency and reliable function-calling, not maximum model size
- A 4B model fits low-power hardware; an 8B model suits a mini PC with a GPU or NPU
- Gemma 3 4B (Google), Qwen3 4B (Alibaba), and Qwen3 8B (Alibaba) are common, well-supported choices
- Qwen3, Gemma 3, and Phi-4-mini have proven Home Assistant tool-calling support today
- Pick a model with strong support for the language you speak to it
- For deep model rankings and mechanics, link out to the local-llms cluster
What Matters for Home Control
Three things decide a good home-control model: latency, reliable function-calling, and size that fits your hardware. Benchmark leaderboards matter far less here than responsiveness.
- Latency: a voice command should feel near-instant; smaller models on capable hardware respond faster.
- Function-calling: the model must emit structured device actions reliably β this is the decisive capability.
- Fit: the model must run comfortably on the box that also hosts Home Assistant β see best hardware for a local smart home.
The Shortlist
These small models are common, well-supported choices for home control across different hardware budgets. Use a 4B model on light hardware and an 8B model when you have a GPU or NPU. Gemma 4 (June 2026) is the newest option; Qwen3, Gemma 3, and Phi-4-mini have proven Home Assistant tool-calling support today.
- Gemma 3 4B (Google): a 4-billion-parameter model with broad multilingual coverage (140+ languages), a strong low-power choice β Ollama tag
gemma3:4b. - Qwen3 4B (Alibaba): a fast 4B model with reliable tool use and good multilingual support, low latency on a CPU or integrated GPU β
qwen3:4b. - Phi-4-mini (Microsoft): a compact 3.8B model that punches above its size for instruction-following β
phi4-mini. - Llama 3.2 3B (Meta): a widely-supported 3B baseline that runs on modest hardware with good function-calling β
llama3.2:3b. - Qwen3 8B (Alibaba): the best quality here on a GPU or NPU and a Home Assistant favourite for tool-calling β
qwen3:8b.
Comparison
Pick by hardware and language: smaller models for CPU-only or Pi-class hardware, 8B for a GPU-equipped mini PC. Sizes below are approximate at common 4-bit quantization; the Ollama tag is the exact model to pull.
- Footprints are approximate and depend on quantization β for VRAM and quantization depth, link out to the local-llms cluster.
| Model | Params | Approx. footprint (Q4) | Ollama tag | Best for |
|---|---|---|---|---|
| Gemma 3 4B | 4B | ~3 GB | gemma3:4b | Low-power host, 140+ languages |
| Qwen3 4B | 4B | ~2.5β3 GB | qwen3:4b | Low latency, multilingual, tools |
| Phi-4-mini | 3.8B | ~2.5β3 GB | phi4-mini | Strong instruction-following |
| Llama 3.2 3B | 3B | ~2β3 GB | llama3.2:3b | Widely-supported baseline |
| Qwen3 8B | 8B | ~5 GB | qwen3:8b | Best quality on GPU/NPU; HA favourite |
Picks by Hardware Budget
Choose a 4B model on a Pi or CPU-only mini PC; choose an 8B model when you have a GPU or NPU. This keeps responses snappy at every tier.
- Raspberry Pi / low-power: Gemma 3 4B or Qwen3 4B, accepting slower responses.
- Mini PC (CPU only): Qwen3 4B or Phi-4-mini as a responsive default.
- Mini PC with GPU/NPU: Qwen3 8B for the best quality at acceptable latency β see best mini PCs for Home Assistant + local AI.
How to Pick
Start with a 4B model, confirm latency and reliable device actions, then move to 8B only if quality is lacking. Test with your real commands before committing.
- Install via the Ollama integration and test your common commands.
- If responses are slow, drop a size or add a GPU/NPU.
- If actions are unreliable, prefer a model known for function-calling.
- For deep model rankings and mechanics, see best local LLMs 2026 (cross-cluster) β this guide stays home-control-specific.
FAQ
What is the smallest usable model for home control?
A 3B model such as Llama 3.2 3B is the practical floor for reliable device control on low-power hardware, trading some understanding for speed. A 4B model like Gemma 3 4B or Qwen3 4B is the better balance if your hardware allows it.
Does a home-control model need a GPU?
No for 4B models, which run on CPU or an integrated GPU. A GPU or NPU mainly lets you run an 8B model such as Qwen3 8B at low latency for better understanding. Match the model to your hardware.
Which models support function-calling?
Modern small models including Qwen3, Gemma 3, and Phi-4-mini have proven Home Assistant tool/function-calling support, which is the capability that lets them emit reliable device actions. Prefer a model documented to support it for home control.
What is the best model for a Raspberry Pi?
A 4B model like Gemma 3 4B or Qwen3 4B is the practical ceiling on a Raspberry Pi, and responses will be slower than on a mini PC. For a snappy assistant, a mini PC with a GPU/NPU running Qwen3 8B is the better host.