Skip to main content
PromptQuorumPromptQuorum
Home/Smart Home/Build a Fully Local Voice Assistant for Your Smart Home (2026)
Local AI & LLMs in the Smart Home

Build a Fully Local Voice Assistant for Your Smart Home (2026)

Β·11 min readΒ·By Hans Kuepper Β· Founder of PromptQuorum, multi-model AI dispatch tool Β· PromptQuorum

A fully local voice assistant combines Home Assistant Assist (intent), local Whisper (speech-to-text), Piper (text-to-speech), and a local LLM (reasoning) β€” all connected over the Wyoming protocol and running on your own hardware. No audio or commands leave the house, and it works offline.

You can replace Alexa or Google with a fully local voice assistant built from Home Assistant Assist, local Whisper for speech-to-text, Piper for text-to-speech, and a local LLM as the brain. This guide covers the offline voice stack, each component, the Wyoming protocol that connects them, and the hardware you need β€” all private and working without the cloud.

Key Takeaways

  • Home Assistant Assist is the local voice pipeline that ties everything together
  • Whisper handles speech-to-text locally; pick a model size for your accuracy/speed/hardware trade-off
  • Piper handles text-to-speech locally with natural-sounding voices
  • The Wyoming protocol connects Assist to the Whisper and Piper services
  • Add a wake-word engine (such as openWakeWord) for hands-free triggering
  • Optional: set a local LLM as the conversation agent for natural-language understanding

The Fully-Local Voice Stack

A local voice assistant is four roles on your own hardware: capture and transcribe (Whisper), understand (Assist intents or a local LLM), respond (Piper), and trigger (wake word). Each runs offline; the Wyoming protocol wires them together.

ComponentRoleLocal?Notes
AssistPipeline + intentYesBuilt into Home Assistant
WhisperSpeech-to-textYesModel size sets accuracy/speed
PiperText-to-speechYesNatural local voices
Wake wordHands-free triggerYese.g. openWakeWord
Local LLMUnderstanding (optional)YesVia Ollama as conversation agent

Home Assistant Assist

Assist is the built-in voice pipeline that routes audio through speech-to-text, an agent, and text-to-speech. It is configured under Settings β†’ Voice assistants.

  • Assist works with built-in intents out of the box (no LLM required) for common commands.
  • You select the STT engine (Whisper), the TTS engine (Piper), and the conversation agent.
  • Use multiple pipelines if you want a fast intent-only assistant and a separate LLM-powered one.

Whisper for Local Speech-to-Text

Whisper transcribes your speech locally; larger Whisper models are more accurate but need more compute. Add it as the Whisper (faster-whisper) add-on and connect via Wyoming.

  • Whisper ships in sizes from tiny to large β€” smaller is faster, larger is more accurate.
  • For a focused STT setup (models, hardware, accuracy), see local Whisper + Home Assistant.
  • Whisper is multilingual, so non-English commands transcribe without a cloud service.

Piper for Local Text-to-Speech

Piper generates spoken responses locally with natural-sounding voices, fast enough for real-time replies on modest hardware. Add it as the Piper add-on and select a voice.

  • Piper offers multiple languages and voices; pick one per pipeline.
  • It runs well on a Raspberry Pi for typical response lengths.
  • No audio is sent anywhere β€” the speech is synthesised on your device.

The Wyoming Protocol

Wyoming is the protocol Home Assistant uses to connect Assist to local voice services like Whisper and Piper. It lets the speech services run as separate add-ons or on separate machines.

  • Each service (Whisper, Piper, wake word) runs as a Wyoming endpoint.
  • Assist discovers and uses them through the Wyoming integration.
  • This modularity means you can offload Whisper to a more powerful box if needed.

Adding the LLM Brain

Set a local LLM as the conversation agent to understand natural language instead of only fixed intents. This is optional but unlocks flexible phrasing.

Hardware Needs

A mini PC comfortably runs Assist, Whisper, Piper, and a small LLM; a Raspberry Pi handles intent-only voice but struggles with large Whisper models and LLM inference. Microphone hardware (voice satellites) captures audio around the house.

FAQ

Can a local voice assistant fully replace Alexa?

For smart-home control and many routines, yes β€” Assist with Whisper, Piper, and a local LLM covers natural-language device control and responses. It does not replicate every third-party Alexa skill or cloud shopping feature, but it covers the core home-control use case privately.

Does a local voice assistant work offline?

Yes. Speech-to-text (Whisper), text-to-speech (Piper), intent handling, and an optional local LLM all run on your hardware, so the assistant works with no internet. Only remote access from outside the home needs connectivity.

How accurate is local speech recognition?

Accuracy depends on the Whisper model size and your microphone. Larger Whisper models are more accurate but slower; a mid-size model on a mini PC gives a good balance for home commands. See the local Whisper guide for sizing.

What hardware do I need for a local voice assistant?

A mini PC for the full stack (LLM + larger Whisper), or a Raspberry Pi for an intent-only assistant, plus microphone/speaker voice-satellite hardware for room coverage. A GPU or NPU lowers LLM and large-Whisper latency.

Can I use a custom wake word?

Yes. A local wake-word engine such as openWakeWord supports custom wake words and runs on your hardware, so hands-free triggering needs no cloud.

← Back to Smart Home