Voice control

Pocster · 2026-02-11T11:33:35Z

I’m on a roll here !

Getting Alexa to run your own automations and use your own wording is a pita . Lots of “ let’s stop you “ from Amazon .

So I had a ha voice assistant - it’s ok ; but its microphones are poor .

So ! Let’s make something that’s as reliable as Alexa that’s local and no LLM .

zillion discussions with chat - and we have a planned approach . Bought an office style conference speaker unit . So multi directional etc . Mac mini has good processing for noise/ background suppression before passing it on . Speech back will be via a squeezebox player . .

VOICE SYSTEM – SOFTWARE STACK SUMMARY

Wake Word Detection
Software options:

Porcupine (Picovoice)
openWakeWord
Snowboy (older / legacy)

Purpose:

Continuously listens for a wake phrase locally with very low CPU.

Speech-to-Text (STT)
Software:

Whisper (OpenAI Whisper local model)
whisper.cpp (faster C++ local version)
Faster-Whisper
OpenAI Whisper API (cloud option)

Purpose:

Converts recorded audio into text.

Speaker Identification (Voice ID)
Software options:

Resemblyzer (voice embeddings)
pyannote.audio
SpeechBrain speaker recognition
Picovoice Eagle (commercial)

Purpose:

Creates a voice fingerprint and compares it against enrolled users.

Important:

This runs separately from Whisper.

Voice identity ≠ transcript content.

Intent Parsing / Command Understanding

If rule-based:

Home Assistant built-in intent engine
Rhasspy

Permission & Policy Layer

Software:

Home Assistant user permissions
Custom Python logic
Node-RED (optional orchestration layer)

Purpose:

Checks:

Who spoke?
Are they authorised?
Does this require confirmation?

Implements:

“Pocster is that OK?” → wait for verified response.

Execution Layer

Software:

Home Assistant
MQTT broker (Mosquitto)
ESPHome
Custom Python services

Purpose:

Triggers actual devices, UI events, or automations.

Text-to-Speech (TTS)

Software:

Piper (local neural TTS)
Coqui TTS
ElevenLabs (cloud)
Home Assistant TTS integrations

Purpose:

System speaks back to the user.

Clean Stack Example (Fully Local Setup)

Wake word: openWakeWord

STT: whisper.cpp

Voice ID: Resemblyzer

Intent: Home Assistant or local LLM via Ollama

Permissions: Custom Python layer

Execution: Home Assistant + MQTT

TTS: Piper

That is the full named software stack for your speech recognition + speaker ID + command system.

SteamyTea · 2026-02-11T15:01:17Z

WTF

Just put a switch or dial on the wall FFS.

Bramco · 2026-02-11T15:09:58Z

7 minutes ago, SteamyTea said:

WTF

Just put a switch or dial on the wall FFS.

That would make the tiling job harder and worst still make it the next job in the queue....

ProDave · 2026-02-11T15:11:12Z

The thing that gets me about spending countless hours configuring and setting up some custom home brew voice control system, is how do you easily back up all the configured software, so WHEN it goes wrong you can just re install it again in a flash and it will all just work again.

It's bad enough with my Pi music box rebuilding that each time it crashes, and there is not much customisation of that.

SteamyTea · 2026-02-11T15:16:42Z

Just now, ProDave said:

bad enough with my Pi music box rebuilding that each time it crashes, and there is not much customisation of that

I think that may be a problem with the OS. Dedicated hardware does not need the same sort of heavy overhead systems to run.

But yes, (expletive deleted)ing stupid idea.

Pocster · 2026-02-11T16:44:47Z

1 hour ago, ProDave said:

The thing that gets me about spending countless hours configuring and setting up some custom home brew voice control system, is how do you easily back up all the configured software, so WHEN it goes wrong you can just re install it again in a flash and it will all just work again.

It's bad enough with my Pi music box rebuilding that each time it crashes, and there is not much customisation of that.

Easily ! Mac Time Machine - backups up automatically every day / week

Pocster · 2026-02-11T16:53:13Z

1 hour ago, SteamyTea said:

WTF

Just put a switch or dial on the wall FFS.

Clearly you don’t understand the issue . You cannot get Alexa to do exactly what you want without jumping through hoops “ Alexa play a random album by Coldplay “ and it selects a random cold play from your squeeze box and streamers to a default streamer . Or “ Alex disable jamma cabinet “ - recognises my voice only and does that .

You can frig some of these but it’s a pita also Amazon can cause issues .

I can and will have dashboard when done that these things are selectable. Home assistant voice can do these things with some effort but as said it’s microphones are crap . “ ok nabu radio on “ ; asking it then to turn radio off it won’t be able to mask out the background radio !!!
Once you have a stable reliable system there are many things that can be achieved - limitation is people’s imagination.

Posting here will of course land on lots of “ why bother “ views .

Take my home cinema setup . 1 button does about 6 things ( SWMBO friendly ) yet some of this hardware has no Bluetooth / zigbee / WiFi - so making dumb things work in the chain can be challenging.
Some of you have no vision …. 😆

Edited 7 hours ago by Pocster

Pocster · 2026-02-11T16:54:58Z

Remember this is local and no LLM overkill .

It was look at this today or painting …

Edited 7 hours ago by Pocster

Pocster · 2026-02-11T16:58:09Z

1 hour ago, Bramco said:

That would make the tiling job harder and worst still make it the next job in the queue....

You’ve got it ! After building for 10 yrs a bit of variety and fun projects are required !

SteamyTea · 2026-02-11T18:16:56Z

Sign In

Voice control

Recommended Posts

Pocster

SteamyTea

Bramco

ProDave

SteamyTea

Pocster

Pocster

Pocster

Pocster

SteamyTea

Create an account or sign in to comment

Create an account

Sign in

Activity

Browse