Setting Up Wake Word Voice Activation for Your OpenClaw Agent
A common question in the OpenClaw Discord: "Is there wake word functionality and how do I set it up?" The short answer is yes โ but it depends on your setup and what you're trying to achieve.
What Is Wake Word?
Wake word functionality lets you trigger your AI agent by speaking a phrase (like "Hey Jarvis" or "OK Computer") instead of typing. This is especially useful for hands-free operation when running OpenClaw on a home server, Raspberry Pi, or always-on setup.
Current Options
OpenClaw itself doesn't have built-in wake word detection โ it's designed to be modular. Instead, you pipe audio through a wake word detector before it reaches your agent. Here are the main approaches:
Option 1: Use the Whisper + Porcupine Flow
The most popular community setup combines:
- Porcupine (by Picovoice) for wake word detection โ it's lightweight, runs locally, and supports custom wake words
- Whisper (local or API) for speech-to-text after the wake word triggers
The flow works like this:
Microphone โ Porcupine (listening) โ Wake word detected โ
Record speech โ Whisper (transcribe) โ Send to OpenClaw โ
TTS response โ Speaker
Option 2: Node-Based Wake Word
If you're running OpenClaw with a paired node (desktop app or mobile), some nodes have built-in push-to-talk or wake word capabilities. Check the node documentation for your platform.
Option 3: Home Assistant Integration
Many users integrate OpenClaw with Home Assistant, which has robust voice pipeline support including wake word detection via Wyoming protocol. Your HA voice satellite handles the wake word, then sends transcribed text to OpenClaw.
Events and Subscriptions
For those building custom integrations, OpenClaw emits events you can hook into:
message.receivedโ fires when any message arrives (text or transcribed audio)agent.replyโ fires when the agent responds
You can subscribe via the WebSocket API or use the event hooks in custom skills.
Viewing Transcriptions
Want to see what your agent heard? Check your session logs:
openclaw logs --followOr enable verbose logging in your config to see full message payloads including transcribed text.
Pro Tips from the Community
- Latency matters โ Use local Whisper (via faster-whisper or whisper.cpp) for snappier response times
- Custom wake words โ Porcupine lets you train custom wake words, so your agent can respond to its actual name
- Noise handling โ Add a voice activity detector (VAD) before the speech-to-text step to avoid sending silence/noise to Whisper
- TTS integration โ Complete the loop by piping agent responses through a TTS service (ElevenLabs, Coqui, or local Piper)
Example Porcupine Setup
import pvporcupine
import pyaudio
porcupine = pvporcupine.create(
access_key='YOUR_ACCESS_KEY',
keywords=[jarvis] # Or your custom wake word
)
pa = pyaudio.PyAudio()
stream = pa.open(
rate=porcupine.sample_rate,
channels=1,
format=pyaudio.paInt16,
input=True,
frames_per_buffer=porcupine.frame_length
)
while True:
pcm = stream.read(porcupine.frame_length)
pcm = struct.unpack_from("h" * porcupine.frame_length, pcm)
keyword_index = porcupine.process(pcm)
if keyword_index >= 0:
print("Wake word detected!")
# Start recording and send to Whisper...Resources
- Porcupine Wake Word Engine
- Whisper.cpp for fast local transcription
- Home Assistant Voice
- Ask in #help or #home-automation on the OpenClaw Discord for specific setup help
Have you built a wake word setup for your OpenClaw agent? Share it in #showcase!
Comments (0)
No comments yet. Be the first to comment!