- OAuth authentication via Authentik - WebSocket connection to OpenClaw gateway - Configurable gateway URL with first-run setup - User preferences sync across devices - Multi-user support with custom assistant names - ElevenLabs TTS integration (local + remote) - FCM push notifications for alarms - Voice input via Google Speech API - No hardcoded secrets or internal IPs in tracked files
4.8 KiB
4.8 KiB
Wake Word Detection - "Hey Alfred"
Overview
The Alfred Mobile app now includes offline wake word detection using Vosk, an open-source speech recognition toolkit. This allows hands-free voice interaction by continuously listening for the wake phrase.
Wake Words
The app listens for:
- "Hey Alfred"
- "Alfred"
When either phrase is detected, voice input automatically starts.
How to Use
1. Enable Wake Word Mode
In the app's status bar (below the top bar), you'll see two toggle chips:
- Wake Word (keyboard icon) - Enable/disable continuous listening
- Voice Off/On (speaker icon) - Enable/disable TTS responses
Tap Wake Word to enable continuous listening mode. The chip will turn blue and say "Always On".
2. Say the Wake Word
With wake word mode enabled, the app continuously listens for "Hey Alfred" or "Alfred" in the background.
When detected:
- You'll see a system message: "Wake word detected!"
- Voice input automatically starts (microphone icon appears)
- Speak your command/question
- Voice input stops after a pause (10 seconds allowed for natural pauses)
- Message auto-sends to Alfred
3. Normal Conversation
After the wake word triggers voice input:
- Speech pauses: The app allows up to 10 seconds of silence for natural speaking rhythm
- Auto-send: Your message sends automatically when voice input completes
- Wake word loops: After sending, wake word detection resumes automatically
4. Enable TTS (Optional)
For a full voice conversation experience:
- Enable Voice On (speaker icon)
- Say "Hey Alfred" → speak your question → Alfred responds verbally
- Say "Hey Alfred" again for the next question
Technical Details
Model
- Vosk Small English Model (vosk-model-small-en-us-0.15)
- Size: ~39MB
- Location:
app/src/main/assets/vosk-model/ - On-device processing: No internet required, completely private
Accuracy
- Works best in quiet environments
- Optimized for American English
- May occasionally false-trigger on similar-sounding words
Privacy
- All speech recognition happens on-device
- No audio data sent to external servers
- Only transcribed text is sent to OpenClaw gateway (as with manual voice input)
Performance
- CPU usage: Low (Vosk uses lightweight model)
- Battery impact: Moderate when wake word mode is enabled (continuous microphone access)
- Latency: ~100-500ms from wake word to voice input activation
Permissions
- Microphone: Required for wake word detection
- Requested automatically when you enable wake word mode
Troubleshooting
Wake word not detecting
- Check microphone permission - Grant in Android settings if denied
- Speak clearly - Say "Hey Alfred" or "Alfred" distinctly
- Reduce background noise - Works best in quiet environments
- Check volume - Speak at normal conversation volume
Battery drain
- Wake word mode uses continuous microphone access
- Disable wake word mode when not needed
- Use manual voice button for single commands
False positives
- Vosk may occasionally trigger on similar words ("Elford", "Alpha Fred", etc.)
- This is normal for lightweight on-device models
- False triggers will just open voice input briefly
Architecture
WakeWordDetector.kt
├── Vosk Model (assets/vosk-model/)
├── Continuous audio recording (16kHz)
├── Partial result processing
└── Wake word matching ("alfred", "hey alfred")
MainScreen.kt
├── Wake word toggle chip
├── Initialize detector on launch
├── Auto-trigger VoiceInputManager on detection
└── Display "Wake word detected!" message
Future Enhancements
Potential improvements:
- Custom wake word training
- Background service (wake word works when app is backgrounded)
- Larger/more accurate Vosk model option
- Multi-language support
- Configurable wake words via settings
Comparison: Wake Word vs Manual Voice
| Feature | Wake Word Mode | Manual Voice Button |
|---|---|---|
| Activation | Say "Hey Alfred" | Tap microphone button |
| Hands-free | ✅ Yes | ❌ No (requires tap) |
| Battery impact | Moderate | Low |
| Privacy | Full (on-device) | Full (on-device) |
| Accuracy | Good | Excellent |
| Background use | Not yet (app must be open) | Not yet (app must be open) |
Related Files
- Wake word logic:
app/src/main/java/com/openclaw/alfred/voice/WakeWordDetector.kt - UI integration:
app/src/main/java/com/openclaw/alfred/ui/screens/MainScreen.kt - Voice input:
app/src/main/java/com/openclaw/alfred/voice/VoiceInputManager.kt - TTS:
app/src/main/java/com/openclaw/alfred/voice/TTSManager.kt - Model:
app/src/main/assets/vosk-model/
Enjoy hands-free conversations with Alfred! 🎤