Files
alfred-mobile/WAKE_WORD.md

143 lines
4.8 KiB
Markdown
Raw Permalink Normal View History

# Wake Word Detection - "Hey Alfred"
## Overview
The Alfred Mobile app now includes **offline wake word detection** using [Vosk](https://alphacephei.com/vosk/), an open-source speech recognition toolkit. This allows hands-free voice interaction by continuously listening for the wake phrase.
## Wake Words
The app listens for:
- **"Hey Alfred"**
- **"Alfred"**
When either phrase is detected, voice input automatically starts.
## How to Use
### 1. Enable Wake Word Mode
In the app's status bar (below the top bar), you'll see two toggle chips:
- **Wake Word** (keyboard icon) - Enable/disable continuous listening
- **Voice Off/On** (speaker icon) - Enable/disable TTS responses
Tap **Wake Word** to enable continuous listening mode. The chip will turn blue and say **"Always On"**.
### 2. Say the Wake Word
With wake word mode enabled, the app continuously listens for "Hey Alfred" or "Alfred" in the background.
When detected:
1. You'll see a system message: "Wake word detected!"
2. Voice input automatically starts (microphone icon appears)
3. Speak your command/question
4. Voice input stops after a pause (10 seconds allowed for natural pauses)
5. Message auto-sends to Alfred
### 3. Normal Conversation
After the wake word triggers voice input:
- **Speech pauses**: The app allows up to 10 seconds of silence for natural speaking rhythm
- **Auto-send**: Your message sends automatically when voice input completes
- **Wake word loops**: After sending, wake word detection resumes automatically
### 4. Enable TTS (Optional)
For a full voice conversation experience:
1. Enable **Voice On** (speaker icon)
2. Say "Hey Alfred" → speak your question → Alfred responds verbally
3. Say "Hey Alfred" again for the next question
## Technical Details
### Model
- **Vosk Small English Model** (vosk-model-small-en-us-0.15)
- **Size**: ~39MB
- **Location**: `app/src/main/assets/vosk-model/`
- **On-device processing**: No internet required, completely private
### Accuracy
- Works best in quiet environments
- Optimized for American English
- May occasionally false-trigger on similar-sounding words
### Privacy
- All speech recognition happens **on-device**
- No audio data sent to external servers
- Only transcribed text is sent to OpenClaw gateway (as with manual voice input)
### Performance
- **CPU usage**: Low (Vosk uses lightweight model)
- **Battery impact**: Moderate when wake word mode is enabled (continuous microphone access)
- **Latency**: ~100-500ms from wake word to voice input activation
### Permissions
- **Microphone**: Required for wake word detection
- Requested automatically when you enable wake word mode
## Troubleshooting
### Wake word not detecting
1. **Check microphone permission** - Grant in Android settings if denied
2. **Speak clearly** - Say "Hey Alfred" or "Alfred" distinctly
3. **Reduce background noise** - Works best in quiet environments
4. **Check volume** - Speak at normal conversation volume
### Battery drain
- Wake word mode uses continuous microphone access
- Disable wake word mode when not needed
- Use manual voice button for single commands
### False positives
- Vosk may occasionally trigger on similar words ("Elford", "Alpha Fred", etc.)
- This is normal for lightweight on-device models
- False triggers will just open voice input briefly
## Architecture
```
WakeWordDetector.kt
├── Vosk Model (assets/vosk-model/)
├── Continuous audio recording (16kHz)
├── Partial result processing
└── Wake word matching ("alfred", "hey alfred")
MainScreen.kt
├── Wake word toggle chip
├── Initialize detector on launch
├── Auto-trigger VoiceInputManager on detection
└── Display "Wake word detected!" message
```
## Future Enhancements
Potential improvements:
- [ ] Custom wake word training
- [ ] Background service (wake word works when app is backgrounded)
- [ ] Larger/more accurate Vosk model option
- [ ] Multi-language support
- [ ] Configurable wake words via settings
## Comparison: Wake Word vs Manual Voice
| Feature | Wake Word Mode | Manual Voice Button |
|---------|---------------|-------------------|
| Activation | Say "Hey Alfred" | Tap microphone button |
| Hands-free | ✅ Yes | ❌ No (requires tap) |
| Battery impact | Moderate | Low |
| Privacy | Full (on-device) | Full (on-device) |
| Accuracy | Good | Excellent |
| Background use | Not yet (app must be open) | Not yet (app must be open) |
## Related Files
- **Wake word logic**: `app/src/main/java/com/openclaw/alfred/voice/WakeWordDetector.kt`
- **UI integration**: `app/src/main/java/com/openclaw/alfred/ui/screens/MainScreen.kt`
- **Voice input**: `app/src/main/java/com/openclaw/alfred/voice/VoiceInputManager.kt`
- **TTS**: `app/src/main/java/com/openclaw/alfred/voice/TTSManager.kt`
- **Model**: `app/src/main/assets/vosk-model/`
---
**Enjoy hands-free conversations with Alfred!** 🎤