143 lines
4.8 KiB
Markdown
143 lines
4.8 KiB
Markdown
|
|
# Wake Word Detection - "Hey Alfred"
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
|
||
|
|
The Alfred Mobile app now includes **offline wake word detection** using [Vosk](https://alphacephei.com/vosk/), an open-source speech recognition toolkit. This allows hands-free voice interaction by continuously listening for the wake phrase.
|
||
|
|
|
||
|
|
## Wake Words
|
||
|
|
|
||
|
|
The app listens for:
|
||
|
|
- **"Hey Alfred"**
|
||
|
|
- **"Alfred"**
|
||
|
|
|
||
|
|
When either phrase is detected, voice input automatically starts.
|
||
|
|
|
||
|
|
## How to Use
|
||
|
|
|
||
|
|
### 1. Enable Wake Word Mode
|
||
|
|
|
||
|
|
In the app's status bar (below the top bar), you'll see two toggle chips:
|
||
|
|
|
||
|
|
- **Wake Word** (keyboard icon) - Enable/disable continuous listening
|
||
|
|
- **Voice Off/On** (speaker icon) - Enable/disable TTS responses
|
||
|
|
|
||
|
|
Tap **Wake Word** to enable continuous listening mode. The chip will turn blue and say **"Always On"**.
|
||
|
|
|
||
|
|
### 2. Say the Wake Word
|
||
|
|
|
||
|
|
With wake word mode enabled, the app continuously listens for "Hey Alfred" or "Alfred" in the background.
|
||
|
|
|
||
|
|
When detected:
|
||
|
|
1. You'll see a system message: "Wake word detected!"
|
||
|
|
2. Voice input automatically starts (microphone icon appears)
|
||
|
|
3. Speak your command/question
|
||
|
|
4. Voice input stops after a pause (10 seconds allowed for natural pauses)
|
||
|
|
5. Message auto-sends to Alfred
|
||
|
|
|
||
|
|
### 3. Normal Conversation
|
||
|
|
|
||
|
|
After the wake word triggers voice input:
|
||
|
|
- **Speech pauses**: The app allows up to 10 seconds of silence for natural speaking rhythm
|
||
|
|
- **Auto-send**: Your message sends automatically when voice input completes
|
||
|
|
- **Wake word loops**: After sending, wake word detection resumes automatically
|
||
|
|
|
||
|
|
### 4. Enable TTS (Optional)
|
||
|
|
|
||
|
|
For a full voice conversation experience:
|
||
|
|
1. Enable **Voice On** (speaker icon)
|
||
|
|
2. Say "Hey Alfred" → speak your question → Alfred responds verbally
|
||
|
|
3. Say "Hey Alfred" again for the next question
|
||
|
|
|
||
|
|
## Technical Details
|
||
|
|
|
||
|
|
### Model
|
||
|
|
- **Vosk Small English Model** (vosk-model-small-en-us-0.15)
|
||
|
|
- **Size**: ~39MB
|
||
|
|
- **Location**: `app/src/main/assets/vosk-model/`
|
||
|
|
- **On-device processing**: No internet required, completely private
|
||
|
|
|
||
|
|
### Accuracy
|
||
|
|
- Works best in quiet environments
|
||
|
|
- Optimized for American English
|
||
|
|
- May occasionally false-trigger on similar-sounding words
|
||
|
|
|
||
|
|
### Privacy
|
||
|
|
- All speech recognition happens **on-device**
|
||
|
|
- No audio data sent to external servers
|
||
|
|
- Only transcribed text is sent to OpenClaw gateway (as with manual voice input)
|
||
|
|
|
||
|
|
### Performance
|
||
|
|
- **CPU usage**: Low (Vosk uses lightweight model)
|
||
|
|
- **Battery impact**: Moderate when wake word mode is enabled (continuous microphone access)
|
||
|
|
- **Latency**: ~100-500ms from wake word to voice input activation
|
||
|
|
|
||
|
|
### Permissions
|
||
|
|
- **Microphone**: Required for wake word detection
|
||
|
|
- Requested automatically when you enable wake word mode
|
||
|
|
|
||
|
|
## Troubleshooting
|
||
|
|
|
||
|
|
### Wake word not detecting
|
||
|
|
1. **Check microphone permission** - Grant in Android settings if denied
|
||
|
|
2. **Speak clearly** - Say "Hey Alfred" or "Alfred" distinctly
|
||
|
|
3. **Reduce background noise** - Works best in quiet environments
|
||
|
|
4. **Check volume** - Speak at normal conversation volume
|
||
|
|
|
||
|
|
### Battery drain
|
||
|
|
- Wake word mode uses continuous microphone access
|
||
|
|
- Disable wake word mode when not needed
|
||
|
|
- Use manual voice button for single commands
|
||
|
|
|
||
|
|
### False positives
|
||
|
|
- Vosk may occasionally trigger on similar words ("Elford", "Alpha Fred", etc.)
|
||
|
|
- This is normal for lightweight on-device models
|
||
|
|
- False triggers will just open voice input briefly
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
|
||
|
|
```
|
||
|
|
WakeWordDetector.kt
|
||
|
|
├── Vosk Model (assets/vosk-model/)
|
||
|
|
├── Continuous audio recording (16kHz)
|
||
|
|
├── Partial result processing
|
||
|
|
└── Wake word matching ("alfred", "hey alfred")
|
||
|
|
|
||
|
|
MainScreen.kt
|
||
|
|
├── Wake word toggle chip
|
||
|
|
├── Initialize detector on launch
|
||
|
|
├── Auto-trigger VoiceInputManager on detection
|
||
|
|
└── Display "Wake word detected!" message
|
||
|
|
```
|
||
|
|
|
||
|
|
## Future Enhancements
|
||
|
|
|
||
|
|
Potential improvements:
|
||
|
|
- [ ] Custom wake word training
|
||
|
|
- [ ] Background service (wake word works when app is backgrounded)
|
||
|
|
- [ ] Larger/more accurate Vosk model option
|
||
|
|
- [ ] Multi-language support
|
||
|
|
- [ ] Configurable wake words via settings
|
||
|
|
|
||
|
|
## Comparison: Wake Word vs Manual Voice
|
||
|
|
|
||
|
|
| Feature | Wake Word Mode | Manual Voice Button |
|
||
|
|
|---------|---------------|-------------------|
|
||
|
|
| Activation | Say "Hey Alfred" | Tap microphone button |
|
||
|
|
| Hands-free | ✅ Yes | ❌ No (requires tap) |
|
||
|
|
| Battery impact | Moderate | Low |
|
||
|
|
| Privacy | Full (on-device) | Full (on-device) |
|
||
|
|
| Accuracy | Good | Excellent |
|
||
|
|
| Background use | Not yet (app must be open) | Not yet (app must be open) |
|
||
|
|
|
||
|
|
## Related Files
|
||
|
|
|
||
|
|
- **Wake word logic**: `app/src/main/java/com/openclaw/alfred/voice/WakeWordDetector.kt`
|
||
|
|
- **UI integration**: `app/src/main/java/com/openclaw/alfred/ui/screens/MainScreen.kt`
|
||
|
|
- **Voice input**: `app/src/main/java/com/openclaw/alfred/voice/VoiceInputManager.kt`
|
||
|
|
- **TTS**: `app/src/main/java/com/openclaw/alfred/voice/TTSManager.kt`
|
||
|
|
- **Model**: `app/src/main/assets/vosk-model/`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
**Enjoy hands-free conversations with Alfred!** 🎤
|