Initial commit: Alfred Mobile - AI Assistant Android App
- OAuth authentication via Authentik - WebSocket connection to OpenClaw gateway - Configurable gateway URL with first-run setup - User preferences sync across devices - Multi-user support with custom assistant names - ElevenLabs TTS integration (local + remote) - FCM push notifications for alarms - Voice input via Google Speech API - No hardcoded secrets or internal IPs in tracked files
This commit is contained in:
142
WAKE_WORD.md
Normal file
142
WAKE_WORD.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# Wake Word Detection - "Hey Alfred"
|
||||
|
||||
## Overview
|
||||
|
||||
The Alfred Mobile app now includes **offline wake word detection** using [Vosk](https://alphacephei.com/vosk/), an open-source speech recognition toolkit. This allows hands-free voice interaction by continuously listening for the wake phrase.
|
||||
|
||||
## Wake Words
|
||||
|
||||
The app listens for:
|
||||
- **"Hey Alfred"**
|
||||
- **"Alfred"**
|
||||
|
||||
When either phrase is detected, voice input automatically starts.
|
||||
|
||||
## How to Use
|
||||
|
||||
### 1. Enable Wake Word Mode
|
||||
|
||||
In the app's status bar (below the top bar), you'll see two toggle chips:
|
||||
|
||||
- **Wake Word** (keyboard icon) - Enable/disable continuous listening
|
||||
- **Voice Off/On** (speaker icon) - Enable/disable TTS responses
|
||||
|
||||
Tap **Wake Word** to enable continuous listening mode. The chip will turn blue and say **"Always On"**.
|
||||
|
||||
### 2. Say the Wake Word
|
||||
|
||||
With wake word mode enabled, the app continuously listens for "Hey Alfred" or "Alfred" in the background.
|
||||
|
||||
When detected:
|
||||
1. You'll see a system message: "Wake word detected!"
|
||||
2. Voice input automatically starts (microphone icon appears)
|
||||
3. Speak your command/question
|
||||
4. Voice input stops after a pause (10 seconds allowed for natural pauses)
|
||||
5. Message auto-sends to Alfred
|
||||
|
||||
### 3. Normal Conversation
|
||||
|
||||
After the wake word triggers voice input:
|
||||
- **Speech pauses**: The app allows up to 10 seconds of silence for natural speaking rhythm
|
||||
- **Auto-send**: Your message sends automatically when voice input completes
|
||||
- **Wake word loops**: After sending, wake word detection resumes automatically
|
||||
|
||||
### 4. Enable TTS (Optional)
|
||||
|
||||
For a full voice conversation experience:
|
||||
1. Enable **Voice On** (speaker icon)
|
||||
2. Say "Hey Alfred" → speak your question → Alfred responds verbally
|
||||
3. Say "Hey Alfred" again for the next question
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Model
|
||||
- **Vosk Small English Model** (vosk-model-small-en-us-0.15)
|
||||
- **Size**: ~39MB
|
||||
- **Location**: `app/src/main/assets/vosk-model/`
|
||||
- **On-device processing**: No internet required, completely private
|
||||
|
||||
### Accuracy
|
||||
- Works best in quiet environments
|
||||
- Optimized for American English
|
||||
- May occasionally false-trigger on similar-sounding words
|
||||
|
||||
### Privacy
|
||||
- All speech recognition happens **on-device**
|
||||
- No audio data sent to external servers
|
||||
- Only transcribed text is sent to OpenClaw gateway (as with manual voice input)
|
||||
|
||||
### Performance
|
||||
- **CPU usage**: Low (Vosk uses lightweight model)
|
||||
- **Battery impact**: Moderate when wake word mode is enabled (continuous microphone access)
|
||||
- **Latency**: ~100-500ms from wake word to voice input activation
|
||||
|
||||
### Permissions
|
||||
- **Microphone**: Required for wake word detection
|
||||
- Requested automatically when you enable wake word mode
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Wake word not detecting
|
||||
1. **Check microphone permission** - Grant in Android settings if denied
|
||||
2. **Speak clearly** - Say "Hey Alfred" or "Alfred" distinctly
|
||||
3. **Reduce background noise** - Works best in quiet environments
|
||||
4. **Check volume** - Speak at normal conversation volume
|
||||
|
||||
### Battery drain
|
||||
- Wake word mode uses continuous microphone access
|
||||
- Disable wake word mode when not needed
|
||||
- Use manual voice button for single commands
|
||||
|
||||
### False positives
|
||||
- Vosk may occasionally trigger on similar words ("Elford", "Alpha Fred", etc.)
|
||||
- This is normal for lightweight on-device models
|
||||
- False triggers will just open voice input briefly
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
WakeWordDetector.kt
|
||||
├── Vosk Model (assets/vosk-model/)
|
||||
├── Continuous audio recording (16kHz)
|
||||
├── Partial result processing
|
||||
└── Wake word matching ("alfred", "hey alfred")
|
||||
|
||||
MainScreen.kt
|
||||
├── Wake word toggle chip
|
||||
├── Initialize detector on launch
|
||||
├── Auto-trigger VoiceInputManager on detection
|
||||
└── Display "Wake word detected!" message
|
||||
```
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Potential improvements:
|
||||
- [ ] Custom wake word training
|
||||
- [ ] Background service (wake word works when app is backgrounded)
|
||||
- [ ] Larger/more accurate Vosk model option
|
||||
- [ ] Multi-language support
|
||||
- [ ] Configurable wake words via settings
|
||||
|
||||
## Comparison: Wake Word vs Manual Voice
|
||||
|
||||
| Feature | Wake Word Mode | Manual Voice Button |
|
||||
|---------|---------------|-------------------|
|
||||
| Activation | Say "Hey Alfred" | Tap microphone button |
|
||||
| Hands-free | ✅ Yes | ❌ No (requires tap) |
|
||||
| Battery impact | Moderate | Low |
|
||||
| Privacy | Full (on-device) | Full (on-device) |
|
||||
| Accuracy | Good | Excellent |
|
||||
| Background use | Not yet (app must be open) | Not yet (app must be open) |
|
||||
|
||||
## Related Files
|
||||
|
||||
- **Wake word logic**: `app/src/main/java/com/openclaw/alfred/voice/WakeWordDetector.kt`
|
||||
- **UI integration**: `app/src/main/java/com/openclaw/alfred/ui/screens/MainScreen.kt`
|
||||
- **Voice input**: `app/src/main/java/com/openclaw/alfred/voice/VoiceInputManager.kt`
|
||||
- **TTS**: `app/src/main/java/com/openclaw/alfred/voice/TTSManager.kt`
|
||||
- **Model**: `app/src/main/assets/vosk-model/`
|
||||
|
||||
---
|
||||
|
||||
**Enjoy hands-free conversations with Alfred!** 🎤
|
||||
Reference in New Issue
Block a user