Initial commit: Alfred Mobile - AI Assistant Android App

- OAuth authentication via Authentik - WebSocket connection to OpenClaw gateway - Configurable gateway URL with first-run setup - User preferences sync across devices - Multi-user support with custom assistant names - ElevenLabs TTS integration (local + remote) - FCM push notifications for alarms - Voice input via Google Speech API - No hardcoded secrets or internal IPs in tracked files
2026-02-09 11:12:51 -08:00
commit 6d4ae2e5c3
92 changed files with 15173 additions and 0 deletions
--- a/WAKE_WORD_ALTERNATIVES.md
+++ b/WAKE_WORD_ALTERNATIVES.md
@@ -0,0 +1,161 @@
+# Open-Source Wake Word Alternatives
+
+## Vosk (Recommended ✅)
+
+**Best open-source option for Android**
+
+**Pros:**
+- ✅ Fully open source (Apache 2.0)
+- ✅ Actively maintained
+- ✅ Excellent Android support
+- ✅ Small models (20-50MB)
+- ✅ Fast, on-device processing
+- ✅ No API keys, no accounts
+- ✅ Works offline
+- ✅ Can do continuous keyword spotting
+
+**Cons:**
+- ❌ Not as battery-optimized as Porcupine
+- ❌ Slightly larger model size
+- ❌ More CPU intensive
+
+**Implementation:**
+```kotlin
+// Add to build.gradle.kts
+implementation("com.alphacephei:vosk-android:0.3.47")
+
+// Download small model (~40MB)
+// https://alphacephei.com/vosk/models
+// vosk-model-small-en-us-0.15.zip
+```
+
+**How it works:**
+- Continuous speech recognition
+- Listen for "alfred" or "hey alfred" in the audio stream
+- When detected, trigger voice input
+- Can even extract what they said after the wake word!
+
+**Battery Impact:**
+- Moderate (~2-3% per hour)
+- Can be optimized with shorter recognition windows
+
+---
+
+## Pocketsphinx
+
+**The OG open-source speech recognition**
+
+**Pros:**
+- ✅ Fully open source (BSD license)
+- ✅ Mature, proven technology (CMU)
+- ✅ Android library available
+- ✅ Very customizable
+- ✅ No external dependencies
+
+**Cons:**
+- ❌ Lower accuracy than modern solutions
+- ❌ Older API, less documentation
+- ❌ Harder to set up
+- ❌ Higher battery usage
+
+**Implementation:**
+```kotlin
+// Add to build.gradle.kts
+implementation("edu.cmu.pocketsphinx:pocketsphinx-android:5prealpha-SNAPSHOT")
+```
+
+---
+
+## Android AlwaysOnHotwordDetector
+
+**Built into Android (8.0+)**
+
+**Pros:**
+- ✅ Zero dependencies
+- ✅ System-level battery optimization
+- ✅ Built into Android
+
+**Cons:**
+- ❌ Only works with system wake words ("Ok Google", etc.)
+- ❌ Can't train custom "Alfred" wake word
+- ❌ Requires special permissions
+- ❌ Limited control
+
+**Not recommended** for custom wake words.
+
+---
+
+## TensorFlow Lite + Custom Model
+
+**Roll your own**
+
+**Pros:**
+- ✅ Complete control
+- ✅ Open source
+- ✅ Can be very efficient if done right
+
+**Cons:**
+- ❌ Need to train your own model
+- ❌ Need training data (recordings of "Alfred")
+- ❌ Complex implementation
+- ❌ High development time (weeks)
+
+**Not recommended** unless you want a fun project.
+
+---
+
+## Recommendation: Vosk
+
+**Why Vosk is the best choice:**
+
+1. **True Open Source**
+   - No vendor lock-in
+   - Apache 2.0 license
+   - Active community
+
+2. **Good Balance**
+   - Decent battery life (not as good as Porcupine, but acceptable)
+   - Good accuracy
+   - Easy to implement
+   - Well-documented
+
+3. **Bonus Features**
+   - Can transcribe what they said AFTER "Alfred"
+   - So "Hey Alfred, what's the weather" could extract "what's the weather" directly
+   - This could skip the voice input step entirely!
+
+4. **No Account/API Key Required**
+   - Just download the model
+   - Bundle it with the app
+   - Done!
+
+---
+
+## Implementation Complexity
+
+**Vosk:**
+- Setup: ~30 minutes (download model, add dependency)
+- Code: ~1-2 hours
+- Total: ~2-3 hours
+
+**Pocketsphinx:**
+- Setup: ~1 hour (configure, download models)
+- Code: ~3-4 hours (harder API)
+- Total: ~4-5 hours
+
+---
+
+## My Recommendation
+
+**Go with Vosk.**
+
+It's the best balance of:
+- Open source ethos ✅
+- Easy implementation ✅
+- Good accuracy ✅
+- Reasonable battery usage ✅
+- Active development ✅
+
+And the bonus feature of potentially extracting the full command ("Hey Alfred, what's the weather?") means we could make the UX even better than Porcupine!
+
+Want me to implement Vosk wake word detection?