# Open-Source Wake Word Alternatives ## Vosk (Recommended ✅) **Best open-source option for Android** **Pros:** - ✅ Fully open source (Apache 2.0) - ✅ Actively maintained - ✅ Excellent Android support - ✅ Small models (20-50MB) - ✅ Fast, on-device processing - ✅ No API keys, no accounts - ✅ Works offline - ✅ Can do continuous keyword spotting **Cons:** - ❌ Not as battery-optimized as Porcupine - ❌ Slightly larger model size - ❌ More CPU intensive **Implementation:** ```kotlin // Add to build.gradle.kts implementation("com.alphacephei:vosk-android:0.3.47") // Download small model (~40MB) // https://alphacephei.com/vosk/models // vosk-model-small-en-us-0.15.zip ``` **How it works:** - Continuous speech recognition - Listen for "alfred" or "hey alfred" in the audio stream - When detected, trigger voice input - Can even extract what they said after the wake word! **Battery Impact:** - Moderate (~2-3% per hour) - Can be optimized with shorter recognition windows --- ## Pocketsphinx **The OG open-source speech recognition** **Pros:** - ✅ Fully open source (BSD license) - ✅ Mature, proven technology (CMU) - ✅ Android library available - ✅ Very customizable - ✅ No external dependencies **Cons:** - ❌ Lower accuracy than modern solutions - ❌ Older API, less documentation - ❌ Harder to set up - ❌ Higher battery usage **Implementation:** ```kotlin // Add to build.gradle.kts implementation("edu.cmu.pocketsphinx:pocketsphinx-android:5prealpha-SNAPSHOT") ``` --- ## Android AlwaysOnHotwordDetector **Built into Android (8.0+)** **Pros:** - ✅ Zero dependencies - ✅ System-level battery optimization - ✅ Built into Android **Cons:** - ❌ Only works with system wake words ("Ok Google", etc.) - ❌ Can't train custom "Alfred" wake word - ❌ Requires special permissions - ❌ Limited control **Not recommended** for custom wake words. --- ## TensorFlow Lite + Custom Model **Roll your own** **Pros:** - ✅ Complete control - ✅ Open source - ✅ Can be very efficient if done right **Cons:** - ❌ Need to train your own model - ❌ Need training data (recordings of "Alfred") - ❌ Complex implementation - ❌ High development time (weeks) **Not recommended** unless you want a fun project. --- ## Recommendation: Vosk **Why Vosk is the best choice:** 1. **True Open Source** - No vendor lock-in - Apache 2.0 license - Active community 2. **Good Balance** - Decent battery life (not as good as Porcupine, but acceptable) - Good accuracy - Easy to implement - Well-documented 3. **Bonus Features** - Can transcribe what they said AFTER "Alfred" - So "Hey Alfred, what's the weather" could extract "what's the weather" directly - This could skip the voice input step entirely! 4. **No Account/API Key Required** - Just download the model - Bundle it with the app - Done! --- ## Implementation Complexity **Vosk:** - Setup: ~30 minutes (download model, add dependency) - Code: ~1-2 hours - Total: ~2-3 hours **Pocketsphinx:** - Setup: ~1 hour (configure, download models) - Code: ~3-4 hours (harder API) - Total: ~4-5 hours --- ## My Recommendation **Go with Vosk.** It's the best balance of: - Open source ethos ✅ - Easy implementation ✅ - Good accuracy ✅ - Reasonable battery usage ✅ - Active development ✅ And the bonus feature of potentially extracting the full command ("Hey Alfred, what's the weather?") means we could make the UX even better than Porcupine! Want me to implement Vosk wake word detection?