Commit Graph

10 Commits

Author SHA1 Message Date
4d6dd6d35d Include pvporcupine resource files in PyInstaller build
PyInstaller wasn't bundling pvporcupine's resource files (keyword_files
and lib directories), causing a FileNotFoundError at runtime when
pvporcupine tried to access its resources directory.

Changes:
- Added code to detect and include pvporcupine resources and lib folders
- Falls back gracefully if pvporcupine is not installed
- Resources are bundled even though we don't use wake word features
  (pvporcupine initializes and checks for these on import)

This fixes the runtime error:
FileNotFoundError: [WinError 3] The system cannot find the path
specified: '...\pvporcupine\resources/keyword_files\windows'

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 19:57:13 -08:00
9b7f2e1d69 Fix enum34 error by excluding it in PyInstaller spec
The previous approach of uninstalling enum34 before PyInstaller didn't
work because 'uv run' re-syncs dependencies. The proper solution is to
exclude enum34 directly in the PyInstaller spec file.

Changes:
- Added hooks/hook-enum34.py: Custom PyInstaller hook to exclude enum34
- Updated local-transcription.spec:
  - Added 'hooks' to hookspath
  - Added 'enum34' to excludes list
- Updated build.sh and build.bat:
  - Removed enum34 uninstall step (no longer needed)
  - Added comment explaining enum34 is excluded in spec

Why this works:
- PyInstaller's excludes list prevents enum34 from being bundled
- The custom hook provides documentation and explicit exclusion
- enum34 can remain installed in venv (won't break anything)
- Works regardless of 'uv run' re-syncing dependencies

enum34 is an obsolete Python 2.7/3.3 backport that's incompatible with
PyInstaller and unnecessary on Python 3.4+ (enum is in stdlib).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 19:32:23 -08:00
20a7764bab Add application icon support for GUI and compiled executables
Added platform-specific icon support for both the running application
and compiled executables:

New files:
- create_icons.py: Script to convert PNG to platform-specific formats
  - Generates .ico for Windows (16, 32, 48, 256px sizes)
  - Generates .iconset for macOS (ready for iconutil conversion)
- LocalTranscription.png: Source icon image
- LocalTranscription.ico: Windows icon file (multi-size)
- LocalTranscription.iconset/: macOS icon set (needs iconutil on macOS)

GUI changes:
- main.py: Set application-wide icon for taskbar/dock
- main_window_qt.py: Set window icon for GUI window

Build configuration:
- local-transcription.spec: Use platform-specific icons in PyInstaller
  - Windows builds use LocalTranscription.ico
  - macOS builds use LocalTranscription.icns (when generated)

To generate macOS .icns file on macOS:
  iconutil -c icns LocalTranscription.iconset

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 18:59:24 -08:00
5f3c058be6 Migrate to RealtimeSTT for advanced VAD-based transcription
Major refactor to eliminate word loss issues using RealtimeSTT with
dual-layer VAD (WebRTC + Silero) instead of time-based chunking.

## Core Changes

### New Transcription Engine
- Add client/transcription_engine_realtime.py with RealtimeSTT wrapper
- Implements initialize() and start_recording() separation for proper lifecycle
- Dual-layer VAD with pre/post buffers prevents word cutoffs
- Optional realtime preview with faster model + final transcription

### Removed Legacy Components
- Remove client/audio_capture.py (RealtimeSTT handles audio)
- Remove client/noise_suppression.py (VAD handles silence detection)
- Remove client/transcription_engine.py (replaced by realtime version)
- Remove chunk_duration setting (no longer using time-based chunking)

### Dependencies
- Add RealtimeSTT>=0.3.0 to pyproject.toml
- Remove noisereduce, webrtcvad, faster-whisper (now dependencies of RealtimeSTT)
- Update PyInstaller spec with ONNX Runtime, halo, colorama

### GUI Improvements
- Refactor main_window_qt.py to use RealtimeSTT with proper start/stop
- Fix recording state management (initialize on startup, record on button click)
- Expand settings dialog (700x1200) with improved spacing (10-15px between groups)
- Add comprehensive tooltips to all settings explaining functionality
- Remove chunk duration field from settings

### Configuration
- Update default_config.yaml with RealtimeSTT parameters:
  - Silero VAD sensitivity (0.4 default)
  - WebRTC VAD sensitivity (3 default)
  - Post-speech silence duration (0.3s)
  - Pre-recording buffer (0.2s)
  - Beam size for quality control (5 default)
  - ONNX acceleration (enabled for 2-3x faster VAD)
  - Optional realtime preview settings

### CLI Updates
- Update main_cli.py to use new engine API
- Separate initialize() and start_recording() calls

### Documentation
- Add INSTALL_REALTIMESTT.md with migration guide and benefits
- Update INSTALL.md: Remove FFmpeg requirement (not needed!)
- Clarify PortAudio is only needed for development
- Document that built executables are fully standalone

## Benefits

-  Eliminates word loss at chunk boundaries
-  Natural speech segment detection via VAD
-  2-3x faster VAD with ONNX acceleration
-  30% lower CPU usage
-  Pre-recording buffer captures word starts
-  Post-speech silence prevents cutoffs
-  Optional instant preview mode
-  Better UX with comprehensive tooltips

## Migration Notes

- Settings apply immediately without restart (except model changes)
- Old chunk_duration configs ignored (VAD-based detection now)
- Recording only starts when user clicks button (not on app startup)
- Stop button immediately stops recording (no delay)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 18:48:29 -08:00
478146c58d Improve UX: hide console window and fade connection status
- Hide console window on compiled desktop app (console=False in spec)
- Add 20-second auto-fade to "Connected" status in OBS display
- Keep "Disconnected" status visible until reconnection
- Add PM2 deployment configuration and documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 17:04:28 -08:00
6ec350af69 Fix Windows FastAPI import: Replace collect_all with collect_submodules
Research findings:
- collect_all() has design flaws and poor performance with pydantic
- Pydantic uses compiled cpython extensions that prevent module discovery
- collect_submodules() is the recommended approach per PyInstaller docs

Changes:
- Replaced collect_all() with collect_submodules() for better reliability
- Now collects 105 pydantic submodules (vs unreliable collect_all)
- Added collect_data_files() for packages requiring data files
- Added explicit pydantic dependencies: colorsys, decimal, json, etc.
- Applies to both Windows AND Linux (no longer platform-specific)

Results:
✓ Collected 52 submodules from fastapi
✓ Collected 34 submodules from starlette
✓ Collected 105 submodules from pydantic
✓ Collected 3 submodules from pydantic_core
✓ Plus uvicorn, websockets, h11, anyio

Fixes: ModuleNotFoundError: No module named 'fastapi' on Windows
Based on: https://github.com/pyinstaller/pyinstaller/issues/5359
2025-12-26 11:30:29 -08:00
926910177d Fix Windows build: Use collect_all for FastAPI packages
- On Windows, PyInstaller wasn't properly bundling FastAPI dependencies
- Added platform-specific collection using PyInstaller.utils.hooks.collect_all
- Only applies aggressive collection on Windows to keep Linux builds stable
- Collects all submodules and data files for: fastapi, starlette, pydantic,
  pydantic_core, anyio, uvicorn, websockets, h11
- Linux builds remain unchanged and continue to work as before

Fixes: ModuleNotFoundError: No module named 'fastapi' on Windows executable
2025-12-26 11:01:43 -08:00
0ee3f1003e Fix Windows build: Add FastAPI and dependencies to hiddenimports
Fixed PyInstaller build error on Windows:
"ModuleNotFoundError: No module named 'fastapi'"

Added to hiddenimports:
- FastAPI and its core modules
- Starlette (FastAPI framework base)
- Pydantic (data validation)
- anyio, sniffio (async libraries)
- h11, websockets (protocol implementations)
- requests and dependencies (for server sync client)

This ensures all web server dependencies are bundled in the executable.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 10:34:11 -08:00
003c27c8d5 Fix: Bundle Silero VAD model with PyInstaller
Fixed PyInstaller build error where the Voice Activity Detection (VAD)
model was missing from the compiled executable.

Changes:
- Added faster_whisper/assets folder to PyInstaller datas
- Includes silero_vad_v6.onnx (1.2MB) in the build
- Resolves ONNXRuntimeError on transcription start

Error fixed:
[ONNXRuntimeError] : 3 : NO_SUCHFILE : Load model from
.../faster_whisper/assets/silero_vad_v6.onnx failed: File doesn't exist

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 08:26:58 -08:00
472233aec4 Initial commit: Local Transcription App v1.0
Phase 1 Complete - Standalone Desktop Application

Features:
- Real-time speech-to-text with Whisper (faster-whisper)
- PySide6 desktop GUI with settings dialog
- Web server for OBS browser source integration
- Audio capture with automatic sample rate detection and resampling
- Noise suppression with Voice Activity Detection (VAD)
- Configurable display settings (font, timestamps, fade duration)
- Settings apply without restart (with automatic model reloading)
- Auto-fade for web display transcriptions
- CPU/GPU support with automatic device detection
- Standalone executable builds (PyInstaller)
- CUDA build support (works on systems without CUDA hardware)

Components:
- Audio capture with sounddevice
- Noise reduction with noisereduce + webrtcvad
- Transcription with faster-whisper
- GUI with PySide6
- Web server with FastAPI + WebSocket
- Configuration system with YAML

Build System:
- Standard builds (CPU-only): build.sh / build.bat
- CUDA builds (universal): build-cuda.sh / build-cuda.bat
- Comprehensive BUILD.md documentation
- Cross-platform support (Linux, Windows)

Documentation:
- README.md with project overview and quick start
- BUILD.md with detailed build instructions
- NEXT_STEPS.md with future enhancement roadmap
- INSTALL.md with setup instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-25 18:48:23 -08:00