**Splash Screen Features:**
- Shows "Local Transcription" branding during startup
- Displays progress messages as app initializes
- Prevents users from clicking multiple times while loading
- Clean dark theme matching app design
**Implementation:**
- Created splash screen with custom pixmap drawing
- Updates messages during initialization phases:
- "Loading configuration..."
- "Creating user interface..."
- "Starting web server..."
- "Loading Whisper model..."
- Automatically closes when main window is ready
- Always stays on top to remain visible
**Benefits:**
- Better user experience during model loading (2-5 seconds)
- Prevents multiple app instances from confusion
- Professional appearance
- Clear feedback that app is starting
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
**Model Reload Fixes:**
- Properly disconnect signals before reconnecting to prevent duplicate connections
- Wait for previous model loader thread to finish before starting new one
- Add garbage collection after unloading model to free memory
- Improve error handling in model reload callback
**Settings Dialog:**
- Remove duplicate success message (callback handles it)
- Only show message if no callback is defined
**Transcription Engine:**
- Explicitly delete model reference before setting to None
- Force garbage collection to ensure memory is freed
This prevents crashes when switching models, especially when done
multiple times in succession or while the app is under load.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Major improvements to display configuration and server architecture:
**Display Enhancements:**
- Add URL parameters for display customization (timestamps, maxlines, fontsize, fontfamily)
- Fix max lines enforcement to prevent scroll bars in OBS
- Apply font family and size settings to both local and sync displays
- Remove auto-scroll, enforce overflow:hidden for clean OBS integration
**Node.js Server:**
- Add timestamps toggle: timestamps=true/false
- Add max lines limit: maxlines=50
- Add font configuration: fontsize=16, fontfamily=Arial
- Update index page with URL parameters documentation
- Improve display URLs in room generation
**Local Web Server:**
- Add max_lines, font_family, font_size configuration
- Respect settings from GUI configuration
- Apply changes immediately without restart
**Architecture:**
- Remove PHP server implementation (Node.js recommended)
- Update all documentation to reference Node.js server
- Update default config URLs to Node.js endpoints
- Clean up 1700+ lines of PHP code
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add comprehensive error handling to prevent crashes during model reload
- Implement automatic port fallback (8080-8084) for web server conflicts
- Configure uvicorn to work properly with PyInstaller console=False builds
- Add proper web server shutdown on app close to release ports
- Improve error reporting with full tracebacks for debugging
Fixes:
- App crashing when switching models
- Web server not starting after app crash (port conflict)
- Web server failing silently in compiled builds without console
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added a clickable link in the status bar that opens the web display
in the default browser. This makes it easy for users to access the
OBS browser source without manually typing the URL.
Features:
- Shows "🌐 Open Web Display" link in green
- Tooltip shows the full URL
- Opens in default browser when clicked
- Reads host/port from config automatically
Location: Status bar, after user name
URL format: http://127.0.0.1:8080 (or configured host:port)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Changes:
1. Changed UI text from "Recording" to "Transcribing" for clarity
2. Implemented overlapping audio chunks to prevent word cutoff
Audio Overlap Feature:
- Added overlap_duration parameter (default: 0.5 seconds)
- Audio chunks now overlap by 0.5s to capture words at boundaries
- Prevents missed words when chunks are processed separately
- Configurable via audio.overlap_duration in config.yaml
How it works:
- Each 3-second chunk includes 0.5s from the previous chunk
- Buffer advances by (chunk_size - overlap_size) instead of full chunk
- Ensures words at chunk boundaries are captured in at least one chunk
- No duplicate transcription due to Whisper's context handling
Example with 3s chunks and 0.5s overlap:
Chunk 1: [0.0s - 3.0s]
Chunk 2: [2.5s - 5.5s] <- 0.5s overlap
Chunk 3: [5.0s - 8.0s] <- 0.5s overlap
Files modified:
- client/audio_capture.py: Implemented overlapping buffer logic
- config/default_config.yaml: Added overlap_duration setting
- gui/main_window_qt.py: Updated UI text, passed overlap param
- main_cli.py: Passed overlap param
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Phase 1 Complete - Standalone Desktop Application
Features:
- Real-time speech-to-text with Whisper (faster-whisper)
- PySide6 desktop GUI with settings dialog
- Web server for OBS browser source integration
- Audio capture with automatic sample rate detection and resampling
- Noise suppression with Voice Activity Detection (VAD)
- Configurable display settings (font, timestamps, fade duration)
- Settings apply without restart (with automatic model reloading)
- Auto-fade for web display transcriptions
- CPU/GPU support with automatic device detection
- Standalone executable builds (PyInstaller)
- CUDA build support (works on systems without CUDA hardware)
Components:
- Audio capture with sounddevice
- Noise reduction with noisereduce + webrtcvad
- Transcription with faster-whisper
- GUI with PySide6
- Web server with FastAPI + WebSocket
- Configuration system with YAML
Build System:
- Standard builds (CPU-only): build.sh / build.bat
- CUDA builds (universal): build-cuda.sh / build-cuda.bat
- Comprehensive BUILD.md documentation
- Cross-platform support (Linux, Windows)
Documentation:
- README.md with project overview and quick start
- BUILD.md with detailed build instructions
- NEXT_STEPS.md with future enhancement roadmap
- INSTALL.md with setup instructions
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>