Phase 2 implementation: Multiple streamers can now merge their captions
into a single stream using a PHP server.
PHP Server (server/php/):
- server.php: API endpoint for sending/streaming transcriptions
- display.php: Web page for viewing merged captions in OBS
- config.php: Server configuration
- .htaccess: Security settings
- README.md: Comprehensive deployment guide
Features:
- Room-based isolation (multiple groups on same server)
- Passphrase authentication per room
- Real-time streaming via Server-Sent Events (SSE)
- Different colors for each user
- File-based storage (no database required)
- Auto-cleanup of old rooms
- Works on standard PHP hosting
Client-Side:
- client/server_sync.py: HTTP client for sending to PHP server
- Settings dialog updated with server sync options
- Config updated with server_sync section
Server Configuration:
- URL: Server endpoint (e.g., http://example.com/transcription/server.php)
- Room: Unique room name for your group
- Passphrase: Shared secret for authentication
OBS Integration:
Display URL format:
http://example.com/transcription/display.php?room=ROOM&passphrase=PASS&fade=10×tamps=true
NOTE: Main window integration pending (client sends transcriptions)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added a clickable link in the status bar that opens the web display
in the default browser. This makes it easy for users to access the
OBS browser source without manually typing the URL.
Features:
- Shows "🌐 Open Web Display" link in green
- Tooltip shows the full URL
- Opens in default browser when clicked
- Reads host/port from config automatically
Location: Status bar, after user name
URL format: http://127.0.0.1:8080 (or configured host:port)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Changes:
1. Changed UI text from "Recording" to "Transcribing" for clarity
2. Implemented overlapping audio chunks to prevent word cutoff
Audio Overlap Feature:
- Added overlap_duration parameter (default: 0.5 seconds)
- Audio chunks now overlap by 0.5s to capture words at boundaries
- Prevents missed words when chunks are processed separately
- Configurable via audio.overlap_duration in config.yaml
How it works:
- Each 3-second chunk includes 0.5s from the previous chunk
- Buffer advances by (chunk_size - overlap_size) instead of full chunk
- Ensures words at chunk boundaries are captured in at least one chunk
- No duplicate transcription due to Whisper's context handling
Example with 3s chunks and 0.5s overlap:
Chunk 1: [0.0s - 3.0s]
Chunk 2: [2.5s - 5.5s] <- 0.5s overlap
Chunk 3: [5.0s - 8.0s] <- 0.5s overlap
Files modified:
- client/audio_capture.py: Implemented overlapping buffer logic
- config/default_config.yaml: Added overlap_duration setting
- gui/main_window_qt.py: Updated UI text, passed overlap param
- main_cli.py: Passed overlap param
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Phase 1 Complete - Standalone Desktop Application
Features:
- Real-time speech-to-text with Whisper (faster-whisper)
- PySide6 desktop GUI with settings dialog
- Web server for OBS browser source integration
- Audio capture with automatic sample rate detection and resampling
- Noise suppression with Voice Activity Detection (VAD)
- Configurable display settings (font, timestamps, fade duration)
- Settings apply without restart (with automatic model reloading)
- Auto-fade for web display transcriptions
- CPU/GPU support with automatic device detection
- Standalone executable builds (PyInstaller)
- CUDA build support (works on systems without CUDA hardware)
Components:
- Audio capture with sounddevice
- Noise reduction with noisereduce + webrtcvad
- Transcription with faster-whisper
- GUI with PySide6
- Web server with FastAPI + WebSocket
- Configuration system with YAML
Build System:
- Standard builds (CPU-only): build.sh / build.bat
- CUDA builds (universal): build-cuda.sh / build-cuda.bat
- Comprehensive BUILD.md documentation
- Cross-platform support (Linux, Windows)
Documentation:
- README.md with project overview and quick start
- BUILD.md with detailed build instructions
- NEXT_STEPS.md with future enhancement roadmap
- INSTALL.md with setup instructions
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>