Major improvements to display configuration and server architecture:
**Display Enhancements:**
- Add URL parameters for display customization (timestamps, maxlines, fontsize, fontfamily)
- Fix max lines enforcement to prevent scroll bars in OBS
- Apply font family and size settings to both local and sync displays
- Remove auto-scroll, enforce overflow:hidden for clean OBS integration
**Node.js Server:**
- Add timestamps toggle: timestamps=true/false
- Add max lines limit: maxlines=50
- Add font configuration: fontsize=16, fontfamily=Arial
- Update index page with URL parameters documentation
- Improve display URLs in room generation
**Local Web Server:**
- Add max_lines, font_family, font_size configuration
- Respect settings from GUI configuration
- Apply changes immediately without restart
**Architecture:**
- Remove PHP server implementation (Node.js recommended)
- Update all documentation to reference Node.js server
- Update default config URLs to Node.js endpoints
- Clean up 1700+ lines of PHP code
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Phase 2 implementation: Multiple streamers can now merge their captions
into a single stream using a PHP server.
PHP Server (server/php/):
- server.php: API endpoint for sending/streaming transcriptions
- display.php: Web page for viewing merged captions in OBS
- config.php: Server configuration
- .htaccess: Security settings
- README.md: Comprehensive deployment guide
Features:
- Room-based isolation (multiple groups on same server)
- Passphrase authentication per room
- Real-time streaming via Server-Sent Events (SSE)
- Different colors for each user
- File-based storage (no database required)
- Auto-cleanup of old rooms
- Works on standard PHP hosting
Client-Side:
- client/server_sync.py: HTTP client for sending to PHP server
- Settings dialog updated with server sync options
- Config updated with server_sync section
Server Configuration:
- URL: Server endpoint (e.g., http://example.com/transcription/server.php)
- Room: Unique room name for your group
- Passphrase: Shared secret for authentication
OBS Integration:
Display URL format:
http://example.com/transcription/display.php?room=ROOM&passphrase=PASS&fade=10×tamps=true
NOTE: Main window integration pending (client sends transcriptions)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Phase 1 Complete - Standalone Desktop Application
Features:
- Real-time speech-to-text with Whisper (faster-whisper)
- PySide6 desktop GUI with settings dialog
- Web server for OBS browser source integration
- Audio capture with automatic sample rate detection and resampling
- Noise suppression with Voice Activity Detection (VAD)
- Configurable display settings (font, timestamps, fade duration)
- Settings apply without restart (with automatic model reloading)
- Auto-fade for web display transcriptions
- CPU/GPU support with automatic device detection
- Standalone executable builds (PyInstaller)
- CUDA build support (works on systems without CUDA hardware)
Components:
- Audio capture with sounddevice
- Noise reduction with noisereduce + webrtcvad
- Transcription with faster-whisper
- GUI with PySide6
- Web server with FastAPI + WebSocket
- Configuration system with YAML
Build System:
- Standard builds (CPU-only): build.sh / build.bat
- CUDA builds (universal): build-cuda.sh / build-cuda.bat
- Comprehensive BUILD.md documentation
- Cross-platform support (Linux, Windows)
Documentation:
- README.md with project overview and quick start
- BUILD.md with detailed build instructions
- NEXT_STEPS.md with future enhancement roadmap
- INSTALL.md with setup instructions
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>