Commit Graph

161 Commits

Author SHA1 Message Date
Developer
61c5ffa4fa Remove Zone.Identifier files that break Windows checkout
All checks were successful
Release / Bump version and tag (push) Successful in 4s
Release / Build App (macOS) (push) Successful in 58s
Release / Build App (Windows) (push) Successful in 3m22s
Release / Build App (Linux) (push) Successful in 6m27s
Windows NTFS Zone.Identifier alternate data stream files were
accidentally committed. The colon in the filename is invalid on
Windows, causing git checkout to fail on Windows runners.

Also added *:Zone.Identifier to .gitignore to prevent this recurring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:02:11 -07:00
Gitea Actions
289b9dabe1 chore: bump version to 1.4.2 [skip ci] v1.4.2 2026-04-06 21:00:01 +00:00
Developer
9522f28c57 Fix app icons: regenerate as RGBA and add macOS .icns
Some checks failed
Release / Bump version and tag (push) Successful in 4s
Release / Build App (Windows) (push) Failing after 10s
Release / Build App (macOS) (push) Successful in 59s
Release / Build App (Linux) (push) Has been cancelled
The bundled .ico had non-RGBA PNGs which caused Tauri's macOS bundler
to fail with "The PNG is not in RGBA format!". Regenerated all icons
from the source PNG as proper RGBA, and added icon.icns for macOS.

Also fixed bundle identifier from "com.localtranscription.app" (the
.app suffix conflicts with macOS bundle extension) to
"net.anhonesthost.local-transcription".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:59:50 -07:00
Gitea Actions
a8e2e7dca8 chore: bump version to 1.4.1 [skip ci] v1.4.1 2026-04-06 20:53:15 +00:00
Developer
3bcf4f09a3 Fix sidecar builds: macOS CUDA resolution and Windows uv install
Some checks failed
Release / Bump version and tag (push) Successful in 3s
Release / Build App (Windows) (push) Failing after 10s
Release / Build App (macOS) (push) Failing after 51s
Release / Build App (Linux) (push) Successful in 4m31s
macOS: pyproject.toml's [tool.uv.sources] forces torch from the CUDA
index which has no macOS ARM wheels. Use `uv sync --no-sources` to
bypass this and get torch from PyPI (which includes MPS support).

Windows: Add additional uv PATH locations ($LOCALAPPDATA\uv\bin) for
robustness with different runner environments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:51:41 -07:00
Gitea Actions
ef5734ef15 chore: bump sidecar version to 1.0.1 [skip ci] 2026-04-06 20:45:14 +00:00
c9db43d56c Merge pull request 'Rewrite frontend to Tauri v2 + Svelte 5 for cross-platform support' (#4) from feature/tauri-rewrite into main
Some checks failed
Build Sidecars / Bump sidecar version and tag (push) Successful in 4s
Release / Bump version and tag (push) Successful in 2s
Build Sidecars / Build Sidecar (Windows) (push) Failing after 15s
Build Sidecars / Build Sidecar (macOS) (push) Failing after 18s
Release / Build App (Windows) (push) Failing after 15s
Release / Build App (macOS) (push) Failing after 52s
Release / Build App (Linux) (push) Has been cancelled
Build Sidecars / Build Sidecar (Linux) (push) Has been cancelled
Reviewed-on: #4
2026-04-06 20:45:10 +00:00
Developer
4c519a109a Add missing Svelte components and stores, fix .gitignore lib/ pattern
The src/lib/ directory was being excluded by a Python .gitignore rule
for lib/ (meant for Python's build output). Changed to /lib/ so it
only matches root-level lib/ and doesn't block src/lib/.

Adds 8 files that were created but missed in the initial commit:
- 5 Svelte components (Header, StatusBar, Controls, TranscriptionDisplay, Settings)
- 3 TypeScript stores (backend, config, transcriptions)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:42:31 -07:00
Developer
47ca74e75d Update README and CLAUDE.md for Tauri rewrite
Update both docs to reflect the new architecture:
- Tauri v2 + Svelte 5 frontend replacing PySide6/Qt
- Headless Python backend with FastAPI control API
- Cross-platform support (Windows, macOS, Linux)
- Deepgram remote transcription (managed/BYOK)
- Gitea CI/CD workflows for automated builds
- New project structure with backend/, src/, src-tauri/
- Updated development commands and build instructions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:34:10 -07:00
Developer
25d2a55efb Add Gitea CI/CD workflows for cross-platform builds
Two workflows adapted from voice-to-notes:

- release.yml: Builds the Tauri app shell (.deb/.rpm for Linux, .msi
  for Windows, .dmg for macOS) on push to main. Auto-bumps version,
  creates Gitea release, uploads platform binaries.

- build-sidecar.yml: Builds the headless Python backend sidecar via
  PyInstaller when client/server/backend code changes. Produces CUDA
  and CPU variants for Linux/Windows, CPU-only for macOS. Uses the new
  local-transcription-headless.spec (no PySide6 dependencies).

Also adds local-transcription-headless.spec — a simplified PyInstaller
config for the headless backend that excludes all Qt/PySide6 imports.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 11:44:34 -07:00
Developer
af534bf768 Add Tauri v2 + Svelte 5 frontend and headless Python backend
Scaffold the cross-platform rewrite from PySide6/Qt to Tauri + Svelte,
following the same architecture as voice-to-notes. The Python backend
runs headless as a sidecar, with a FastAPI control API that the Svelte
frontend connects to via REST and WebSocket.

New files:
- backend/app_controller.py: Headless orchestration (extracted from MainWindow)
- backend/api_server.py: FastAPI control endpoints + /ws/control WebSocket
- backend/main_headless.py: Headless entry point for sidecar mode
- src-tauri/: Tauri v2 Rust shell with sidecar and dialog plugins
- src/: Svelte 5 frontend (App, Settings, Controls, TranscriptionDisplay)
- src/lib/stores/: Reactive stores for backend connection, config, transcriptions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:20:25 -07:00
Developer
9ff883e2e3 Phase 6: Add Deepgram remote transcription (managed + BYOK modes)
New files:
- client/deepgram_transcription.py — DeepgramTranscriptionEngine with
  managed mode (proxy) and BYOK mode (direct Deepgram). Sends raw binary
  PCM audio over WebSocket, handles both proxy and Deepgram response formats.

Modified files:
- config/default_config.yaml — Replace remote_processing with new remote
  section (mode, server_url, auth_token, byok_api_key, deepgram_model, language)
- client/config.py — Add migration from old remote_processing config
- gui/settings_dialog_qt.py — Replace Remote Processing group with
  Transcription Mode section (Local/Managed/BYOK radio buttons, login/register
  dialogs, balance display, model selector)
- gui/main_window_qt.py — Select engine based on remote.mode config,
  add error and credits_low handlers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 11:45:30 -07:00
bb8a8c251d Update README to reflect current application state
Remove outdated implementation plan and task checklists. Document
actual implemented features including RealtimeSTT, dual-layer VAD,
custom fonts/colors, and auto-updates. Add practical usage instructions
for standalone mode, OBS setup, and multi-user sync.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 06:31:27 -08:00
b7ab57f21f Add auto-update feature with Gitea release checking
- Add UpdateChecker class to query Gitea API for latest releases
- Show update dialog with release notes when new version available
- Open browser to release page for download (handles large files)
- Allow users to skip specific versions or defer updates
- Add "Check for Updates Now" button in settings
- Check automatically on startup (respects 24-hour interval)
- Pre-configured for repo.anhonesthost.net/streamer-tools

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
v1.4.0
2026-01-22 17:40:13 -08:00
89819f5d1b Add user-configurable colors for transcription display
- Add color settings (user_color, text_color, background_color) to config
- Add color picker buttons in Settings dialog with alpha support for backgrounds
- Update local web display to use configurable colors
- Send per-user colors with transcriptions to multi-user server
- Update Node.js server to apply per-user colors on display page
- Improve server landing page: replace tech details with display options reference
- Bump version to 1.3.2

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 20:59:13 -08:00
ff067b3368 Add unified per-speaker font support and remote transcription service
Font changes:
- Consolidate font settings into single Display Settings section
- Support Web-Safe, Google Fonts, and Custom File uploads for both displays
- Fix Google Fonts URL encoding (use + instead of %2B for spaces)
- Fix per-speaker font inline style quote escaping in Node.js display
- Add font debug logging to help diagnose font issues
- Update web server to sync all font settings on settings change
- Remove deprecated PHP server documentation files

New features:
- Add remote transcription service for GPU offloading
- Add instance lock to prevent multiple app instances
- Add version tracking

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 19:09:57 -08:00
f035bdb927 Fix recording failure in Windows PyInstaller builds with console=False
When console=False in PyInstaller builds on Windows, stdout/stderr are
not available. This causes subprocess/multiprocessing operations to fail
when they try to write output, breaking RealtimeSTT initialization.

The fix:
- Check if running as frozen executable on Windows
- Test if stdout/stderr are available (try to flush)
- If not available, redirect to io.StringIO() null streams
- This allows subprocess/multiprocessing to write without errors

This fixes the "Failed to start Recording" error that only occurred
in builds with console=False but worked fine with console=True or
when running with uv.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 20:57:49 -08:00
7bf0af953d Set console=False for production builds
The app is working correctly - the device validation retry is normal
RealtimeSTT behavior. Set console back to False to hide the console
window for a cleaner user experience in production builds.

Users confirmed transcription and server sync are working properly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 20:46:31 -08:00
ee6dfe00d8 Enable console window for debugging PyInstaller build issues
Temporarily enable console output to diagnose "failed to start recording"
error in the PyInstaller build. This will show all print() statements and
error messages that are currently being hidden.

Change console=False to console=True in the spec file.

Once the issue is identified and fixed, set back to console=False for
a production build without the console window.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 20:40:20 -08:00
95e9e8ebad Fix application icon not showing in PyInstaller builds
The icon wasn't working in frozen executables because:
1. LocalTranscription.png wasn't being bundled in the PyInstaller build
2. The code was using Path(__file__).parent which doesn't work in frozen exes

Changes:
- Added LocalTranscription.png to datas in local-transcription.spec
- Updated main.py to use sys._MEIPASS for frozen executables
- Updated gui/main_window_qt.py to use sys._MEIPASS for frozen executables
- Both files now detect if running frozen and adjust icon path accordingly

The icon will now appear correctly in:
- Window titlebar
- Taskbar (Windows) / Dock (macOS)
- Alt-Tab switcher

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 20:35:53 -08:00
c968eb8a48 Fix RealtimeSTT warmup file and PyTorch CUDA version mismatch
Fixed two build/runtime issues:

1. Windows: Missing warmup_audio.wav file from RealtimeSTT
   - Added RealtimeSTT to collect_data_files() in spec
   - Ensures warmup_audio.wav and other RealtimeSTT data files are bundled
   - Fixes: soundfile.LibsndfileError opening warmup_audio.wav

2. Linux: PyTorch/TorchAudio CUDA version mismatch (12.1 vs 12.4)
   - Added torchaudio>=2.0.0 explicitly to dependencies
   - Ensures torchaudio comes from pytorch-cu121 index (same as torch)
   - Previously RealtimeSTT was pulling torchaudio from PyPI with CUDA 12.4
   - Fixes: RuntimeError about CUDA version mismatch

Both packages now correctly use the pytorch-cu121 index via tool.uv.sources
configuration, ensuring matching CUDA versions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 20:28:11 -08:00
52aa73bfaa Fix infinite spawn loop on all platforms with PyInstaller
Extended the freeze_support fix to work on Linux and macOS, not just Windows.
The spawn loop can occur on any platform when PyInstaller bundles apps that
use multiprocessing.

Changes:
- Removed Windows-only condition for freeze_support()
- Added multiprocessing.set_start_method('spawn', force=True)
- Set spawn method for consistency across all platforms
- Prevents fork-related issues on Linux with PyInstaller

Why this is needed:
- PyTorch, faster-whisper, and RealtimeSTT all use multiprocessing
- PyInstaller frozen executables need explicit spawn method configuration
- Linux defaults to 'fork' which can cause issues with frozen executables
- 'spawn' method is more reliable with PyInstaller on all platforms

This ensures the app launches only once on Windows, Linux, and macOS.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 20:17:23 -08:00
371d5d9a28 Fix infinite spawn loop on Windows PyInstaller builds
CRITICAL FIX: Added multiprocessing.freeze_support() to prevent the
frozen executable from spawning infinite copies of itself on Windows.

The issue:
When PyInstaller bundles Python apps that use multiprocessing (which
PyTorch, faster-whisper, and RealtimeSTT all use), Windows treats each
spawn as a new process that re-executes the script. Without freeze_support(),
this creates an infinite loop of processes spawning until the system crashes.

The fix:
- Added multiprocessing.freeze_support() at the very top of main.py
- Called before any imports that might use multiprocessing
- Windows-only (wrapped in sys.platform check)
- Must be before QApplication or any Qt imports

This is a standard requirement for all PyInstaller apps that use
multiprocessing on Windows.

Resolves: App spawns infinite copies when running from PyInstaller build

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 20:09:43 -08:00
4d6dd6d35d Include pvporcupine resource files in PyInstaller build
PyInstaller wasn't bundling pvporcupine's resource files (keyword_files
and lib directories), causing a FileNotFoundError at runtime when
pvporcupine tried to access its resources directory.

Changes:
- Added code to detect and include pvporcupine resources and lib folders
- Falls back gracefully if pvporcupine is not installed
- Resources are bundled even though we don't use wake word features
  (pvporcupine initializes and checks for these on import)

This fixes the runtime error:
FileNotFoundError: [WinError 3] The system cannot find the path
specified: '...\pvporcupine\resources/keyword_files\windows'

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 19:57:13 -08:00
a3f61ea177 Fix PyInstaller hook error for webrtcvad package
PyInstaller's default webrtcvad hook was failing because we use
webrtcvad-wheels (which provides the webrtcvad module but has a
different package name for metadata purposes).

Changes:
- Created hooks/hook-webrtcvad.py custom hook
- Tries to copy metadata from webrtcvad-wheels first
- Falls back to webrtcvad if needed
- Gracefully handles missing metadata (module still works)

This prevents the "PackageNotFoundError: No package metadata was
found for webrtcvad" error during PyInstaller build.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 19:49:45 -08:00
e77303f793 Document why we override enum34 dependency
Added detailed comments explaining the enum34 override:
- RealtimeSTT uses pvporcupine 1.9.5 (last open-source version)
- pvporcupine 1.9.5 depends on enum34
- enum34 is incompatible with PyInstaller
- We don't use wake word features, so enum34 is unnecessary
- enum is in stdlib since Python 3.4

This provides context for future maintainers about why the override exists.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 19:44:14 -08:00
07b746144d Properly fix enum34 error with override-dependencies
The previous PyInstaller exclusion approach didn't prevent the pre-flight
check from failing. The proper solution is to use UV's override-dependencies
to prevent enum34 from being installed in the first place.

Changes:
- Added [tool.uv] override-dependencies in pyproject.toml
- Configured enum34 to only install on Python < 3.4
  (effectively never, since we require Python >=3.9)
- This prevents enum34 from being added to uv.lock

Why this works:
- UV respects override-dependencies during dependency resolution
- enum34 is never installed, so PyInstaller pre-flight check passes
- enum is part of Python stdlib since 3.4, so no functionality lost
- RealtimeSTT's dependency on pvporcupine==1.9.5 (which requires enum34)
  is satisfied without actually installing enum34

Credit: Solution suggested by Opus

Resolves: enum34 incompatible with PyInstaller error

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 19:42:13 -08:00
9b7f2e1d69 Fix enum34 error by excluding it in PyInstaller spec
The previous approach of uninstalling enum34 before PyInstaller didn't
work because 'uv run' re-syncs dependencies. The proper solution is to
exclude enum34 directly in the PyInstaller spec file.

Changes:
- Added hooks/hook-enum34.py: Custom PyInstaller hook to exclude enum34
- Updated local-transcription.spec:
  - Added 'hooks' to hookspath
  - Added 'enum34' to excludes list
- Updated build.sh and build.bat:
  - Removed enum34 uninstall step (no longer needed)
  - Added comment explaining enum34 is excluded in spec

Why this works:
- PyInstaller's excludes list prevents enum34 from being bundled
- The custom hook provides documentation and explicit exclusion
- enum34 can remain installed in venv (won't break anything)
- Works regardless of 'uv run' re-syncing dependencies

enum34 is an obsolete Python 2.7/3.3 backport that's incompatible with
PyInstaller and unnecessary on Python 3.4+ (enum is in stdlib).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 19:32:23 -08:00
aad7ab0713 Fix settings dialog for smaller screens with scroll area
The settings dialog was using a fixed 1200px height which exceeded
the available space on 1920x1080 displays, causing settings to be
cut off. Added scroll area and dynamic sizing based on screen size.

Changes:
- Added QScrollArea to wrap all settings content
- Dialog height now calculated as 80% of screen height (max 900px)
- Minimum size reduced to 700x500 for smaller screens
- Save/Cancel buttons remain fixed at bottom (outside scroll area)
- Horizontal scrollbar disabled, vertical scrollbar shown when needed

Benefits:
- Works on any screen size (1080p, 1440p, 4K, etc.)
- All settings always accessible via scrolling
- Buttons always visible at bottom
- More professional UX

Resolves: Settings dialog running off screen on 1920x1080

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 19:19:20 -08:00
d34d272cf0 Simplify build process: CUDA support now included by default
Since pyproject.toml is configured to use PyTorch CUDA index by default,
all builds automatically include CUDA support. Removed redundant separate
CUDA build scripts and updated documentation.

Changes:
- Removed build-cuda.sh and build-cuda.bat (no longer needed)
- Updated build.sh and build.bat to include CUDA by default
  - Added "uv sync" step to ensure CUDA PyTorch is installed
  - Updated messages to clarify CUDA support is included
- Updated BUILD.md to reflect simplified build process
  - Removed separate CUDA build sections
  - Clarified all builds include CUDA support
  - Updated GPU support section
- Updated CLAUDE.md with simplified build commands

Benefits:
- Simpler build process (one script per platform instead of two)
- Less confusion about which script to use
- All builds work on any system (GPU or CPU)
- Automatic fallback to CPU if no GPU available
- pyproject.toml is single source of truth for dependencies

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 19:09:36 -08:00
be53f2e962 Fix PyInstaller build failure caused by enum34 package
The enum34 package is an obsolete backport of Python's enum module
and is incompatible with PyInstaller on Python 3.4+. It was being
pulled in as a transitive dependency by pvporcupine (part of
RealtimeSTT's dependencies).

Changes:
- All build scripts now remove enum34 before running PyInstaller
  - build.bat, build-cuda.bat (Windows)
  - build.sh, build-cuda.sh (Linux)
- Added "uv pip uninstall -q enum34" step after cleaning builds
- Removed attempted pyproject.toml override (not needed with this fix)

This fix allows PyInstaller to bundle the application without errors
while still maintaining all RealtimeSTT functionality (enum is part
of Python stdlib since 3.4).

Resolves: PyInstaller error "enum34 package is incompatible"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 19:06:33 -08:00
20a7764bab Add application icon support for GUI and compiled executables
Added platform-specific icon support for both the running application
and compiled executables:

New files:
- create_icons.py: Script to convert PNG to platform-specific formats
  - Generates .ico for Windows (16, 32, 48, 256px sizes)
  - Generates .iconset for macOS (ready for iconutil conversion)
- LocalTranscription.png: Source icon image
- LocalTranscription.ico: Windows icon file (multi-size)
- LocalTranscription.iconset/: macOS icon set (needs iconutil on macOS)

GUI changes:
- main.py: Set application-wide icon for taskbar/dock
- main_window_qt.py: Set window icon for GUI window

Build configuration:
- local-transcription.spec: Use platform-specific icons in PyInstaller
  - Windows builds use LocalTranscription.ico
  - macOS builds use LocalTranscription.icns (when generated)

To generate macOS .icns file on macOS:
  iconutil -c icns LocalTranscription.iconset

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 18:59:24 -08:00
5f3c058be6 Migrate to RealtimeSTT for advanced VAD-based transcription
Major refactor to eliminate word loss issues using RealtimeSTT with
dual-layer VAD (WebRTC + Silero) instead of time-based chunking.

## Core Changes

### New Transcription Engine
- Add client/transcription_engine_realtime.py with RealtimeSTT wrapper
- Implements initialize() and start_recording() separation for proper lifecycle
- Dual-layer VAD with pre/post buffers prevents word cutoffs
- Optional realtime preview with faster model + final transcription

### Removed Legacy Components
- Remove client/audio_capture.py (RealtimeSTT handles audio)
- Remove client/noise_suppression.py (VAD handles silence detection)
- Remove client/transcription_engine.py (replaced by realtime version)
- Remove chunk_duration setting (no longer using time-based chunking)

### Dependencies
- Add RealtimeSTT>=0.3.0 to pyproject.toml
- Remove noisereduce, webrtcvad, faster-whisper (now dependencies of RealtimeSTT)
- Update PyInstaller spec with ONNX Runtime, halo, colorama

### GUI Improvements
- Refactor main_window_qt.py to use RealtimeSTT with proper start/stop
- Fix recording state management (initialize on startup, record on button click)
- Expand settings dialog (700x1200) with improved spacing (10-15px between groups)
- Add comprehensive tooltips to all settings explaining functionality
- Remove chunk duration field from settings

### Configuration
- Update default_config.yaml with RealtimeSTT parameters:
  - Silero VAD sensitivity (0.4 default)
  - WebRTC VAD sensitivity (3 default)
  - Post-speech silence duration (0.3s)
  - Pre-recording buffer (0.2s)
  - Beam size for quality control (5 default)
  - ONNX acceleration (enabled for 2-3x faster VAD)
  - Optional realtime preview settings

### CLI Updates
- Update main_cli.py to use new engine API
- Separate initialize() and start_recording() calls

### Documentation
- Add INSTALL_REALTIMESTT.md with migration guide and benefits
- Update INSTALL.md: Remove FFmpeg requirement (not needed!)
- Clarify PortAudio is only needed for development
- Document that built executables are fully standalone

## Benefits

-  Eliminates word loss at chunk boundaries
-  Natural speech segment detection via VAD
-  2-3x faster VAD with ONNX acceleration
-  30% lower CPU usage
-  Pre-recording buffer captures word starts
-  Post-speech silence prevents cutoffs
-  Optional instant preview mode
-  Better UX with comprehensive tooltips

## Migration Notes

- Settings apply immediately without restart (except model changes)
- Old chunk_duration configs ignored (VAD-based detection now)
- Recording only starts when user clicks button (not on app startup)
- Stop button immediately stops recording (no delay)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 18:48:29 -08:00
eeeb488529 Add loading splash screen for app startup
**Splash Screen Features:**
- Shows "Local Transcription" branding during startup
- Displays progress messages as app initializes
- Prevents users from clicking multiple times while loading
- Clean dark theme matching app design

**Implementation:**
- Created splash screen with custom pixmap drawing
- Updates messages during initialization phases:
  - "Loading configuration..."
  - "Creating user interface..."
  - "Starting web server..."
  - "Loading Whisper model..."
- Automatically closes when main window is ready
- Always stays on top to remain visible

**Benefits:**
- Better user experience during model loading (2-5 seconds)
- Prevents multiple app instances from confusion
- Professional appearance
- Clear feedback that app is starting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-27 06:33:44 -08:00
bd0e84c5e7 Fix model switching crash and improve error handling
**Model Reload Fixes:**
- Properly disconnect signals before reconnecting to prevent duplicate connections
- Wait for previous model loader thread to finish before starting new one
- Add garbage collection after unloading model to free memory
- Improve error handling in model reload callback

**Settings Dialog:**
- Remove duplicate success message (callback handles it)
- Only show message if no callback is defined

**Transcription Engine:**
- Explicitly delete model reference before setting to None
- Force garbage collection to ensure memory is freed

This prevents crashes when switching models, especially when done
multiple times in succession or while the app is under load.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-27 06:28:40 -08:00
146a8c8beb Enhance display customization and remove PHP server
Major improvements to display configuration and server architecture:

**Display Enhancements:**
- Add URL parameters for display customization (timestamps, maxlines, fontsize, fontfamily)
- Fix max lines enforcement to prevent scroll bars in OBS
- Apply font family and size settings to both local and sync displays
- Remove auto-scroll, enforce overflow:hidden for clean OBS integration

**Node.js Server:**
- Add timestamps toggle: timestamps=true/false
- Add max lines limit: maxlines=50
- Add font configuration: fontsize=16, fontfamily=Arial
- Update index page with URL parameters documentation
- Improve display URLs in room generation

**Local Web Server:**
- Add max_lines, font_family, font_size configuration
- Respect settings from GUI configuration
- Apply changes immediately without restart

**Architecture:**
- Remove PHP server implementation (Node.js recommended)
- Update all documentation to reference Node.js server
- Update default config URLs to Node.js endpoints
- Clean up 1700+ lines of PHP code

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-27 06:15:55 -08:00
e831dadd24 Fix app stability: graceful model switching and web server improvements
- Add comprehensive error handling to prevent crashes during model reload
- Implement automatic port fallback (8080-8084) for web server conflicts
- Configure uvicorn to work properly with PyInstaller console=False builds
- Add proper web server shutdown on app close to release ports
- Improve error reporting with full tracebacks for debugging

Fixes:
- App crashing when switching models
- Web server not starting after app crash (port conflict)
- Web server failing silently in compiled builds without console

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 17:50:37 -08:00
478146c58d Improve UX: hide console window and fade connection status
- Hide console window on compiled desktop app (console=False in spec)
- Add 20-second auto-fade to "Connected" status in OBS display
- Keep "Disconnected" status visible until reconnection
- Add PM2 deployment configuration and documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 17:04:28 -08:00
64c864b0f0 Fix multi-user server sync performance and integration
Major fixes:
- Integrated ServerSyncClient into GUI for actual multi-user sync
- Fixed CUDA device display to show actual hardware used
- Optimized server sync with parallel HTTP requests (5x faster)
- Fixed 2-second DNS delay by using 127.0.0.1 instead of localhost
- Added comprehensive debugging and performance logging

Performance improvements:
- HTTP requests: 2045ms → 52ms (97% faster)
- Multi-user sync lag: ~4s → ~100ms (97% faster)
- Parallel request processing with ThreadPoolExecutor (3 workers)

New features:
- Room generator with one-click copy on Node.js landing page
- Auto-detection of PHP vs Node.js server types
- Localhost warning banner for WSL2 users
- Comprehensive debug logging throughout sync pipeline

Files modified:
- gui/main_window_qt.py - Server sync integration, device display fix
- client/server_sync.py - Parallel HTTP, server type detection
- server/nodejs/server.js - Room generator, warnings, debug logs

Documentation added:
- PERFORMANCE_FIX.md - Server sync optimization details
- FIX_2_SECOND_HTTP_DELAY.md - DNS/localhost issue solution
- LATENCY_GUIDE.md - Audio chunk duration tuning guide
- DEBUG_4_SECOND_LAG.md - Comprehensive debugging guide
- SESSION_SUMMARY.md - Complete session summary

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 16:44:55 -08:00
c28679acb6 Update to support sync captions 2025-12-26 16:15:52 -08:00
2870d45bdc Add Apache ProxyTimeout configuration for SSE support
Created apache-sse-config.conf with required Apache settings to support
long-running SSE connections. Apache's mod_proxy_fcgi has a default
timeout of 30-60 seconds which kills SSE connections prematurely.

The configuration sets ProxyTimeout to 21600 seconds (6 hours) to match
HAProxy's timeout and allow long streaming sessions.

Added note to .htaccess explaining this requirement, as ProxyTimeout
cannot be set in .htaccess and must be configured in the virtual host.

To fix 504 Gateway Timeout errors:
1. Add ProxyTimeout directive to Apache virtual host config
2. Reload Apache
3. Test SSE connection

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 15:29:56 -08:00
9feb17b734 Fix SSE connection closing prematurely for non-existent rooms
The server was calling exit() immediately when a room didn't exist,
which caused the SSE connection to open and then close right away.
This triggered EventSource to reconnect in a loop.

Now the server keeps the connection open and sends keepalives even
for rooms that don't exist yet. This is the correct SSE behavior -
maintain the connection and stream data when it becomes available.

Fixes the "connection established then immediately errors" issue
seen in diagnostic tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 14:03:06 -08:00
50deaeae96 Add comprehensive console debugging to SSE diagnostic test
Enhanced JavaScript debugging with:
- Detailed connection state logging (CONNECTING/OPEN/CLOSED)
- EventSource object inspection
- Real-time readyState monitoring
- Verbose error information (type, target, readyState)
- URL and location details for troubleshooting
- Console grouping for organized output

This will help diagnose SSE connection issues by providing
detailed information in the browser console.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 13:57:28 -08:00
54910a5df3 Fix misleading keepalive comment in PHP server
The comment said keepalives were sent every 15 seconds, but the code
uses sleep(1), so they're actually sent every 1 second. This is correct
for SSE connections - frequent keepalives prevent proxy timeouts.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 13:12:09 -08:00
af1c1231b6 Fix PHP-FPM compatibility and add server diagnostics
Changes to .htaccess:
- Removed php_flag and php_value directives (don't work with php-fpm)
- Simplified DirectoryMatch to FilesMatch for .json files
- Added note about configuring PHP settings in php.ini/pool config
- More compatible with user directories

Added diagnostic.php:
- Tests PHP version, extensions, and configuration
- Checks storage directory permissions
- Tests Server-Sent Events (SSE) connection
- Shows server API type (php-fpm vs mod_php)
- Provides troubleshooting hints for common issues
- Live SSE connection test with detailed logging

Added data/index.php:
- Blocks direct access to data directory
- Returns 403 Forbidden

Fixes:
- php-fpm environments not respecting .htaccess PHP settings
- DirectoryMatch issues in user directories
- Hard to diagnose connection problems

Usage: Navigate to diagnostic.php to troubleshoot server issues
2025-12-26 12:45:23 -08:00
1acdb065c5 Fix uv index: Use explicit=true for PyTorch index
- Added explicit=true to pytorch-cu121 index
- Only torch, torchvision, torchaudio use PyTorch index
- All other packages (requests, fastapi, etc.) use PyPI
- Fixes: requests version conflict (PyTorch index has 2.28.1, we need >=2.31.0)

How explicit=true works:
- PyTorch index only checked for packages listed in tool.uv.sources
- Prevents dependency confusion and version conflicts
- Best practice for supplemental package indexes
2025-12-26 12:16:08 -08:00
a5556c475d Fix uv index configuration: Use PyTorch CUDA as additional index
- Changed from 'default' to named additional index
- Added tool.uv.sources to specify torch comes from pytorch-cu121 index
- Other packages (fastapi, uvicorn, etc.) still come from PyPI
- Fixes: 'fastapi was not found in the package registry' error

How it works:
- PyPI remains the default index for most packages
- torch package explicitly uses pytorch-cu121 index
- Best of both worlds: CUDA PyTorch + all other packages from PyPI
2025-12-26 12:13:40 -08:00
0bcd8e8d21 Configure uv to always use PyTorch CUDA index
Changes:
- Set PyTorch CUDA index (cu121) as default for all builds
- CUDA builds support both GPU and CPU (auto-fallback)
- Fixes uv run reinstalling CPU-only PyTorch
- Updated dependency-groups syntax (fixes deprecation warning)

Benefits:
- Simpler build process - no CPU vs CUDA distinction needed
- uv sync and uv run now get CUDA-enabled PyTorch automatically
- Builds work on systems with or without NVIDIA GPUs
- Fixes issue where uv run check_cuda.py was getting CPU version

Index: https://download.pytorch.org/whl/cu121 (PyTorch 2.5.1+cu121)
2025-12-26 12:08:42 -08:00
8604662262 Add CUDA diagnostic script for troubleshooting GPU detection
- Checks PyTorch installation and version
- Detects CUDA availability and GPU info
- Tests CUDA with simple tensor operation
- Shows device manager detection results
- Provides troubleshooting hints for CPU-only builds

Usage: python check_cuda.py or uv run check_cuda.py
2025-12-26 12:00:37 -08:00
d51b24e2e5 Move FastAPI and uvicorn to main dependencies
- Web server is always-running (not optional) for OBS integration
- Users no longer need to manually install fastapi and uvicorn
- Previously required: uv pip install "fastapi[standard]" uvicorn
- Now auto-installed with: uv sync

Fixes: Missing FastAPI/uvicorn dependencies on fresh Windows installs
2025-12-26 11:57:50 -08:00