Start Transcription button now shows the error message when it fails
instead of silently reverting. Common causes:
- Missing PortAudio library on Linux
- Audio device not accessible
- Deepgram connection failure
Also added error details to backend console output and captured
the last error from the Deepgram engine for better diagnostics.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CUDA sidecars are ~2GB and too slow to upload from the Windows runner.
Cloud (Deepgram) provides faster transcription anyway. Removed:
- CUDA build steps from Windows and Linux sidecar workflows
- CUDA option from the SidecarSetup download screen
Remaining sidecar variants:
- Cloud (Deepgram): ~50 MB - recommended for most users
- Local CPU: ~500 MB - for offline/privacy use
CUDA can be revisited once the managed Deepgram service is ready.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Rust backend emits {downloaded, total, phase, message} but the
Svelte component was reading event.payload.progress which doesn't
exist, resulting in NaN. Now calculates percentage from downloaded/total.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After calling POST /api/stop, the button stayed on "Stop Transcription"
because the state update depended on the WebSocket broadcast which can
be delayed or missed (event loop threading issue).
Fix: poll GET /api/status immediately after start/stop API calls to
update the UI state directly, rather than waiting for the WebSocket.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New features:
- Settings > Transcription Engine > "Change Transcription Engine"
button stops the sidecar, deletes downloaded files, and reloads
the app to show the engine selection screen
- Improved SidecarSetup descriptions with detailed explanations
of each variant and "Recommended" tag on Cloud (Deepgram)
- Cloud option listed first as the recommended choice
- New reset_sidecar Tauri command that cleans up sidecar files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lightweight Deepgram-only sidecar that excludes PyTorch, faster-whisper,
RealtimeSTT, and CUDA. Only includes audio capture + WebSocket streaming
to Deepgram. Requires a Deepgram API key (BYOK or managed mode).
Changes:
- client/models.py: Extracted TranscriptionResult into standalone module
so deepgram_transcription.py doesn't transitively import torch
- backend/app_controller.py: Made RealtimeTranscriptionEngine and
DeviceManager imports lazy (only loaded when remote.mode == "local")
- local-transcription-cloud.spec: PyInstaller spec excluding all ML deps
- SidecarSetup.svelte: Added "Cloud Only (Deepgram)" variant option
- build-sidecar-cloud.yml: CI workflow building cloud sidecar for all 3 OS
- sidecar-release.yml: Dispatches cloud build alongside CPU/CUDA builds
Sidecar download options are now:
- Standard (CPU): ~500 MB - local Whisper on any computer
- GPU Accelerated (CUDA): ~2 GB - local Whisper with NVIDIA GPU
- Cloud Only (Deepgram): ~50 MB - requires API key, no local models
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Restored the font configuration that was missing from the Tauri
rewrite. Settings now include:
- Font Source: System Font, Web-Safe, Google Font
- System Font: text input for any installed font family
- Web-Safe: dropdown with 13 universal fonts (Arial, Courier New, etc.)
- Google Font: dropdown with 35 fonts organized by category
(Sans Serif, Serif, Monospace, Display, Handwriting)
- Font Size: range slider (8-32px)
All font settings are saved to config and applied to the OBS web
display and server sync.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On launch, after confirming the sidecar is installed, the app now
checks for a newer sidecar version via the Gitea API. If an update
is available, shows a prompt with "Update Now" or "Skip":
- Update Now: shows the SidecarSetup download screen
- Skip: launches the existing sidecar version
The update check is non-blocking -- if it fails (no internet, API
error), the app silently proceeds with the current version.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test suite covering all three layers:
Python backend (25 tests):
- AppController: state machine, start/stop, callbacks, settings reload
- API server: REST endpoints, config CRUD, status, devices
- Config: dot-notation get/set, persistence, nested paths
- Main headless: ready event port format validation
Svelte frontend (14 tests via Vitest):
- Backend store: exported properties/methods, port derivation, URLs
- Config store: method names (fetchConfig not loadConfig), defaults
- Transcriptions store: add/clear/plaintext
- File extension regression: ensures $state runes only in .svelte.ts
Rust sidecar (24 tests via cargo test):
- Platform/arch detection, asset name construction
- Ready event deserialization (with extra fields tolerance)
- Path construction, version read/write, old version cleanup
- Zip extraction, SidecarManager lifecycle
CI workflow (.gitea/workflows/test.yml):
- Runs on push to main and PRs
- Three parallel jobs: Python, Frontend, Rust
Also fixes three bugs found during test planning:
- Settings: /api/check-updates -> GET /api/check-update
- Settings: /api/remote/login -> /api/login
- Settings: /api/remote/register -> /api/register
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three issues fixed:
1. Port mismatch: The sidecar reported the OBS port (8080) in the
ready event but the frontend needs the API port (8081). Now reports
the API port so WebSocket/REST connects to the right place.
2. Broadcast from wrong thread: Engine init fires state_changed from
a background thread, but _broadcast_control used get_event_loop()
which returns the wrong loop. Now captures the uvicorn event loop
at startup via on_event("startup").
3. Missed ready state: If the engine finishes before the WebSocket
client connects, the "ready" state_changed was never received.
Added status polling (GET /api/status) on WebSocket connect that
retries every 2s while appState is "initializing".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fixed method call from saveConfig (doesn't exist) to updateConfig
- Save button shows "Saving..." while in progress, disabled during save
- Green "Settings saved!" message appears on success before closing
- Red error message shown on failure
- Cancel button disabled during save
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BYOK mode connects directly to Deepgram (wss://api.deepgram.com),
so the server URL field was incorrect. Now:
- BYOK shows a Deepgram API Key field with link to console.deepgram.com
- Managed shows the Server URL field (for the transcription proxy)
- Local shows neither
- API key is saved as remote.byok_api_key in config
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The config store exports fetchConfig() but App.svelte was calling
the nonexistent loadConfig(), causing a TypeError that prevented
the sidecar from launching.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Svelte 5 runes ($state, $derived, $effect) are only compiled in
.svelte and .svelte.ts files. The stores used runes in plain .ts
files, which meant $state was treated as an undefined function at
runtime, crashing the JS before anything rendered.
- Renamed backend.ts -> backend.svelte.ts
- Renamed config.ts -> config.svelte.ts
- Renamed transcriptions.ts -> transcriptions.svelte.ts
- Added .svelte.ts to Vite resolve extensions
- Added missing obsUrl/syncUrl getters to backend store
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Added write_log Tauri command that writes to frontend.log in app data dir
- App.svelte now logs each startup step (Tauri import, sidecar check, launch)
- Startup overlays use inline styles as fallback so they're visible even if
CSS variables fail to load
- Debug status shown on the checking/connecting screens
- Rust side logs startup info to app.log (resource dir, data dir)
Log files location: %APPDATA%/net.anhonesthost.local-transcription/ (Windows)
or ~/Library/Application Support/net.anhonesthost.local-transcription/ (macOS)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On first launch, the app now prompts users to download the Python
sidecar (CPU or CUDA variant) from Gitea releases, matching the
voice-to-notes pattern. On subsequent launches, it auto-launches
the sidecar and connects.
New Rust module (src-tauri/src/sidecar/):
- download_sidecar: streams download with progress events, extracts zip
- check_sidecar: verifies installed sidecar binary exists
- check_sidecar_update: compares local vs latest release version
- SidecarManager: launches binary, waits for ready JSON, manages lifecycle
- Dev mode: runs `python -m backend.main_headless` directly
- start_sidecar/stop_sidecar/get_sidecar_port: Tauri commands
New Svelte component (SidecarSetup.svelte):
- First-time setup overlay with CPU/CUDA variant selection
- Download progress bar with byte counter
- Error state with retry, success state with auto-continue
Updated App.svelte state machine:
- checking -> needs_setup -> starting -> connected
- Falls back to direct connection in browser dev mode
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
macOS sidecar: `uv run` re-resolves dependencies using CUDA sources
even after `uv sync --no-sources`. Use UV_NO_SOURCES=1 env var instead
so it applies to all uv commands in the step.
Blank window: When the Tauri app starts without the Python backend
running, it showed a completely blank window. Now shows a "Connecting
to backend..." spinner, or an error state with instructions to start
the backend manually.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The src/lib/ directory was being excluded by a Python .gitignore rule
for lib/ (meant for Python's build output). Changed to /lib/ so it
only matches root-level lib/ and doesn't block src/lib/.
Adds 8 files that were created but missed in the initial commit:
- 5 Svelte components (Header, StatusBar, Controls, TranscriptionDisplay, Settings)
- 3 TypeScript stores (backend, config, transcriptions)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scaffold the cross-platform rewrite from PySide6/Qt to Tauri + Svelte,
following the same architecture as voice-to-notes. The Python backend
runs headless as a sidecar, with a FastAPI control API that the Svelte
frontend connects to via REST and WebSocket.
New files:
- backend/app_controller.py: Headless orchestration (extracted from MainWindow)
- backend/api_server.py: FastAPI control endpoints + /ws/control WebSocket
- backend/main_headless.py: Headless entry point for sidecar mode
- src-tauri/: Tauri v2 Rust shell with sidecar and dialog plugins
- src/: Svelte 5 frontend (App, Settings, Controls, TranscriptionDisplay)
- src/lib/stores/: Reactive stores for backend connection, config, transcriptions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>