New features:
- Settings > Transcription Engine > "Change Transcription Engine"
button stops the sidecar, deletes downloaded files, and reloads
the app to show the engine selection screen
- Improved SidecarSetup descriptions with detailed explanations
of each variant and "Recommended" tag on Cloud (Deepgram)
- Cloud option listed first as the recommended choice
- New reset_sidecar Tauri command that cleans up sidecar files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lightweight Deepgram-only sidecar that excludes PyTorch, faster-whisper,
RealtimeSTT, and CUDA. Only includes audio capture + WebSocket streaming
to Deepgram. Requires a Deepgram API key (BYOK or managed mode).
Changes:
- client/models.py: Extracted TranscriptionResult into standalone module
so deepgram_transcription.py doesn't transitively import torch
- backend/app_controller.py: Made RealtimeTranscriptionEngine and
DeviceManager imports lazy (only loaded when remote.mode == "local")
- local-transcription-cloud.spec: PyInstaller spec excluding all ML deps
- SidecarSetup.svelte: Added "Cloud Only (Deepgram)" variant option
- build-sidecar-cloud.yml: CI workflow building cloud sidecar for all 3 OS
- sidecar-release.yml: Dispatches cloud build alongside CPU/CUDA builds
Sidecar download options are now:
- Standard (CPU): ~500 MB - local Whisper on any computer
- GPU Accelerated (CUDA): ~2 GB - local Whisper with NVIDIA GPU
- Cloud Only (Deepgram): ~50 MB - requires API key, no local models
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Restored the font configuration that was missing from the Tauri
rewrite. Settings now include:
- Font Source: System Font, Web-Safe, Google Font
- System Font: text input for any installed font family
- Web-Safe: dropdown with 13 universal fonts (Arial, Courier New, etc.)
- Google Font: dropdown with 35 fonts organized by category
(Sans Serif, Serif, Monospace, Display, Handwriting)
- Font Size: range slider (8-32px)
All font settings are saved to config and applied to the OBS web
display and server sync.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three changes to reduce transcription delay:
1. Send loop: queue.get() was blocking the asyncio event loop, stalling
the receive loop and delaying transcription results. Now uses
run_in_executor() to avoid blocking the event loop.
2. Block size: reduced from 4096 (~256ms) to 1024 (~64ms) for more
frequent, smaller audio chunks. Deepgram handles streaming better
with smaller packets.
3. Added punctuate=true and smart_format=true to Deepgram BYOK
params for cleaner transcription output.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Dev mode: use `uv run python` instead of bare `python` to ensure
the project venv is used. Also use CARGO_MANIFEST_DIR to find the
project root reliably.
2. Engine reload: changing remote.mode (local/managed/byok) now
triggers a full engine reload. Previously only model and device
changes triggered reload, so switching to Deepgram had no effect
until the app was restarted.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Removed push triggers from both coordinator workflows. They now
only run via workflow_dispatch (manual "Run workflow" button).
Re-enable push triggers once the build pipeline is stable.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues causing all builds to fail:
1. Cleanup steps deleted git tags along with releases. Since builds
are dispatched asynchronously, they tried to checkout tags that
had already been deleted. Now cleanup only deletes releases (which
frees storage by removing assets) but preserves git tags.
2. Linux/macOS build workflows used $GITHUB_OUTPUT step outputs for
the tag, which is unreliable on Gitea runners. Switched to the
same job-level env var pattern (RELEASE_TAG) that works on Windows.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PyInstaller frozen executables buffer stdout when piped to a
subprocess (no TTY). Even with flush=True in Python, the OS-level
pipe buffer can delay output. This prevented the ready event from
reaching the Tauri app, causing the "Starting sidecar..." hang.
Fix: set PYTHONUNBUFFERED=1 env var on both prod and dev sidecar
commands, plus -u flag for dev mode Python.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues caused the app to freeze on "Starting sidecar...":
1. wait_for_ready() used a blocking BufReader::lines() iterator
with a timeout check between lines. If the sidecar produced no
stdout output (crashed, missing binary, or slow model loading),
the read blocked forever. Now uses a background thread with
mpsc::recv_timeout() for a real 120s deadline.
2. start_sidecar was a synchronous Tauri command that blocked the
main thread during the entire sidecar startup (up to 120s).
Now async via tokio::spawn_blocking, keeping the UI responsive.
Also logs all sidecar stdout lines to stderr with [sidecar-stdout]
prefix for debugging.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Major version bump reflecting the architecture change from PySide6/Qt
to Tauri v2 + Svelte 5 with cross-platform support for Windows,
macOS, and Linux.
Key changes since v1.4.0:
- Tauri v2 native desktop shell replacing PySide6/Qt
- Svelte 5 reactive frontend
- Headless Python backend as a downloadable sidecar
- Deepgram cloud transcription (managed + BYOK)
- Gitea CI/CD with per-OS builds and automated releases
- Sidecar auto-update checking on startup
- 63-test suite (Python + Svelte + Rust)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On launch, after confirming the sidecar is installed, the app now
checks for a newer sidecar version via the Gitea API. If an update
is available, shows a prompt with "Update Now" or "Skip":
- Update Now: shows the SidecarSetup download screen
- Skip: launches the existing sidecar version
The update check is non-blocking -- if it fails (no internet, API
error), the app silently proceeds with the current version.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs preventing sidecar from starting:
1. Directory was "sidecar-sidecar-v1.0.3" (double prefix) because
sidecar_dir_for_version() prepended "sidecar-" to a version that
already contained it. Now uses the tag directly as the dir name.
2. After a crash, the Python InstanceLock PID file at
~/.local-transcription/app.lock remained, blocking the next launch
with "Another instance is already running". Now clears the stale
lock file before spawning the sidecar.
Also fixed cleanup_old_versions() and tests to match the corrected
directory naming.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the sidecar process exits before sending the ready event, the
error message now includes the last 10 lines of stderr. Stderr is
captured in a background thread and written to sidecar.log in the
app data directory.
This helps diagnose why the PyInstaller sidecar fails to start
(missing DLLs, import errors, permission issues, etc.).
Log location: %APPDATA%\net.anhonesthost.local-transcription\sidecar.log
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- test.yml: use uv venv instead of pip --break-system-packages
- release.yml: inline test job that must pass before version bump;
only triggers on source file changes (src/, src-tauri/, package.json)
- sidecar-release.yml: inline Python test job that must pass before
sidecar version bump
- Both coordinators use `needs: test` so builds never start if tests fail
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both release.yml and sidecar-release.yml were updating version.py,
causing merge conflicts when both ran on the same push. Now:
- release.yml (app) owns: package.json, tauri.conf.json, Cargo.toml, version.py
- sidecar-release.yml owns: pyproject.toml only
Also deleted the stale sidecar-v1.0.4 tag that failed to push.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test suite covering all three layers:
Python backend (25 tests):
- AppController: state machine, start/stop, callbacks, settings reload
- API server: REST endpoints, config CRUD, status, devices
- Config: dot-notation get/set, persistence, nested paths
- Main headless: ready event port format validation
Svelte frontend (14 tests via Vitest):
- Backend store: exported properties/methods, port derivation, URLs
- Config store: method names (fetchConfig not loadConfig), defaults
- Transcriptions store: add/clear/plaintext
- File extension regression: ensures $state runes only in .svelte.ts
Rust sidecar (24 tests via cargo test):
- Platform/arch detection, asset name construction
- Ready event deserialization (with extra fields tolerance)
- Path construction, version read/write, old version cleanup
- Zip extraction, SidecarManager lifecycle
CI workflow (.gitea/workflows/test.yml):
- Runs on push to main and PRs
- Three parallel jobs: Python, Frontend, Rust
Also fixes three bugs found during test planning:
- Settings: /api/check-updates -> GET /api/check-update
- Settings: /api/remote/login -> /api/login
- Settings: /api/remote/register -> /api/register
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three issues fixed:
1. Port mismatch: The sidecar reported the OBS port (8080) in the
ready event but the frontend needs the API port (8081). Now reports
the API port so WebSocket/REST connects to the right place.
2. Broadcast from wrong thread: Engine init fires state_changed from
a background thread, but _broadcast_control used get_event_loop()
which returns the wrong loop. Now captures the uvicorn event loop
at startup via on_event("startup").
3. Missed ready state: If the engine finishes before the WebSocket
client connects, the "ready" state_changed was never received.
Added status polling (GET /api/status) on WebSocket connect that
retries every 2s while appState is "initializing".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fixed method call from saveConfig (doesn't exist) to updateConfig
- Save button shows "Saving..." while in progress, disabled during save
- Green "Settings saved!" message appears on success before closing
- Red error message shown on failure
- Cancel button disabled during save
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BYOK mode connects directly to Deepgram (wss://api.deepgram.com),
so the server URL field was incorrect. Now:
- BYOK shows a Deepgram API Key field with link to console.deepgram.com
- Managed shows the Server URL field (for the transcription proxy)
- Local shows neither
- API key is saved as remote.byok_api_key in config
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The config store exports fetchConfig() but App.svelte was calling
the nonexistent loadConfig(), causing a TypeError that prevented
the sidecar from launching.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tauri v2 requires explicit permission grants. The SidecarSetup
component uses listen() from @tauri-apps/api/event to receive
download progress, which requires core:event:allow-listen.
Added default capability with core, event, shell, dialog, and
process permissions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Svelte 5 runes ($state, $derived, $effect) are only compiled in
.svelte and .svelte.ts files. The stores used runes in plain .ts
files, which meant $state was treated as an undefined function at
runtime, crashing the JS before anything rendered.
- Renamed backend.ts -> backend.svelte.ts
- Renamed config.ts -> config.svelte.ts
- Renamed transcriptions.ts -> transcriptions.svelte.ts
- Added .svelte.ts to Vite resolve extensions
- Added missing obsUrl/syncUrl getters to backend store
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step outputs via GITHUB_OUTPUT are unreliable with act runner on
Windows (BOM encoding issues). Replaced with job-level env var
RELEASE_TAG set directly from inputs.tag, and checkout ref also
uses inputs.tag directly. Eliminated the Determine tag step
entirely — no intermediate output needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The act runner on Windows doesn't have bash available. Switched back
to PowerShell with the inputs.tag fallback chain. Uses Out-File for
GITHUB_OUTPUT instead of echo redirection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The workflow_dispatch input was accessed as github.event.inputs.tag
which can be empty depending on the Gitea runner. Now tries both
inputs.tag (modern syntax) and github.event.inputs.tag as fallback,
with a final fallback to the latest matching git tag.
Also switched Windows Determine-tag steps from PowerShell to bash
(via Git Bash) for consistency with the other platforms.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously per-OS build workflows triggered on tag push events, but
Gitea doesn't fire events for tags pushed by other workflows. Now:
- release.yml dispatches build-app-{linux,windows,macos}.yml via
the Gitea API after creating the tag and release
- sidecar-release.yml dispatches build-sidecar-{linux,windows,macos}.yml
Per-OS workflows changed from push+dispatch triggers to dispatch-only
with tag as a required input. To re-run a failed build for the same
version, just dispatch the specific OS workflow with the same tag --
upload logic replaces existing assets automatically.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Added write_log Tauri command that writes to frontend.log in app data dir
- App.svelte now logs each startup step (Tauri import, sidecar check, launch)
- Startup overlays use inline styles as fallback so they're visible even if
CSS variables fail to load
- Debug status shown on the checking/connecting screens
- Rust side logs startup info to app.log (resource dir, data dir)
Log files location: %APPDATA%/net.anhonesthost.local-transcription/ (Windows)
or ~/Library/Application Support/net.anhonesthost.local-transcription/ (macOS)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Refactored from 2 monolithic workflows into 8 targeted ones:
Coordinators (version bump + tag + release creation):
- release.yml: bumps app version, tags v*, creates Gitea release
- sidecar-release.yml: bumps sidecar version, tags sidecar-v*
Per-OS app builds (triggered by v* tags or workflow_dispatch):
- build-app-linux.yml: .deb, .rpm, .AppImage
- build-app-windows.yml: .msi, -setup.exe
- build-app-macos.yml: .dmg
Per-OS sidecar builds (triggered by sidecar-v* tags or workflow_dispatch):
- build-sidecar-linux.yml: CUDA + CPU variants
- build-sidecar-windows.yml: CUDA + CPU variants
- build-sidecar-macos.yml: CPU only
Each build workflow can be re-triggered independently without
re-running the version bump or rebuilding other platforms.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>