Engines now read user.name from the config object at transcription time
instead of caching it at init, so name changes take effect immediately.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Run apply_settings in thread pool executor to prevent engine reload
from blocking the HTTP response (caused "TypeError: Failed to fetch")
- Flatten nested config objects into dot-notation keys before saving
so partial updates don't wipe out unincluded keys like auth_token
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Expose is_cloud_only flag in /api/status response
- Add isCloudOnly to backend store state
- Conditionally hide Local (Whisper) radio button in Settings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Start Transcription button now shows the error message when it fails
instead of silently reverting. Common causes:
- Missing PortAudio library on Linux
- Audio device not accessible
- Deepgram connection failure
Also added error details to backend console output and captured
the last error from the Deepgram engine for better diagnostics.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On first launch, the cloud sidecar now:
1. Detects it's the cloud variant (DeviceManager import fails)
2. Auto-switches config from "local" to "byok" mode
3. Shows "Setup needed: Open Settings > Remote Transcription >
enter your Deepgram API key" as a friendly status message
4. Stays in READY state so the UI is fully accessible
The user can then open Settings, enter their Deepgram API key,
save, and start transcribing without needing to know about modes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The cloud sidecar excludes the local Whisper engine module, but on
first launch the config defaults to remote.mode="local" which tries
to import it. Now catches the ImportError gracefully and shows an
error message telling the user to switch to Cloud (Deepgram) mode
in Settings. The API server still starts so Settings is accessible.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lightweight Deepgram-only sidecar that excludes PyTorch, faster-whisper,
RealtimeSTT, and CUDA. Only includes audio capture + WebSocket streaming
to Deepgram. Requires a Deepgram API key (BYOK or managed mode).
Changes:
- client/models.py: Extracted TranscriptionResult into standalone module
so deepgram_transcription.py doesn't transitively import torch
- backend/app_controller.py: Made RealtimeTranscriptionEngine and
DeviceManager imports lazy (only loaded when remote.mode == "local")
- local-transcription-cloud.spec: PyInstaller spec excluding all ML deps
- SidecarSetup.svelte: Added "Cloud Only (Deepgram)" variant option
- build-sidecar-cloud.yml: CI workflow building cloud sidecar for all 3 OS
- sidecar-release.yml: Dispatches cloud build alongside CPU/CUDA builds
Sidecar download options are now:
- Standard (CPU): ~500 MB - local Whisper on any computer
- GPU Accelerated (CUDA): ~2 GB - local Whisper with NVIDIA GPU
- Cloud Only (Deepgram): ~50 MB - requires API key, no local models
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Dev mode: use `uv run python` instead of bare `python` to ensure
the project venv is used. Also use CARGO_MANIFEST_DIR to find the
project root reliably.
2. Engine reload: changing remote.mode (local/managed/byok) now
triggers a full engine reload. Previously only model and device
changes triggered reload, so switching to Deepgram had no effect
until the app was restarted.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scaffold the cross-platform rewrite from PySide6/Qt to Tauri + Svelte,
following the same architecture as voice-to-notes. The Python backend
runs headless as a sidecar, with a FastAPI control API that the Svelte
frontend connects to via REST and WebSocket.
New files:
- backend/app_controller.py: Headless orchestration (extracted from MainWindow)
- backend/api_server.py: FastAPI control endpoints + /ws/control WebSocket
- backend/main_headless.py: Headless entry point for sidecar mode
- src-tauri/: Tauri v2 Rust shell with sidecar and dialog plugins
- src/: Svelte 5 frontend (App, Settings, Controls, TranscriptionDisplay)
- src/lib/stores/: Reactive stores for backend connection, config, transcriptions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>