Lightweight Deepgram-only sidecar that excludes PyTorch, faster-whisper,
RealtimeSTT, and CUDA. Only includes audio capture + WebSocket streaming
to Deepgram. Requires a Deepgram API key (BYOK or managed mode).
Changes:
- client/models.py: Extracted TranscriptionResult into standalone module
so deepgram_transcription.py doesn't transitively import torch
- backend/app_controller.py: Made RealtimeTranscriptionEngine and
DeviceManager imports lazy (only loaded when remote.mode == "local")
- local-transcription-cloud.spec: PyInstaller spec excluding all ML deps
- SidecarSetup.svelte: Added "Cloud Only (Deepgram)" variant option
- build-sidecar-cloud.yml: CI workflow building cloud sidecar for all 3 OS
- sidecar-release.yml: Dispatches cloud build alongside CPU/CUDA builds
Sidecar download options are now:
- Standard (CPU): ~500 MB - local Whisper on any computer
- GPU Accelerated (CUDA): ~2 GB - local Whisper with NVIDIA GPU
- Cloud Only (Deepgram): ~50 MB - requires API key, no local models
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Dev mode: use `uv run python` instead of bare `python` to ensure
the project venv is used. Also use CARGO_MANIFEST_DIR to find the
project root reliably.
2. Engine reload: changing remote.mode (local/managed/byok) now
triggers a full engine reload. Previously only model and device
changes triggered reload, so switching to Deepgram had no effect
until the app was restarted.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test suite covering all three layers:
Python backend (25 tests):
- AppController: state machine, start/stop, callbacks, settings reload
- API server: REST endpoints, config CRUD, status, devices
- Config: dot-notation get/set, persistence, nested paths
- Main headless: ready event port format validation
Svelte frontend (14 tests via Vitest):
- Backend store: exported properties/methods, port derivation, URLs
- Config store: method names (fetchConfig not loadConfig), defaults
- Transcriptions store: add/clear/plaintext
- File extension regression: ensures $state runes only in .svelte.ts
Rust sidecar (24 tests via cargo test):
- Platform/arch detection, asset name construction
- Ready event deserialization (with extra fields tolerance)
- Path construction, version read/write, old version cleanup
- Zip extraction, SidecarManager lifecycle
CI workflow (.gitea/workflows/test.yml):
- Runs on push to main and PRs
- Three parallel jobs: Python, Frontend, Rust
Also fixes three bugs found during test planning:
- Settings: /api/check-updates -> GET /api/check-update
- Settings: /api/remote/login -> /api/login
- Settings: /api/remote/register -> /api/register
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three issues fixed:
1. Port mismatch: The sidecar reported the OBS port (8080) in the
ready event but the frontend needs the API port (8081). Now reports
the API port so WebSocket/REST connects to the right place.
2. Broadcast from wrong thread: Engine init fires state_changed from
a background thread, but _broadcast_control used get_event_loop()
which returns the wrong loop. Now captures the uvicorn event loop
at startup via on_event("startup").
3. Missed ready state: If the engine finishes before the WebSocket
client connects, the "ready" state_changed was never received.
Added status polling (GET /api/status) on WebSocket connect that
retries every 2s while appState is "initializing".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scaffold the cross-platform rewrite from PySide6/Qt to Tauri + Svelte,
following the same architecture as voice-to-notes. The Python backend
runs headless as a sidecar, with a FastAPI control API that the Svelte
frontend connects to via REST and WebSocket.
New files:
- backend/app_controller.py: Headless orchestration (extracted from MainWindow)
- backend/api_server.py: FastAPI control endpoints + /ws/control WebSocket
- backend/main_headless.py: Headless entry point for sidecar mode
- src-tauri/: Tauri v2 Rust shell with sidecar and dialog plugins
- src/: Svelte 5 frontend (App, Settings, Controls, TranscriptionDisplay)
- src/lib/stores/: Reactive stores for backend connection, config, transcriptions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>