Compare commits

...

130 Commits
v1.4.1 ... main

Author SHA1 Message Date
Gitea Actions
d263be2ac1 chore: bump sidecar version to 1.0.15 [skip ci] 2026-04-12 17:44:51 +00:00
Developer
1c8c6ad7e8 Fix display user not updating locally until app restart
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 3m12s
Engines now read user.name from the config object at transcription time
instead of caching it at init, so name changes take effect immediately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:40:46 -07:00
Gitea Actions
023bc0218b chore: bump version to 2.0.20 [skip ci] 2026-04-12 02:12:38 +00:00
Gitea Actions
634506f902 chore: bump sidecar version to 1.0.14 [skip ci] 2026-04-12 01:59:52 +00:00
Developer
8c7f4e8008 Fix sidecar pipe crash on state changes and show logged-in state in settings
All checks were successful
Tests / Python Backend Tests (push) Successful in 14s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m0s
The state callback in main_headless.py wrote events to stdout
synchronously, so an EINVAL on the Tauri sidecar pipe (Windows) bubbled
up through _set_state and tore down engine init and reload_engine. That
turned PUT /api/config into a "Failed to fetch" for the user. The print
is now pipe-safe and api_server isolates the chained callback so a
future misbehaving listener cannot break the engine state machine.

Settings also now persists remote.email on login and shows a "Logged in
as <email>" indicator with a Log out button when an auth_token is
present, instead of leaving the email/password fields blank on reload.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:55:43 -07:00
Developer
b8d718caa6 Add login success/failure feedback message
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m6s
Shows "Logged in successfully!" or "Login failed" after clicking
the managed mode Login button.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 20:30:11 -07:00
Gitea Actions
d92005bf95 chore: bump version to 2.0.19 [skip ci] 2026-04-11 03:09:49 +00:00
Gitea Actions
e90d154b83 chore: bump sidecar version to 1.0.13 [skip ci] 2026-04-11 03:09:45 +00:00
Developer
fa749b571d Fix cloud-only detection: check for cloud device presence, not exclusivity
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m3s
The compute-devices endpoint always includes "auto" alongside "cloud",
so checking every() never matched. Use some() to detect cloud sidecar.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 20:06:50 -07:00
Developer
ef188e1f67 Fix managed mode WebSocket URL when server_url uses https://
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m21s
The URL builder was prepending wss:// to the full https:// URL, producing
an invalid wss://https://... URL. Now properly converts https→wss and
http→ws before appending the /ws/transcribe path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 20:03:58 -07:00
Gitea Actions
f7b9695418 chore: bump version to 2.0.18 [skip ci] 2026-04-11 02:49:54 +00:00
Gitea Actions
b4c0589b04 chore: bump sidecar version to 1.0.12 [skip ci] 2026-04-11 02:44:12 +00:00
Developer
66c441b17f Revert macOS workflow to pre-signing state
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m59s
Remove all signing env vars and setup steps. The local act runner's
keychain interferes with Tauri's auto-detection. Will re-add signing
once Apple Developer verification is complete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:41:46 -07:00
Developer
94bc704950 Fix settings save blocking event loop and overwriting config keys
Some checks failed
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Has been cancelled
- Run apply_settings in thread pool executor to prevent engine reload
  from blocking the HTTP response (caused "TypeError: Failed to fetch")
- Flatten nested config objects into dot-notation keys before saving
  so partial updates don't wipe out unincluded keys like auth_token

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:40:51 -07:00
Developer
7900d2d9f2 Detect cloud-only sidecar from compute devices (no sidecar rebuild needed)
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m57s
Use the existing /api/compute-devices response to determine if only cloud
is available, instead of relying on the backend's is_cloud_only status field.
Hides Local (Whisper) option when the sidecar only supports cloud.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:36:13 -07:00
Developer
e0396df7b0 Use ad-hoc signing when no Apple certificate is configured
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m4s
Prevents Tauri from auto-detecting local keychain certificates on the
build machine, which causes SecKeychainItemImport failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:30:44 -07:00
Gitea Actions
ad89735822 chore: bump version to 2.0.17 [skip ci] 2026-04-11 02:27:20 +00:00
Developer
f0b5890eba Hide Local (Whisper) mode option when using cloud-only sidecar
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m3s
- Expose is_cloud_only flag in /api/status response
- Add isCloudOnly to backend store state
- Conditionally hide Local (Whisper) radio button in Settings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:24:01 -07:00
Developer
8df1ab9817 Remove macOS signing config until Apple Developer verification completes
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m0s
hardenedRuntime triggers code signing which fails without a valid certificate.
Entitlements.plist and Info.plist remain for when signing is re-enabled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:21:20 -07:00
Gitea Actions
34a165fc05 chore: bump version to 2.0.16 [skip ci] 2026-04-11 02:15:32 +00:00
Developer
8f4e5cc099 Default managed mode to transcribe.shadowdao.com and simplify login UI
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m4s
- Set default server_url to https://transcribe.shadowdao.com
- Remove Server URL field from managed mode settings (users don't need to configure it)
- Replace Register button with link to website signup page
- Add fallback to default URL in login handler for existing users with empty config

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:12:50 -07:00
Developer
16f9ac2ab8 Add code signing config for Windows (Azure Artifact Signing) and macOS (Apple notarization)
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 1m58s
CI workflows now support code signing when secrets are configured:
- macOS: Apple Developer certificate + App Store Connect API key for notarization
- Windows: Azure Artifact Signing via signtool + dlib
- Both are no-ops when secrets aren't set (backwards-compatible)
- Add Entitlements.plist (mic, network) and Info.plist (NSMicrophoneUsageDescription)
- Add SIGNING.md with full setup guide for both platforms

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 18:02:46 -07:00
Developer
cd325102e2 Update docs for cloud-first UX and shared captions
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m13s
- README: document cloud-first quick start, shared captions workflow
  (create room, join via share code, share existing room), and
  self-hosting option
- README: update default remote.mode from local to byok in config table
- CLAUDE.md: reflect cloud-first default, settings gating, and shared
  captions features

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:10:46 -07:00
Gitea Actions
d220158dd7 chore: bump version to 2.0.15 [skip ci] 2026-04-10 19:38:00 +00:00
Developer
8670e19acc Add "Share Current Room" button to copy existing room config as share code
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m58s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 12:26:29 -07:00
Gitea Actions
812cc4ac5e chore: bump version to 2.0.14 [skip ci] 2026-04-10 19:15:02 +00:00
Developer
4aa19eee86 Fix test: align remote.mode in no-reload settings test
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 1m59s
The default remote.mode changed from 'local' to 'byok', causing
the apply_settings test to detect a mode mismatch and trigger an
unexpected engine reload. Pin remote.mode to 'local' in the test
to match the controller's assumed current mode.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 12:01:11 -07:00
Developer
b8dfe0f1ba Cloud-first UX: default to Deepgram, gate start button, add room sharing
Some checks failed
Tests / Python Backend Tests (push) Failing after 6s
Tests / Frontend Tests (push) Successful in 9s
Tests / Rust Sidecar Tests (push) Successful in 2m1s
- Change default transcription mode from local to byok (cloud/Deepgram)
- Move Transcription Mode selector to top of settings for visibility
- Hide local-only settings (model, VAD, timing) when cloud mode selected
- Disable Start button until API key (byok) or login (managed) is configured
- Add room creation and share code flow to Shared Captions section
- Add POST /api/create-room endpoint to Node.js sync server
- Update default sync URL placeholder to caption.shadowdao.com

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 11:58:49 -07:00
Gitea Actions
5837b97a20 chore: bump sidecar version to 1.0.11 [skip ci] 2026-04-08 21:15:05 +00:00
Gitea Actions
ab09a3e9da chore: bump version to 2.0.13 [skip ci] 2026-04-08 21:09:40 +00:00
Developer
5343a28a08 Bundle sounddevice PortAudio library in sidecar builds
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m4s
On macOS, sounddevice ships its own PortAudio dylib in the
_sounddevice_data directory. PyInstaller wasn't collecting it,
causing "Error querying device -1" when the sidecar tried to
open an audio stream.

Added data collection for _sounddevice_data in both cloud and
headless PyInstaller specs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 14:07:04 -07:00
Developer
f0bf026133 Handle ExitRequested to stop sidecar on macOS Cmd+Q
All checks were successful
Tests / Python Backend Tests (push) Successful in 4s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m56s
On macOS, Cmd+Q triggers ExitRequested before Exit. If the app is
force-quit or closed via Cmd+Q, the Exit event may not fire,
leaving the sidecar process orphaned with ports 8080/8081 in use.

Now handles both ExitRequested and Exit to ensure the sidecar is
always stopped when the app closes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 14:02:25 -07:00
Developer
37a029d1c6 Show app version from Tauri instead of sidecar
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 1m59s
The version label was reading from backendStore.version which comes
from the sidecar's version.py (hardcoded at build time). Now uses
Tauri's getVersion() API which reads from tauri.conf.json -- the
actual app version that gets bumped by the release workflow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 13:45:53 -07:00
Gitea Actions
5ec030387f chore: bump sidecar version to 1.0.10 [skip ci] 2026-04-08 20:27:00 +00:00
Gitea Actions
4d9bdba903 chore: bump version to 2.0.12 [skip ci] 2026-04-08 20:22:08 +00:00
Gitea Actions
a7a3bcd102 chore: bump version to 2.0.11 [skip ci] 2026-04-08 20:10:12 +00:00
Developer
115d93482a Always poll status after start/stop, even on API error
All checks were successful
Tests / Python Backend Tests (push) Successful in 4s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m55s
When apiPost throws (e.g. 400 "Already transcribing"), pollStatus
never ran because it was in the try block. The button stayed stuck
on "Start" even though transcription was running.

Moved pollStatus to the finally block so it always syncs the UI
with actual backend state. Also suppresses the error message for
400 responses since they just mean the state is already correct.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 13:03:34 -07:00
Developer
fb672cbaef Update Cargo.lock
Some checks failed
Tests / Python Backend Tests (push) Successful in 4s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 13:02:03 -07:00
Gitea Actions
d8c79be094 chore: bump sidecar version to 1.0.9 [skip ci] 2026-04-08 19:52:47 +00:00
Developer
2811f5bb9c Fix release workflow false failure on successful dispatch
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 1m59s
The line [ "\$HTTP_CODE" != "204" ] && cat ... returns exit code 1
when the condition is false (all dispatches succeeded). Since it
was the last command in the loop, the step reported failure.
Changed to if/then/fi which doesn't leak the test exit code.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:49:53 -07:00
Gitea Actions
30127d68e7 chore: bump version to 2.0.10 [skip ci] 2026-04-08 19:46:58 +00:00
Developer
ae61c8c75a Fix Start button not updating: unblock the event loop
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m4s
start_transcription() blocks up to 15s waiting for the Deepgram
WebSocket to connect. Running it synchronously in the async endpoint
blocked the entire uvicorn event loop, preventing:
- pollStatus from completing (frozen HTTP request)
- WebSocket broadcasts from being sent
- Any other API requests from being handled

Fix: run start/stop/reload in thread pool via run_in_executor so
the event loop stays responsive during long-running operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:43:49 -07:00
Developer
2654200fe9 Switch sidecar-release to GITHUB_ENV to match release.yml
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m4s
Same fix as release.yml -- replaced step outputs with GITHUB_ENV
variables to avoid the act runner format bug. Also removed the
has_changes conditional since sidecar-release is now manual-only
(workflow_dispatch always means we want to build).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:33:28 -07:00
Developer
cae0c0b265 Fix false job failure: use GITHUB_ENV instead of step outputs
Some checks failed
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Has been cancelled
The act runner has a Go format bug that evaluates step outputs at
cleanup time and crashes with %!t(string=...), marking the job as
failed even though all steps succeeded.

Replaced steps.bump.outputs.* with GITHUB_ENV variables which
persist across steps without triggering the runner's output
evaluation bug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:32:20 -07:00
Gitea Actions
91b27ac22e chore: bump sidecar version to 1.0.8 [skip ci] 2026-04-08 19:27:02 +00:00
Gitea Actions
1210acd07f chore: bump version to 2.0.9 [skip ci] 2026-04-08 19:23:05 +00:00
Developer
352615c15c Fix Deepgram broken pipe: wait for WebSocket before starting audio
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m0s
Audio capture started immediately after spawning the WebSocket thread,
but the WebSocket hadn't connected yet. Audio chunks sent to the
unconnected WebSocket caused a broken pipe error.

Fix: added a threading.Event that start_recording() waits on (up to
15s) before opening the audio stream. The event is set in _ws_lifecycle
after the WebSocket connects and handshake completes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:18:47 -07:00
Developer
a3bcc5bee5 Show transcription start errors in UI, improve error logging
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m5s
Start Transcription button now shows the error message when it fails
instead of silently reverting. Common causes:
- Missing PortAudio library on Linux
- Audio device not accessible
- Deepgram connection failure

Also added error details to backend console output and captured
the last error from the Deepgram engine for better diagnostics.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:15:43 -07:00
Developer
b91fe876f9 Stop sidecar process when the app exits
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m6s
The sidecar process was orphaned when the Tauri app closed, leaving
ports 8080/8081 in use. On next launch the new sidecar couldn't bind
those ports and failed to start.

Added RunEvent::Exit handler that stops the sidecar before the app
process terminates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:54:53 -07:00
Developer
7e04d6b4af Fix Linux CPU sidecar bundling CUDA, add cleanup workflow
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m9s
Linux CPU sidecar: PyPI's default torch on Linux includes CUDA
(~800MB). UV_NO_SOURCES only bypasses our custom CUDA index but
still gets CUDA-enabled torch from PyPI. Now explicitly installs
CPU-only torch from pytorch.org/whl/cpu after sync. Same fix
applied to Windows.

New cleanup-releases.yml workflow (manual trigger):
- Configurable: keep N app releases, keep N sidecar releases
- Dry run mode (default) shows what would be deleted without deleting
- Protects v1.4.0 (last pre-Tauri release)
- Shows release sizes in MB

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:21:16 -07:00
Developer
15c4e262b9 Document macOS quarantine workaround in README
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m1s
macOS Gatekeeper blocks unsigned apps with "damaged" error.
Added xattr -cr command to Troubleshooting section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:02:55 -07:00
Gitea Actions
2246723220 chore: bump sidecar version to 1.0.7 [skip ci] 2026-04-08 17:05:36 +00:00
Gitea Actions
1c586738f3 chore: bump version to 2.0.8 [skip ci] 2026-04-08 16:58:00 +00:00
Developer
fb02a24334 Remove CUDA sidecar builds, keep CPU + Cloud only
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m3s
CUDA sidecars are ~2GB and too slow to upload from the Windows runner.
Cloud (Deepgram) provides faster transcription anyway. Removed:

- CUDA build steps from Windows and Linux sidecar workflows
- CUDA option from the SidecarSetup download screen

Remaining sidecar variants:
- Cloud (Deepgram): ~50 MB - recommended for most users
- Local CPU: ~500 MB - for offline/privacy use

CUDA can be revisited once the managed Deepgram service is ready.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 09:49:36 -07:00
Developer
ce64cacc5e Use max compression for sidecar zips to reduce upload size
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m1s
zip -9 on Linux, 7z -mx=9 on Windows. Compression takes longer but
produces smaller files which upload faster over the network.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 09:42:26 -07:00
Gitea Actions
14a7ca3b30 chore: bump sidecar version to 1.0.6 [skip ci] 2026-04-08 16:26:36 +00:00
Gitea Actions
5b7387f9c6 chore: bump version to 2.0.7 [skip ci] 2026-04-08 16:21:51 +00:00
Developer
293362baa1 Cloud sidecar auto-detects variant and guides user to configure
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m7s
On first launch, the cloud sidecar now:
1. Detects it's the cloud variant (DeviceManager import fails)
2. Auto-switches config from "local" to "byok" mode
3. Shows "Setup needed: Open Settings > Remote Transcription >
   enter your Deepgram API key" as a friendly status message
4. Stays in READY state so the UI is fully accessible

The user can then open Settings, enter their Deepgram API key,
save, and start transcribing without needing to know about modes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 09:17:06 -07:00
Developer
41f50dedec Fix cloud sidecar crash on first launch
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 3m11s
The cloud sidecar excludes the local Whisper engine module, but on
first launch the config defaults to remote.mode="local" which tries
to import it. Now catches the ImportError gracefully and shows an
error message telling the user to switch to Cloud (Deepgram) mode
in Settings. The API server still starts so Settings is accessible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 09:12:17 -07:00
Developer
d8b7811153 Fix NaN% in sidecar download progress
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m4s
The Rust backend emits {downloaded, total, phase, message} but the
Svelte component was reading event.payload.progress which doesn't
exist, resulting in NaN. Now calculates percentage from downloaded/total.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 09:06:39 -07:00
Developer
ec8922672c Fix Stop Transcription button not updating after click
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m9s
After calling POST /api/stop, the button stayed on "Stop Transcription"
because the state update depended on the WebSocket broadcast which can
be delayed or missed (event loop threading issue).

Fix: poll GET /api/status immediately after start/stop API calls to
update the UI state directly, rather than waiting for the WebSocket.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 07:26:06 -07:00
Gitea Actions
375669f657 chore: bump sidecar version to 1.0.5 [skip ci] 2026-04-08 00:43:01 +00:00
Gitea Actions
c8b11fb0ad chore: bump version to 2.0.6 [skip ci] 2026-04-08 00:37:28 +00:00
Developer
273a926f03 Fix YAML parse error: use block scalar for echo with colons
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m7s
Gitea's YAML parser treats `echo "text: value"` as a mapping when
on a single `run:` line. Using block scalar (`run: |`) avoids this.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:21:42 -07:00
Gitea Actions
5bbbc38875 chore: bump version to 2.0.5 [skip ci] 2026-04-08 00:19:25 +00:00
Developer
d50be6654d Fix dispatch failures and disable automatic cleanup
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m8s
1. Quote RELEASE_TAG env vars in all workflow files. Unquoted
   ${{ inputs.tag }} caused YAML parse errors on some Gitea runners,
   making dispatch return HTTP 500 for Linux/macOS.

2. Disable automatic release cleanup in both coordinators. The cleanup
   races with async builds -- it deletes the release before builds
   finish uploading their assets. Clean up old releases manually
   from the Gitea UI instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:16:36 -07:00
Developer
68abf49018 Log dispatch error responses for debugging
Some checks failed
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Has been cancelled
Show the Gitea API response body when dispatch returns non-204,
to help diagnose why Linux/macOS dispatches return HTTP 500.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:14:23 -07:00
Gitea Actions
8cc2a3ec7a chore: bump version to 2.0.4 [skip ci] 2026-04-08 00:09:39 +00:00
Developer
8aa9dfc644 Update Cargo.lock and generated Tauri schemas
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m10s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:03:40 -07:00
Developer
3f16aa838d Add ability to change transcription engine from Settings
Some checks failed
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Has been cancelled
New features:
- Settings > Transcription Engine > "Change Transcription Engine"
  button stops the sidecar, deletes downloaded files, and reloads
  the app to show the engine selection screen
- Improved SidecarSetup descriptions with detailed explanations
  of each variant and "Recommended" tag on Cloud (Deepgram)
- Cloud option listed first as the recommended choice
- New reset_sidecar Tauri command that cleans up sidecar files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:02:31 -07:00
Developer
3d3d7ec3c5 Add cloud-only sidecar variant (~50MB vs 500MB-2GB)
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m59s
Lightweight Deepgram-only sidecar that excludes PyTorch, faster-whisper,
RealtimeSTT, and CUDA. Only includes audio capture + WebSocket streaming
to Deepgram. Requires a Deepgram API key (BYOK or managed mode).

Changes:
- client/models.py: Extracted TranscriptionResult into standalone module
  so deepgram_transcription.py doesn't transitively import torch
- backend/app_controller.py: Made RealtimeTranscriptionEngine and
  DeviceManager imports lazy (only loaded when remote.mode == "local")
- local-transcription-cloud.spec: PyInstaller spec excluding all ML deps
- SidecarSetup.svelte: Added "Cloud Only (Deepgram)" variant option
- build-sidecar-cloud.yml: CI workflow building cloud sidecar for all 3 OS
- sidecar-release.yml: Dispatches cloud build alongside CPU/CUDA builds

Sidecar download options are now:
- Standard (CPU): ~500 MB - local Whisper on any computer
- GPU Accelerated (CUDA): ~2 GB - local Whisper with NVIDIA GPU
- Cloud Only (Deepgram): ~50 MB - requires API key, no local models

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:57:43 -07:00
Developer
bb039399fc Add font source/family settings matching v1.4.0 feature set
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m11s
Restored the font configuration that was missing from the Tauri
rewrite. Settings now include:

- Font Source: System Font, Web-Safe, Google Font
- System Font: text input for any installed font family
- Web-Safe: dropdown with 13 universal fonts (Arial, Courier New, etc.)
- Google Font: dropdown with 35 fonts organized by category
  (Sans Serif, Serif, Monospace, Display, Handwriting)
- Font Size: range slider (8-32px)

All font settings are saved to config and applied to the OBS web
display and server sync.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:40:52 -07:00
Developer
9dcb14e92c Fix Deepgram streaming latency
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 9s
Tests / Rust Sidecar Tests (push) Successful in 2m5s
Three changes to reduce transcription delay:

1. Send loop: queue.get() was blocking the asyncio event loop, stalling
   the receive loop and delaying transcription results. Now uses
   run_in_executor() to avoid blocking the event loop.

2. Block size: reduced from 4096 (~256ms) to 1024 (~64ms) for more
   frequent, smaller audio chunks. Deepgram handles streaming better
   with smaller packets.

3. Added punctuate=true and smart_format=true to Deepgram BYOK
   params for cleaner transcription output.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:31:50 -07:00
Developer
8db9b8298b Fix dev mode sidecar launch and engine reload on mode change
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m57s
1. Dev mode: use `uv run python` instead of bare `python` to ensure
   the project venv is used. Also use CARGO_MANIFEST_DIR to find the
   project root reliably.

2. Engine reload: changing remote.mode (local/managed/byok) now
   triggers a full engine reload. Previously only model and device
   changes triggered reload, so switching to Deepgram had no effect
   until the app was restarted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:25:07 -07:00
Developer
411779f578 Make release and sidecar-release manual-only while testing
All checks were successful
Tests / Python Backend Tests (push) Successful in 4s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m59s
Removed push triggers from both coordinator workflows. They now
only run via workflow_dispatch (manual "Run workflow" button).
Re-enable push triggers once the build pipeline is stable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:04:06 -07:00
Developer
bc6055a707 Add workflow_dispatch trigger to release.yml
Some checks failed
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Has been cancelled
Allows manually triggering app releases from the Gitea Actions UI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:03:18 -07:00
Gitea Actions
e42a922507 chore: bump sidecar version to 1.0.4 [skip ci] 2026-04-07 20:01:20 +00:00
Developer
8fc2d11c5f Fix builds failing to checkout: stop deleting tags, fix tag passing
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m3s
Two issues causing all builds to fail:

1. Cleanup steps deleted git tags along with releases. Since builds
   are dispatched asynchronously, they tried to checkout tags that
   had already been deleted. Now cleanup only deletes releases (which
   frees storage by removing assets) but preserves git tags.

2. Linux/macOS build workflows used $GITHUB_OUTPUT step outputs for
   the tag, which is unreliable on Gitea runners. Switched to the
   same job-level env var pattern (RELEASE_TAG) that works on Windows.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:27:13 -07:00
Gitea Actions
11832e911b chore: bump version to 2.0.3 [skip ci] 2026-04-07 19:21:16 +00:00
Developer
18e6b974c0 Fix sidecar stdout buffering: set PYTHONUNBUFFERED=1
All checks were successful
Release / Run Tests (push) Successful in 24s
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 3m17s
Release / Bump version and tag (push) Successful in 4s
PyInstaller frozen executables buffer stdout when piped to a
subprocess (no TTY). Even with flush=True in Python, the OS-level
pipe buffer can delay output. This prevented the ready event from
reaching the Tauri app, causing the "Starting sidecar..." hang.

Fix: set PYTHONUNBUFFERED=1 env var on both prod and dev sidecar
commands, plus -u flag for dev mode Python.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:17:18 -07:00
Gitea Actions
08e464daaf chore: bump version to 2.0.2 [skip ci] 2026-04-07 19:15:05 +00:00
Developer
5d22adcaa4 Fix app hanging on sidecar startup
All checks were successful
Release / Run Tests (push) Successful in 11s
Tests / Python Backend Tests (push) Successful in 4s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 3m13s
Release / Bump version and tag (push) Successful in 14s
Two issues caused the app to freeze on "Starting sidecar...":

1. wait_for_ready() used a blocking BufReader::lines() iterator
   with a timeout check between lines. If the sidecar produced no
   stdout output (crashed, missing binary, or slow model loading),
   the read blocked forever. Now uses a background thread with
   mpsc::recv_timeout() for a real 120s deadline.

2. start_sidecar was a synchronous Tauri command that blocked the
   main thread during the entire sidecar startup (up to 120s).
   Now async via tokio::spawn_blocking, keeping the UI responsive.

Also logs all sidecar stdout lines to stderr with [sidecar-stdout]
prefix for debugging.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:11:13 -07:00
Gitea Actions
36b4f7dad5 chore: bump version to 2.0.1 [skip ci] 2026-04-07 15:58:01 +00:00
Developer
1ecb23b83f Bump to v2.0.0 — cross-platform Tauri rewrite
All checks were successful
Release / Run Tests (push) Successful in 9s
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 6s
Tests / Rust Sidecar Tests (push) Successful in 2m7s
Release / Bump version and tag (push) Successful in 4s
Major version bump reflecting the architecture change from PySide6/Qt
to Tauri v2 + Svelte 5 with cross-platform support for Windows,
macOS, and Linux.

Key changes since v1.4.0:
- Tauri v2 native desktop shell replacing PySide6/Qt
- Svelte 5 reactive frontend
- Headless Python backend as a downloadable sidecar
- Deepgram cloud transcription (managed + BYOK)
- Gitea CI/CD with per-OS builds and automated releases
- Sidecar auto-update checking on startup
- 63-test suite (Python + Svelte + Rust)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:55:25 -07:00
Developer
4b88871a9b Add sidecar update check on startup
Some checks failed
Release / Run Tests (push) Successful in 11s
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Release / Bump version and tag (push) Has been cancelled
Tests / Rust Sidecar Tests (push) Has been cancelled
On launch, after confirming the sidecar is installed, the app now
checks for a newer sidecar version via the Gitea API. If an update
is available, shows a prompt with "Update Now" or "Skip":

- Update Now: shows the SidecarSetup download screen
- Skip: launches the existing sidecar version

The update check is non-blocking -- if it fails (no internet, API
error), the app silently proceeds with the current version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:54:18 -07:00
Gitea Actions
0ae48a67d5 chore: bump version to 1.4.21 [skip ci] 2026-04-07 15:51:32 +00:00
Developer
924cae6c75 Fix double-prefix sidecar dir and stale PID lock on startup
All checks were successful
Release / Run Tests (push) Successful in 29s
Tests / Python Backend Tests (push) Successful in 8s
Tests / Frontend Tests (push) Successful in 10s
Tests / Rust Sidecar Tests (push) Successful in 4m7s
Release / Bump version and tag (push) Successful in 4s
Two bugs preventing sidecar from starting:

1. Directory was "sidecar-sidecar-v1.0.3" (double prefix) because
   sidecar_dir_for_version() prepended "sidecar-" to a version that
   already contained it. Now uses the tag directly as the dir name.

2. After a crash, the Python InstanceLock PID file at
   ~/.local-transcription/app.lock remained, blocking the next launch
   with "Another instance is already running". Now clears the stale
   lock file before spawning the sidecar.

Also fixed cleanup_old_versions() and tests to match the corrected
directory naming.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:46:31 -07:00
Gitea Actions
5139936e18 chore: bump version to 1.4.20 [skip ci] 2026-04-07 15:44:54 +00:00
Developer
47724f1ac0 Capture sidecar stderr to sidecar.log for crash debugging
All checks were successful
Release / Run Tests (push) Successful in 15s
Tests / Python Backend Tests (push) Successful in 7s
Tests / Frontend Tests (push) Successful in 10s
Tests / Rust Sidecar Tests (push) Successful in 2m30s
Release / Bump version and tag (push) Successful in 12s
When the sidecar process exits before sending the ready event, the
error message now includes the last 10 lines of stderr. Stderr is
captured in a background thread and written to sidecar.log in the
app data directory.

This helps diagnose why the PyInstaller sidecar fails to start
(missing DLLs, import errors, permission issues, etc.).

Log location: %APPDATA%\net.anhonesthost.local-transcription\sidecar.log

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:41:40 -07:00
Developer
3b204be37e Add automatic cleanup of old releases to save storage
All checks were successful
Tests / Python Backend Tests (push) Successful in 13s
Tests / Frontend Tests (push) Successful in 23s
Tests / Rust Sidecar Tests (push) Successful in 2m8s
- App releases: keeps latest 3 + v1.4.0 (last pre-Tauri version),
  deletes older releases and their tags
- Sidecar releases: keeps latest 2, deletes older releases and tags
  (sidecars are large, ~500MB-2GB each)

Cleanup runs after creating new releases, before triggering builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:10:55 -07:00
Developer
4c02a48135 Fix CI: use uv for test venv, gate builds on tests, reduce build triggers
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m7s
- test.yml: use uv venv instead of pip --break-system-packages
- release.yml: inline test job that must pass before version bump;
  only triggers on source file changes (src/, src-tauri/, package.json)
- sidecar-release.yml: inline Python test job that must pass before
  sidecar version bump
- Both coordinators use `needs: test` so builds never start if tests fail

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:05:49 -07:00
Gitea Actions
997e97c19a chore: bump version to 1.4.19 [skip ci] 2026-04-07 15:00:21 +00:00
Developer
6ca8fc41b2 Fix sidecar release conflict: stop modifying version.py
Some checks failed
Release / Bump version and tag (push) Successful in 4s
Tests / Python Backend Tests (push) Failing after 3s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m21s
Both release.yml and sidecar-release.yml were updating version.py,
causing merge conflicts when both ran on the same push. Now:
- release.yml (app) owns: package.json, tauri.conf.json, Cargo.toml, version.py
- sidecar-release.yml owns: pyproject.toml only

Also deleted the stale sidecar-v1.0.4 tag that failed to push.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:00:09 -07:00
Gitea Actions
d9d90563cc chore: bump version to 1.4.18 [skip ci] 2026-04-07 14:49:51 +00:00
Developer
5a674ed199 Add test suite (63 tests) and CI workflow, fix Settings API bugs
Some checks failed
Release / Bump version and tag (push) Successful in 4s
Sidecar Release / Bump sidecar version and tag (push) Failing after 3s
Tests / Python Backend Tests (push) Failing after 3s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 3m10s
Test suite covering all three layers:

Python backend (25 tests):
- AppController: state machine, start/stop, callbacks, settings reload
- API server: REST endpoints, config CRUD, status, devices
- Config: dot-notation get/set, persistence, nested paths
- Main headless: ready event port format validation

Svelte frontend (14 tests via Vitest):
- Backend store: exported properties/methods, port derivation, URLs
- Config store: method names (fetchConfig not loadConfig), defaults
- Transcriptions store: add/clear/plaintext
- File extension regression: ensures $state runes only in .svelte.ts

Rust sidecar (24 tests via cargo test):
- Platform/arch detection, asset name construction
- Ready event deserialization (with extra fields tolerance)
- Path construction, version read/write, old version cleanup
- Zip extraction, SidecarManager lifecycle

CI workflow (.gitea/workflows/test.yml):
- Runs on push to main and PRs
- Three parallel jobs: Python, Frontend, Rust

Also fixes three bugs found during test planning:
- Settings: /api/check-updates -> GET /api/check-update
- Settings: /api/remote/login -> /api/login
- Settings: /api/remote/register -> /api/register

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:48:36 -07:00
Gitea Actions
9d78fce3f0 chore: bump version to 1.4.17 [skip ci] 2026-04-07 14:35:53 +00:00
Developer
a8de39de84 Fix OBS display and Start button not working
All checks were successful
Release / Bump version and tag (push) Successful in 12s
Sidecar Release / Bump sidecar version and tag (push) Successful in 6s
Three issues fixed:

1. Port mismatch: The sidecar reported the OBS port (8080) in the
   ready event but the frontend needs the API port (8081). Now reports
   the API port so WebSocket/REST connects to the right place.

2. Broadcast from wrong thread: Engine init fires state_changed from
   a background thread, but _broadcast_control used get_event_loop()
   which returns the wrong loop. Now captures the uvicorn event loop
   at startup via on_event("startup").

3. Missed ready state: If the engine finishes before the WebSocket
   client connects, the "ready" state_changed was never received.
   Added status polling (GET /api/status) on WebSocket connect that
   retries every 2s while appState is "initializing".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:35:41 -07:00
Gitea Actions
bc82584dff chore: bump version to 1.4.16 [skip ci] 2026-04-07 13:47:41 +00:00
Developer
4d0b4ee1c5 Add save confirmation and fix saveConfig -> updateConfig
All checks were successful
Release / Bump version and tag (push) Successful in 16s
- Fixed method call from saveConfig (doesn't exist) to updateConfig
- Save button shows "Saving..." while in progress, disabled during save
- Green "Settings saved!" message appears on success before closing
- Red error message shown on failure
- Cancel button disabled during save

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:41:01 -07:00
Gitea Actions
c73e9de0ac chore: bump version to 1.4.15 [skip ci] 2026-04-07 13:39:50 +00:00
Developer
288c6ad6a3 Fix BYOK settings: show Deepgram API key instead of server URL
All checks were successful
Release / Bump version and tag (push) Successful in 20s
BYOK mode connects directly to Deepgram (wss://api.deepgram.com),
so the server URL field was incorrect. Now:
- BYOK shows a Deepgram API Key field with link to console.deepgram.com
- Managed shows the Server URL field (for the transcription proxy)
- Local shows neither
- API key is saved as remote.byok_api_key in config

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 06:39:31 -07:00
Gitea Actions
af8046f9b1 chore: bump version to 1.4.14 [skip ci] 2026-04-07 02:33:33 +00:00
Developer
6003885519 Fix configStore.loadConfig -> fetchConfig method name
All checks were successful
Release / Bump version and tag (push) Successful in 10s
The config store exports fetchConfig() but App.svelte was calling
the nonexistent loadConfig(), causing a TypeError that prevented
the sidecar from launching.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:33:24 -07:00
Gitea Actions
8829846b53 chore: bump version to 1.4.13 [skip ci] 2026-04-07 02:20:51 +00:00
Developer
cf449d9338 Add Tauri ACL capabilities for event listener
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Tauri v2 requires explicit permission grants. The SidecarSetup
component uses listen() from @tauri-apps/api/event to receive
download progress, which requires core:event:allow-listen.

Added default capability with core, event, shell, dialog, and
process permissions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:20:44 -07:00
Gitea Actions
5a6910834c chore: bump version to 1.4.12 [skip ci] 2026-04-07 02:16:01 +00:00
Developer
a6c7eb5d5e Fix blank screen: rename stores to .svelte.ts for rune support
All checks were successful
Release / Bump version and tag (push) Successful in 7s
Svelte 5 runes ($state, $derived, $effect) are only compiled in
.svelte and .svelte.ts files. The stores used runes in plain .ts
files, which meant $state was treated as an undefined function at
runtime, crashing the JS before anything rendered.

- Renamed backend.ts -> backend.svelte.ts
- Renamed config.ts -> config.svelte.ts
- Renamed transcriptions.ts -> transcriptions.svelte.ts
- Added .svelte.ts to Vite resolve extensions
- Added missing obsUrl/syncUrl getters to backend store

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:15:52 -07:00
Gitea Actions
135d5d534b chore: bump version to 1.4.11 [skip ci] 2026-04-07 02:06:55 +00:00
Developer
76f34fe17d Fix Windows tag passing: use env var instead of step outputs
All checks were successful
Release / Bump version and tag (push) Successful in 4s
Step outputs via GITHUB_OUTPUT are unreliable with act runner on
Windows (BOM encoding issues). Replaced with job-level env var
RELEASE_TAG set directly from inputs.tag, and checkout ref also
uses inputs.tag directly. Eliminated the Determine tag step
entirely — no intermediate output needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:06:50 -07:00
Gitea Actions
68ad31b6a7 chore: bump version to 1.4.10 [skip ci] 2026-04-07 02:01:11 +00:00
Developer
fcbe405e23 Fix Windows tag step: use PowerShell instead of bash
All checks were successful
Release / Bump version and tag (push) Successful in 4s
The act runner on Windows doesn't have bash available. Switched back
to PowerShell with the inputs.tag fallback chain. Uses Out-File for
GITHUB_OUTPUT instead of echo redirection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:00:34 -07:00
Gitea Actions
4adfd2adc6 chore: bump version to 1.4.9 [skip ci] 2026-04-07 01:59:23 +00:00
Developer
f3843d59f1 Fix empty tag in dispatched Windows builds
All checks were successful
Release / Bump version and tag (push) Successful in 7s
The workflow_dispatch input was accessed as github.event.inputs.tag
which can be empty depending on the Gitea runner. Now tries both
inputs.tag (modern syntax) and github.event.inputs.tag as fallback,
with a final fallback to the latest matching git tag.

Also switched Windows Determine-tag steps from PowerShell to bash
(via Git Bash) for consistency with the other platforms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 18:59:17 -07:00
Gitea Actions
ad68251e04 chore: bump version to 1.4.8 [skip ci] 2026-04-07 00:50:20 +00:00
Developer
9468d01a88 Coordinators now dispatch per-OS builds via API
All checks were successful
Release / Bump version and tag (push) Successful in 7s
Previously per-OS build workflows triggered on tag push events, but
Gitea doesn't fire events for tags pushed by other workflows. Now:

- release.yml dispatches build-app-{linux,windows,macos}.yml via
  the Gitea API after creating the tag and release
- sidecar-release.yml dispatches build-sidecar-{linux,windows,macos}.yml

Per-OS workflows changed from push+dispatch triggers to dispatch-only
with tag as a required input. To re-run a failed build for the same
version, just dispatch the specific OS workflow with the same tag --
upload logic replaces existing assets automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 17:50:13 -07:00
Gitea Actions
a3151ad55e chore: bump version to 1.4.7 [skip ci] 2026-04-07 00:40:25 +00:00
Developer
5bff40e9b4 Add debug logging to file and fix blank startup screen
All checks were successful
Release / Bump version and tag (push) Successful in 5s
- Added write_log Tauri command that writes to frontend.log in app data dir
- App.svelte now logs each startup step (Tauri import, sidecar check, launch)
- Startup overlays use inline styles as fallback so they're visible even if
  CSS variables fail to load
- Debug status shown on the checking/connecting screens
- Rust side logs startup info to app.log (resource dir, data dir)

Log files location: %APPDATA%/net.anhonesthost.local-transcription/ (Windows)
or ~/Library/Application Support/net.anhonesthost.local-transcription/ (macOS)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 17:40:18 -07:00
Gitea Actions
0ccb02ba27 chore: bump version to 1.4.6 [skip ci] 2026-04-07 00:39:08 +00:00
Developer
aa4033b412 Split CI workflows into per-OS files for independent re-runs
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Refactored from 2 monolithic workflows into 8 targeted ones:

Coordinators (version bump + tag + release creation):
- release.yml: bumps app version, tags v*, creates Gitea release
- sidecar-release.yml: bumps sidecar version, tags sidecar-v*

Per-OS app builds (triggered by v* tags or workflow_dispatch):
- build-app-linux.yml: .deb, .rpm, .AppImage
- build-app-windows.yml: .msi, -setup.exe
- build-app-macos.yml: .dmg

Per-OS sidecar builds (triggered by sidecar-v* tags or workflow_dispatch):
- build-sidecar-linux.yml: CUDA + CPU variants
- build-sidecar-windows.yml: CUDA + CPU variants
- build-sidecar-macos.yml: CPU only

Each build workflow can be re-triggered independently without
re-running the version bump or rebuilding other platforms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 17:35:25 -07:00
Gitea Actions
b4b9435317 chore: bump sidecar version to 1.0.3 [skip ci] 2026-04-07 00:27:28 +00:00
Gitea Actions
ee1d4f8643 chore: bump version to 1.4.5 [skip ci] 2026-04-07 00:23:37 +00:00
Developer
4a186d1de6 Fix CPU sidecar builds bundling CUDA torch instead of CPU
All checks were successful
Release / Bump version and tag (push) Successful in 7s
Release / Build App (macOS) (push) Successful in 1m8s
Release / Build App (Windows) (push) Successful in 2m8s
Release / Build App (Linux) (push) Successful in 3m23s
The CPU build steps used `uv run pyinstaller` which re-resolves
dependencies from pyproject.toml's [tool.uv.sources] before running,
pulling CUDA torch back in after the CPU-only reinstall. This made
CPU and CUDA zips the same size.

Fix: run pyinstaller directly from the venv (.venv/bin/pyinstaller
on Linux/macOS, .venv\Scripts\pyinstaller.exe on Windows) to skip
uv's dependency resolution entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 17:23:28 -07:00
Gitea Actions
fff37992b1 chore: bump version to 1.4.4 [skip ci] 2026-04-07 00:05:15 +00:00
Developer
8afe3230d3 Add sidecar download, setup screen, and auto-launch
Some checks failed
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m9s
Release / Build App (Linux) (push) Successful in 5m36s
Release / Build App (Windows) (push) Has been cancelled
On first launch, the app now prompts users to download the Python
sidecar (CPU or CUDA variant) from Gitea releases, matching the
voice-to-notes pattern. On subsequent launches, it auto-launches
the sidecar and connects.

New Rust module (src-tauri/src/sidecar/):
- download_sidecar: streams download with progress events, extracts zip
- check_sidecar: verifies installed sidecar binary exists
- check_sidecar_update: compares local vs latest release version
- SidecarManager: launches binary, waits for ready JSON, manages lifecycle
- Dev mode: runs `python -m backend.main_headless` directly
- start_sidecar/stop_sidecar/get_sidecar_port: Tauri commands

New Svelte component (SidecarSetup.svelte):
- First-time setup overlay with CPU/CUDA variant selection
- Download progress bar with byte counter
- Error state with retry, success state with auto-continue

Updated App.svelte state machine:
- checking -> needs_setup -> starting -> connected
- Falls back to direct connection in browser dev mode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 17:02:56 -07:00
Developer
04e7fb1a99 Fix macOS sidecar build and blank window on startup
Some checks failed
Release / Bump version and tag (push) Has been cancelled
Release / Build App (Linux) (push) Has been cancelled
Release / Build App (Windows) (push) Has been cancelled
Release / Build App (macOS) (push) Has been cancelled
macOS sidecar: `uv run` re-resolves dependencies using CUDA sources
even after `uv sync --no-sources`. Use UV_NO_SOURCES=1 env var instead
so it applies to all uv commands in the step.

Blank window: When the Tauri app starts without the Python backend
running, it showed a completely blank window. Now shows a "Connecting
to backend..." spinner, or an error state with instructions to start
the backend manually.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:55:03 -07:00
Gitea Actions
9a282215c9 chore: bump sidecar version to 1.0.2 [skip ci] 2026-04-06 23:49:18 +00:00
Gitea Actions
cc2d17a627 chore: bump version to 1.4.3 [skip ci] 2026-04-06 21:02:18 +00:00
Developer
61c5ffa4fa Remove Zone.Identifier files that break Windows checkout
All checks were successful
Release / Bump version and tag (push) Successful in 4s
Release / Build App (macOS) (push) Successful in 58s
Release / Build App (Windows) (push) Successful in 3m22s
Release / Build App (Linux) (push) Successful in 6m27s
Windows NTFS Zone.Identifier alternate data stream files were
accidentally committed. The colon in the filename is invalid on
Windows, causing git checkout to fail on Windows runners.

Also added *:Zone.Identifier to .gitignore to prevent this recurring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:02:11 -07:00
Gitea Actions
289b9dabe1 chore: bump version to 1.4.2 [skip ci] 2026-04-06 21:00:01 +00:00
Developer
9522f28c57 Fix app icons: regenerate as RGBA and add macOS .icns
Some checks failed
Release / Bump version and tag (push) Successful in 4s
Release / Build App (Windows) (push) Failing after 10s
Release / Build App (macOS) (push) Successful in 59s
Release / Build App (Linux) (push) Has been cancelled
The bundled .ico had non-RGBA PNGs which caused Tauri's macOS bundler
to fail with "The PNG is not in RGBA format!". Regenerated all icons
from the source PNG as proper RGBA, and added icon.icns for macOS.

Also fixed bundle identifier from "com.localtranscription.app" (the
.app suffix conflicts with macOS bundle extension) to
"net.anhonesthost.local-transcription".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:59:50 -07:00
66 changed files with 6906 additions and 925 deletions

View File

@@ -0,0 +1,95 @@
name: Build App (Linux)
on:
workflow_dispatch:
inputs:
tag:
description: 'Release tag to build (e.g. v1.4.5)'
required: true
jobs:
build-linux:
name: Build App (Linux)
runs-on: ubuntu-latest
env:
NODE_VERSION: "20"
RELEASE_TAG: "${{ inputs.tag }}"
steps:
- name: Show tag
run: |
echo "Building for tag: ${RELEASE_TAG}"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.tag }}
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Install Rust stable
run: |
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y libgtk-3-dev libwebkit2gtk-4.1-dev libappindicator3-dev librsvg2-dev patchelf xdg-utils rpm
- name: Install npm dependencies
run: npm ci
- name: Build Tauri app
run: npm run tauri build
- name: Upload to release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
sudo apt-get install -y jq
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${RELEASE_TAG}"
echo "Release tag: ${TAG}"
echo "Waiting for release ${TAG} to be available..."
RELEASE_ID=""
for i in $(seq 1 30); do
RELEASE_JSON=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/tags/${TAG}")
RELEASE_ID=$(echo "$RELEASE_JSON" | jq -r '.id // empty')
if [ -n "${RELEASE_ID}" ] && [ "${RELEASE_ID}" != "null" ]; then
echo "Found release: ${TAG} (ID: ${RELEASE_ID})"
break
fi
echo "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
sleep 10
done
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
echo "ERROR: Failed to find release for tag ${TAG} after 30 attempts."
exit 1
fi
find src-tauri/target/release/bundle -type f \( -name "*.deb" -o -name "*.rpm" -o -name "*.AppImage" \) | while IFS= read -r file; do
filename=$(basename "$file")
encoded_name=$(echo "$filename" | sed 's/ /%20/g')
echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
if [ -n "${ASSET_ID}" ]; then
curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
fi
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/octet-stream" \
-T "$file" \
"${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
echo "Upload response: HTTP ${HTTP_CODE}"
done

View File

@@ -0,0 +1,93 @@
name: Build App (macOS)
on:
workflow_dispatch:
inputs:
tag:
description: 'Release tag to build (e.g. v1.4.5)'
required: true
jobs:
build-macos:
name: Build App (macOS)
runs-on: macos-latest
env:
NODE_VERSION: "20"
RELEASE_TAG: "${{ inputs.tag }}"
steps:
- name: Show tag
run: |
echo "Building for tag: ${RELEASE_TAG}"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.tag }}
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Install Rust stable
run: |
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
- name: Install system dependencies
run: brew install --quiet create-dmg || true
- name: Install npm dependencies
run: npm ci
- name: Build Tauri app
run: npm run tauri build
- name: Upload to release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
which jq || brew install jq
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${RELEASE_TAG}"
echo "Release tag: ${TAG}"
echo "Waiting for release ${TAG} to be available..."
RELEASE_ID=""
for i in $(seq 1 30); do
RELEASE_JSON=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/tags/${TAG}")
RELEASE_ID=$(echo "$RELEASE_JSON" | jq -r '.id // empty')
if [ -n "${RELEASE_ID}" ] && [ "${RELEASE_ID}" != "null" ]; then
echo "Found release: ${TAG} (ID: ${RELEASE_ID})"
break
fi
echo "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
sleep 10
done
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
echo "ERROR: Failed to find release for tag ${TAG} after 30 attempts."
exit 1
fi
find src-tauri/target/release/bundle -type f -name "*.dmg" | while IFS= read -r file; do
filename=$(basename "$file")
encoded_name=$(echo "$filename" | sed 's/ /%20/g')
echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
if [ -n "${ASSET_ID}" ]; then
curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
fi
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/octet-stream" \
-T "$file" \
"${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
echo "Upload response: HTTP ${HTTP_CODE}"
done

View File

@@ -0,0 +1,154 @@
name: Build App (Windows)
on:
workflow_dispatch:
inputs:
tag:
description: 'Release tag to build (e.g. v1.4.5)'
required: true
env:
NODE_VERSION: "20"
jobs:
build-windows:
name: Build App (Windows)
runs-on: windows-latest
env:
RELEASE_TAG: "${{ inputs.tag }}"
steps:
- name: Show tag
shell: powershell
run: |
Write-Host "Building for tag: $env:RELEASE_TAG"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.tag }}
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Install Rust stable
shell: powershell
run: |
if (Get-Command rustup -ErrorAction SilentlyContinue) {
rustup default stable
} else {
Invoke-WebRequest -Uri https://win.rustup.rs/x86_64 -OutFile rustup-init.exe
.\rustup-init.exe -y --default-toolchain stable
echo "$env:USERPROFILE\.cargo\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
}
- name: Install npm dependencies
shell: powershell
run: npm ci
- name: Setup Azure Artifact Signing
shell: powershell
env:
AZURE_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
AZURE_SIGNING_ENDPOINT: ${{ secrets.AZURE_SIGNING_ENDPOINT }}
AZURE_SIGNING_ACCOUNT: ${{ secrets.AZURE_SIGNING_ACCOUNT }}
AZURE_CERT_PROFILE: ${{ secrets.AZURE_CERT_PROFILE }}
run: |
if (-not $env:AZURE_CLIENT_ID) {
Write-Host "No Azure signing secrets configured, skipping code signing setup"
return
}
Write-Host "Setting up Azure Artifact Signing..."
# Install Artifact Signing client tools
nuget install Microsoft.ArtifactSigning.Client -x -OutputDirectory .\signing-tools
$dlibPath = (Resolve-Path ".\signing-tools\Microsoft.ArtifactSigning.Client*\bin\x64\Azure.CodeSigning.Dlib.dll").Path
# Write metadata.json
@{
Endpoint = $env:AZURE_SIGNING_ENDPOINT
CodeSigningAccountName = $env:AZURE_SIGNING_ACCOUNT
CertificateProfileName = $env:AZURE_CERT_PROFILE
} | ConvertTo-Json | Out-File -Encoding UTF8 metadata.json
$metadataPath = (Resolve-Path "metadata.json").Path
# Inject signCommand into tauri.conf.json for this build
$conf = Get-Content src-tauri\tauri.conf.json -Raw | ConvertFrom-Json
$signCmd = "signtool.exe sign /v /fd SHA256 /tr http://timestamp.acs.microsoft.com /td SHA256 /dlib `"$dlibPath`" /dmdf `"$metadataPath`" %1"
$conf.bundle.windows | Add-Member -NotePropertyName "signCommand" -NotePropertyValue $signCmd -Force
$conf | ConvertTo-Json -Depth 10 | Set-Content src-tauri\tauri.conf.json -Encoding UTF8
- name: Build Tauri app
shell: powershell
env:
AZURE_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
AZURE_CLIENT_SECRET: ${{ secrets.AZURE_CLIENT_SECRET }}
AZURE_TENANT_ID: ${{ secrets.AZURE_TENANT_ID }}
run: npm run tauri build
- name: Upload to release
shell: powershell
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
$REPO_API = "${{ github.server_url }}/api/v1/repos/${{ github.repository }}"
$Headers = @{ "Authorization" = "token $env:BUILD_TOKEN" }
$TAG = $env:RELEASE_TAG
Write-Host "Release tag: $TAG"
if (-not $TAG) {
Write-Host "ERROR: RELEASE_TAG is empty"
exit 1
}
Write-Host "Waiting for release $TAG to be available..."
$RELEASE_ID = $null
for ($i = 1; $i -le 30; $i++) {
try {
$release = Invoke-RestMethod -Uri "$REPO_API/releases/tags/$TAG" -Headers $Headers -ErrorAction Stop
$RELEASE_ID = $release.id
if ($RELEASE_ID) {
Write-Host "Found release: $TAG (ID: $RELEASE_ID)"
break
}
} catch {}
Write-Host "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
Start-Sleep -Seconds 10
}
if (-not $RELEASE_ID) {
Write-Host "ERROR: Failed to find release for tag $TAG after 30 attempts."
exit 1
}
Get-ChildItem -Path src-tauri\target\release\bundle -Recurse -Include *.msi,*-setup.exe | ForEach-Object {
$filename = $_.Name
$encodedName = [System.Uri]::EscapeDataString($filename)
$size = [math]::Round($_.Length / 1MB, 1)
Write-Host "Uploading $filename ($size MB)..."
try {
$assets = Invoke-RestMethod -Uri "$REPO_API/releases/$RELEASE_ID/assets" -Headers $Headers
$existing = $assets | Where-Object { $_.name -eq $filename }
if ($existing) {
Invoke-RestMethod -Uri "$REPO_API/releases/$RELEASE_ID/assets/$($existing.id)" -Method Delete -Headers $Headers
}
} catch {}
$uploadUrl = "$REPO_API/releases/$RELEASE_ID/assets?name=$encodedName"
$result = curl.exe --fail --silent --show-error `
-X POST `
-H "Authorization: token $env:BUILD_TOKEN" `
-H "Content-Type: application/octet-stream" `
-T "$($_.FullName)" `
"$uploadUrl" 2>&1
if ($LASTEXITCODE -eq 0) {
Write-Host "Upload successful: $filename"
} else {
Write-Host "WARNING: Upload failed for ${filename}: $result"
}
}

View File

@@ -0,0 +1,229 @@
name: Build Sidecar (Cloud)
on:
workflow_dispatch:
inputs:
tag:
description: 'Sidecar release tag to build (e.g. sidecar-v1.0.5)'
required: true
jobs:
build-cloud-linux:
name: Build Cloud Sidecar (Linux)
runs-on: ubuntu-latest
env:
PYTHON_VERSION: "3.11"
RELEASE_TAG: "${{ inputs.tag }}"
steps:
- name: Show tag
run: |
echo "Building cloud sidecar for tag ${RELEASE_TAG}"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.tag }}
- name: Install uv
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Set up Python
run: uv python install ${{ env.PYTHON_VERSION }}
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y portaudio19-dev
- name: Build cloud sidecar
env:
UV_NO_SOURCES: "1"
run: |
uv venv
uv pip install pyinstaller numpy sounddevice fastapi uvicorn websockets pydantic requests pyyaml packaging
.venv/bin/pyinstaller local-transcription-cloud.spec
- name: Package
run: |
cd dist/local-transcription-backend && zip -r ../../sidecar-linux-x86_64-cloud.zip .
- name: Upload to release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
sudo apt-get install -y jq
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${RELEASE_TAG}"
for i in $(seq 1 30); do
RELEASE_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/tags/${TAG}" | jq -r '.id // empty')
if [ -n "${RELEASE_ID}" ] && [ "${RELEASE_ID}" != "null" ]; then
echo "Found release ${TAG} (ID: ${RELEASE_ID})"
break
fi
echo "Attempt ${i}/30: waiting for release..."
sleep 10
done
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
echo "ERROR: Release not found"; exit 1
fi
for file in sidecar-*-cloud.zip; do
filename=$(basename "$file")
ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
[ -n "${ASSET_ID}" ] && curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" "${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
curl -s -o /dev/null -w "Upload ${filename}: HTTP %{http_code}\n" -X POST \
-H "Authorization: token ${BUILD_TOKEN}" -H "Content-Type: application/octet-stream" \
-T "$file" "${REPO_API}/releases/${RELEASE_ID}/assets?name=${filename}"
done
build-cloud-windows:
name: Build Cloud Sidecar (Windows)
runs-on: windows-latest
env:
PYTHON_VERSION: "3.11"
RELEASE_TAG: "${{ inputs.tag }}"
steps:
- name: Show tag
shell: powershell
run: Write-Host "Building cloud sidecar for tag $env:RELEASE_TAG"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.tag }}
- name: Install uv
shell: powershell
run: |
if (Get-Command uv -ErrorAction SilentlyContinue) {
Write-Host "uv already installed"
} else {
irm https://astral.sh/uv/install.ps1 | iex
$uvPaths = @("$env:USERPROFILE\.local\bin", "$env:USERPROFILE\.cargo\bin", "$env:LOCALAPPDATA\uv\bin")
foreach ($p in $uvPaths) { if (Test-Path $p) { echo $p | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append } }
}
- name: Set up Python
shell: powershell
run: uv python install ${{ env.PYTHON_VERSION }}
- name: Build cloud sidecar
shell: powershell
env:
UV_NO_SOURCES: "1"
run: |
uv venv
uv pip install pyinstaller numpy sounddevice fastapi uvicorn websockets pydantic requests pyyaml packaging
.venv\Scripts\pyinstaller.exe local-transcription-cloud.spec
- name: Package
shell: powershell
run: |
if (-not (Get-Command 7z -ErrorAction SilentlyContinue)) { choco install 7zip -y }
7z a -tzip -mx=5 sidecar-windows-x86_64-cloud.zip .\dist\local-transcription-backend\*
- name: Upload to release
shell: powershell
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
$REPO_API = "${{ github.server_url }}/api/v1/repos/${{ github.repository }}"
$Headers = @{ "Authorization" = "token $env:BUILD_TOKEN" }
$TAG = $env:RELEASE_TAG
$RELEASE_ID = $null
for ($i = 1; $i -le 30; $i++) {
try {
$release = Invoke-RestMethod -Uri "$REPO_API/releases/tags/$TAG" -Headers $Headers -ErrorAction Stop
$RELEASE_ID = $release.id
if ($RELEASE_ID) { Write-Host "Found release $TAG (ID: $RELEASE_ID)"; break }
} catch {}
Write-Host "Attempt ${i}/30: waiting..."; Start-Sleep -Seconds 10
}
if (-not $RELEASE_ID) { Write-Host "ERROR: Release not found"; exit 1 }
Get-ChildItem -Path . -Filter "sidecar-*-cloud.zip" | ForEach-Object {
$fn = $_.Name; $enc = [System.Uri]::EscapeDataString($fn)
try {
$assets = Invoke-RestMethod -Uri "$REPO_API/releases/$RELEASE_ID/assets" -Headers $Headers
$existing = $assets | Where-Object { $_.name -eq $fn }
if ($existing) { Invoke-RestMethod -Uri "$REPO_API/releases/$RELEASE_ID/assets/$($existing.id)" -Method Delete -Headers $Headers }
} catch {}
curl.exe --fail -s -X POST -H "Authorization: token $env:BUILD_TOKEN" -H "Content-Type: application/octet-stream" -T "$($_.FullName)" "$REPO_API/releases/$RELEASE_ID/assets?name=$enc"
Write-Host "Uploaded $fn"
}
build-cloud-macos:
name: Build Cloud Sidecar (macOS)
runs-on: macos-latest
env:
PYTHON_VERSION: "3.11"
RELEASE_TAG: "${{ inputs.tag }}"
steps:
- name: Show tag
run: |
echo "Building cloud sidecar for tag ${RELEASE_TAG}"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.tag }}
- name: Install uv
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Set up Python
run: uv python install ${{ env.PYTHON_VERSION }}
- name: Install system dependencies
run: brew install portaudio
- name: Build cloud sidecar
env:
UV_NO_SOURCES: "1"
run: |
uv venv
uv pip install pyinstaller numpy sounddevice fastapi uvicorn websockets pydantic requests pyyaml packaging
.venv/bin/pyinstaller local-transcription-cloud.spec
- name: Package
run: |
cd dist/local-transcription-backend && zip -r ../../sidecar-macos-aarch64-cloud.zip .
- name: Upload to release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
which jq || brew install jq
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${RELEASE_TAG}"
for i in $(seq 1 30); do
RELEASE_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/tags/${TAG}" | jq -r '.id // empty')
if [ -n "${RELEASE_ID}" ] && [ "${RELEASE_ID}" != "null" ]; then
echo "Found release ${TAG} (ID: ${RELEASE_ID})"
break
fi
echo "Attempt ${i}/30: waiting for release..."
sleep 10
done
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
echo "ERROR: Release not found"; exit 1
fi
for file in sidecar-*-cloud.zip; do
filename=$(basename "$file")
ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
[ -n "${ASSET_ID}" ] && curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" "${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
curl -s -o /dev/null -w "Upload ${filename}: HTTP %{http_code}\n" -X POST \
-H "Authorization: token ${BUILD_TOKEN}" -H "Content-Type: application/octet-stream" \
-T "$file" "${REPO_API}/releases/${RELEASE_ID}/assets?name=${filename}"
done

View File

@@ -0,0 +1,101 @@
name: Build Sidecar (Linux)
on:
workflow_dispatch:
inputs:
tag:
description: 'Sidecar release tag to build (e.g. sidecar-v1.0.3)'
required: true
jobs:
build-sidecar-linux:
name: Build Sidecar (Linux)
runs-on: ubuntu-latest
env:
PYTHON_VERSION: "3.11"
RELEASE_TAG: "${{ inputs.tag }}"
steps:
- name: Show tag
run: |
echo "Building for tag: ${RELEASE_TAG}"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.tag }}
- name: Install uv
run: |
if command -v uv &> /dev/null; then
echo "uv already installed: $(uv --version)"
else
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.local/bin" >> $GITHUB_PATH
fi
- name: Set up Python
run: uv python install ${{ env.PYTHON_VERSION }}
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y portaudio19-dev
- name: Build sidecar (CPU)
run: |
uv sync --no-sources
# PyPI's default torch on Linux includes CUDA (~800MB).
# Replace with CPU-only torch from the dedicated index.
uv pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu --force-reinstall
.venv/bin/pyinstaller local-transcription-headless.spec
- name: Package sidecar (CPU)
run: |
cd dist/local-transcription-backend && zip -9 -r ../../sidecar-linux-x86_64-cpu.zip .
- name: Upload to sidecar release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
sudo apt-get install -y jq
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${RELEASE_TAG}"
echo "Waiting for sidecar release ${TAG} to be available..."
for i in $(seq 1 30); do
RELEASE_JSON=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/tags/${TAG}")
RELEASE_ID=$(echo "$RELEASE_JSON" | jq -r '.id // empty')
if [ -n "${RELEASE_ID}" ] && [ "${RELEASE_ID}" != "null" ]; then
echo "Found sidecar release: ${TAG} (ID: ${RELEASE_ID})"
break
fi
echo "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
sleep 10
done
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
echo "ERROR: Failed to find sidecar release for tag ${TAG} after 30 attempts."
exit 1
fi
for file in sidecar-*.zip; do
filename=$(basename "$file")
encoded_name=$(echo "$filename" | sed 's/ /%20/g')
echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
if [ -n "${ASSET_ID}" ]; then
curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
fi
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/octet-stream" \
-T "$file" \
"${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
echo "Upload response: HTTP ${HTTP_CODE}"
done

View File

@@ -0,0 +1,101 @@
name: Build Sidecar (macOS)
on:
workflow_dispatch:
inputs:
tag:
description: 'Sidecar release tag to build (e.g. sidecar-v1.0.3)'
required: true
jobs:
build-sidecar-macos:
name: Build Sidecar (macOS)
runs-on: macos-latest
env:
PYTHON_VERSION: "3.11"
RELEASE_TAG: "${{ inputs.tag }}"
steps:
- name: Show tag
run: |
echo "Building for tag: ${RELEASE_TAG}"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.tag }}
- name: Install uv
run: |
if command -v uv &> /dev/null; then
echo "uv already installed: $(uv --version)"
else
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.local/bin" >> $GITHUB_PATH
fi
- name: Set up Python
run: uv python install ${{ env.PYTHON_VERSION }}
- name: Install system dependencies
run: brew install portaudio
- name: Build sidecar (CPU)
env:
UV_NO_SOURCES: "1"
run: |
# UV_NO_SOURCES bypasses pyproject.toml's [tool.uv.sources] which forces
# torch from the CUDA index (no macOS ARM wheels there).
# Default PyPI torch includes MPS (Apple Silicon GPU) support.
uv sync
.venv/bin/pyinstaller local-transcription-headless.spec
- name: Package sidecar (CPU)
run: |
cd dist/local-transcription-backend && zip -r ../../sidecar-macos-aarch64-cpu.zip .
- name: Upload to sidecar release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
which jq || brew install jq
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${RELEASE_TAG}"
echo "Waiting for sidecar release ${TAG} to be available..."
for i in $(seq 1 30); do
RELEASE_JSON=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/tags/${TAG}")
RELEASE_ID=$(echo "$RELEASE_JSON" | jq -r '.id // empty')
if [ -n "${RELEASE_ID}" ] && [ "${RELEASE_ID}" != "null" ]; then
echo "Found sidecar release: ${TAG} (ID: ${RELEASE_ID})"
break
fi
echo "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
sleep 10
done
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
echo "ERROR: Failed to find sidecar release for tag ${TAG} after 30 attempts."
exit 1
fi
for file in sidecar-*.zip; do
filename=$(basename "$file")
encoded_name=$(echo "$filename" | sed 's/ /%20/g')
echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
if [ -n "${ASSET_ID}" ]; then
curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
fi
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/octet-stream" \
-T "$file" \
"${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
echo "Upload response: HTTP ${HTTP_CODE}"
done

View File

@@ -0,0 +1,135 @@
name: Build Sidecar (Windows)
on:
workflow_dispatch:
inputs:
tag:
description: 'Sidecar release tag to build (e.g. sidecar-v1.0.3)'
required: true
jobs:
build-sidecar-windows:
name: Build Sidecar (Windows)
runs-on: windows-latest
env:
PYTHON_VERSION: "3.11"
RELEASE_TAG: "${{ inputs.tag }}"
steps:
- name: Show tag
shell: powershell
run: |
Write-Host "Building for tag: $env:RELEASE_TAG"
- uses: actions/checkout@v4
with:
ref: ${{ inputs.tag }}
- name: Install uv
shell: powershell
run: |
if (Get-Command uv -ErrorAction SilentlyContinue) {
Write-Host "uv already installed: $(uv --version)"
} else {
irm https://astral.sh/uv/install.ps1 | iex
$uvPaths = @(
"$env:USERPROFILE\.local\bin",
"$env:USERPROFILE\.cargo\bin",
"$env:LOCALAPPDATA\uv\bin"
)
foreach ($p in $uvPaths) {
if (Test-Path $p) {
echo $p | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
}
}
}
- name: Set up Python
shell: powershell
run: uv python install ${{ env.PYTHON_VERSION }}
- name: Install 7-Zip
shell: powershell
run: |
if (-not (Get-Command 7z -ErrorAction SilentlyContinue)) {
choco install 7zip -y
}
- name: Build sidecar (CPU)
shell: powershell
run: |
$env:UV_NO_SOURCES = "1"
uv sync
# PyPI's default torch includes CUDA. Replace with CPU-only.
uv pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu --force-reinstall
.venv\Scripts\pyinstaller.exe local-transcription-headless.spec
- name: Package sidecar (CPU)
shell: powershell
run: |
7z a -tzip -mx=9 sidecar-windows-x86_64-cpu.zip .\dist\local-transcription-backend\*
- name: Upload to sidecar release
shell: powershell
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
$REPO_API = "${{ github.server_url }}/api/v1/repos/${{ github.repository }}"
$Headers = @{ "Authorization" = "token $env:BUILD_TOKEN" }
$TAG = $env:RELEASE_TAG
Write-Host "Release tag: $TAG"
if (-not $TAG) {
Write-Host "ERROR: RELEASE_TAG is empty"
exit 1
}
Write-Host "Waiting for sidecar release $TAG to be available..."
$RELEASE_ID = $null
for ($i = 1; $i -le 30; $i++) {
try {
$release = Invoke-RestMethod -Uri "$REPO_API/releases/tags/$TAG" -Headers $Headers -ErrorAction Stop
$RELEASE_ID = $release.id
if ($RELEASE_ID) {
Write-Host "Found sidecar release: $TAG (ID: $RELEASE_ID)"
break
}
} catch {}
Write-Host "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
Start-Sleep -Seconds 10
}
if (-not $RELEASE_ID) {
Write-Host "ERROR: Failed to find sidecar release for tag $TAG after 30 attempts."
exit 1
}
Get-ChildItem -Path . -Filter "sidecar-*.zip" | ForEach-Object {
$filename = $_.Name
$encodedName = [System.Uri]::EscapeDataString($filename)
$size = [math]::Round($_.Length / 1MB, 1)
Write-Host "Uploading $filename ($size MB)..."
try {
$assets = Invoke-RestMethod -Uri "$REPO_API/releases/$RELEASE_ID/assets" -Headers $Headers
$existing = $assets | Where-Object { $_.name -eq $filename }
if ($existing) {
Invoke-RestMethod -Uri "$REPO_API/releases/$RELEASE_ID/assets/$($existing.id)" -Method Delete -Headers $Headers
}
} catch {}
$uploadUrl = "$REPO_API/releases/$RELEASE_ID/assets?name=$encodedName"
$result = curl.exe --fail --silent --show-error `
-X POST `
-H "Authorization: token $env:BUILD_TOKEN" `
-H "Content-Type: application/octet-stream" `
-T "$($_.FullName)" `
"$uploadUrl" 2>&1
if ($LASTEXITCODE -eq 0) {
Write-Host "Upload successful: $filename"
} else {
Write-Host "WARNING: Upload failed for ${filename}: $result"
}
}

View File

@@ -1,425 +0,0 @@
name: Build Sidecars
on:
push:
branches: [main]
paths:
- 'client/**'
- 'server/**'
- 'backend/**'
- 'pyproject.toml'
- 'local-transcription-headless.spec'
workflow_dispatch:
jobs:
bump-sidecar-version:
name: Bump sidecar version and tag
if: "!contains(github.event.head_commit.message, '[skip ci]')"
runs-on: ubuntu-latest
outputs:
version: ${{ steps.bump.outputs.version }}
tag: ${{ steps.bump.outputs.tag }}
has_changes: ${{ steps.check_changes.outputs.has_changes }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 2
- name: Check for backend changes
id: check_changes
run: |
# If triggered by workflow_dispatch, always build
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "has_changes=true" >> $GITHUB_OUTPUT
exit 0
fi
# Check if relevant files changed in this commit
CHANGED=$(git diff --name-only HEAD~1 HEAD -- client/ server/ backend/ pyproject.toml local-transcription-headless.spec 2>/dev/null || echo "")
if [ -n "$CHANGED" ]; then
echo "has_changes=true" >> $GITHUB_OUTPUT
echo "Backend changes detected: $CHANGED"
else
echo "has_changes=false" >> $GITHUB_OUTPUT
echo "No backend changes detected, skipping sidecar build"
fi
- name: Configure git
if: steps.check_changes.outputs.has_changes == 'true'
run: |
git config user.name "Gitea Actions"
git config user.email "actions@gitea.local"
- name: Bump sidecar patch version
if: steps.check_changes.outputs.has_changes == 'true'
id: bump
run: |
# Read current version from pyproject.toml
CURRENT=$(grep '^version = ' pyproject.toml | head -1 | sed 's/version = "\(.*\)"/\1/')
echo "Current sidecar version: ${CURRENT}"
# Increment patch number
MAJOR=$(echo "${CURRENT}" | cut -d. -f1)
MINOR=$(echo "${CURRENT}" | cut -d. -f2)
PATCH=$(echo "${CURRENT}" | cut -d. -f3)
NEW_PATCH=$((PATCH + 1))
NEW_VERSION="${MAJOR}.${MINOR}.${NEW_PATCH}"
echo "New sidecar version: ${NEW_VERSION}"
# Update pyproject.toml
sed -i "s/^version = \"${CURRENT}\"/version = \"${NEW_VERSION}\"/" pyproject.toml
# Update version.py
sed -i "s/__version__ = \"${CURRENT}\"/__version__ = \"${NEW_VERSION}\"/" version.py
sed -i "s/__version_info__ = .*/__version_info__ = (${MAJOR}, ${MINOR}, ${NEW_PATCH})/" version.py
echo "version=${NEW_VERSION}" >> $GITHUB_OUTPUT
echo "tag=sidecar-v${NEW_VERSION}" >> $GITHUB_OUTPUT
- name: Commit and tag
if: steps.check_changes.outputs.has_changes == 'true'
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
NEW_VERSION="${{ steps.bump.outputs.version }}"
TAG="${{ steps.bump.outputs.tag }}"
git add pyproject.toml version.py
git commit -m "chore: bump sidecar version to ${NEW_VERSION} [skip ci]"
git tag "${TAG}"
REMOTE_URL=$(git remote get-url origin | sed "s|://|://gitea-actions:${BUILD_TOKEN}@|")
git pull --rebase "${REMOTE_URL}" main || true
git push "${REMOTE_URL}" HEAD:main
git push "${REMOTE_URL}" "${TAG}"
- name: Create Gitea release
if: steps.check_changes.outputs.has_changes == 'true'
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${{ steps.bump.outputs.tag }}"
VERSION="${{ steps.bump.outputs.version }}"
RELEASE_NAME="Sidecar v${VERSION}"
curl -s -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/json" \
-d "{\"tag_name\": \"${TAG}\", \"name\": \"${RELEASE_NAME}\", \"body\": \"Automated sidecar build.\", \"draft\": false, \"prerelease\": false}" \
"${REPO_API}/releases"
echo "Created release: ${RELEASE_NAME}"
# ── Linux sidecar (CUDA + CPU) ──
build-sidecar-linux:
name: Build Sidecar (Linux)
needs: bump-sidecar-version
if: needs.bump-sidecar-version.outputs.has_changes == 'true'
runs-on: ubuntu-latest
env:
PYTHON_VERSION: "3.11"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ needs.bump-sidecar-version.outputs.tag }}
- name: Install uv
run: |
if command -v uv &> /dev/null; then
echo "uv already installed: $(uv --version)"
else
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.local/bin" >> $GITHUB_PATH
fi
- name: Set up Python
run: uv python install ${{ env.PYTHON_VERSION }}
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y portaudio19-dev
- name: Build sidecar (CUDA)
run: |
uv sync --frozen || uv sync
uv run pyinstaller local-transcription-headless.spec
- name: Package sidecar (CUDA)
run: |
cd dist/local-transcription-backend && zip -r ../../sidecar-linux-x86_64-cuda.zip .
- name: Build sidecar (CPU)
run: |
rm -rf dist/local-transcription-backend build/
uv pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu --force-reinstall
uv run pyinstaller local-transcription-headless.spec
- name: Package sidecar (CPU)
run: |
cd dist/local-transcription-backend && zip -r ../../sidecar-linux-x86_64-cpu.zip .
- name: Upload to sidecar release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
sudo apt-get install -y jq
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${{ needs.bump-sidecar-version.outputs.tag }}"
echo "Waiting for sidecar release ${TAG} to be available..."
for i in $(seq 1 30); do
RELEASE_JSON=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/tags/${TAG}")
RELEASE_ID=$(echo "$RELEASE_JSON" | jq -r '.id // empty')
if [ -n "${RELEASE_ID}" ] && [ "${RELEASE_ID}" != "null" ]; then
echo "Found sidecar release: ${TAG} (ID: ${RELEASE_ID})"
break
fi
echo "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
sleep 10
done
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
echo "ERROR: Failed to find sidecar release for tag ${TAG} after 30 attempts."
exit 1
fi
for file in sidecar-*.zip; do
filename=$(basename "$file")
encoded_name=$(echo "$filename" | sed 's/ /%20/g')
echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
if [ -n "${ASSET_ID}" ]; then
curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
fi
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/octet-stream" \
-T "$file" \
"${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
echo "Upload response: HTTP ${HTTP_CODE}"
done
# ── Windows sidecar (CUDA + CPU) ──
build-sidecar-windows:
name: Build Sidecar (Windows)
needs: bump-sidecar-version
if: needs.bump-sidecar-version.outputs.has_changes == 'true'
runs-on: windows-latest
env:
PYTHON_VERSION: "3.11"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ needs.bump-sidecar-version.outputs.tag }}
- name: Install uv
shell: powershell
run: |
if (Get-Command uv -ErrorAction SilentlyContinue) {
Write-Host "uv already installed: $(uv --version)"
} else {
irm https://astral.sh/uv/install.ps1 | iex
# Add both possible uv install locations to PATH
$uvPaths = @(
"$env:USERPROFILE\.local\bin",
"$env:USERPROFILE\.cargo\bin",
"$env:LOCALAPPDATA\uv\bin"
)
foreach ($p in $uvPaths) {
if (Test-Path $p) {
echo $p | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
}
}
}
- name: Set up Python
shell: powershell
run: uv python install ${{ env.PYTHON_VERSION }}
- name: Install 7-Zip
shell: powershell
run: |
if (-not (Get-Command 7z -ErrorAction SilentlyContinue)) {
choco install 7zip -y
}
- name: Build sidecar (CUDA)
shell: powershell
run: |
uv sync --frozen
if ($LASTEXITCODE -ne 0) { uv sync }
uv run pyinstaller local-transcription-headless.spec
- name: Package sidecar (CUDA)
shell: powershell
run: |
7z a -tzip -mx=5 sidecar-windows-x86_64-cuda.zip .\dist\local-transcription-backend\*
- name: Build sidecar (CPU)
shell: powershell
run: |
Remove-Item -Recurse -Force dist\local-transcription-backend, build -ErrorAction SilentlyContinue
uv pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu --force-reinstall
uv run pyinstaller local-transcription-headless.spec
- name: Package sidecar (CPU)
shell: powershell
run: |
7z a -tzip -mx=5 sidecar-windows-x86_64-cpu.zip .\dist\local-transcription-backend\*
- name: Upload to sidecar release
shell: powershell
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
$REPO_API = "${{ github.server_url }}/api/v1/repos/${{ github.repository }}"
$Headers = @{ "Authorization" = "token $env:BUILD_TOKEN" }
$TAG = "${{ needs.bump-sidecar-version.outputs.tag }}"
Write-Host "Waiting for sidecar release ${TAG} to be available..."
$RELEASE_ID = $null
for ($i = 1; $i -le 30; $i++) {
try {
$release = Invoke-RestMethod -Uri "${REPO_API}/releases/tags/${TAG}" -Headers $Headers -ErrorAction Stop
$RELEASE_ID = $release.id
if ($RELEASE_ID) {
Write-Host "Found sidecar release: ${TAG} (ID: ${RELEASE_ID})"
break
}
} catch {}
Write-Host "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
Start-Sleep -Seconds 10
}
if (-not $RELEASE_ID) {
Write-Host "ERROR: Failed to find sidecar release for tag ${TAG} after 30 attempts."
exit 1
}
Get-ChildItem -Path . -Filter "sidecar-*.zip" | ForEach-Object {
$filename = $_.Name
$encodedName = [System.Uri]::EscapeDataString($filename)
$size = [math]::Round($_.Length / 1MB, 1)
Write-Host "Uploading ${filename} (${size} MB)..."
try {
$assets = Invoke-RestMethod -Uri "${REPO_API}/releases/${RELEASE_ID}/assets" -Headers $Headers
$existing = $assets | Where-Object { $_.name -eq $filename }
if ($existing) {
Invoke-RestMethod -Uri "${REPO_API}/releases/${RELEASE_ID}/assets/$($existing.id)" -Method Delete -Headers $Headers
}
} catch {}
$uploadUrl = "${REPO_API}/releases/${RELEASE_ID}/assets?name=${encodedName}"
$result = curl.exe --fail --silent --show-error `
-X POST `
-H "Authorization: token $env:BUILD_TOKEN" `
-H "Content-Type: application/octet-stream" `
-T "$($_.FullName)" `
"$uploadUrl" 2>&1
if ($LASTEXITCODE -eq 0) {
Write-Host "Upload successful: ${filename}"
} else {
Write-Host "WARNING: Upload failed for ${filename}: ${result}"
}
}
# ── macOS sidecar (CPU only — no CUDA on macOS) ──
build-sidecar-macos:
name: Build Sidecar (macOS)
needs: bump-sidecar-version
if: needs.bump-sidecar-version.outputs.has_changes == 'true'
runs-on: macos-latest
env:
PYTHON_VERSION: "3.11"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ needs.bump-sidecar-version.outputs.tag }}
- name: Install uv
run: |
if command -v uv &> /dev/null; then
echo "uv already installed: $(uv --version)"
else
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.local/bin" >> $GITHUB_PATH
fi
- name: Set up Python
run: uv python install ${{ env.PYTHON_VERSION }}
- name: Install system dependencies
run: brew install portaudio
- name: Build sidecar (CPU)
run: |
# --no-sources bypasses pyproject.toml's [tool.uv.sources] which forces
# torch from the CUDA index (no macOS ARM wheels there)
# Default PyPI torch includes MPS (Apple Silicon GPU) support
uv sync --no-sources
uv run pyinstaller local-transcription-headless.spec
- name: Package sidecar (CPU)
run: |
cd dist/local-transcription-backend && zip -r ../../sidecar-macos-aarch64-cpu.zip .
- name: Upload to sidecar release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
which jq || brew install jq
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${{ needs.bump-sidecar-version.outputs.tag }}"
echo "Waiting for sidecar release ${TAG} to be available..."
for i in $(seq 1 30); do
RELEASE_JSON=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/tags/${TAG}")
RELEASE_ID=$(echo "$RELEASE_JSON" | jq -r '.id // empty')
if [ -n "${RELEASE_ID}" ] && [ "${RELEASE_ID}" != "null" ]; then
echo "Found sidecar release: ${TAG} (ID: ${RELEASE_ID})"
break
fi
echo "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
sleep 10
done
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
echo "ERROR: Failed to find sidecar release for tag ${TAG} after 30 attempts."
exit 1
fi
for file in sidecar-*.zip; do
filename=$(basename "$file")
encoded_name=$(echo "$filename" | sed 's/ /%20/g')
echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
if [ -n "${ASSET_ID}" ]; then
curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
fi
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/octet-stream" \
-T "$file" \
"${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
echo "Upload response: HTTP ${HTTP_CODE}"
done

View File

@@ -0,0 +1,102 @@
name: Cleanup Old Releases
on:
workflow_dispatch:
inputs:
keep_app_releases:
description: 'Number of app releases to keep'
required: false
default: '3'
keep_sidecar_releases:
description: 'Number of sidecar releases to keep'
required: false
default: '2'
dry_run:
description: 'Dry run (show what would be deleted without deleting)'
required: false
default: 'true'
jobs:
cleanup:
name: Cleanup Old Releases
runs-on: ubuntu-latest
steps:
- name: Cleanup releases
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
KEEP_APP="${{ inputs.keep_app_releases }}"
KEEP_SIDECAR="${{ inputs.keep_sidecar_releases }}"
DRY_RUN="${{ inputs.dry_run }}"
echo "=== Cleanup Configuration ==="
echo "Keep app releases: ${KEEP_APP}"
echo "Keep sidecar releases: ${KEEP_SIDECAR}"
echo "Dry run: ${DRY_RUN}"
echo ""
# Fetch all releases
ALL_RELEASES=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases?limit=50")
# ── App releases (v* tags, not sidecar-v*) ──
echo "=== App Releases ==="
APP_RELEASES=$(echo "$ALL_RELEASES" | jq -c '[.[] | select(.tag_name | startswith("v")) | select(.tag_name | startswith("sidecar") | not)]')
APP_TOTAL=$(echo "$APP_RELEASES" | jq 'length')
echo "Found ${APP_TOTAL} app releases, keeping ${KEEP_APP}"
if [ "$APP_TOTAL" -gt "$KEEP_APP" ]; then
echo "$APP_RELEASES" | jq -c ".[$KEEP_APP:][]" | while read -r release; do
ID=$(echo "$release" | jq -r '.id')
TAG=$(echo "$release" | jq -r '.tag_name')
SIZE=$(echo "$release" | jq '[.assets[]?.size // 0] | add // 0')
SIZE_MB=$(echo "scale=1; $SIZE / 1048576" | bc 2>/dev/null || echo "?")
# Protect v1.4.0 (last pre-Tauri release)
if [ "$TAG" = "v1.4.0" ]; then
echo " PROTECT ${TAG} (${SIZE_MB} MB)"
continue
fi
if [ "$DRY_RUN" = "true" ]; then
echo " WOULD DELETE ${TAG} (ID: ${ID}, ${SIZE_MB} MB)"
else
echo " DELETING ${TAG} (ID: ${ID}, ${SIZE_MB} MB)..."
curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${ID}"
fi
done
else
echo " Nothing to clean up"
fi
echo ""
# ── Sidecar releases (sidecar-v* tags) ──
echo "=== Sidecar Releases ==="
SIDECAR_RELEASES=$(echo "$ALL_RELEASES" | jq -c '[.[] | select(.tag_name | startswith("sidecar-v"))]')
SIDECAR_TOTAL=$(echo "$SIDECAR_RELEASES" | jq 'length')
echo "Found ${SIDECAR_TOTAL} sidecar releases, keeping ${KEEP_SIDECAR}"
if [ "$SIDECAR_TOTAL" -gt "$KEEP_SIDECAR" ]; then
echo "$SIDECAR_RELEASES" | jq -c ".[$KEEP_SIDECAR:][]" | while read -r release; do
ID=$(echo "$release" | jq -r '.id')
TAG=$(echo "$release" | jq -r '.tag_name')
SIZE=$(echo "$release" | jq '[.assets[]?.size // 0] | add // 0')
SIZE_MB=$(echo "scale=1; $SIZE / 1048576" | bc 2>/dev/null || echo "?")
if [ "$DRY_RUN" = "true" ]; then
echo " WOULD DELETE ${TAG} (ID: ${ID}, ${SIZE_MB} MB)"
else
echo " DELETING ${TAG} (ID: ${ID}, ${SIZE_MB} MB)..."
curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${ID}"
fi
done
else
echo " Nothing to clean up"
fi
echo ""
echo "=== Done ==="

View File

@@ -1,17 +1,41 @@
name: Release
on:
push:
branches: [main]
workflow_dispatch:
jobs:
bump-version:
name: Bump version and tag
test:
name: Run Tests
if: "!contains(github.event.head_commit.message, '[skip ci]')"
runs-on: ubuntu-latest
outputs:
new_version: ${{ steps.bump.outputs.new_version }}
tag: ${{ steps.bump.outputs.tag }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- name: Install npm deps
run: npm ci
- name: Frontend tests
run: npx vitest run
- name: Install uv
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Python tests
run: |
uv venv .testvenv
VIRTUAL_ENV=.testvenv uv pip install pytest httpx pytest-asyncio anyio fastapi pydantic pyyaml uvicorn requests
.testvenv/bin/python -m pytest backend/tests/ client/tests/ -v --tb=short
bump-version:
name: Bump version and tag
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
@@ -23,13 +47,10 @@ jobs:
git config user.email "actions@gitea.local"
- name: Bump patch version
id: bump
run: |
# Read current version from package.json
CURRENT=$(grep '"version"' package.json | head -1 | sed 's/.*"version": *"\([^"]*\)".*/\1/')
echo "Current version: ${CURRENT}"
# Increment patch number
MAJOR=$(echo "${CURRENT}" | cut -d. -f1)
MINOR=$(echo "${CURRENT}" | cut -d. -f2)
PATCH=$(echo "${CURRENT}" | cut -d. -f3)
@@ -37,264 +58,56 @@ jobs:
NEW_VERSION="${MAJOR}.${MINOR}.${NEW_PATCH}"
echo "New version: ${NEW_VERSION}"
# Update package.json
sed -i "s/\"version\": \"${CURRENT}\"/\"version\": \"${NEW_VERSION}\"/" package.json
# Update src-tauri/tauri.conf.json
sed -i "s/\"version\": \"${CURRENT}\"/\"version\": \"${NEW_VERSION}\"/" src-tauri/tauri.conf.json
# Update src-tauri/Cargo.toml
sed -i "s/^version = \"${CURRENT}\"/version = \"${NEW_VERSION}\"/" src-tauri/Cargo.toml
# Update version.py
sed -i "s/__version__ = \"${CURRENT}\"/__version__ = \"${NEW_VERSION}\"/" version.py
sed -i "s/__version_info__ = .*/__version_info__ = (${MAJOR}, ${MINOR}, ${NEW_PATCH})/" version.py
echo "new_version=${NEW_VERSION}" >> $GITHUB_OUTPUT
echo "tag=v${NEW_VERSION}" >> $GITHUB_OUTPUT
# Write to env file instead of step outputs (avoids act runner bug)
echo "NEW_VERSION=${NEW_VERSION}" >> $GITHUB_ENV
echo "RELEASE_TAG=v${NEW_VERSION}" >> $GITHUB_ENV
- name: Commit and tag
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
NEW_VERSION="${{ steps.bump.outputs.new_version }}"
git add package.json src-tauri/tauri.conf.json src-tauri/Cargo.toml version.py
git commit -m "chore: bump version to ${NEW_VERSION} [skip ci]"
git tag "v${NEW_VERSION}"
git tag "${RELEASE_TAG}"
REMOTE_URL=$(git remote get-url origin | sed "s|://|://gitea-actions:${BUILD_TOKEN}@|")
git pull --rebase "${REMOTE_URL}" main || true
git push "${REMOTE_URL}" HEAD:main
git push "${REMOTE_URL}" "v${NEW_VERSION}"
git push "${REMOTE_URL}" "${RELEASE_TAG}"
- name: Create Gitea release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${{ steps.bump.outputs.tag }}"
RELEASE_NAME="Local Transcription ${TAG}"
RELEASE_NAME="Local Transcription ${RELEASE_TAG}"
curl -s -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/json" \
-d "{\"tag_name\": \"${TAG}\", \"name\": \"${RELEASE_NAME}\", \"body\": \"Automated build.\", \"draft\": false, \"prerelease\": false}" \
-d "{\"tag_name\": \"${RELEASE_TAG}\", \"name\": \"${RELEASE_NAME}\", \"body\": \"Automated build.\", \"draft\": false, \"prerelease\": false}" \
"${REPO_API}/releases"
echo "Created release: ${RELEASE_NAME}"
# ── Platform builds (run after version bump) ──
build-linux:
name: Build App (Linux)
needs: bump-version
runs-on: ubuntu-latest
env:
NODE_VERSION: "20"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ needs.bump-version.outputs.tag }}
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Install Rust stable
run: |
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y libgtk-3-dev libwebkit2gtk-4.1-dev libappindicator3-dev librsvg2-dev patchelf xdg-utils rpm
- name: Install npm dependencies
run: npm ci
- name: Build Tauri app
run: npm run tauri build
- name: Upload to release
- name: Trigger per-OS app builds
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
sudo apt-get install -y jq
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${{ needs.bump-version.outputs.tag }}"
echo "Release tag: ${TAG}"
RELEASE_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/tags/${TAG}" | jq -r '.id // empty')
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
echo "ERROR: Failed to find release for tag ${TAG}."
exit 1
fi
echo "Release ID: ${RELEASE_ID}"
find src-tauri/target/release/bundle -type f \( -name "*.deb" -o -name "*.rpm" -o -name "*.AppImage" \) | while IFS= read -r file; do
filename=$(basename "$file")
encoded_name=$(echo "$filename" | sed 's/ /%20/g')
echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
if [ -n "${ASSET_ID}" ]; then
curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
fi
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
for workflow in build-app-linux.yml build-app-windows.yml build-app-macos.yml; do
echo "Dispatching ${workflow} for ${RELEASE_TAG}..."
HTTP_CODE=$(curl -s -w "%{http_code}" -o /tmp/dispatch_resp.txt -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/octet-stream" \
-T "$file" \
"${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
echo "Upload response: HTTP ${HTTP_CODE}"
done
build-windows:
name: Build App (Windows)
needs: bump-version
runs-on: windows-latest
env:
NODE_VERSION: "20"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ needs.bump-version.outputs.tag }}
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Install Rust stable
shell: powershell
run: |
if (Get-Command rustup -ErrorAction SilentlyContinue) {
rustup default stable
} else {
Invoke-WebRequest -Uri https://win.rustup.rs/x86_64 -OutFile rustup-init.exe
.\rustup-init.exe -y --default-toolchain stable
echo "$env:USERPROFILE\.cargo\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
}
- name: Install npm dependencies
shell: powershell
run: npm ci
- name: Build Tauri app
shell: powershell
run: npm run tauri build
- name: Upload to release
shell: powershell
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
$REPO_API = "${{ github.server_url }}/api/v1/repos/${{ github.repository }}"
$Headers = @{ "Authorization" = "token $env:BUILD_TOKEN" }
$TAG = "${{ needs.bump-version.outputs.tag }}"
Write-Host "Release tag: ${TAG}"
$release = Invoke-RestMethod -Uri "${REPO_API}/releases/tags/${TAG}" -Headers $Headers -ErrorAction Stop
$RELEASE_ID = $release.id
Write-Host "Release ID: ${RELEASE_ID}"
Get-ChildItem -Path src-tauri\target\release\bundle -Recurse -Include *.msi,*-setup.exe | ForEach-Object {
$filename = $_.Name
$encodedName = [System.Uri]::EscapeDataString($filename)
$size = [math]::Round($_.Length / 1MB, 1)
Write-Host "Uploading ${filename} (${size} MB)..."
try {
$assets = Invoke-RestMethod -Uri "${REPO_API}/releases/${RELEASE_ID}/assets" -Headers $Headers
$existing = $assets | Where-Object { $_.name -eq $filename }
if ($existing) {
Invoke-RestMethod -Uri "${REPO_API}/releases/${RELEASE_ID}/assets/$($existing.id)" -Method Delete -Headers $Headers
}
} catch {}
$uploadUrl = "${REPO_API}/releases/${RELEASE_ID}/assets?name=${encodedName}"
$result = curl.exe --fail --silent --show-error `
-X POST `
-H "Authorization: token $env:BUILD_TOKEN" `
-H "Content-Type: application/octet-stream" `
-T "$($_.FullName)" `
"$uploadUrl" 2>&1
if ($LASTEXITCODE -eq 0) {
Write-Host "Upload successful: ${filename}"
} else {
Write-Host "WARNING: Upload failed for ${filename}: ${result}"
}
}
build-macos:
name: Build App (macOS)
needs: bump-version
runs-on: macos-latest
env:
NODE_VERSION: "20"
steps:
- uses: actions/checkout@v4
with:
ref: ${{ needs.bump-version.outputs.tag }}
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Install Rust stable
run: |
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
- name: Install system dependencies
run: brew install --quiet create-dmg || true
- name: Install npm dependencies
run: npm ci
- name: Build Tauri app
run: npm run tauri build
- name: Upload to release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
which jq || brew install jq
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
TAG="${{ needs.bump-version.outputs.tag }}"
echo "Release tag: ${TAG}"
RELEASE_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/tags/${TAG}" | jq -r '.id // empty')
if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
echo "ERROR: Failed to find release for tag ${TAG}."
exit 1
fi
echo "Release ID: ${RELEASE_ID}"
find src-tauri/target/release/bundle -type f -name "*.dmg" | while IFS= read -r file; do
filename=$(basename "$file")
encoded_name=$(echo "$filename" | sed 's/ /%20/g')
echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
if [ -n "${ASSET_ID}" ]; then
curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
"${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
fi
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/octet-stream" \
-T "$file" \
"${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
echo "Upload response: HTTP ${HTTP_CODE}"
-H "Content-Type: application/json" \
-d "{\"ref\": \"main\", \"inputs\": {\"tag\": \"${RELEASE_TAG}\"}}" \
"${REPO_API}/actions/workflows/${workflow}/dispatches")
echo " -> HTTP ${HTTP_CODE}"
if [ "$HTTP_CODE" != "204" ]; then cat /tmp/dispatch_resp.txt; echo ""; fi
done

View File

@@ -0,0 +1,102 @@
name: Sidecar Release
on:
workflow_dispatch:
jobs:
test:
name: Run Tests
if: "!contains(github.event.head_commit.message, '[skip ci]')"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install uv
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Python tests
run: |
uv venv .testvenv
VIRTUAL_ENV=.testvenv uv pip install pytest httpx pytest-asyncio anyio fastapi pydantic pyyaml uvicorn requests
.testvenv/bin/python -m pytest backend/tests/ client/tests/ -v --tb=short
bump-sidecar-version:
name: Bump sidecar version and tag
needs: test
if: "!contains(github.event.head_commit.message, '[skip ci]')"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 2
- name: Configure git
run: |
git config user.name "Gitea Actions"
git config user.email "actions@gitea.local"
- name: Bump sidecar patch version
run: |
CURRENT=$(grep '^version = ' pyproject.toml | head -1 | sed 's/version = "\(.*\)"/\1/')
echo "Current sidecar version: ${CURRENT}"
MAJOR=$(echo "${CURRENT}" | cut -d. -f1)
MINOR=$(echo "${CURRENT}" | cut -d. -f2)
PATCH=$(echo "${CURRENT}" | cut -d. -f3)
NEW_PATCH=$((PATCH + 1))
NEW_VERSION="${MAJOR}.${MINOR}.${NEW_PATCH}"
echo "New sidecar version: ${NEW_VERSION}"
sed -i "s/^version = \"${CURRENT}\"/version = \"${NEW_VERSION}\"/" pyproject.toml
# Write to env file instead of step outputs (avoids act runner bug)
echo "NEW_VERSION=${NEW_VERSION}" >> $GITHUB_ENV
echo "RELEASE_TAG=sidecar-v${NEW_VERSION}" >> $GITHUB_ENV
- name: Commit and tag
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
git add pyproject.toml
git commit -m "chore: bump sidecar version to ${NEW_VERSION} [skip ci]"
git tag "${RELEASE_TAG}"
REMOTE_URL=$(git remote get-url origin | sed "s|://|://gitea-actions:${BUILD_TOKEN}@|")
git pull --rebase "${REMOTE_URL}" main || true
git push "${REMOTE_URL}" HEAD:main
git push "${REMOTE_URL}" "${RELEASE_TAG}"
- name: Create Gitea release
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
RELEASE_NAME="Sidecar v${NEW_VERSION}"
curl -s -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/json" \
-d "{\"tag_name\": \"${RELEASE_TAG}\", \"name\": \"${RELEASE_NAME}\", \"body\": \"Automated sidecar build.\", \"draft\": false, \"prerelease\": false}" \
"${REPO_API}/releases"
echo "Created release: ${RELEASE_NAME}"
- name: Trigger per-OS sidecar builds
env:
BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
run: |
REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
for workflow in build-sidecar-linux.yml build-sidecar-windows.yml build-sidecar-macos.yml build-sidecar-cloud.yml; do
echo "Dispatching ${workflow} for ${RELEASE_TAG}..."
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
-H "Authorization: token ${BUILD_TOKEN}" \
-H "Content-Type: application/json" \
-d "{\"ref\": \"main\", \"inputs\": {\"tag\": \"${RELEASE_TAG}\"}}" \
"${REPO_API}/actions/workflows/${workflow}/dispatches")
echo " -> HTTP ${HTTP_CODE}"
done
# NOTE: Automatic cleanup disabled -- it races with async builds.
# Clean up old releases manually from the Gitea UI when needed.

66
.gitea/workflows/test.yml Normal file
View File

@@ -0,0 +1,66 @@
name: Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:
jobs:
python-tests:
name: Python Backend Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install uv
run: |
if command -v uv &> /dev/null; then
echo "uv already installed: $(uv --version)"
else
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.local/bin" >> $GITHUB_PATH
fi
- name: Run pytest
run: |
uv venv .testvenv
VIRTUAL_ENV=.testvenv uv pip install pytest httpx pytest-asyncio anyio fastapi pydantic pyyaml uvicorn requests
.testvenv/bin/python -m pytest backend/tests/ client/tests/ -v --tb=short
frontend-tests:
name: Frontend Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- name: Install dependencies
run: npm ci
- name: Run Vitest
run: npx vitest run
rust-tests:
name: Rust Sidecar Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Rust
run: |
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
- name: Install Tauri system dependencies
run: |
sudo apt-get update
sudo apt-get install -y libgtk-3-dev libwebkit2gtk-4.1-dev libappindicator3-dev librsvg2-dev patchelf
- name: Run cargo test
working-directory: src-tauri
run: cargo test

3
.gitignore vendored
View File

@@ -63,3 +63,6 @@ dist/
# Tauri
src-tauri/target/
# Windows NTFS alternate data streams
*:Zone.Identifier

View File

@@ -11,9 +11,11 @@ Local Transcription is a cross-platform desktop application for real-time speech
**Key Features:**
- Cross-platform desktop app (Windows, macOS, Linux) via Tauri v2 + Svelte 5
- Headless Python backend with FastAPI control API
- Dual transcription modes: local Whisper or cloud Deepgram (managed/BYOK)
- Cloud-first: defaults to Deepgram (BYOK) transcription; local Whisper also supported
- Settings UI hides local-only options (model, VAD, timing) when in cloud mode
- Start button gated on API key / login — shows guidance if not configured
- Shared Captions: create rooms, share via codes, join with one click (hosted at caption.shadowdao.com)
- Built-in web server for OBS browser source at `http://localhost:8080`
- Optional multi-user sync via Node.js server
- CUDA, MPS (Apple Silicon), and CPU support
- Auto-updates, custom fonts, configurable colors
@@ -64,8 +66,14 @@ local-transcription/
│ ├── web_display.py # FastAPI OBS display server (WebSocket + HTML)
│ └── nodejs/ # Optional multi-user sync server
├── .gitea/workflows/ # CI/CD
│ ├── release.yml # Tauri app builds (Linux/Windows/macOS)
── build-sidecar.yml # Python sidecar builds (CUDA + CPU)
│ ├── release.yml # Coordinator: version bump, tag, release creation
── build-app-linux.yml # Linux Tauri app build (triggered by v* tag)
│ ├── build-app-windows.yml # Windows Tauri app build (triggered by v* tag)
│ ├── build-app-macos.yml # macOS Tauri app build (triggered by v* tag)
│ ├── sidecar-release.yml # Sidecar coordinator: version bump, tag, release
│ ├── build-sidecar-linux.yml # Linux sidecar build (triggered by sidecar-v* tag)
│ ├── build-sidecar-windows.yml # Windows sidecar build (triggered by sidecar-v* tag)
│ └── build-sidecar-macos.yml # macOS sidecar build (triggered by sidecar-v* tag)
├── config/default_config.yaml # Default settings template
├── main.py # Legacy PySide6 GUI entry point
├── main_cli.py # CLI version for testing
@@ -205,12 +213,21 @@ Uses Svelte 5 runes throughout (`$state`, `$derived`, `$effect`, `$props`). No S
## CI/CD
Two Gitea Actions workflows in `.gitea/workflows/`:
Eight Gitea Actions workflows in `.gitea/workflows/`, split into coordinators and per-OS builders:
- **`release.yml`**: Triggers on push to `main`. Auto-bumps version, builds Tauri app on Linux/Windows/macOS, uploads `.deb`, `.rpm`, `.msi`, `.dmg` to Gitea release.
- **`build-sidecar.yml`**: Triggers on changes to `client/`, `server/`, `backend/`, `pyproject.toml`. Builds headless Python sidecar via PyInstaller. CUDA + CPU for Linux/Windows, CPU-only for macOS.
**App release (Tauri):**
- **`release.yml`**: Coordinator. Triggers on push to `main`. Auto-bumps version in package.json/tauri.conf.json/Cargo.toml/version.py, commits, tags `v{VERSION}`, creates Gitea release.
- **`build-app-linux.yml`**: Triggers on `v*` tag push or `workflow_dispatch`. Builds Tauri app, uploads `.deb`/`.rpm`/`.AppImage`.
- **`build-app-windows.yml`**: Triggers on `v*` tag push or `workflow_dispatch`. Builds Tauri app, uploads `.msi`/`*-setup.exe`.
- **`build-app-macos.yml`**: Triggers on `v*` tag push or `workflow_dispatch`. Builds Tauri app, uploads `.dmg`.
Both require a `BUILD_TOKEN` secret (Gitea API token with release write access).
**Sidecar release (Python backend):**
- **`sidecar-release.yml`**: Coordinator. Triggers on push to `main` with changes in `client/`, `server/`, `backend/`, `pyproject.toml`, or `local-transcription-headless.spec`. Bumps version in pyproject.toml/version.py, tags `sidecar-v{VERSION}`, creates Gitea release.
- **`build-sidecar-linux.yml`**: Triggers on `sidecar-v*` tag push or `workflow_dispatch`. Builds CUDA + CPU sidecars via PyInstaller.
- **`build-sidecar-windows.yml`**: Triggers on `sidecar-v*` tag push or `workflow_dispatch`. Builds CUDA + CPU sidecars via PyInstaller.
- **`build-sidecar-macos.yml`**: Triggers on `sidecar-v*` tag push or `workflow_dispatch`. Builds CPU-only sidecar via PyInstaller.
All per-OS build workflows can be re-run independently via `workflow_dispatch` with an optional `tag` input. All require a `BUILD_TOKEN` secret (Gitea API token with release write access).
## Common Patterns
@@ -258,9 +275,29 @@ Both require a `BUILD_TOKEN` secret (Gitea API token with release write access).
- `Info.plist` must include `NSMicrophoneUsageDescription` for mic access
- No CUDA builds — CPU/MPS only
## Code Signing
Code signing is configured for Windows and macOS to eliminate install warnings (SmartScreen / Gatekeeper). See [SIGNING.md](SIGNING.md) for full setup details.
**Status (as of 2026-04-10):** CI workflow changes are committed. Waiting on identity verification for both platforms before secrets can be configured.
**How it works:**
- macOS: Tauri auto-signs when `APPLE_CERTIFICATE` and related env vars are set in CI. Notarization uses App Store Connect API key.
- Windows: Azure Artifact Signing via `signtool.exe` + dlib. CI workflow injects `signCommand` into `tauri.conf.json` at build time when `AZURE_CLIENT_ID` is set.
- Both are no-ops when secrets aren't configured — unsigned builds work as before.
**Key files:**
- `src-tauri/Entitlements.plist` — macOS hardened runtime entitlements (mic, network)
- `src-tauri/Info.plist` — macOS microphone usage description
- `.gitea/workflows/build-app-macos.yml` — Apple signing + notarization
- `.gitea/workflows/build-app-windows.yml` — Azure Artifact Signing
**Secrets required (12 total):** See [SIGNING.md](SIGNING.md) for the full list — 6 Apple secrets, 6 Azure secrets.
## Related Documentation
- [README.md](README.md) — User-facing documentation
- [BUILD.md](BUILD.md) — Detailed build instructions
- [INSTALL.md](INSTALL.md) — Installation guide
- [SIGNING.md](SIGNING.md) — Code signing setup guide
- [server/nodejs/README.md](server/nodejs/README.md) — Node.js server setup

View File

@@ -7,14 +7,14 @@ A real-time speech-to-text desktop application for streamers. Runs locally on yo
## Features
- **Real-Time Transcription**: Live speech-to-text using Whisper models with minimal latency
- **Cloud-First**: Defaults to Deepgram cloud transcription — get started with just an API key
- **Cross-Platform**: Native desktop app for Windows, macOS, and Linux via [Tauri](https://tauri.app/)
- **Dual Transcription Modes**: Local (Whisper) or cloud (Deepgram) with managed billing or BYOK
- **CPU & GPU Support**: Automatic detection of CUDA (NVIDIA), MPS (Apple Silicon), or CPU fallback
- **Advanced Voice Detection**: Dual-layer VAD (WebRTC + Silero) for accurate speech detection
- **Dual Transcription Modes**: Cloud (Deepgram) or local (Whisper) with automatic GPU/CPU detection
- **Shared Captions**: Create a room and share a code so others can join — no server setup needed
- **OBS Integration**: Built-in web server for browser source capture at `http://localhost:8080`
- **Multi-User Sync**: Optional Node.js server to sync transcriptions across multiple users
- **Custom Fonts**: Support for system fonts, web-safe fonts, Google Fonts, and custom font files
- **Customizable Colors**: User-configurable colors for name, text, and background
- **Advanced Voice Detection**: Dual-layer VAD (WebRTC + Silero) for accurate speech detection
- **Noise Suppression**: Built-in audio preprocessing to reduce background noise
- **Auto-Updates**: Automatic update checking with release notes display
@@ -87,27 +87,30 @@ For detailed build instructions, see [BUILD.md](BUILD.md).
## Usage
### Standalone Mode
### Quick Setup (Cloud — Recommended)
1. Launch the application
2. Select your microphone from the audio device dropdown
3. Choose a Whisper model (smaller = faster, larger = more accurate):
2. Open **Settings** — the transcription mode defaults to **Cloud (Deepgram)**
3. Get a free API key at [console.deepgram.com](https://console.deepgram.com) and paste it in Settings
4. Select your microphone from the audio device dropdown
5. Click **Start Transcription**
6. Transcriptions appear in the main window and at `http://localhost:8080`
> The Start button is disabled until an API key is entered. Local-only settings (model, VAD, timing) are hidden in cloud mode to keep things simple.
### Local Mode (Whisper)
For offline/on-device transcription, switch to **Local (Whisper)** in Settings:
1. Choose a Whisper model (smaller = faster, larger = more accurate):
- `tiny.en` / `tiny` — Fastest, good for quick captions
- `base.en` / `base` — Balanced speed and accuracy
- `small.en` / `small` — Better accuracy
- `medium.en` / `medium` — High accuracy
- `large-v3` — Best accuracy (requires more resources)
4. Click **Start** to begin transcription
5. Transcriptions appear in the main window and at `http://localhost:8080`
### Remote Transcription (Deepgram)
Instead of local Whisper models, you can use cloud-based transcription:
- **Managed mode**: Sign up via the transcription proxy for metered billing
- **BYOK mode**: Bring your own Deepgram API key for direct access
Configure in Settings > Remote Transcription.
2. Select compute device (Auto/CUDA/CPU) and compute type
3. Tune VAD sensitivity and timing settings as needed
4. Click **Start Transcription**
### OBS Browser Source Setup
@@ -117,18 +120,42 @@ Configure in Settings > Remote Transcription.
4. Set dimensions (e.g., 1920x300)
5. Check "Shutdown source when not visible" for performance
### Multi-User Mode (Optional)
### Shared Captions (Multi-User)
For syncing transcriptions across multiple users (e.g., multi-host streams or translation teams):
Share live captions across multiple users using the hosted service at `https://caption.shadowdao.com/` — no server setup required.
1. Deploy the Node.js server (see [server/nodejs/README.md](server/nodejs/README.md))
2. In the app settings, enable **Server Sync**
3. Enter the server URL (e.g., `http://your-server:3000/api/send`)
4. Set a room name and passphrase (shared with other users)
5. In OBS, use the server's display URL with your room name:
```
http://your-server:3000/display?room=YOURROOM&timestamps=true&maxlines=50
```
#### Creating a Room
1. Open **Settings** and enable **Shared Captions**
2. Click **Create Room** — this generates a room name and passphrase automatically
3. A **share code** is generated and copied to your clipboard
4. Send the share code to anyone who should join
#### Joining a Room
1. Open **Settings** and enable **Shared Captions**
2. Paste the share code you received into the **"Paste share code to join"** field
3. Click **Join** — the server URL, room, and passphrase are auto-filled
4. Click **Save**
#### Sharing an Existing Room
If you already have a room configured and want to invite others:
1. Open **Settings** and scroll to **Shared Captions**
2. Click **Share Current Room** — generates a share code from your current config and copies it to the clipboard
3. Send the code to others
#### OBS Display for Shared Rooms
In OBS, add a Browser source pointing to the server's display URL:
```
https://caption.shadowdao.com/display?room=YOURROOM&timestamps=true&maxlines=50
```
#### Self-Hosting
You can also self-host the sync server. See [server/nodejs/README.md](server/nodejs/README.md) for setup instructions, then enter your own server URL in the Shared Captions settings.
## Configuration
@@ -144,7 +171,7 @@ Settings are stored at `~/.local-transcription/config.yaml` and can be modified
| `transcription.silero_sensitivity` | VAD sensitivity (0-1, lower = more sensitive) | `0.4` |
| `transcription.post_speech_silence_duration` | Silence before finalizing (seconds) | `0.3` |
| `transcription.continuous_mode` | Fast speaker mode for quick talkers | `false` |
| `remote.mode` | Transcription mode (local/managed/byok) | `local` |
| `remote.mode` | Transcription mode (local/managed/byok) | `byok` |
| `display.show_timestamps` | Show timestamps with transcriptions | `true` |
| `display.fade_after_seconds` | Fade out time (0 = never) | `10` |
| `display.font_source` | Font type (System Font/Web-Safe/Google Font/Custom File) | `System Font` |
@@ -267,6 +294,15 @@ Both workflows require a `BUILD_TOKEN` secret in the repo settings (Gitea API to
## Troubleshooting
### macOS: "App is damaged and can't be opened"
macOS Gatekeeper blocks unsigned applications. Since the app is not yet signed with an Apple Developer certificate, you need to remove the quarantine flag before opening:
```bash
xattr -cr "/Applications/Local Transcription.app"
```
Then open the app normally. You only need to do this once after downloading.
### Model Loading Issues
- Models download automatically on first use to `~/.cache/huggingface/`
- First run requires internet connection

136
SIGNING.md Normal file
View File

@@ -0,0 +1,136 @@
# Code Signing Setup
This document explains how to configure code signing for Local Transcription so that Windows and macOS installers are trusted by the operating system.
## Overview
Without code signing:
- **Windows**: SmartScreen shows "Windows protected your PC" warnings
- **macOS**: Gatekeeper blocks the app — "app can't be opened because it is from an unidentified developer"
The CI/CD workflows are configured to sign automatically when the required secrets are present. Without secrets, builds still work — they just produce unsigned installers.
---
## Windows — Azure Artifact Signing
**Cost**: ~$9.99/month (up to 5,000 signatures)
### 1. Create an Azure Account
Sign up at https://azure.microsoft.com if you don't already have one.
### 2. Set Up Artifact Signing
1. In the Azure Portal, search for **Artifact Signing**
2. Create a new **Artifact Signing Account**
- Choose a region (e.g., West US 2) — note this for the endpoint URL
- The endpoint will be like `https://wus2.codesigning.azure.net/`
3. Complete **Identity Verification** (required before you can create certificate profiles)
4. Create a **Certificate Profile** with type "Public Trust" for code signing
### 3. Create an App Registration (Service Principal)
This allows CI to authenticate to Azure:
1. Go to **Azure Active Directory** > **App registrations** > **New registration**
2. Name it (e.g., `local-transcription-signing`)
3. After creation, note the **Application (client) ID** and **Directory (tenant) ID**
4. Go to **Certificates & secrets** > **New client secret** — note the secret value
5. Grant the app registration the **Artifact Signing Certificate Profile Signer** role on your Artifact Signing Account
### 4. Add Gitea Secrets
In your Gitea repository, go to **Settings** > **Actions** > **Secrets** and add:
| Secret Name | Value |
|-------------|-------|
| `AZURE_CLIENT_ID` | App registration Application (client) ID |
| `AZURE_CLIENT_SECRET` | App registration client secret value |
| `AZURE_TENANT_ID` | Directory (tenant) ID |
| `AZURE_SIGNING_ENDPOINT` | Artifact Signing endpoint URL (e.g., `https://wus2.codesigning.azure.net/`) |
| `AZURE_SIGNING_ACCOUNT` | Artifact Signing account name |
| `AZURE_CERT_PROFILE` | Certificate profile name |
---
## macOS — Apple Developer Code Signing + Notarization
**Cost**: $99/year (Apple Developer Program)
### 1. Enroll in the Apple Developer Program
Sign up at https://developer.apple.com/programs/
### 2. Create a Developer ID Certificate
1. Open **Xcode** > **Settings** > **Accounts** > select your team > **Manage Certificates**
2. Click **+** > **Developer ID Application**
3. Or create via the Apple Developer portal: **Certificates, Identifiers & Profiles** > **Certificates** > **+** > **Developer ID Application**
### 3. Export the Certificate as .p12
1. Open **Keychain Access**
2. Find your **Developer ID Application** certificate
3. Right-click > **Export** > save as `.p12` with a password
4. Base64-encode it:
```bash
base64 -i certificate.p12 | tr -d '\n'
```
### 4. Create an App Store Connect API Key
This is used for notarization (submitting the app to Apple for verification):
1. Go to https://appstoreconnect.apple.com/access/integrations/api
2. Click **Generate API Key**
3. Give it a name and **Developer** role (minimum)
4. Download the `.p8` private key file (you can only download it once)
5. Note the **Key ID** and **Issuer ID** shown on the page
### 5. Find Your Signing Identity
Your signing identity looks like:
```
Developer ID Application: Your Name (TEAMID)
```
You can find it by running:
```bash
security find-identity -v -p codesigning
```
### 6. Add Gitea Secrets
| Secret Name | Value |
|-------------|-------|
| `APPLE_CERTIFICATE` | Base64-encoded .p12 certificate (from step 3) |
| `APPLE_CERTIFICATE_PASSWORD` | Password used when exporting the .p12 |
| `APPLE_SIGNING_IDENTITY` | Full identity string (e.g., `Developer ID Application: Your Name (TEAMID)`) |
| `APPLE_API_KEY` | App Store Connect API Key ID |
| `APPLE_API_ISSUER` | API issuer UUID |
| `APPLE_API_KEY_CONTENT` | Full contents of the `.p8` private key file |
---
## Verifying Signing Works
### Trigger a Build
Both build workflows use `workflow_dispatch`, so you can trigger them manually in Gitea:
1. Go to **Actions** > select the workflow > **Run workflow**
2. Enter the release tag (e.g., `v2.0.15`)
### Check macOS
After installing the `.dmg`, the app should open without any Gatekeeper warnings. You can also verify from the command line:
```bash
codesign -dv --verbose=4 /Applications/Local\ Transcription.app
spctl --assess --type execute /Applications/Local\ Transcription.app
```
### Check Windows
After running the `.msi` or `-setup.exe`, there should be no SmartScreen warning. The installer properties should show your organization name as the publisher.

View File

@@ -73,8 +73,15 @@ class APIServer:
original_state_cb = self.controller.on_state_changed
def on_state_changed(state: str, message: str):
# Isolate the upstream callback so a failure there (e.g. a
# broken stdout pipe in main_headless) cannot propagate into
# _set_state and tear down engine init / reload_engine /
# apply_settings request handling.
if original_state_cb:
try:
original_state_cb(state, message)
except Exception:
pass
self._broadcast_control({"type": "state_changed", "state": state, "message": message})
self.controller.on_state_changed = on_state_changed
@@ -99,11 +106,19 @@ class APIServer:
self.controller.on_credits_low = on_credits_low
def set_event_loop(self, loop: asyncio.AbstractEventLoop):
"""Set the event loop used for broadcasting (call from uvicorn startup)."""
self._event_loop = loop
def _broadcast_control(self, data: dict):
"""Send a message to all connected /ws/control clients."""
if not self.control_connections:
return
loop = getattr(self, '_event_loop', None)
if loop is None:
return
message = json.dumps(data)
disconnected = []
@@ -111,7 +126,7 @@ class APIServer:
try:
asyncio.run_coroutine_threadsafe(
ws.send_text(message),
asyncio.get_event_loop(),
loop,
)
except Exception:
disconnected.append(ws)
@@ -124,6 +139,10 @@ class APIServer:
app = self.app
ctrl = self.controller
@app.on_event("startup")
async def on_startup():
self.set_event_loop(asyncio.get_event_loop())
# ── Status ─────────────────────────────────────────────
@app.get("/api/status")
@@ -139,14 +158,24 @@ class APIServer:
@app.post("/api/start")
async def start_transcription():
success, message = ctrl.start_transcription()
import asyncio
# Run in thread pool to avoid blocking the event loop
# (start_recording can block up to 15s waiting for Deepgram WS)
loop = asyncio.get_event_loop()
success, message = await loop.run_in_executor(
None, ctrl.start_transcription
)
if not success:
raise HTTPException(status_code=400, detail=message)
return {"status": "ok", "message": message}
@app.post("/api/stop")
async def stop_transcription():
success, message = ctrl.stop_transcription()
import asyncio
loop = asyncio.get_event_loop()
success, message = await loop.run_in_executor(
None, ctrl.stop_transcription
)
if not success:
raise HTTPException(status_code=400, detail=message)
return {"status": "ok", "message": message}
@@ -190,7 +219,11 @@ class APIServer:
@app.put("/api/config")
async def update_config(update: ConfigUpdate):
engine_reloaded, message = ctrl.apply_settings(update.settings)
import asyncio
loop = asyncio.get_event_loop()
engine_reloaded, message = await loop.run_in_executor(
None, ctrl.apply_settings, update.settings
)
return {
"status": "ok",
"message": message,
@@ -211,7 +244,11 @@ class APIServer:
@app.post("/api/reload-engine")
async def reload_engine():
success, message = ctrl.reload_engine()
import asyncio
loop = asyncio.get_event_loop()
success, message = await loop.run_in_executor(
None, ctrl.reload_engine
)
if not success:
raise HTTPException(status_code=500, detail=message)
return {"status": "ok", "message": message}
@@ -243,6 +280,7 @@ class APIServer:
data = resp.json()
ctrl.config.set('remote.auth_token', data.get('token', ''))
ctrl.config.set('remote.server_url', req.server_url)
ctrl.config.set('remote.email', req.email)
return {"status": "ok", "token": data.get('token', '')}
else:
raise HTTPException(status_code=resp.status_code, detail=resp.text)

View File

@@ -18,13 +18,18 @@ import sys
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
from client.config import Config
from client.device_utils import DeviceManager
from client.transcription_engine_realtime import RealtimeTranscriptionEngine, TranscriptionResult
from client.models import TranscriptionResult
from client.deepgram_transcription import DeepgramTranscriptionEngine
from client.server_sync import ServerSyncClient
from server.web_display import TranscriptionWebServer
from version import __version__
# Heavy imports (torch, RealtimeSTT, faster-whisper) are deferred so
# the cloud-only sidecar build can exclude them entirely.
# Imported lazily in _initialize_engine() when remote.mode == "local".
RealtimeTranscriptionEngine = None
DeviceManager = None
class AppState:
"""Enum-like class for application states."""
@@ -89,7 +94,24 @@ class AppController:
def __init__(self, config: Optional[Config] = None):
self.config = config or Config()
self.device_manager = DeviceManager()
# DeviceManager is only needed for local Whisper mode.
# Lazy-import to keep the cloud-only sidecar lightweight.
global DeviceManager
if DeviceManager is None:
try:
from client.device_utils import DeviceManager as _DM
DeviceManager = _DM
except ImportError:
DeviceManager = None
self.device_manager = DeviceManager() if DeviceManager else None
self.is_cloud_only = DeviceManager is None
# If this is the cloud-only sidecar and mode is still "local",
# auto-switch to "byok" so the engine doesn't try to load Whisper.
if self.is_cloud_only and self.config.get('remote.mode', 'local') == 'local':
self.config.set('remote.mode', 'byok')
# State
self._state = AppState.INITIALIZING
@@ -243,21 +265,17 @@ class AppController:
def _initialize_engine(self):
"""Initialize the transcription engine in a background thread."""
device_config = self.config.get('transcription.device', 'auto')
self.device_manager.set_device(device_config)
audio_device_str = self.config.get('audio.input_device', 'default')
audio_device = None if audio_device_str == 'default' else int(audio_device_str)
model = self.config.get('transcription.model', 'base.en')
language = self.config.get('transcription.language', 'en')
device = self.device_manager.get_device_for_whisper()
device_config = self.config.get('transcription.device', 'auto')
compute_type = self.config.get('transcription.compute_type', 'default')
self.current_model_size = model
self.current_device_config = device_config
user_name = self.config.get('user.name', 'User')
continuous_mode = self.config.get('transcription.continuous_mode', False)
if continuous_mode:
@@ -274,7 +292,6 @@ class AppController:
if remote_mode in ('managed', 'byok'):
self.transcription_engine = DeepgramTranscriptionEngine(
config=self.config,
user_name=user_name,
input_device_index=audio_device,
)
self.transcription_engine.set_callbacks(
@@ -284,6 +301,27 @@ class AppController:
self.transcription_engine.set_error_callback(self._on_remote_error)
self.transcription_engine.set_credits_low_callback(self._on_credits_low)
else:
# Lazy-import heavy local transcription dependencies
global RealtimeTranscriptionEngine
if RealtimeTranscriptionEngine is None:
try:
from client.transcription_engine_realtime import RealtimeTranscriptionEngine as _RTE
RealtimeTranscriptionEngine = _RTE
except ImportError:
# Cloud-only sidecar -- local engine not available
self._set_state(
AppState.ERROR,
"Local transcription not available in this build. "
"Please switch to Cloud (Deepgram) mode in Settings."
)
return
if self.device_manager:
self.device_manager.set_device(device_config)
device = self.device_manager.get_device_for_whisper()
else:
device = "cpu"
self.transcription_engine = RealtimeTranscriptionEngine(
model=model,
device=device,
@@ -303,7 +341,7 @@ class AppController:
initial_prompt=self.config.get('transcription.initial_prompt', ''),
no_log_file=self.config.get('transcription.no_log_file', True),
input_device_index=audio_device,
user_name=user_name,
app_config=self.config,
)
self.transcription_engine.set_callbacks(
realtime_callback=self._on_realtime_transcription,
@@ -332,6 +370,14 @@ class AppController:
device_display = "Unknown"
self._set_state(AppState.READY, f"Ready | Device: {device_display}")
else:
# Cloud sidecar with no API key -- show helpful setup message
# instead of a scary error. The user needs to enter their key.
if self.is_cloud_only:
self._set_state(
AppState.READY,
"Setup needed: Open Settings > Remote Transcription > enter your Deepgram API key"
)
else:
self._set_state(AppState.ERROR, message)
@@ -348,7 +394,14 @@ class AppController:
try:
success = self.transcription_engine.start_recording()
if not success:
return False, "Failed to start recording"
import logging
# Check if there's a recent error in the logger
err_detail = getattr(self.transcription_engine, '_last_error', '')
msg = f"Failed to start recording"
if err_detail:
msg += f": {err_detail}"
print(f"ERROR: {msg}")
return False, msg
# Start server sync if enabled
if self.config.get('server_sync.enabled', False):
@@ -553,8 +606,17 @@ class AppController:
Returns (engine_reload_needed, message).
"""
if new_config:
for key, value in new_config.items():
self.config.set(key, value)
# Flatten nested dicts into dot-notation keys so we merge
# individual values instead of replacing entire sections
# (e.g. remote.mode instead of overwriting all of remote)
def _flatten(d, prefix=""):
for k, v in d.items():
full_key = f"{prefix}{k}" if not prefix else f"{prefix}.{k}"
if isinstance(v, dict):
_flatten(v, full_key)
else:
self.config.set(full_key, v)
_flatten(new_config)
# Update web server display settings
if self.web_server:
@@ -577,12 +639,18 @@ class AppController:
if self.config.get('server_sync.enabled', False):
self._start_server_sync()
# Check if model/device changed
# Check if model/device/remote mode changed -- any of these require
# a full engine reload since they change which engine class is used
new_model = self.config.get('transcription.model', 'base.en')
new_device = self.config.get('transcription.device', 'auto')
new_remote_mode = self.config.get('remote.mode', 'local')
current_remote_mode = 'local'
if self.transcription_engine:
current_remote_mode = getattr(self.transcription_engine, 'mode', 'local')
engine_reload_needed = (
self.current_model_size != new_model
or self.current_device_config != new_device
or current_remote_mode != new_remote_mode
)
if engine_reload_needed:
@@ -596,7 +664,7 @@ class AppController:
host = self.config.get('web_server.host', '127.0.0.1')
port = self.actual_web_port or self.config.get('web_server.port', 8080)
device_info = self.device_manager.get_device_info()
device_info = self.device_manager.get_device_info() if self.device_manager else []
remote_mode = self.config.get('remote.mode', 'local')
if remote_mode in ('managed', 'byok') and self.transcription_engine:
@@ -621,6 +689,7 @@ class AppController:
"transcription_count": len(self.transcriptions),
"remote_mode": remote_mode,
"server_sync_enabled": self.config.get('server_sync.enabled', False),
"is_cloud_only": self.is_cloud_only,
}
def get_audio_devices(self) -> list[dict]:
@@ -640,10 +709,13 @@ class AppController:
def get_compute_devices(self) -> list[dict]:
"""List available compute devices."""
device_info = self.device_manager.get_device_info()
devices = [{"id": "auto", "name": "Auto-detect"}]
if self.device_manager:
device_info = self.device_manager.get_device_info()
for dev_id, dev_name in device_info:
devices.append({"id": dev_id, "name": dev_name})
else:
devices.append({"id": "cloud", "name": "Cloud (Deepgram)"})
return devices
# ── Update Checking ────────────────────────────────────────────

View File

@@ -75,10 +75,16 @@ def main():
# Create controller and initialize
controller = AppController(config=config)
# Wire a state callback that prints the ready event
# Wire a state callback that prints state events for the parent
# process to read. Stdout writes can fail with EINVAL on Windows
# when the parent stops reading the sidecar pipe; swallow those
# so the engine state machine isn't taken down by a logging path.
def on_state_changed(state, message):
event = {"event": "state", "state": state, "message": message}
try:
print(json.dumps(event), flush=True)
except (OSError, ValueError):
pass
controller.on_state_changed = on_state_changed
@@ -88,11 +94,16 @@ def main():
# Create API server wrapping the controller
api_server = APIServer(controller)
# Determine actual port (web server may have shifted if port was in use)
actual_port = controller.actual_web_port or args.port
# OBS display runs on the configured port, API server on port+1
obs_port = controller.actual_web_port or args.port
api_port = obs_port + 1
# Print ready event so Tauri can discover the port
print(json.dumps({"event": "ready", "port": actual_port}), flush=True)
# Print ready event so Tauri can discover the API port
print(json.dumps({
"event": "ready",
"port": api_port,
"obs_port": obs_port,
}), flush=True)
# Run the API server (blocks)
import uvicorn
@@ -104,7 +115,7 @@ def main():
uvicorn.run(
api_server.app,
host=args.host,
port=actual_port + 1, # API on port+1, OBS display on the main port
port=api_port,
log_level="error",
access_log=False,
)

View File

159
backend/tests/conftest.py Normal file
View File

@@ -0,0 +1,159 @@
"""Shared fixtures for backend tests.
Heavy third-party modules (torch, sounddevice, numpy, RealtimeSTT, etc.) are
stubbed at the *sys.modules* level before any backend code is imported. This
lets the test suite run on a plain Python install without GPU drivers, audio
hardware, or heavyweight ML libraries.
"""
import sys
import types
from pathlib import Path
from unittest.mock import MagicMock
import pytest
# ── Project root on sys.path ────────────────────────────────────────
project_root = Path(__file__).resolve().parent.parent.parent
if str(project_root) not in sys.path:
sys.path.insert(0, str(project_root))
# ── Stub heavy modules before anything imports them ─────────────────
def _stub(name: str) -> types.ModuleType:
"""Create a stub module and register it in sys.modules if not already present."""
if name in sys.modules:
return sys.modules[name]
mod = types.ModuleType(name)
sys.modules[name] = mod
return mod
# numpy -- must behave like a real module for `import numpy as np`
_np = _stub("numpy")
_np.float32 = float
_np.float64 = float
_np.int16 = int
_np.ndarray = MagicMock
_np.array = MagicMock(return_value=MagicMock())
_np.zeros = MagicMock(return_value=MagicMock())
_np.frombuffer = MagicMock(return_value=MagicMock())
# torch + sub-modules
_torch = _stub("torch")
_torch.cuda = MagicMock()
_torch.cuda.is_available = MagicMock(return_value=False)
_torch.backends = MagicMock()
_torch.backends.mps = MagicMock()
_torch.backends.mps.is_available = MagicMock(return_value=False)
_stub("torch.cuda")
_stub("torch.backends")
_stub("torch.backends.mps")
_stub("torchaudio")
# sounddevice
_sd = _stub("sounddevice")
_sd.query_devices = MagicMock(return_value=[])
# RealtimeSTT (imported by transcription_engine_realtime)
_rtstt = _stub("RealtimeSTT")
_rtstt.AudioToTextRecorder = MagicMock
# faster_whisper (sometimes imported transitively)
_stub("faster_whisper")
# noisereduce
_stub("noisereduce")
# scipy
_scipy = _stub("scipy")
_stub("scipy.signal")
_stub("scipy.io")
_stub("scipy.io.wavfile")
# webrtcvad
_stub("webrtcvad")
# openwakeword
_stub("openwakeword")
# pvporcupine
_stub("pvporcupine")
# PySide6 (should not be needed, but just in case)
_stub("PySide6")
_stub("PySide6.QtWidgets")
_stub("PySide6.QtCore")
_stub("PySide6.QtGui")
# websockets
_ws = _stub("websockets")
_ws.connect = MagicMock
# deepgram (cloud transcription)
_stub("deepgram")
# ── Fixtures ────────────────────────────────────────────────────────
@pytest.fixture
def mock_config(tmp_path):
"""Return a Config object backed by a temporary file.
This avoids touching the real user config at ~/.local-transcription/.
"""
config_file = tmp_path / "test_config.yaml"
from client.config import Config
config = Config(config_path=str(config_file))
return config
@pytest.fixture
def controller(mock_config):
"""Return an AppController wired to *mock_config* without starting heavy
subsystems (engine, web server, device manager).
The transcription engine, web server thread, and DeviceManager are all
replaced with lightweight mocks so the test suite can run without a GPU,
audio hardware, or a free network port.
"""
from unittest.mock import patch
with patch("backend.app_controller.DeviceManager") as MockDM, \
patch("backend.app_controller.RealtimeTranscriptionEngine"), \
patch("backend.app_controller.DeepgramTranscriptionEngine"), \
patch("backend.app_controller.TranscriptionWebServer"), \
patch("backend.app_controller.ServerSyncClient"):
# DeviceManager stub
dm_instance = MagicMock()
dm_instance.get_device_info.return_value = [("cpu", "CPU")]
dm_instance.get_device_for_whisper.return_value = "cpu"
MockDM.return_value = dm_instance
from backend.app_controller import AppController
ctrl = AppController(config=mock_config)
yield ctrl
@pytest.fixture
def api_client(controller):
"""Return an httpx.AsyncClient speaking ASGI to the APIServer's FastAPI app.
Usage in tests::
async def test_something(api_client):
resp = await api_client.get("/api/status")
assert resp.status_code == 200
"""
from backend.api_server import APIServer
import httpx
api = APIServer(controller)
transport = httpx.ASGITransport(app=api.app)
client = httpx.AsyncClient(transport=transport, base_url="http://testserver")
return client

View File

@@ -0,0 +1,150 @@
"""Tests for backend.api_server.APIServer REST endpoints."""
import sys
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
import pytest_asyncio
# Ensure project root is on path
project_root = Path(__file__).resolve().parent.parent.parent
sys.path.insert(0, str(project_root))
# ── GET /api/status ─────────────────────────────────────────────────
@pytest.mark.asyncio
async def test_get_status(api_client):
resp = await api_client.get("/api/status")
assert resp.status_code == 200
data = resp.json()
assert "state" in data
assert "is_transcribing" in data
assert "version" in data
assert "web_server" in data
# ── GET /api/config ─────────────────────────────────────────────────
@pytest.mark.asyncio
async def test_get_config(api_client):
resp = await api_client.get("/api/config")
assert resp.status_code == 200
data = resp.json()
# The config should be a dict (the raw config mapping)
assert isinstance(data, dict)
# ── PUT /api/config ─────────────────────────────────────────────────
@pytest.mark.asyncio
async def test_put_config(api_client, controller):
"""Updating config via PUT should persist and return success."""
# Patch reload_engine to avoid heavy lifting
controller.reload_engine = MagicMock(return_value=(True, "ok"))
controller.current_model_size = "base.en"
controller.current_device_config = "auto"
controller.config.set("transcription.model", "base.en")
controller.config.set("transcription.device", "auto")
resp = await api_client.put(
"/api/config",
json={"settings": {"display.font_size": 24}},
)
assert resp.status_code == 200
body = resp.json()
assert body["status"] == "ok"
# Verify the value was actually saved
assert controller.config.get("display.font_size") == 24
# ── POST /api/start (engine not ready) ─────────────────────────────
@pytest.mark.asyncio
async def test_start_when_not_ready(api_client, controller):
"""Starting transcription without an engine should return 400."""
controller.transcription_engine = None
resp = await api_client.post("/api/start")
assert resp.status_code == 400
# ── POST /api/clear ─────────────────────────────────────────────────
@pytest.mark.asyncio
async def test_clear(api_client, controller):
from client.models import TranscriptionResult
from datetime import datetime
controller.transcriptions = [
TranscriptionResult(text="One", is_final=True, timestamp=datetime.now(), user_name="U"),
]
resp = await api_client.post("/api/clear")
assert resp.status_code == 200
body = resp.json()
assert body["cleared"] == 1
assert len(controller.transcriptions) == 0
# ── GET /api/audio-devices ──────────────────────────────────────────
@pytest.mark.asyncio
async def test_get_audio_devices(api_client, controller):
"""Audio devices endpoint should return a list, even when mocked."""
# Mock sounddevice so the test works without audio hardware
with patch("backend.app_controller.AppController.get_audio_devices",
return_value=[{"index": 0, "name": "Mock Mic"}]):
resp = await api_client.get("/api/audio-devices")
assert resp.status_code == 200
data = resp.json()
assert "devices" in data
assert len(data["devices"]) >= 1
# ── GET /api/compute-devices ────────────────────────────────────────
@pytest.mark.asyncio
async def test_get_compute_devices(api_client, controller):
resp = await api_client.get("/api/compute-devices")
assert resp.status_code == 200
data = resp.json()
assert "devices" in data
# At minimum we get the "Auto-detect" entry
assert any(d["id"] == "auto" for d in data["devices"])
# ── GET /api/check-update ──────────────────────────────────────────
@pytest.mark.asyncio
async def test_check_update(api_client, controller):
"""check-update should return a dict with an 'available' key."""
with patch.object(controller, "check_for_updates",
return_value={"available": False, "current_version": "1.0.0"}):
resp = await api_client.get("/api/check-update")
assert resp.status_code == 200
data = resp.json()
assert "available" in data
# ── GET /api/version ────────────────────────────────────────────────
@pytest.mark.asyncio
async def test_version(api_client):
resp = await api_client.get("/api/version")
assert resp.status_code == 200
data = resp.json()
assert "version" in data
# Should be a non-empty string
assert isinstance(data["version"], str)
assert len(data["version"]) > 0

View File

@@ -0,0 +1,183 @@
"""Tests for backend.app_controller.AppController."""
import sys
from datetime import datetime
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
# Ensure project root is on path
project_root = Path(__file__).resolve().parent.parent.parent
sys.path.insert(0, str(project_root))
from backend.app_controller import AppState
# ── basic state ─────────────────────────────────────────────────────
def test_initial_state(controller):
"""A freshly constructed controller should be INITIALIZING and not transcribing."""
assert controller.state == AppState.INITIALIZING
assert controller.is_transcribing is False
# ── start / stop ────────────────────────────────────────────────────
def test_start_transcription_without_engine(controller):
"""Starting transcription before the engine is ready should fail gracefully."""
controller.transcription_engine = None
success, message = controller.start_transcription()
assert success is False
assert "not ready" in message.lower()
def test_start_stop_cycle(controller):
"""Full start -> stop cycle with a mocked engine that reports ready."""
engine = MagicMock()
engine.is_ready.return_value = True
engine.start_recording.return_value = True
controller.transcription_engine = engine
# Start
ok, msg = controller.start_transcription()
assert ok is True
assert controller.is_transcribing is True
assert controller.state == AppState.TRANSCRIBING
# Stop
ok, msg = controller.stop_transcription()
assert ok is True
assert controller.is_transcribing is False
engine.stop_recording.assert_called_once()
def test_double_start_rejected(controller):
"""Calling start_transcription twice should reject the second call."""
engine = MagicMock()
engine.is_ready.return_value = True
engine.start_recording.return_value = True
controller.transcription_engine = engine
controller.start_transcription()
success, message = controller.start_transcription()
assert success is False
assert "already" in message.lower()
# ── transcription storage ───────────────────────────────────────────
def test_clear_transcriptions(controller):
"""clear_transcriptions should empty the list and return the count."""
from client.models import TranscriptionResult
controller.transcriptions = [
TranscriptionResult(text="Hello", is_final=True, timestamp=datetime.now(), user_name="Alice"),
TranscriptionResult(text="World", is_final=True, timestamp=datetime.now(), user_name="Bob"),
]
count = controller.clear_transcriptions()
assert count == 2
assert len(controller.transcriptions) == 0
def test_get_transcriptions_text_with_timestamps(controller):
"""get_transcriptions_text should include [HH:MM:SS] prefixes when requested."""
from client.models import TranscriptionResult
ts = datetime(2025, 1, 15, 10, 30, 45)
controller.transcriptions = [
TranscriptionResult(text="Test line", is_final=True, timestamp=ts, user_name="User"),
]
text = controller.get_transcriptions_text(include_timestamps=True)
assert "[10:30:45]" in text
assert "User:" in text
assert "Test line" in text
# ── settings / engine reload ────────────────────────────────────────
def test_apply_settings_triggers_reload_on_model_change(controller):
"""Changing the transcription model should trigger an engine reload."""
controller.current_model_size = "base.en"
controller.current_device_config = "auto"
# Patch reload_engine so it doesn't actually try to spin up threads
controller.reload_engine = MagicMock(return_value=(True, "reloaded"))
reloaded, msg = controller.apply_settings({
"transcription.model": "small.en",
})
assert reloaded is True
controller.reload_engine.assert_called_once()
def test_apply_settings_no_reload_when_same(controller):
"""If model and device haven't changed, no reload should happen."""
controller.current_model_size = "base.en"
controller.current_device_config = "auto"
# Ensure config returns the same values
controller.config.set("transcription.model", "base.en")
controller.config.set("transcription.device", "auto")
# Remote mode must also match (no engine means current mode is 'local')
controller.config.set("remote.mode", "local")
controller.reload_engine = MagicMock(return_value=(True, "reloaded"))
reloaded, msg = controller.apply_settings({
"display.font_size": 20,
})
assert reloaded is False
controller.reload_engine.assert_not_called()
# ── transcription callbacks ─────────────────────────────────────────
def test_on_final_transcription_callback_fires(controller):
"""_on_final_transcription should append and invoke on_transcription callback."""
from client.models import TranscriptionResult
received = []
controller.on_transcription = lambda data: received.append(data)
controller.is_transcribing = True
controller._set_state(AppState.TRANSCRIBING)
result = TranscriptionResult(
text="Hello world",
is_final=True,
timestamp=datetime.now(),
user_name="Tester",
)
controller._on_final_transcription(result)
assert len(controller.transcriptions) == 1
assert len(received) == 1
assert received[0]["text"] == "Hello world"
assert received[0]["user_name"] == "Tester"
assert received[0]["is_preview"] is False
def test_on_final_transcription_ignored_when_not_transcribing(controller):
"""If the controller is not in transcribing state the callback should be a no-op."""
from client.models import TranscriptionResult
controller.is_transcribing = False
result = TranscriptionResult(
text="Should be ignored",
is_final=True,
timestamp=datetime.now(),
user_name="Ghost",
)
controller._on_final_transcription(result)
assert len(controller.transcriptions) == 0

View File

@@ -0,0 +1,56 @@
"""Tests for backend.main_headless ready-event JSON format."""
import sys
from pathlib import Path
import pytest
# Ensure project root is on path
project_root = Path(__file__).resolve().parent.parent.parent
sys.path.insert(0, str(project_root))
def test_ready_event_reports_api_port_not_obs_port():
"""The ready JSON printed by main_headless must set ``port`` to
``obs_port + 1`` (the API port), not the OBS display port.
From main_headless.py::
obs_port = controller.actual_web_port or args.port
api_port = obs_port + 1
print(json.dumps({
"event": "ready",
"port": api_port,
"obs_port": obs_port,
}), flush=True)
We verify this contract by reading the source and checking the
structure directly (running main() would start a real server).
"""
import ast
import textwrap
source_path = project_root / "backend" / "main_headless.py"
source = source_path.read_text()
# Verify the key relationships exist in the source:
# 1. api_port = obs_port + 1
assert "api_port = obs_port + 1" in source, (
"Expected `api_port = obs_port + 1` in main_headless.py"
)
# 2. The ready event JSON uses api_port for "port", not obs_port
assert '"port": api_port' in source or "'port': api_port" in source, (
"The ready event should report api_port as 'port'"
)
# 3. obs_port is also included separately
assert '"obs_port": obs_port' in source or "'obs_port': obs_port" in source, (
"The ready event should also include 'obs_port'"
)
# 4. Verify the event name
assert '"event": "ready"' in source or "'event': 'ready'" in source, (
"The ready event should have event='ready'"
)

View File

@@ -17,7 +17,7 @@ from datetime import datetime
from queue import Queue, Empty
from typing import Optional, Callable
from client.transcription_engine_realtime import TranscriptionResult
from client.models import TranscriptionResult
logger = logging.getLogger(__name__)
@@ -36,18 +36,16 @@ class DeepgramTranscriptionEngine:
# Construction / configuration
# ------------------------------------------------------------------ #
def __init__(self, config, user_name: str = "User", input_device_index: Optional[int] = None):
def __init__(self, config, input_device_index: Optional[int] = None):
"""
Initialise the engine from a :class:`client.config.Config` object.
Args:
config: Application ``Config`` instance.
user_name: Display name attached to transcriptions.
input_device_index: Index of the audio input device to use
(``None`` for the system default).
"""
self.config = config
self.user_name = user_name
self.input_device_index = input_device_index
# Mode: 'managed' (proxy) or 'byok' (direct Deepgram)
@@ -67,7 +65,7 @@ class DeepgramTranscriptionEngine:
# Audio parameters
self.sample_rate: int = 16000
self.channels: int = 1
self.blocksize: int = 4096
self.blocksize: int = 1024 # ~64ms chunks for lower latency streaming
# Callbacks
self.realtime_callback: Optional[Callable[[TranscriptionResult], None]] = None
@@ -156,17 +154,30 @@ class DeepgramTranscriptionEngine:
return True
self._stop_event.clear()
self._ws_connected = threading.Event()
self._is_recording = True
# Start the asyncio event-loop thread (handles WS send/receive)
self._thread = threading.Thread(target=self._run_event_loop, daemon=True)
self._thread.start()
# Wait for the WebSocket to connect before starting audio capture.
# Without this, audio chunks arrive before the WS is open -> broken pipe.
if not self._ws_connected.wait(timeout=15):
logger.error("Timed out waiting for Deepgram WebSocket connection")
print("ERROR: Timed out waiting for Deepgram WebSocket connection")
self._last_error = "Timed out connecting to Deepgram"
self._is_recording = False
self._stop_event.set()
return False
# Start the audio capture stream
try:
self._start_audio_stream()
except Exception as exc:
logger.error("Failed to open audio stream: %s", exc)
print(f"ERROR: Failed to open audio stream: {exc}")
self._last_error = f"Audio stream error: {exc}"
self._is_recording = False
self._stop_event.set()
return False
@@ -283,6 +294,11 @@ class DeepgramTranscriptionEngine:
if not await self._managed_handshake():
return
# Signal that the WebSocket is connected and ready
logger.info("WebSocket connected to Deepgram")
if hasattr(self, '_ws_connected'):
self._ws_connected.set()
# Run send and receive concurrently
await asyncio.gather(
self._send_loop(),
@@ -302,9 +318,13 @@ class DeepgramTranscriptionEngine:
def _build_ws_url_and_headers(self):
"""Return ``(url, headers)`` depending on the current mode."""
if self.mode == "managed":
# Ensure the server URL uses wss:// and append the path
# Convert HTTP(S) URLs to WS(S) for WebSocket connection
url = self.server_url.rstrip("/")
if not url.startswith("ws://") and not url.startswith("wss://"):
if url.startswith("https://"):
url = "wss://" + url[len("https://"):]
elif url.startswith("http://"):
url = "ws://" + url[len("http://"):]
elif not url.startswith("ws://") and not url.startswith("wss://"):
url = f"wss://{url}"
url = f"{url}/ws/transcribe"
return url, {}
@@ -314,6 +334,8 @@ class DeepgramTranscriptionEngine:
f"model={self.deepgram_model}"
f"&language={self.language}"
"&interim_results=true"
"&punctuate=true"
"&smart_format=true"
"&encoding=linear16"
f"&sample_rate={self.sample_rate}"
f"&channels={self.channels}"
@@ -370,10 +392,16 @@ class DeepgramTranscriptionEngine:
async def _send_loop(self):
"""Drain the audio queue and push raw PCM bytes over the WebSocket."""
loop = asyncio.get_event_loop()
while not self._stop_event.is_set():
try:
pcm_bytes = self._audio_queue.get(timeout=0.1)
except Empty:
# Use run_in_executor to avoid blocking the async event loop
# (which would stall the receive loop and delay transcriptions)
pcm_bytes = await asyncio.wait_for(
loop.run_in_executor(None, lambda: self._audio_queue.get(timeout=0.5)),
timeout=1.0,
)
except (Empty, asyncio.TimeoutError):
continue
try:
@@ -424,7 +452,7 @@ class DeepgramTranscriptionEngine:
text=text,
is_final=is_final,
timestamp=datetime.now(),
user_name=self.user_name,
user_name=self.config.get('user.name', 'User'),
)
if is_final:
if self.final_callback:
@@ -475,7 +503,7 @@ class DeepgramTranscriptionEngine:
text=transcript,
is_final=is_final,
timestamp=datetime.now(),
user_name=self.user_name,
user_name=self.config.get('user.name', 'User'),
)
if is_final:
if self.final_callback:
@@ -506,10 +534,6 @@ class DeepgramTranscriptionEngine:
pass
self._ws = None
def set_user_name(self, user_name: str):
"""Update the user name attached to future transcriptions."""
self.user_name = user_name
def is_recording_active(self) -> bool:
"""Return ``True`` if audio is currently being captured."""
return self._is_recording

29
client/models.py Normal file
View File

@@ -0,0 +1,29 @@
"""Shared data models used across transcription engines."""
from datetime import datetime
class TranscriptionResult:
"""Represents a transcription result."""
def __init__(self, text: str, is_final: bool, timestamp: datetime, user_name: str = ""):
"""
Initialize transcription result.
Args:
text: Transcribed text
is_final: Whether this is a final transcription or realtime preview
timestamp: Timestamp of transcription
user_name: Name of the user/speaker
"""
self.text = text.strip()
self.is_final = is_final
self.timestamp = timestamp
self.user_name = user_name
def __repr__(self) -> str:
time_str = self.timestamp.strftime("%H:%M:%S")
prefix = "[FINAL]" if self.is_final else "[PREVIEW]"
if self.user_name and self.user_name.strip():
return f"{prefix} [{time_str}] {self.user_name}: {self.text}"
return f"{prefix} [{time_str}] {self.text}"

0
client/tests/__init__.py Normal file
View File

View File

@@ -0,0 +1,78 @@
"""Tests for client.config.Config."""
import sys
from pathlib import Path
import pytest
# Ensure project root is on path
project_root = Path(__file__).resolve().parent.parent.parent
sys.path.insert(0, str(project_root))
from client.config import Config
@pytest.fixture
def cfg(tmp_path):
"""A Config backed by a temp file so we never touch the real user config."""
return Config(config_path=str(tmp_path / "test_config.yaml"))
# ── dot-notation get ────────────────────────────────────────────────
def test_dot_notation_get(cfg):
"""Config.get should traverse nested dicts using dot-separated keys."""
cfg.config = {"audio": {"sample_rate": 16000}}
assert cfg.get("audio.sample_rate") == 16000
# ── dot-notation set ────────────────────────────────────────────────
def test_dot_notation_set(cfg):
"""Config.set should create/update nested values via dot-separated keys."""
cfg.set("audio.sample_rate", 44100)
assert cfg.config["audio"]["sample_rate"] == 44100
# Also readable via .get
assert cfg.get("audio.sample_rate") == 44100
# ── missing key returns default ─────────────────────────────────────
def test_missing_key_returns_default(cfg):
"""Accessing a nonexistent key should return the supplied default."""
assert cfg.get("nonexistent.path", "fallback") == "fallback"
assert cfg.get("also.missing") is None # default default is None
# ── nested set creates intermediate dicts ───────────────────────────
def test_nested_set_creates_path(cfg):
"""Setting a deeply nested key should create all intermediate dicts."""
cfg.config = {}
cfg.set("a.b.c.d", 42)
assert cfg.config["a"]["b"]["c"]["d"] == 42
assert cfg.get("a.b.c.d") == 42
# ── save and reload round-trip ──────────────────────────────────────
def test_save_and_reload(tmp_path):
"""Values persisted via save() should survive a fresh Config load."""
config_file = str(tmp_path / "roundtrip.yaml")
# Create and populate
cfg1 = Config(config_path=config_file)
cfg1.set("user.name", "TestUser")
cfg1.set("transcription.model", "tiny.en")
# Load a fresh instance from the same file
cfg2 = Config(config_path=config_file)
assert cfg2.get("user.name") == "TestUser"
assert cfg2.get("transcription.model") == "tiny.en"

View File

@@ -8,30 +8,8 @@ from threading import Lock
import logging
class TranscriptionResult:
"""Represents a transcription result."""
def __init__(self, text: str, is_final: bool, timestamp: datetime, user_name: str = ""):
"""
Initialize transcription result.
Args:
text: Transcribed text
is_final: Whether this is a final transcription or realtime preview
timestamp: Timestamp of transcription
user_name: Name of the user/speaker
"""
self.text = text.strip()
self.is_final = is_final
self.timestamp = timestamp
self.user_name = user_name
def __repr__(self) -> str:
time_str = self.timestamp.strftime("%H:%M:%S")
prefix = "[FINAL]" if self.is_final else "[PREVIEW]"
if self.user_name and self.user_name.strip():
return f"{prefix} [{time_str}] {self.user_name}: {self.text}"
return f"{prefix} [{time_str}] {self.text}"
# Re-export TranscriptionResult from the shared models module for backward compatibility
from client.models import TranscriptionResult # noqa: F401
def to_dict(self) -> dict:
"""Convert to dictionary."""
@@ -80,8 +58,8 @@ class RealtimeTranscriptionEngine:
no_log_file: bool = True,
# Audio device
input_device_index: Optional[int] = None,
# User name
user_name: str = ""
# App config (for reading user.name at transcription time)
app_config=None
):
"""
Initialize RealtimeSTT transcription engine.
@@ -104,7 +82,7 @@ class RealtimeTranscriptionEngine:
initial_prompt: Optional prompt to guide transcription
no_log_file: Disable RealtimeSTT logging
input_device_index: Audio input device index
user_name: User name for transcriptions
app_config: App Config object for reading user.name dynamically
"""
self.model = model
self.language = language
@@ -122,7 +100,7 @@ class RealtimeTranscriptionEngine:
self.enable_realtime = enable_realtime_transcription
self.realtime_model = realtime_model
self.realtime_processing_pause = realtime_processing_pause
self.user_name = user_name
self.app_config = app_config
# Callbacks
self.realtime_callback: Optional[Callable[[TranscriptionResult], None]] = None
@@ -184,6 +162,11 @@ class RealtimeTranscriptionEngine:
self.realtime_callback = realtime_callback
self.final_callback = final_callback
def _get_user_name(self) -> str:
if self.app_config:
return self.app_config.get('user.name', '')
return ''
def _on_realtime_transcription(self, text: str):
"""Internal callback for realtime transcriptions."""
if self.realtime_callback and text.strip():
@@ -191,7 +174,7 @@ class RealtimeTranscriptionEngine:
text=text,
is_final=False,
timestamp=datetime.now(),
user_name=self.user_name
user_name=self._get_user_name()
)
self.realtime_callback(result)
@@ -202,7 +185,7 @@ class RealtimeTranscriptionEngine:
text=text,
is_final=True,
timestamp=datetime.now(),
user_name=self.user_name
user_name=self._get_user_name()
)
self.final_callback(result)
@@ -428,10 +411,6 @@ class RealtimeTranscriptionEngine:
if self.is_recording:
print("VAD settings updated. Restart transcription to apply changes.")
def set_user_name(self, user_name: str):
"""Set the user name for transcriptions."""
self.user_name = user_name
def __repr__(self) -> str:
return f"RealtimeTranscriptionEngine(model={self.model}, device={self.device}, running={self.is_recording})"

View File

@@ -42,7 +42,7 @@ transcription:
server_sync:
enabled: false
url: "http://localhost:3000/api/send"
url: ""
room: "default"
passphrase: ""
# Font settings are now in the display section (shared for local and server sync)
@@ -69,9 +69,10 @@ web_server:
host: "127.0.0.1"
remote:
mode: local # local | managed | byok
server_url: "" # Proxy server URL for managed mode (e.g., wss://your-proxy.com)
mode: byok # local | managed | byok
server_url: "https://transcribe.shadowdao.com" # Proxy server URL for managed mode
auth_token: "" # JWT stored after login (managed mode)
email: "" # Email of the logged-in managed-mode account (for UI display)
byok_api_key: "" # Deepgram API key for BYOK mode
deepgram_model: nova-2 # Deepgram model to use
language: en-US # Language code

View File

@@ -401,7 +401,6 @@ class MainWindow(QMainWindow):
# Use Deepgram-based remote transcription
self.transcription_engine = DeepgramTranscriptionEngine(
config=self.config,
user_name=user_name,
input_device_index=audio_device
)
self.transcription_engine.set_callbacks(
@@ -431,7 +430,7 @@ class MainWindow(QMainWindow):
initial_prompt=self.config.get('transcription.initial_prompt', ''),
no_log_file=self.config.get('transcription.no_log_file', True),
input_device_index=audio_device,
user_name=user_name
app_config=self.config
)
# Set up callbacks for transcription results

View File

@@ -0,0 +1,169 @@
# -*- mode: python ; coding: utf-8 -*-
"""PyInstaller spec file for cloud-only Local Transcription backend.
This builds a lightweight sidecar (~50MB) that only supports Deepgram
cloud transcription (managed + BYOK). No local Whisper models, no
PyTorch, no CUDA -- just audio capture and WebSocket streaming.
"""
import sys
import os
block_cipher = None
is_windows = sys.platform == 'win32'
from PyInstaller.utils.hooks import collect_submodules, collect_data_files
# Data files
datas = [
('config/default_config.yaml', 'config'),
]
# Collect sounddevice's bundled PortAudio library (_sounddevice_data)
try:
import sounddevice
sd_path = os.path.dirname(sounddevice.__file__)
sd_data = os.path.join(sd_path, '_sounddevice_data')
if os.path.exists(sd_data):
datas.append((sd_data, '_sounddevice_data'))
print(f" + Collected sounddevice PortAudio data from {sd_data}")
# Also collect the package itself
sd_datas = collect_data_files('sounddevice')
if sd_datas:
datas += sd_datas
print(f" + Collected {len(sd_datas)} sounddevice data files")
except ImportError:
print(" - Warning: sounddevice not found")
# Hidden imports -- only lightweight deps needed for Deepgram streaming
hiddenimports = [
'sounddevice',
'_sounddevice_data',
'numpy',
# FastAPI and dependencies
'fastapi',
'fastapi.routing',
'fastapi.responses',
'starlette',
'starlette.applications',
'starlette.routing',
'starlette.responses',
'starlette.websockets',
'starlette.middleware',
'starlette.middleware.cors',
'pydantic',
'pydantic.fields',
'pydantic.main',
'anyio',
'anyio._backends',
'anyio._backends._asyncio',
'sniffio',
# Uvicorn
'uvicorn',
'uvicorn.logging',
'uvicorn.loops',
'uvicorn.loops.auto',
'uvicorn.protocols',
'uvicorn.protocols.http',
'uvicorn.protocols.http.auto',
'uvicorn.protocols.http.h11_impl',
'uvicorn.protocols.websockets',
'uvicorn.protocols.websockets.auto',
'uvicorn.protocols.websockets.wsproto_impl',
'uvicorn.lifespan',
'uvicorn.lifespan.on',
'h11',
'websockets',
'websockets.legacy',
'websockets.legacy.server',
# HTTP client
'requests',
'urllib3',
'certifi',
'charset_normalizer',
]
# Collect submodules for key packages
print("Collecting submodules for cloud backend packages...")
for package in ['fastapi', 'starlette', 'pydantic', 'pydantic_core', 'anyio', 'uvicorn', 'websockets', 'h11']:
try:
submodules = collect_submodules(package)
hiddenimports += submodules
print(f" + Collected {len(submodules)} submodules from {package}")
except Exception as e:
print(f" - Warning: Could not collect {package}: {e}")
# Collect data files
for package in ['fastapi', 'starlette', 'pydantic', 'uvicorn']:
try:
data_files = collect_data_files(package)
if data_files:
datas += data_files
except Exception:
pass
# Pydantic critical deps
hiddenimports += [
'colorsys', 'decimal', 'json', 'ipaddress', 'pathlib', 'uuid',
'email.message', 'typing_extensions',
]
a = Analysis(
['backend/main_headless.py'],
pathex=[],
binaries=[],
datas=datas,
hiddenimports=hiddenimports,
hookspath=['hooks'],
hooksconfig={},
runtime_hooks=[],
excludes=[
# Exclude all heavy ML/local transcription deps
'torch', 'torchaudio', 'torchvision',
'faster_whisper', 'ctranslate2',
'RealtimeSTT', 'webrtcvad', 'webrtcvad_wheels',
'silero_vad', 'onnxruntime',
'openwakeword', 'pvporcupine', 'pyaudio',
'noisereduce', 'scipy',
# Exclude GUI frameworks
'PySide6', 'PyQt5', 'PyQt6', 'tkinter',
# Exclude other unnecessary heavy packages
'matplotlib', 'PIL', 'cv2',
],
win_no_prefer_redirects=False,
win_private_assemblies=False,
cipher=block_cipher,
noarchive=False,
)
pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)
exe = EXE(
pyz,
a.scripts,
[],
exclude_binaries=True,
name='local-transcription-backend',
debug=False,
bootloader_ignore_signals=False,
strip=False,
upx=True,
console=True,
disable_windowed_traceback=False,
argv_emulation=False,
target_arch=None,
codesign_identity=None,
entitlements_file=None,
icon='LocalTranscription.ico' if is_windows else None,
)
coll = COLLECT(
exe,
a.binaries,
a.zipfiles,
a.datas,
strip=False,
upx=True,
upx_exclude=[],
name='local-transcription-backend',
)

View File

@@ -38,6 +38,21 @@ datas = [
(vad_assets_path, 'faster_whisper/assets'),
] + pvporcupine_data_files
# Collect sounddevice's bundled PortAudio library (_sounddevice_data)
try:
import sounddevice
sd_path = os.path.dirname(sounddevice.__file__)
sd_data = os.path.join(sd_path, '_sounddevice_data')
if os.path.exists(sd_data):
datas.append((sd_data, '_sounddevice_data'))
print(f" + Collected sounddevice PortAudio data from {sd_data}")
sd_datas = collect_data_files('sounddevice')
if sd_datas:
datas += sd_datas
print(f" + Collected {len(sd_datas)} sounddevice data files")
except ImportError:
print(" - Warning: sounddevice not found")
# Hidden imports -- NO PySide6/Qt needed for headless backend
hiddenimports = [
# Transcription engine
@@ -46,6 +61,7 @@ hiddenimports = [
'faster_whisper.vad',
'ctranslate2',
'sounddevice',
'_sounddevice_data',
'scipy',
'scipy.signal',
'numpy',

View File

@@ -90,7 +90,7 @@ class TranscriptionCLI:
initial_prompt=self.config.get('transcription.initial_prompt', ''),
no_log_file=True,
input_device_index=audio_device,
user_name=user_name
app_config=self.config
)
# Set up callbacks

1085
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,7 @@
{
"name": "local-transcription",
"private": true,
"version": "1.4.1",
"version": "2.0.20",
"type": "module",
"scripts": {
"dev": "vite dev",
@@ -12,16 +12,19 @@
"devDependencies": {
"@sveltejs/vite-plugin-svelte": "^5.0.0",
"@tauri-apps/cli": "^2.0.0",
"@testing-library/svelte": "^5.3.1",
"@tsconfig/svelte": "^5.0.0",
"jsdom": "^29.0.2",
"svelte": "^5.0.0",
"svelte-check": "^4.0.0",
"typescript": "~5.6.0",
"vite": "^6.0.0"
"vite": "^6.0.0",
"vitest": "^4.1.3"
},
"dependencies": {
"@tauri-apps/api": "^2.0.0",
"@tauri-apps/plugin-dialog": "^2.0.0",
"@tauri-apps/plugin-shell": "^2.0.0",
"@tauri-apps/plugin-process": "^2.0.0"
"@tauri-apps/plugin-process": "^2.0.0",
"@tauri-apps/plugin-shell": "^2.0.0"
}
}

View File

@@ -1,6 +1,6 @@
[project]
name = "local-transcription"
version = "1.0.1"
version = "1.0.15"
description = "A standalone desktop application for real-time speech-to-text transcription using Whisper models"
readme = "README.md"
requires-python = ">=3.9"

View File

@@ -703,6 +703,36 @@ app.post('/api/send', async (req, res) => {
}
});
// Create room explicitly (no transcription needed)
app.post('/api/create-room', async (req, res) => {
try {
const { room, passphrase } = req.body;
if (!room || !passphrase) {
return res.status(400).json({ error: 'Missing room or passphrase' });
}
// Check if room already exists
const existing = await loadRoom(room);
if (existing) {
const valid = await verifyPassphrase(room, passphrase);
if (!valid) {
return res.status(401).json({ error: 'Room exists with different passphrase' });
}
return res.json({ status: 'ok', room, created: false, message: 'Room already exists' });
}
// Create the room (verifyPassphrase creates it if it doesn't exist)
await verifyPassphrase(room, passphrase);
console.log(`[Room] Created room "${room}"`);
res.json({ status: 'ok', room, created: true });
} catch (err) {
console.error('Error in /api/create-room:', err);
res.status(500).json({ error: err.message });
}
});
// List transcriptions
app.get('/api/list', async (req, res) => {
try {

498
src-tauri/Cargo.lock generated
View File

@@ -47,6 +47,15 @@ version = "1.0.102"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7f202df86484c868dbad7eaa557ef785d5c66295e41b460ef922eca0723b842c"
[[package]]
name = "arbitrary"
version = "1.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c3d036a3c4ab069c7b410a2ce876bd74808d2d0888a82667669f8e783a898bf1"
dependencies = [
"derive_arbitrary",
]
[[package]]
name = "atk"
version = "0.18.2"
@@ -307,8 +316,10 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c673075a2e0e5f4a1dde27ce9dee1ea4558c7ffe648f576438a20ca1d2acc4b0"
dependencies = [
"iana-time-zone",
"js-sys",
"num-traits",
"serde",
"wasm-bindgen",
"windows-link 0.2.1",
]
@@ -338,6 +349,16 @@ dependencies = [
"version_check",
]
[[package]]
name = "core-foundation"
version = "0.9.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "91e195e091a93c46f7102ec7818a2aa394e1e1771c3ab4825963fa03e45afb8f"
dependencies = [
"core-foundation-sys",
"libc",
]
[[package]]
name = "core-foundation"
version = "0.10.1"
@@ -361,9 +382,9 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "064badf302c3194842cf2c5d61f56cc88e54a759313879cdf03abdd27d0c3b97"
dependencies = [
"bitflags 2.11.0",
"core-foundation",
"core-foundation 0.10.1",
"core-graphics-types",
"foreign-types",
"foreign-types 0.5.0",
"libc",
]
@@ -374,7 +395,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3d44a101f213f6c4cdc1853d4b78aef6db6bdfa3468798cc1d9912f4735013eb"
dependencies = [
"bitflags 2.11.0",
"core-foundation",
"core-foundation 0.10.1",
"libc",
]
@@ -515,6 +536,17 @@ dependencies = [
"serde_core",
]
[[package]]
name = "derive_arbitrary"
version = "1.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1e567bd82dcff979e4b03460c307b3cdc9e96fde3d73bed1496d2bc75d9dd62a"
dependencies = [
"proc-macro2",
"quote",
"syn 2.0.117",
]
[[package]]
name = "derive_more"
version = "0.99.20"
@@ -792,6 +824,15 @@ version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "77ce24cb58228fbb8aa041425bb1050850ac19177686ea6e0f41a70416f56fdb"
[[package]]
name = "foreign-types"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f6f339eb8adc052cd2ca78910fda869aefa38d22d5cb648e6485e4d3fc06f3b1"
dependencies = [
"foreign-types-shared 0.1.1",
]
[[package]]
name = "foreign-types"
version = "0.5.0"
@@ -799,7 +840,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d737d9aa519fb7b749cbc3b962edcf310a8dd1f4b67c91c4f83975dbdd17d965"
dependencies = [
"foreign-types-macros",
"foreign-types-shared",
"foreign-types-shared 0.3.1",
]
[[package]]
@@ -813,6 +854,12 @@ dependencies = [
"syn 2.0.117",
]
[[package]]
name = "foreign-types-shared"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "00b0228411908ca8685dba7fc2cdd70ec9990a6e753e89b6ac91a84c40fbaf4b"
[[package]]
name = "foreign-types-shared"
version = "0.3.1"
@@ -1222,6 +1269,25 @@ dependencies = [
"syn 2.0.117",
]
[[package]]
name = "h2"
version = "0.4.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2f44da3a8150a6703ed5d34e164b875fd14c2cdab9af1252a9a1020bde2bdc54"
dependencies = [
"atomic-waker",
"bytes",
"fnv",
"futures-core",
"futures-sink",
"http",
"indexmap 2.13.1",
"slab",
"tokio",
"tokio-util",
"tracing",
]
[[package]]
name = "hashbrown"
version = "0.12.3"
@@ -1332,6 +1398,7 @@ dependencies = [
"bytes",
"futures-channel",
"futures-core",
"h2",
"http",
"http-body",
"httparse",
@@ -1342,6 +1409,38 @@ dependencies = [
"want",
]
[[package]]
name = "hyper-rustls"
version = "0.27.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e3c93eb611681b207e1fe55d5a71ecf91572ec8a6705cdb6857f7d8d5242cf58"
dependencies = [
"http",
"hyper",
"hyper-util",
"rustls",
"rustls-pki-types",
"tokio",
"tokio-rustls",
"tower-service",
]
[[package]]
name = "hyper-tls"
version = "0.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "70206fc6890eaca9fde8a0bf71caa2ddfc9fe045ac9e5c70df101a7dbde866e0"
dependencies = [
"bytes",
"http-body-util",
"hyper",
"hyper-util",
"native-tls",
"tokio",
"tokio-native-tls",
"tower-service",
]
[[package]]
name = "hyper-util"
version = "0.1.20"
@@ -1360,9 +1459,11 @@ dependencies = [
"percent-encoding",
"pin-project-lite",
"socket2",
"system-configuration",
"tokio",
"tower-service",
"tracing",
"windows-registry",
]
[[package]]
@@ -1766,6 +1867,12 @@ dependencies = [
"libc",
]
[[package]]
name = "linux-raw-sys"
version = "0.12.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32a66949e030da00e8c7d4434b251670a91556f4144941d37452769c25d58a53"
[[package]]
name = "litemap"
version = "0.8.2"
@@ -1774,8 +1881,12 @@ checksum = "92daf443525c4cce67b150400bc2316076100ce0b3686209eb8cf3c31612e6f0"
[[package]]
name = "local-transcription"
version = "1.4.0"
version = "2.0.12"
dependencies = [
"bytes",
"chrono",
"futures-util",
"reqwest 0.12.28",
"serde",
"serde_json",
"tauri",
@@ -1783,6 +1894,9 @@ dependencies = [
"tauri-plugin-dialog",
"tauri-plugin-process",
"tauri-plugin-shell",
"tempfile",
"tokio",
"zip",
]
[[package]]
@@ -1911,6 +2025,23 @@ dependencies = [
"windows-sys 0.60.2",
]
[[package]]
name = "native-tls"
version = "0.2.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "465500e14ea162429d264d44189adc38b199b62b1c21eea9f69e4b73cb03bbf2"
dependencies = [
"libc",
"log",
"openssl",
"openssl-probe",
"openssl-sys",
"schannel",
"security-framework",
"security-framework-sys",
"tempfile",
]
[[package]]
name = "ndk"
version = "0.9.0"
@@ -2132,6 +2263,50 @@ dependencies = [
"pathdiff",
]
[[package]]
name = "openssl"
version = "0.10.76"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "951c002c75e16ea2c65b8c7e4d3d51d5530d8dfa7d060b4776828c88cfb18ecf"
dependencies = [
"bitflags 2.11.0",
"cfg-if",
"foreign-types 0.3.2",
"libc",
"once_cell",
"openssl-macros",
"openssl-sys",
]
[[package]]
name = "openssl-macros"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a948666b637a0f465e8564c73e89d4dde00d72d4d473cc972f390fc3dcee7d9c"
dependencies = [
"proc-macro2",
"quote",
"syn 2.0.117",
]
[[package]]
name = "openssl-probe"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7c87def4c32ab89d880effc9e097653c8da5d6ef28e6b539d313baaacfbafcbe"
[[package]]
name = "openssl-sys"
version = "0.9.112"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "57d55af3b3e226502be1526dfdba67ab0e9c96fc293004e79576b2b9edb0dbdb"
dependencies = [
"cc",
"libc",
"pkg-config",
"vcpkg",
]
[[package]]
name = "option-ext"
version = "0.2.0"
@@ -2727,6 +2902,49 @@ version = "0.8.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dc897dd8d9e8bd1ed8cdad82b5966c3e0ecae09fb1907d58efaa013543185d0a"
[[package]]
name = "reqwest"
version = "0.12.28"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eddd3ca559203180a307f12d114c268abf583f59b03cb906fd0b3ff8646c1147"
dependencies = [
"base64 0.22.1",
"bytes",
"encoding_rs",
"futures-core",
"futures-util",
"h2",
"http",
"http-body",
"http-body-util",
"hyper",
"hyper-rustls",
"hyper-tls",
"hyper-util",
"js-sys",
"log",
"mime",
"native-tls",
"percent-encoding",
"pin-project-lite",
"rustls-pki-types",
"serde",
"serde_json",
"serde_urlencoded",
"sync_wrapper",
"tokio",
"tokio-native-tls",
"tokio-util",
"tower",
"tower-http",
"tower-service",
"url",
"wasm-bindgen",
"wasm-bindgen-futures",
"wasm-streams 0.4.2",
"web-sys",
]
[[package]]
name = "reqwest"
version = "0.13.2"
@@ -2757,7 +2975,7 @@ dependencies = [
"url",
"wasm-bindgen",
"wasm-bindgen-futures",
"wasm-streams",
"wasm-streams 0.5.0",
"web-sys",
]
@@ -2785,6 +3003,20 @@ dependencies = [
"windows-sys 0.60.2",
]
[[package]]
name = "ring"
version = "0.17.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a4689e6c2294d81e88dc6261c768b63bc4fcdb852be6d1352498b114f61383b7"
dependencies = [
"cc",
"cfg-if",
"getrandom 0.2.17",
"libc",
"untrusted",
"windows-sys 0.52.0",
]
[[package]]
name = "rustc-hash"
version = "2.1.2"
@@ -2800,12 +3032,64 @@ dependencies = [
"semver",
]
[[package]]
name = "rustix"
version = "1.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b6fe4565b9518b83ef4f91bb47ce29620ca828bd32cb7e408f0062e9930ba190"
dependencies = [
"bitflags 2.11.0",
"errno",
"libc",
"linux-raw-sys",
"windows-sys 0.61.2",
]
[[package]]
name = "rustls"
version = "0.23.37"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "758025cb5fccfd3bc2fd74708fd4682be41d99e5dff73c377c0646c6012c73a4"
dependencies = [
"once_cell",
"rustls-pki-types",
"rustls-webpki",
"subtle",
"zeroize",
]
[[package]]
name = "rustls-pki-types"
version = "1.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "be040f8b0a225e40375822a563fa9524378b9d63112f53e19ffff34df5d33fdd"
dependencies = [
"zeroize",
]
[[package]]
name = "rustls-webpki"
version = "0.103.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "df33b2b81ac578cabaf06b89b0631153a3f416b0a886e8a7a1707fb51abbd1ef"
dependencies = [
"ring",
"rustls-pki-types",
"untrusted",
]
[[package]]
name = "rustversion"
version = "1.0.22"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b39cdef0fa800fc44525c84ccb54a029961a8215f9619753635a9c0d2538d46d"
[[package]]
name = "ryu"
version = "1.0.23"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9774ba4a74de5f7b1c1451ed6cd5285a32eddb5cccb8cc655a4e50009e06477f"
[[package]]
name = "same-file"
version = "1.0.6"
@@ -2815,6 +3099,15 @@ dependencies = [
"winapi-util",
]
[[package]]
name = "schannel"
version = "0.1.29"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "91c1b7e4904c873ef0710c1f407dde2e6287de2bebc1bbbf7d430bb7cbffd939"
dependencies = [
"windows-sys 0.61.2",
]
[[package]]
name = "schemars"
version = "0.8.22"
@@ -2872,6 +3165,29 @@ version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "94143f37725109f92c262ed2cf5e59bce7498c01bcc1502d7b9afe439a4e9f49"
[[package]]
name = "security-framework"
version = "3.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b7f4bc775c73d9a02cde8bf7b2ec4c9d12743edf609006c7facc23998404cd1d"
dependencies = [
"bitflags 2.11.0",
"core-foundation 0.10.1",
"core-foundation-sys",
"libc",
"security-framework-sys",
]
[[package]]
name = "security-framework-sys"
version = "2.17.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6ce2691df843ecc5d231c0b14ece2acc3efb62c0a398c7e1d875f3983ce020e3"
dependencies = [
"core-foundation-sys",
"libc",
]
[[package]]
name = "selectors"
version = "0.24.0"
@@ -3014,6 +3330,18 @@ dependencies = [
"serde_core",
]
[[package]]
name = "serde_urlencoded"
version = "0.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d3491c14715ca2294c4d6a88f15e84739788c1d030eed8c110436aafdaa2f3fd"
dependencies = [
"form_urlencoded",
"itoa",
"ryu",
"serde",
]
[[package]]
name = "serde_with"
version = "3.18.0"
@@ -3294,6 +3622,12 @@ version = "0.11.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7da8b5736845d9f2fcb837ea5d9e2628564b3b043a70948a3f0b778838c5fb4f"
[[package]]
name = "subtle"
version = "2.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "13c2bddecc57b384dee18652358fb23172facb8a2c51ccc10d74c157bdea3292"
[[package]]
name = "swift-rs"
version = "1.0.7"
@@ -3347,6 +3681,27 @@ dependencies = [
"syn 2.0.117",
]
[[package]]
name = "system-configuration"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a13f3d0daba03132c0aa9767f98351b3488edc2c100cda2d2ec2b04f3d8d3c8b"
dependencies = [
"bitflags 2.11.0",
"core-foundation 0.9.4",
"system-configuration-sys",
]
[[package]]
name = "system-configuration-sys"
version = "0.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8e1d1b10ced5ca923a1fcb8d03e96b8d3268065d724548c0211415ff6ac6bac4"
dependencies = [
"core-foundation-sys",
"libc",
]
[[package]]
name = "system-deps"
version = "6.2.2"
@@ -3368,7 +3723,7 @@ checksum = "9103edf55f2da3c82aea4c7fab7c4241032bfeea0e71fa557d98e00e7ce7cc20"
dependencies = [
"bitflags 2.11.0",
"block2",
"core-foundation",
"core-foundation 0.10.1",
"core-graphics",
"crossbeam-channel",
"dispatch2",
@@ -3445,7 +3800,7 @@ dependencies = [
"percent-encoding",
"plist",
"raw-window-handle",
"reqwest",
"reqwest 0.13.2",
"serde",
"serde_json",
"serde_repr",
@@ -3719,6 +4074,19 @@ dependencies = [
"toml 0.9.12+spec-1.1.0",
]
[[package]]
name = "tempfile"
version = "3.27.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32497e9a4c7b38532efcdebeef879707aa9f794296a4f0244f6f69e9bc8574bd"
dependencies = [
"fastrand",
"getrandom 0.4.2",
"once_cell",
"rustix",
"windows-sys 0.61.2",
]
[[package]]
name = "tendril"
version = "0.4.3"
@@ -3830,11 +4198,45 @@ dependencies = [
"bytes",
"libc",
"mio",
"parking_lot",
"pin-project-lite",
"signal-hook-registry",
"socket2",
"tokio-macros",
"windows-sys 0.61.2",
]
[[package]]
name = "tokio-macros"
version = "2.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "385a6cb71ab9ab790c5fe8d67f1645e6c450a7ce006a33de03daa956cf70a496"
dependencies = [
"proc-macro2",
"quote",
"syn 2.0.117",
]
[[package]]
name = "tokio-native-tls"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bbae76ab933c85776efabc971569dd6119c580d8f5d448769dec1764bf796ef2"
dependencies = [
"native-tls",
"tokio",
]
[[package]]
name = "tokio-rustls"
version = "0.26.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1729aa945f29d91ba541258c8df89027d5792d85a8841fb65e8bf0f4ede4ef61"
dependencies = [
"rustls",
"tokio",
]
[[package]]
name = "tokio-util"
version = "0.7.18"
@@ -4116,6 +4518,12 @@ version = "0.2.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ebc1c04c71510c7f702b52b7c350734c9ff1295c464a03335b00bb84fc54f853"
[[package]]
name = "untrusted"
version = "0.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8ecb6da28b8a351d773b68d5825ac39017e680750f980f3a1a85cd8dd28a47c1"
[[package]]
name = "url"
version = "2.5.8"
@@ -4165,6 +4573,12 @@ dependencies = [
"wasm-bindgen",
]
[[package]]
name = "vcpkg"
version = "0.2.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "accd4ea62f7bb7a82fe23066fb0957d48ef677f6eeb8215f372f52e48bb32426"
[[package]]
name = "version-compare"
version = "0.2.1"
@@ -4323,6 +4737,19 @@ dependencies = [
"wasmparser",
]
[[package]]
name = "wasm-streams"
version = "0.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "15053d8d85c7eccdbefef60f06769760a563c7f0a9d6902a13d35c7800b0ad65"
dependencies = [
"futures-util",
"js-sys",
"wasm-bindgen",
"wasm-bindgen-futures",
"web-sys",
]
[[package]]
name = "wasm-streams"
version = "0.5.0"
@@ -4599,6 +5026,17 @@ dependencies = [
"windows-link 0.1.3",
]
[[package]]
name = "windows-registry"
version = "0.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "02752bf7fbdcce7f2a27a742f798510f3e5ad88dbe84871e5168e2120c3d5720"
dependencies = [
"windows-link 0.2.1",
"windows-result 0.4.1",
"windows-strings 0.5.1",
]
[[package]]
name = "windows-result"
version = "0.3.4"
@@ -4644,6 +5082,15 @@ dependencies = [
"windows-targets 0.42.2",
]
[[package]]
name = "windows-sys"
version = "0.52.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "282be5f36a8ce781fad8c8ae18fa3f9beff57ec1b52cb3de0789201425d9a33d"
dependencies = [
"windows-targets 0.52.6",
]
[[package]]
name = "windows-sys"
version = "0.59.0"
@@ -5132,6 +5579,12 @@ dependencies = [
"synstructure",
]
[[package]]
name = "zeroize"
version = "1.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b97154e67e32c85465826e8bcc1c59429aaaf107c1e4a9e53c8d8ccd5eff88d0"
[[package]]
name = "zerotrie"
version = "0.2.4"
@@ -5165,8 +5618,37 @@ dependencies = [
"syn 2.0.117",
]
[[package]]
name = "zip"
version = "2.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fabe6324e908f85a1c52063ce7aa26b68dcb7eb6dbc83a2d148403c9bc3eba50"
dependencies = [
"arbitrary",
"crc32fast",
"crossbeam-utils",
"displaydoc",
"flate2",
"indexmap 2.13.1",
"memchr",
"thiserror 2.0.18",
"zopfli",
]
[[package]]
name = "zmij"
version = "1.0.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa"
[[package]]
name = "zopfli"
version = "0.8.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f05cd8797d63865425ff89b5c4a48804f35ba0ce8d125800027ad6017d2b5249"
dependencies = [
"bumpalo",
"crc32fast",
"log",
"simd-adler32",
]

View File

@@ -1,6 +1,6 @@
[package]
name = "local-transcription"
version = "1.4.1"
version = "2.0.20"
description = "Real-time speech-to-text transcription for streamers"
authors = ["Local Transcription Contributors"]
edition = "2021"
@@ -19,3 +19,12 @@ tauri-plugin-dialog = "2"
tauri-plugin-process = "2"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
reqwest = { version = "0.12", features = ["json", "stream"] }
futures-util = "0.3"
zip = { version = "2", default-features = false, features = ["deflate"] }
bytes = "1"
tokio = { version = "1", features = ["full"] }
chrono = "0.4"
[dev-dependencies]
tempfile = "3"

View File

@@ -0,0 +1,14 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>com.apple.security.device.audio-input</key>
<true/>
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.network.server</key>
<true/>
<key>com.apple.security.cs.allow-unsigned-executable-memory</key>
<true/>
</dict>
</plist>

8
src-tauri/Info.plist Normal file
View File

@@ -0,0 +1,8 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>NSMicrophoneUsageDescription</key>
<string>Local Transcription needs microphone access for real-time speech-to-text transcription.</string>
</dict>
</plist>

View File

@@ -0,0 +1,14 @@
{
"identifier": "default",
"description": "Default permissions for the main window",
"windows": ["main"],
"permissions": [
"core:default",
"core:event:default",
"core:event:allow-listen",
"core:event:allow-emit",
"shell:default",
"dialog:default",
"process:default"
]
}

View File

@@ -1 +1 @@
{}
{"default":{"identifier":"default","description":"Default permissions for the main window","local":true,"windows":["main"],"permissions":["core:default","core:event:default","core:event:allow-listen","core:event:allow-emit","shell:default","dialog:default","process:default"]}}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 41 KiB

After

Width:  |  Height:  |  Size: 20 KiB

BIN
src-tauri/icons/icon.icns Normal file

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 739 B

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

After

Width:  |  Height:  |  Size: 41 KiB

View File

@@ -1,9 +1,98 @@
mod sidecar;
use std::sync::Mutex;
use tauri::Manager;
/// App log directory, set during setup.
static LOG_DIR: std::sync::OnceLock<std::path::PathBuf> = std::sync::OnceLock::new();
/// Write a log message to the app's log file (for debugging).
#[tauri::command]
fn write_log(message: String) {
if let Some(log_dir) = LOG_DIR.get() {
let log_path = log_dir.join("frontend.log");
use std::io::Write;
if let Ok(mut f) = std::fs::OpenOptions::new()
.create(true)
.append(true)
.open(&log_path)
{
let _ = writeln!(f, "[{}] {}", chrono::Local::now().format("%H:%M:%S%.3f"), message);
}
}
eprintln!("[frontend] {}", message);
}
#[cfg_attr(mobile, tauri::mobile_entry_point)]
pub fn run() {
tauri::Builder::default()
.plugin(tauri_plugin_shell::init())
.plugin(tauri_plugin_dialog::init())
.plugin(tauri_plugin_process::init())
.run(tauri::generate_context!())
.expect("error while running tauri application");
.manage(sidecar::ManagedSidecar(std::sync::Arc::new(Mutex::new(
sidecar::SidecarManager::new(),
))))
.setup(|app| {
let resource_dir = app
.path()
.resource_dir()
.expect("failed to resolve resource dir");
let data_dir = app
.path()
.app_data_dir()
.expect("failed to resolve app data dir");
std::fs::create_dir_all(&data_dir).expect("failed to create app data dir");
// Set up logging
LOG_DIR.set(data_dir.clone()).ok();
let log_path = data_dir.join("app.log");
if let Ok(mut f) = std::fs::OpenOptions::new()
.create(true)
.append(true)
.open(&log_path)
{
use std::io::Write;
let _ = writeln!(f, "\n=== App started at {} ===", chrono::Local::now());
let _ = writeln!(f, "Resource dir: {}", resource_dir.display());
let _ = writeln!(f, "Data dir: {}", data_dir.display());
}
sidecar::init_dirs(resource_dir, data_dir);
Ok(())
})
.invoke_handler(tauri::generate_handler![
sidecar::check_sidecar,
sidecar::download_sidecar,
sidecar::check_sidecar_update,
sidecar::get_sidecar_port,
sidecar::start_sidecar,
sidecar::stop_sidecar,
sidecar::reset_sidecar,
write_log,
])
.build(tauri::generate_context!())
.expect("error while building tauri application")
.run(|app, event| {
match event {
tauri::RunEvent::Exit => {
if let Some(state) = app.try_state::<sidecar::ManagedSidecar>() {
if let Ok(mut mgr) = state.0.lock() {
eprintln!("[app] Stopping sidecar on exit...");
mgr.stop();
}
}
}
tauri::RunEvent::ExitRequested { .. } => {
// Also stop sidecar on exit request (Cmd+Q on macOS)
if let Some(state) = app.try_state::<sidecar::ManagedSidecar>() {
if let Ok(mut mgr) = state.0.lock() {
eprintln!("[app] Stopping sidecar on exit request...");
mgr.stop();
}
}
}
_ => {}
}
});
}

1082
src-tauri/src/sidecar/mod.rs Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,7 @@
{
"productName": "Local Transcription",
"version": "1.4.1",
"identifier": "com.localtranscription.app",
"version": "2.0.20",
"identifier": "net.anhonesthost.local-transcription",
"build": {
"frontendDist": "../dist",
"devUrl": "http://localhost:1420",
@@ -30,9 +30,13 @@
"icons/32x32.png",
"icons/128x128.png",
"icons/128x128@2x.png",
"icons/icon.icns",
"icons/icon.ico",
"icons/icon.png"
]
],
"windows": {
"digestAlgorithm": "sha256"
}
},
"plugins": {
"shell": {

View File

@@ -5,13 +5,22 @@
import Controls from "$lib/components/Controls.svelte";
import TranscriptionDisplay from "$lib/components/TranscriptionDisplay.svelte";
import Settings from "$lib/components/Settings.svelte";
import SidecarSetup from "$lib/components/SidecarSetup.svelte";
import { backendStore } from "$lib/stores/backend";
import { configStore } from "$lib/stores/config";
type SidecarState = "checking" | "needs_setup" | "update_available" | "starting" | "connected";
let showSettings = $state(false);
let sidecarState = $state<SidecarState>("checking");
let debugLog = $state("");
let availableUpdate = $state("");
let appVersion = $state("");
let obsDisplayUrl = $derived(backendStore.obsUrl);
let syncDisplayUrl = $derived(backendStore.syncUrl);
let isConnected = $derived(backendStore.connectionState === "connected");
let connectionState = $derived(backendStore.connectionState);
function openSettings() {
showSettings = true;
@@ -21,9 +30,94 @@
showSettings = false;
}
onMount(() => {
let tauriInvoke: ((cmd: string, args?: Record<string, unknown>) => Promise<unknown>) | null = null;
function log(msg: string) {
console.log(`[App] ${msg}`);
debugLog = msg;
// Also write to file via Tauri if available
tauriInvoke?.("write_log", { message: msg });
}
async function checkAndLaunchSidecar() {
try {
log("Importing Tauri API...");
const { invoke } = await import("@tauri-apps/api/core");
tauriInvoke = invoke;
log("Checking if sidecar is installed...");
sidecarState = "checking";
const installed = await invoke<boolean>("check_sidecar");
log(`Sidecar installed: ${installed}`);
if (!installed) {
sidecarState = "needs_setup";
return;
}
// Check for sidecar updates before launching
try {
log("Checking for sidecar updates...");
const update = await invoke<string | null>("check_sidecar_update");
if (update) {
log(`Sidecar update available: ${update}`);
availableUpdate = update;
sidecarState = "update_available";
return;
}
} catch (err) {
log(`Update check failed (non-fatal): ${err}`);
}
await launchSidecar();
} catch (err) {
// Not running in Tauri (browser dev mode) - skip sidecar check
// and connect directly to localhost:8081
log(`Tauri not available (${err}), using dev mode`);
sidecarState = "starting";
backendStore.setPort(8081);
backendStore.connect();
configStore.loadConfig();
configStore.fetchConfig();
}
}
async function launchSidecar() {
try {
const { invoke } = await import("@tauri-apps/api/core");
log("Starting sidecar...");
sidecarState = "starting";
await invoke("start_sidecar");
log("Getting sidecar port...");
const port = await invoke<number>("get_sidecar_port");
log(`Sidecar ready on port ${port}`);
backendStore.setPort(port);
backendStore.connect();
configStore.fetchConfig();
} catch (err) {
// If sidecar launch fails, still try connecting to default port
log(`Sidecar launch failed: ${err}, trying default port`);
sidecarState = "starting";
backendStore.connect();
configStore.fetchConfig();
}
}
async function onSidecarReady() {
await launchSidecar();
}
onMount(() => {
// Get app version from Tauri
import("@tauri-apps/api/app").then(({ getVersion }) =>
getVersion().then((v) => { appVersion = v; })
).catch(() => {
// Browser dev mode -- read from package.json or use fallback
appVersion = "dev";
});
checkAndLaunchSidecar();
return () => {
backendStore.disconnect();
@@ -31,7 +125,73 @@
});
</script>
<div class="app-shell">
{#if sidecarState === "checking"}
<div class="connecting-overlay" style="background:#1e1e1e;color:#e0e0e0;display:flex;align-items:center;justify-content:center;height:100%;width:100%;">
<div class="connecting-content" style="text-align:center;">
<div class="connecting-icon">
<div class="spinner"></div>
</div>
<h2 style="font-size:20px;margin:16px 0 8px;">Local Transcription</h2>
<p style="color:#a0a0a0;font-size:14px;">Checking setup...</p>
{#if debugLog}
<p style="color:#707070;font-size:11px;margin-top:12px;">{debugLog}</p>
{/if}
</div>
</div>
{:else if sidecarState === "needs_setup"}
<SidecarSetup onComplete={onSidecarReady} />
{:else if sidecarState === "update_available"}
<div class="connecting-overlay" style="background:#1e1e1e;color:#e0e0e0;display:flex;align-items:center;justify-content:center;height:100%;width:100%;">
<div class="connecting-content" style="text-align:center;max-width:400px;">
<h2 style="font-size:20px;margin:0 0 12px;">Sidecar Update Available</h2>
<p style="color:#a0a0a0;font-size:14px;margin:0 0 20px;">
A new version of the transcription engine is available ({availableUpdate}).
</p>
<div style="display:flex;gap:10px;justify-content:center;">
<button
style="padding:8px 20px;border:1px solid #555;border-radius:6px;background:transparent;color:#e0e0e0;cursor:pointer;"
onclick={() => launchSidecar()}
>Skip</button>
<button
style="padding:8px 20px;border:none;border-radius:6px;background:#4CAF50;color:white;cursor:pointer;font-weight:500;"
onclick={() => { sidecarState = "needs_setup"; }}
>Update Now</button>
</div>
</div>
</div>
{:else if !isConnected}
<div class="connecting-overlay" style="background:#1e1e1e;color:#e0e0e0;display:flex;align-items:center;justify-content:center;height:100%;width:100%;">
<div class="connecting-content" style="text-align:center;">
<div class="connecting-icon">
{#if connectionState === "error"}
<svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="#e74c3c" stroke-width="2">
<circle cx="12" cy="12" r="10"/>
<line x1="15" y1="9" x2="9" y2="15"/>
<line x1="9" y1="9" x2="15" y2="15"/>
</svg>
{:else}
<div class="spinner"></div>
{/if}
</div>
<h2 style="font-size:20px;margin:16px 0 8px;">Local Transcription</h2>
{#if connectionState === "error"}
<p style="color:#a0a0a0;">Cannot connect to backend</p>
<p class="hint">Make sure the Python backend is running:<br>
<code>uv run python -m backend.main_headless</code></p>
{:else}
<p style="color:#a0a0a0;">Connecting to backend...</p>
{/if}
{#if debugLog}
<p style="color:#707070;font-size:11px;margin-top:12px;">{debugLog}</p>
{/if}
</div>
</div>
{:else}
<div class="app-shell">
<Header onSettingsClick={openSettings} />
<StatusBar />
@@ -50,14 +210,76 @@
<TranscriptionDisplay />
<Controls />
<div class="version-label">v{backendStore.version}</div>
</div>
<div class="version-label">v{appVersion || backendStore.version}</div>
</div>
{#if showSettings}
{#if showSettings}
<Settings onClose={closeSettings} />
{/if}
{/if}
<style>
.connecting-overlay {
display: flex;
align-items: center;
justify-content: center;
height: 100%;
width: 100%;
background-color: var(--bg-primary);
}
.connecting-content {
text-align: center;
color: var(--text-primary);
}
.connecting-content h2 {
margin: 16px 0 8px;
font-size: 20px;
font-weight: 600;
}
.connecting-content p {
margin: 4px 0;
color: var(--text-secondary);
font-size: 14px;
}
.connecting-content .hint {
margin-top: 16px;
font-size: 12px;
color: var(--text-muted);
}
.connecting-content code {
display: inline-block;
margin-top: 4px;
padding: 4px 8px;
background: var(--bg-tertiary);
border-radius: 4px;
font-size: 12px;
color: var(--text-primary);
}
.connecting-icon {
display: flex;
justify-content: center;
margin-bottom: 8px;
}
.spinner {
width: 40px;
height: 40px;
border: 3px solid var(--border-color);
border-top-color: var(--accent-color, #4CAF50);
border-radius: 50%;
animation: spin 0.8s linear infinite;
}
@keyframes spin {
to { transform: rotate(360deg); }
}
.app-shell {
display: flex;
flex-direction: column;

View File

@@ -1,5 +1,6 @@
<script lang="ts">
import { backendStore } from "$lib/stores/backend";
import { configStore } from "$lib/stores/config";
import { transcriptionStore } from "$lib/stores/transcriptions";
let isTranscribing = $derived(backendStore.appState === "transcribing");
@@ -8,18 +9,39 @@
);
let isLoading = $state(false);
let remoteMode = $derived(configStore.config.remote.mode);
let byokApiKey = $derived(configStore.config.remote.byok_api_key);
let authToken = $derived(configStore.config.remote.auth_token);
let cloudConfigured = $derived(
remoteMode === "local" ||
(remoteMode === "byok" && byokApiKey.trim() !== "") ||
(remoteMode === "managed" && authToken.trim() !== "")
);
let errorMessage = $state("");
async function toggleTranscription() {
if (isLoading) return;
isLoading = true;
errorMessage = "";
try {
if (isTranscribing) {
await backendStore.apiPost("/api/stop");
} else {
await backendStore.apiPost("/api/start");
}
} catch (err) {
console.error("Failed to toggle transcription:", err);
} catch (err: unknown) {
const msg = err instanceof Error ? err.message : String(err);
// Ignore "Already transcribing/not transcribing" -- just sync the state
if (!msg.includes("400")) {
console.error("Failed to toggle transcription:", msg);
errorMessage = msg;
}
} finally {
// Always poll status to sync UI with actual backend state,
// even if the API call failed (e.g. "Already transcribing")
await backendStore.pollStatus();
isLoading = false;
}
}
@@ -83,7 +105,7 @@
<button
class={isTranscribing ? "danger" : "primary"}
onclick={toggleTranscription}
disabled={!isReady || isLoading}
disabled={!isReady || isLoading || !cloudConfigured}
>
{#if isLoading}
...
@@ -101,9 +123,43 @@
<button onclick={saveTranscriptions} disabled={!backendStore.connected}>
Save
</button>
{#if errorMessage}
<span class="error-msg">{errorMessage}</span>
{/if}
{#if !cloudConfigured && isReady}
<div class="cloud-warning">
{#if remoteMode === "byok"}
<span>API key required. Get one at
<a href="https://console.deepgram.com" target="_blank" rel="noopener">console.deepgram.com</a>,
then enter it in Settings.</span>
{:else if remoteMode === "managed"}
<span>Login required. Open Settings to log in.</span>
{/if}
</div>
{/if}
</div>
<style>
.error-msg {
color: #f44336;
font-size: 12px;
margin-left: 8px;
}
.cloud-warning {
font-size: 12px;
color: #ff9800;
margin-left: 8px;
flex: 1;
}
.cloud-warning a {
color: #4fc3f7;
text-decoration: underline;
}
.controls {
display: flex;
align-items: center;

View File

@@ -27,6 +27,10 @@
let showTimestamps = $state(true);
let fadeSeconds = $state(10);
let maxLines = $state(100);
let fontSource = $state("System Font");
let fontFamily = $state("Courier");
let websafeFont = $state("Arial");
let googleFont = $state("Roboto");
let fontSize = $state(12);
let userColor = $state("#4CAF50");
let textColor = $state("#FFFFFF");
@@ -37,10 +41,26 @@
let syncPassphrase = $state("");
let remoteMode = $state("local");
let remoteServerUrl = $state("");
let byokApiKey = $state("");
let managedEmail = $state("");
let managedPassword = $state("");
let managedLoggedIn = $state(false);
let autoCheckUpdates = $state(true);
let isCloudMode = $derived(remoteMode === "managed" || remoteMode === "byok");
let isCloudOnly = $derived(
computeDevices.some(d => d.id === "cloud")
);
// Room creation / join state
let shareCode = $state("");
let joinCode = $state("");
let roomCreating = $state(false);
let roomCreateMessage = $state("");
let saving = $state(false);
let saveMessage = $state("");
// Fetched device lists
let audioDevices = $state<{ id: string; name: string }[]>([]);
let computeDevices = $state<{ id: string; name: string }[]>([]);
@@ -95,6 +115,10 @@
showTimestamps = cfg.display.show_timestamps;
fadeSeconds = cfg.display.fade_after_seconds;
maxLines = cfg.display.max_lines;
fontSource = cfg.display.font_source ?? "System Font";
fontFamily = cfg.display.font_family ?? "Courier";
websafeFont = cfg.display.websafe_font ?? "Arial";
googleFont = cfg.display.google_font ?? "Roboto";
fontSize = cfg.display.font_size;
userColor = cfg.display.user_color;
textColor = cfg.display.text_color;
@@ -107,6 +131,9 @@
syncPassphrase = cfg.server_sync.passphrase;
remoteMode = cfg.remote.mode;
remoteServerUrl = cfg.remote.server_url;
byokApiKey = cfg.remote.byok_api_key ?? "";
managedEmail = cfg.remote.email ?? "";
managedLoggedIn = !!(cfg.remote.auth_token && cfg.remote.email);
autoCheckUpdates = cfg.updates.auto_check;
});
@@ -169,6 +196,10 @@
show_timestamps: showTimestamps,
fade_after_seconds: fadeSeconds,
max_lines: maxLines,
font_source: fontSource,
font_family: fontFamily,
websafe_font: websafeFont,
google_font: googleFont,
font_size: fontSize,
user_color: userColor,
text_color: textColor,
@@ -182,18 +213,24 @@
},
remote: {
mode: remoteMode,
server_url: remoteServerUrl,
server_url: remoteServerUrl || MANAGED_SERVER_URL,
byok_api_key: byokApiKey,
},
updates: {
auto_check: autoCheckUpdates,
},
};
saving = true;
saveMessage = "";
try {
await configStore.saveConfig(updates);
onClose();
await configStore.updateConfig(updates);
saveMessage = "Settings saved!";
setTimeout(() => onClose(), 600);
} catch (err) {
console.error("Failed to save settings:", err);
saveMessage = `Error: ${err}`;
saving = false;
}
}
@@ -203,31 +240,161 @@
async function handleCheckUpdates() {
try {
await backendStore.apiPost("/api/check-updates");
await backendStore.apiGet("/api/check-update");
} catch (err) {
console.error("Failed to check for updates:", err);
}
}
async function handleManagedLogin() {
async function handleChangeSidecar() {
try {
await backendStore.apiPost("/api/remote/login", {
email: managedEmail,
password: managedPassword,
});
const { invoke } = await import("@tauri-apps/api/core");
await invoke("reset_sidecar");
// Force a page reload which will re-trigger the setup flow
window.location.reload();
} catch (err) {
console.error("Login failed:", err);
console.error("Failed to reset sidecar:", err);
saveMessage = `Error: ${err}`;
}
}
async function handleManagedRegister() {
const MANAGED_SERVER_URL = "https://transcribe.shadowdao.com";
let loginMessage = $state("");
async function handleManagedLogin() {
loginMessage = "";
try {
await backendStore.apiPost("/api/remote/register", {
await backendStore.apiPost("/api/login", {
email: managedEmail,
password: managedPassword,
server_url: remoteServerUrl || MANAGED_SERVER_URL,
});
loginMessage = "Logged in successfully!";
managedPassword = "";
managedLoggedIn = true;
await configStore.fetchConfig();
} catch (err) {
console.error("Register failed:", err);
console.error("Login failed:", err);
loginMessage = "Login failed. Check your email and password.";
}
}
async function handleManagedLogout() {
try {
await configStore.updateConfig({
remote: { auth_token: "", email: "" },
});
managedLoggedIn = false;
managedPassword = "";
loginMessage = "";
} catch (err) {
console.error("Logout failed:", err);
loginMessage = `Error: ${err}`;
}
}
const CAPTION_SERVER = "https://caption.shadowdao.com";
function generateRandomName(): string {
const adjectives = ['swift', 'bright', 'cosmic', 'electric', 'turbo', 'mega', 'ultra', 'super', 'hyper', 'alpha'];
const nouns = ['phoenix', 'dragon', 'tiger', 'falcon', 'comet', 'storm', 'blaze', 'thunder', 'frost', 'nebula'];
const num = Math.floor(Math.random() * 10000);
return `${adjectives[Math.floor(Math.random() * adjectives.length)]}-${nouns[Math.floor(Math.random() * nouns.length)]}-${num}`;
}
function generateRandomPassphrase(): string {
const chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
let result = '';
for (let i = 0; i < 16; i++) {
result += chars.charAt(Math.floor(Math.random() * chars.length));
}
return result;
}
function encodeShareCode(url: string, room: string, passphrase: string): string {
return btoa(JSON.stringify({ url, room, passphrase }));
}
function decodeShareCode(code: string): { url: string; room: string; passphrase: string } | null {
try {
const json = JSON.parse(atob(code.trim()));
if (json.url && json.room && json.passphrase) {
return json;
}
return null;
} catch {
return null;
}
}
async function handleCreateRoom() {
roomCreating = true;
roomCreateMessage = "";
shareCode = "";
const room = generateRandomName();
const passphrase = generateRandomPassphrase();
const serverSendUrl = `${CAPTION_SERVER}/api/send`;
try {
const resp = await fetch(`${CAPTION_SERVER}/api/create-room`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ room, passphrase }),
});
if (!resp.ok) {
const err = await resp.json().catch(() => ({ error: "Request failed" }));
roomCreateMessage = `Error: ${err.error || resp.statusText}`;
return;
}
syncUrl = serverSendUrl;
syncRoom = room;
syncPassphrase = passphrase;
syncEnabled = true;
shareCode = encodeShareCode(serverSendUrl, room, passphrase);
roomCreateMessage = "Room created! Share the code below with others.";
} catch (err) {
roomCreateMessage = `Error: ${err instanceof Error ? err.message : String(err)}`;
} finally {
roomCreating = false;
}
}
function handleJoinRoom() {
const decoded = decodeShareCode(joinCode);
if (!decoded) {
roomCreateMessage = "Invalid share code. Please check and try again.";
return;
}
syncUrl = decoded.url;
syncRoom = decoded.room;
syncPassphrase = decoded.passphrase;
syncEnabled = true;
joinCode = "";
roomCreateMessage = "Room joined! Fields have been auto-filled.";
}
async function handleShareCurrentRoom() {
const code = encodeShareCode(syncUrl, syncRoom, syncPassphrase);
shareCode = code;
try {
await navigator.clipboard.writeText(code);
roomCreateMessage = "Share code copied to clipboard!";
} catch {
roomCreateMessage = "Share code generated. Copy it from the field below.";
}
}
async function copyShareCode() {
try {
await navigator.clipboard.writeText(shareCode);
roomCreateMessage = "Share code copied to clipboard!";
} catch {
roomCreateMessage = "Failed to copy. Please select and copy manually.";
}
}
@@ -292,7 +459,100 @@
</div>
</section>
<!-- Transcription Settings -->
<!-- Remote Transcription (moved up for cloud-first UX) -->
<section class="settings-section">
<h3>Transcription Mode</h3>
<div class="radio-group">
<label>
<input
type="radio"
name="remote-mode"
value="byok"
bind:group={remoteMode}
/>
Cloud (Deepgram)
</label>
<label>
<input
type="radio"
name="remote-mode"
value="managed"
bind:group={remoteMode}
/>
Managed Service
</label>
{#if !isCloudOnly}
<label>
<input
type="radio"
name="remote-mode"
value="local"
bind:group={remoteMode}
/>
Local (Whisper)
</label>
{/if}
</div>
{#if remoteMode === "byok"}
<div class="field">
<label for="byok-key">Deepgram API Key</label>
<input
id="byok-key"
type="password"
bind:value={byokApiKey}
placeholder="Enter your Deepgram API key"
/>
<p style="font-size: 11px; color: var(--text-muted); margin-top: 4px;">
Get a key at <a href="https://console.deepgram.com" target="_blank" rel="noopener" style="color: var(--accent-blue);">console.deepgram.com</a>
</p>
</div>
{/if}
{#if remoteMode === "managed"}
<div class="managed-auth">
{#if managedLoggedIn}
<p style="font-size: 13px; margin: 0 0 8px;">
<span style="color: var(--accent-green, #4CAF50);">✓ Logged in</span>
as <strong>{managedEmail}</strong>
</p>
<div class="auth-buttons">
<button onclick={handleManagedLogout}>Log out</button>
</div>
{:else}
<div class="field">
<label for="managed-email">Email</label>
<input
id="managed-email"
type="email"
bind:value={managedEmail}
placeholder="email@example.com"
/>
</div>
<div class="field">
<label for="managed-password">Password</label>
<input
id="managed-password"
type="password"
bind:value={managedPassword}
/>
</div>
<div class="auth-buttons">
<button onclick={handleManagedLogin}>Login</button>
</div>
<p style="font-size: 11px; color: var(--text-muted); margin-top: 8px;">
Don't have an account? <a href="https://transcribe.shadowdao.com/register.html" target="_blank" rel="noopener" style="color: var(--accent-blue);">Sign up here</a>
</p>
{/if}
{#if loginMessage}
<p style="font-size: 12px; margin-top: 6px; color: {loginMessage.startsWith('Logged') ? 'var(--accent-green, #4CAF50)' : 'var(--accent-red, #f44336)'};">
{loginMessage}
</p>
{/if}
</div>
{/if}
</section>
{#if !isCloudMode}
<!-- Transcription Settings (local Whisper only) -->
<section class="settings-section">
<h3>Transcription Settings</h3>
<div class="field">
@@ -438,6 +698,7 @@
/>
</div>
</section>
{/if}
<!-- Display Settings -->
<section class="settings-section">
@@ -474,6 +735,95 @@
bind:value={maxLines}
/>
</div>
<div class="field">
<label for="font-source">Font Source</label>
<select id="font-source" bind:value={fontSource}>
<option value="System Font">System Font</option>
<option value="Web-Safe">Web-Safe</option>
<option value="Google Font">Google Font</option>
</select>
</div>
{#if fontSource === "System Font"}
<div class="field">
<label for="font-family">System Font Family</label>
<input id="font-family" type="text" bind:value={fontFamily} placeholder="Courier" />
</div>
{/if}
{#if fontSource === "Web-Safe"}
<div class="field">
<label for="websafe-font">Web-Safe Font</label>
<select id="websafe-font" bind:value={websafeFont}>
<option value="Arial">Arial</option>
<option value="Arial Black">Arial Black</option>
<option value="Comic Sans MS">Comic Sans MS</option>
<option value="Courier New">Courier New</option>
<option value="Georgia">Georgia</option>
<option value="Impact">Impact</option>
<option value="Lucida Console">Lucida Console</option>
<option value="Lucida Sans Unicode">Lucida Sans Unicode</option>
<option value="Palatino Linotype">Palatino Linotype</option>
<option value="Tahoma">Tahoma</option>
<option value="Times New Roman">Times New Roman</option>
<option value="Trebuchet MS">Trebuchet MS</option>
<option value="Verdana">Verdana</option>
</select>
</div>
{/if}
{#if fontSource === "Google Font"}
<div class="field">
<label for="google-font">Google Font</label>
<select id="google-font" bind:value={googleFont}>
<optgroup label="Sans Serif">
<option value="Roboto">Roboto</option>
<option value="Open Sans">Open Sans</option>
<option value="Lato">Lato</option>
<option value="Montserrat">Montserrat</option>
<option value="Poppins">Poppins</option>
<option value="Nunito">Nunito</option>
<option value="Raleway">Raleway</option>
<option value="Ubuntu">Ubuntu</option>
<option value="Rubik">Rubik</option>
<option value="Work Sans">Work Sans</option>
<option value="Inter">Inter</option>
<option value="Outfit">Outfit</option>
<option value="Quicksand">Quicksand</option>
<option value="Comfortaa">Comfortaa</option>
<option value="Varela Round">Varela Round</option>
</optgroup>
<optgroup label="Serif">
<option value="Playfair Display">Playfair Display</option>
<option value="Merriweather">Merriweather</option>
<option value="Lora">Lora</option>
<option value="PT Serif">PT Serif</option>
<option value="Crimson Text">Crimson Text</option>
</optgroup>
<optgroup label="Monospace">
<option value="Roboto Mono">Roboto Mono</option>
<option value="Source Code Pro">Source Code Pro</option>
<option value="Fira Code">Fira Code</option>
<option value="JetBrains Mono">JetBrains Mono</option>
<option value="IBM Plex Mono">IBM Plex Mono</option>
</optgroup>
<optgroup label="Display">
<option value="Bebas Neue">Bebas Neue</option>
<option value="Oswald">Oswald</option>
<option value="Righteous">Righteous</option>
<option value="Bangers">Bangers</option>
<option value="Permanent Marker">Permanent Marker</option>
</optgroup>
<optgroup label="Handwriting">
<option value="Pacifico">Pacifico</option>
<option value="Lobster">Lobster</option>
<option value="Dancing Script">Dancing Script</option>
<option value="Caveat">Caveat</option>
<option value="Satisfy">Satisfy</option>
</optgroup>
</select>
<p style="font-size: 11px; color: var(--text-muted); margin-top: 4px;">
Browse more at <a href="https://fonts.google.com" target="_blank" rel="noopener" style="color: var(--accent-blue);">fonts.google.com</a>
</p>
</div>
{/if}
<div class="field">
<label for="font-size">Font Size: {fontSize}px</label>
<input
@@ -504,11 +854,11 @@
</div>
</section>
<!-- Server Sync -->
<!-- Server Sync (Shared Captions) -->
<section class="settings-section">
<h3>Server Sync</h3>
<h3>Shared Captions</h3>
<div class="field-row">
<label for="sync-enabled">Enable Server Sync</label>
<label for="sync-enabled">Enable Shared Captions</label>
<input
id="sync-enabled"
type="checkbox"
@@ -516,13 +866,57 @@
/>
</div>
{#if syncEnabled}
<div class="room-actions">
<div class="room-buttons-row">
<button
onclick={handleCreateRoom}
disabled={roomCreating}
class="secondary"
>
{roomCreating ? "Creating..." : "Create Room"}
</button>
<button
onclick={handleShareCurrentRoom}
disabled={!syncUrl.trim() || !syncRoom.trim() || !syncPassphrase.trim()}
class="secondary"
>
Share Current Room
</button>
</div>
<div class="join-row">
<input
type="text"
bind:value={joinCode}
placeholder="Paste share code to join"
class="join-input"
/>
<button onclick={handleJoinRoom} disabled={!joinCode.trim()} class="secondary">
Join
</button>
</div>
</div>
{#if roomCreateMessage}
<p class="room-message" class:error={roomCreateMessage.startsWith("Error")}>{roomCreateMessage}</p>
{/if}
{#if shareCode}
<div class="share-code-box">
<label>Share Code</label>
<div class="share-code-row">
<input type="text" value={shareCode} readonly class="share-code-input" />
<button onclick={copyShareCode} class="secondary">Copy</button>
</div>
</div>
{/if}
<div class="field">
<label for="sync-url">Server URL</label>
<input
id="sync-url"
type="url"
bind:value={syncUrl}
placeholder="http://localhost:3000/api/send"
placeholder="https://caption.shadowdao.com/api/send"
/>
</div>
<div class="field">
@@ -540,76 +934,6 @@
{/if}
</section>
<!-- Remote Transcription -->
<section class="settings-section">
<h3>Remote Transcription</h3>
<div class="radio-group">
<label>
<input
type="radio"
name="remote-mode"
value="local"
bind:group={remoteMode}
/>
Local
</label>
<label>
<input
type="radio"
name="remote-mode"
value="managed"
bind:group={remoteMode}
/>
Managed
</label>
<label>
<input
type="radio"
name="remote-mode"
value="byok"
bind:group={remoteMode}
/>
BYOK (Bring Your Own Key)
</label>
</div>
{#if remoteMode !== "local"}
<div class="field">
<label for="remote-url">Server URL</label>
<input
id="remote-url"
type="url"
bind:value={remoteServerUrl}
placeholder="wss://your-proxy.com"
/>
</div>
{/if}
{#if remoteMode === "managed"}
<div class="managed-auth">
<div class="field">
<label for="managed-email">Email</label>
<input
id="managed-email"
type="email"
bind:value={managedEmail}
placeholder="email@example.com"
/>
</div>
<div class="field">
<label for="managed-password">Password</label>
<input
id="managed-password"
type="password"
bind:value={managedPassword}
/>
</div>
<div class="auth-buttons">
<button onclick={handleManagedLogin}>Login</button>
<button onclick={handleManagedRegister}>Register</button>
</div>
</div>
{/if}
</section>
<!-- Updates -->
<section class="settings-section">
<h3>Updates</h3>
@@ -623,11 +947,27 @@
</div>
<button onclick={handleCheckUpdates}>Check Now</button>
</section>
<!-- Transcription Engine -->
<section class="settings-section">
<h3>Transcription Engine</h3>
<p style="font-size: 12px; color: var(--text-secondary); margin-bottom: 12px;">
Switch between local (Whisper) and cloud (Deepgram) transcription engines.
This will stop the current engine, remove the downloaded files, and restart
with the new engine selection.
</p>
<button class="danger-btn" onclick={handleChangeSidecar}>Change Transcription Engine</button>
</section>
</div>
<div class="settings-footer">
<button onclick={handleCancel}>Cancel</button>
<button class="primary" onclick={handleSave}>Save</button>
{#if saveMessage}
<span class="save-message" class:error={saveMessage.startsWith("Error")}>{saveMessage}</span>
{/if}
<button onclick={handleCancel} disabled={saving}>Cancel</button>
<button class="primary" onclick={handleSave} disabled={saving}>
{saving ? "Saving..." : "Save"}
</button>
</div>
</div>
</div>
@@ -771,10 +1111,107 @@
.settings-footer {
display: flex;
align-items: center;
justify-content: flex-end;
gap: 8px;
padding: 16px 20px;
border-top: 1px solid var(--border-color);
flex-shrink: 0;
}
.save-message {
margin-right: auto;
font-size: 13px;
color: #4CAF50;
}
.save-message.error {
color: #f44336;
}
.room-actions {
display: flex;
flex-direction: column;
gap: 8px;
margin-bottom: 12px;
}
.room-buttons-row {
display: flex;
gap: 8px;
}
.join-row {
display: flex;
gap: 8px;
}
.join-input {
flex: 1;
}
.room-message {
font-size: 12px;
color: #4CAF50;
margin-bottom: 8px;
}
.room-message.error {
color: #f44336;
}
.share-code-box {
margin-bottom: 12px;
}
.share-code-box label {
display: block;
margin-bottom: 4px;
font-size: 12px;
color: var(--text-secondary);
}
.share-code-row {
display: flex;
gap: 8px;
}
.share-code-input {
flex: 1;
font-size: 11px;
font-family: monospace;
}
.secondary {
background: transparent;
border: 1px solid var(--border-color);
color: var(--text-primary);
padding: 6px 12px;
border-radius: 6px;
cursor: pointer;
font-size: 13px;
}
.secondary:hover {
background: var(--bg-tertiary);
}
.secondary:disabled {
opacity: 0.5;
cursor: not-allowed;
}
.danger-btn {
background: transparent;
border: 1px solid var(--accent-red, #f44336);
color: var(--accent-red, #f44336);
padding: 8px 16px;
border-radius: 6px;
cursor: pointer;
font-size: 13px;
}
.danger-btn:hover {
background: rgba(244, 67, 54, 0.1);
}
</style>

View File

@@ -0,0 +1,420 @@
<script lang="ts">
import { invoke } from "@tauri-apps/api/core";
import { listen } from "@tauri-apps/api/event";
import { onMount } from "svelte";
interface Props {
onComplete: () => void;
}
let { onComplete }: Props = $props();
type SetupState = "choose" | "downloading" | "error" | "success";
let setupState = $state<SetupState>("choose");
let variant = $state<"cpu" | "cuda">("cpu");
let progress = $state(0);
let progressMessage = $state("");
let errorMessage = $state("");
let unlisten: (() => void) | null = null;
onMount(() => {
return () => {
if (unlisten) {
unlisten();
unlisten = null;
}
};
});
async function startDownload() {
setupState = "downloading";
progress = 0;
progressMessage = "Starting download...";
errorMessage = "";
try {
// Listen for progress events from the Tauri backend
unlisten = await listen<{ downloaded: number; total: number; phase: string; message: string }>(
"sidecar-download-progress",
(event) => {
const { downloaded, total, message } = event.payload;
progress = total > 0 ? (downloaded / total) * 100 : 0;
progressMessage = message;
}
);
await invoke("download_sidecar", { variant });
// Download complete
setupState = "success";
if (unlisten) {
unlisten();
unlisten = null;
}
// Brief pause to show success, then proceed
setTimeout(() => {
onComplete();
}, 1500);
} catch (err) {
setupState = "error";
errorMessage = err instanceof Error ? err.message : String(err);
if (unlisten) {
unlisten();
unlisten = null;
}
}
}
function retry() {
setupState = "choose";
progress = 0;
progressMessage = "";
errorMessage = "";
}
</script>
<div class="setup-overlay">
<div class="setup-card">
<div class="setup-header">
<h1 class="app-title">Local Transcription</h1>
<h2 class="setup-heading">First-Time Setup</h2>
</div>
{#if setupState === "choose"}
<p class="setup-description">
Choose a transcription engine. You can change this later in Settings.
</p>
<div class="variant-options">
<label class="variant-option" class:selected={variant === "cloud"}>
<input
type="radio"
name="variant"
value="cloud"
bind:group={variant}
/>
<div class="variant-info">
<span class="variant-name">Cloud (Deepgram)</span>
<span class="variant-desc">~50 MB download</span>
<span class="variant-detail">
Fast, accurate streaming transcription via Deepgram's servers.
Requires internet and a Deepgram API key.
Best for most users — low resource usage, works on any hardware.
</span>
<span class="variant-tag recommended">Recommended</span>
</div>
</label>
<label class="variant-option" class:selected={variant === "cpu"}>
<input
type="radio"
name="variant"
value="cpu"
bind:group={variant}
/>
<div class="variant-info">
<span class="variant-name">Local - CPU</span>
<span class="variant-desc">~500 MB download</span>
<span class="variant-detail">
Runs Whisper AI models locally on your CPU. No internet needed
after download. Good for privacy or offline use, but slower and
uses more system resources than cloud.
</span>
</div>
</label>
</div>
<button class="download-btn" onclick={startDownload}>
Download & Install
</button>
{:else if setupState === "downloading"}
<div class="progress-section">
<p class="progress-message">{progressMessage}</p>
<div class="progress-bar-track">
<div
class="progress-bar-fill"
style="width: {progress}%"
></div>
</div>
<p class="progress-percent">{Math.round(progress)}%</p>
</div>
{:else if setupState === "error"}
<div class="error-section">
<div class="error-icon">
<svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="#f44336" stroke-width="2">
<circle cx="12" cy="12" r="10"/>
<line x1="15" y1="9" x2="9" y2="15"/>
<line x1="9" y1="9" x2="15" y2="15"/>
</svg>
</div>
<p class="error-title">Download Failed</p>
<p class="error-message">{errorMessage}</p>
<button class="retry-btn" onclick={retry}>
Try Again
</button>
</div>
{:else if setupState === "success"}
<div class="success-section">
<div class="success-icon">
<svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="#4CAF50" stroke-width="2">
<circle cx="12" cy="12" r="10"/>
<polyline points="16 9 10.5 15 8 12.5"/>
</svg>
</div>
<p class="success-title">Setup Complete</p>
<p class="success-message">The transcription engine is ready to go.</p>
</div>
{/if}
</div>
</div>
<style>
.setup-overlay {
display: flex;
align-items: center;
justify-content: center;
height: 100%;
width: 100%;
background-color: #1e1e1e;
}
.setup-card {
background-color: #2a2a2a;
border-radius: 12px;
padding: 40px;
max-width: 480px;
width: 100%;
margin: 20px;
box-shadow: 0 8px 32px rgba(0, 0, 0, 0.4);
}
.setup-header {
text-align: center;
margin-bottom: 24px;
}
.app-title {
font-size: 24px;
font-weight: 700;
color: #e0e0e0;
margin-bottom: 4px;
}
.setup-heading {
font-size: 16px;
font-weight: 500;
color: #a0a0a0;
}
.setup-description {
font-size: 14px;
color: #a0a0a0;
line-height: 1.6;
text-align: center;
margin-bottom: 24px;
}
.variant-options {
display: flex;
flex-direction: column;
gap: 12px;
margin-bottom: 24px;
}
.variant-option {
display: flex;
align-items: center;
gap: 12px;
padding: 14px 16px;
border: 2px solid #444;
border-radius: 8px;
cursor: pointer;
transition: border-color 0.15s ease, background-color 0.15s ease;
}
.variant-option:hover {
background-color: #333;
border-color: #555;
}
.variant-option.selected {
border-color: #4CAF50;
background-color: rgba(76, 175, 80, 0.08);
}
.variant-option input[type="radio"] {
width: 18px;
height: 18px;
flex-shrink: 0;
}
.variant-info {
display: flex;
flex-direction: column;
gap: 2px;
}
.variant-name {
font-size: 14px;
font-weight: 600;
color: #e0e0e0;
}
.variant-desc {
font-size: 12px;
color: #888;
}
.variant-detail {
font-size: 11px;
color: #666;
line-height: 1.4;
margin-top: 2px;
}
.variant-tag {
display: inline-block;
font-size: 10px;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
padding: 2px 6px;
border-radius: 3px;
margin-top: 4px;
width: fit-content;
}
.variant-tag.recommended {
background: rgba(76, 175, 80, 0.15);
color: #4CAF50;
}
.download-btn {
display: block;
width: 100%;
padding: 12px 24px;
font-size: 15px;
font-weight: 600;
color: white;
background-color: #4CAF50;
border: none;
border-radius: 8px;
cursor: pointer;
transition: background-color 0.15s ease;
}
.download-btn:hover {
background-color: #45a049;
}
.download-btn:active {
transform: scale(0.98);
}
/* Progress state */
.progress-section {
text-align: center;
padding: 20px 0;
}
.progress-message {
font-size: 14px;
color: #a0a0a0;
margin-bottom: 16px;
}
.progress-bar-track {
width: 100%;
height: 8px;
background-color: #3a3a3a;
border-radius: 4px;
overflow: hidden;
margin-bottom: 8px;
}
.progress-bar-fill {
height: 100%;
background-color: #4CAF50;
border-radius: 4px;
transition: width 0.3s ease;
}
.progress-percent {
font-size: 13px;
color: #707070;
}
/* Error state */
.error-section {
text-align: center;
padding: 10px 0;
}
.error-icon {
display: flex;
justify-content: center;
margin-bottom: 12px;
}
.error-title {
font-size: 18px;
font-weight: 600;
color: #f44336;
margin-bottom: 8px;
}
.error-message {
font-size: 13px;
color: #a0a0a0;
margin-bottom: 20px;
word-break: break-word;
}
.retry-btn {
display: inline-block;
padding: 10px 28px;
font-size: 14px;
font-weight: 600;
color: white;
background-color: #4CAF50;
border: none;
border-radius: 8px;
cursor: pointer;
transition: background-color 0.15s ease;
}
.retry-btn:hover {
background-color: #45a049;
}
/* Success state */
.success-section {
text-align: center;
padding: 20px 0;
}
.success-icon {
display: flex;
justify-content: center;
margin-bottom: 12px;
}
.success-title {
font-size: 18px;
font-weight: 600;
color: #4CAF50;
margin-bottom: 4px;
}
.success-message {
font-size: 14px;
color: #a0a0a0;
}
</style>

View File

@@ -19,6 +19,7 @@ interface BackendState {
wsConnection: WebSocket | null;
version: string;
lastError: string;
isCloudOnly: boolean;
}
let state = $state<BackendState>({
@@ -30,6 +31,7 @@ let state = $state<BackendState>({
wsConnection: null,
version: "1.4.0",
lastError: "",
isCloudOnly: false,
});
let reconnectTimer: ReturnType<typeof setTimeout> | null = null;
@@ -54,6 +56,38 @@ async function apiFetch(path: string, options?: RequestInit): Promise<Response>
return fetch(url, { ...options, headers });
}
// ── Status polling ──────────────────────────────────────────────────
let statusPollTimer: ReturnType<typeof setTimeout> | null = null;
async function pollStatus() {
try {
const resp = await fetch(apiUrl("/api/status"));
if (resp.ok) {
const data = await resp.json();
if (data.state) {
state.appState = data.state as AppState;
}
if (data.engine_device) {
state.deviceInfo = data.engine_device;
}
if (data.version) {
state.version = data.version;
}
if (data.is_cloud_only !== undefined) {
state.isCloudOnly = data.is_cloud_only;
}
}
} catch {
// API not ready yet, will retry
}
// Keep polling every 2s while still initializing
if (state.appState === "initializing" && state.connectionState === "connected") {
statusPollTimer = setTimeout(pollStatus, 2000);
}
}
// ── WebSocket management ─────────────────────────────────────────────
function connectWebSocket() {
@@ -80,6 +114,9 @@ function _openSocket() {
clearTimeout(reconnectTimer);
reconnectTimer = null;
}
// Poll status to catch engine ready state that may have been
// missed (engine can finish before WebSocket connects)
pollStatus();
};
ws.onmessage = (event) => {
@@ -132,6 +169,10 @@ function _scheduleReconnect() {
}
function disconnect() {
if (statusPollTimer) {
clearTimeout(statusPollTimer);
statusPollTimer = null;
}
if (reconnectTimer) {
clearTimeout(reconnectTimer);
reconnectTimer = null;
@@ -249,15 +290,27 @@ export const backendStore = {
get lastError() {
return state.lastError;
},
get isCloudOnly() {
return state.isCloudOnly;
},
get apiBaseUrl() {
return `http://localhost:${state.port}`;
},
get wsUrl() {
return `ws://localhost:${state.port}/ws/control`;
},
get obsUrl() {
// OBS display runs on the web server port (one below the API port)
const obsPort = state.port > 0 ? state.port - 1 : 8080;
return `http://localhost:${obsPort}`;
},
get syncUrl() {
return "";
},
setPort,
connect: connectWebSocket,
disconnect,
pollStatus,
apiUrl,
apiFetch,
apiGet,

View File

@@ -0,0 +1,77 @@
import { describe, it, expect, vi, beforeEach } from "vitest";
import { backendStore } from "./backend.svelte.ts";
// Mock WebSocket globally so the store module can reference it
class MockWebSocket {
onopen: ((ev: Event) => void) | null = null;
onclose: ((ev: CloseEvent) => void) | null = null;
onmessage: ((ev: MessageEvent) => void) | null = null;
onerror: ((ev: Event) => void) | null = null;
close = vi.fn();
}
vi.stubGlobal("WebSocket", MockWebSocket);
// Mock fetch to prevent real network calls
vi.stubGlobal(
"fetch",
vi.fn(() =>
Promise.resolve({
ok: true,
json: () => Promise.resolve({}),
})
)
);
describe("backend store", () => {
beforeEach(() => {
backendStore.disconnect();
backendStore.setPort(8081);
});
it("test_exports_expected_properties", () => {
expect(backendStore).toHaveProperty("port");
expect(backendStore).toHaveProperty("connectionState");
expect(backendStore).toHaveProperty("connected");
expect(backendStore).toHaveProperty("appState");
expect(backendStore).toHaveProperty("stateMessage");
expect(backendStore).toHaveProperty("deviceInfo");
expect(backendStore).toHaveProperty("version");
expect(backendStore).toHaveProperty("lastError");
expect(backendStore).toHaveProperty("apiBaseUrl");
expect(backendStore).toHaveProperty("wsUrl");
expect(backendStore).toHaveProperty("obsUrl");
expect(backendStore).toHaveProperty("syncUrl");
});
it("test_exports_expected_methods", () => {
expect(typeof backendStore.setPort).toBe("function");
expect(typeof backendStore.connect).toBe("function");
expect(typeof backendStore.disconnect).toBe("function");
expect(typeof backendStore.apiUrl).toBe("function");
expect(typeof backendStore.apiFetch).toBe("function");
expect(typeof backendStore.apiGet).toBe("function");
expect(typeof backendStore.apiPost).toBe("function");
expect(typeof backendStore.apiPut).toBe("function");
});
it("test_obsUrl_derives_from_port", () => {
backendStore.setPort(8081);
expect(backendStore.obsUrl).toBe("http://localhost:8080");
});
it("test_apiBaseUrl_uses_port", () => {
backendStore.setPort(8081);
expect(backendStore.apiBaseUrl).toBe("http://localhost:8081");
});
it("test_wsUrl_uses_port", () => {
backendStore.setPort(8081);
expect(backendStore.wsUrl).toBe("ws://localhost:8081/ws/control");
});
it("test_initial_state", () => {
// After disconnect() in beforeEach, state should be disconnected
expect(backendStore.connectionState).toBe("disconnected");
expect(backendStore.appState).toBe("initializing");
});
});

View File

@@ -65,6 +65,7 @@ export interface AppConfig {
mode: string;
server_url: string;
auth_token: string;
email: string;
byok_api_key: string;
deepgram_model: string;
language: string;
@@ -107,7 +108,7 @@ function getDefaultConfig(): AppConfig {
},
server_sync: {
enabled: false,
url: "http://localhost:3000/api/send",
url: "",
room: "default",
passphrase: "",
},
@@ -128,9 +129,10 @@ function getDefaultConfig(): AppConfig {
},
web_server: { port: 8080, host: "127.0.0.1" },
remote: {
mode: "local",
mode: "byok",
server_url: "",
auth_token: "",
email: "",
byok_api_key: "",
deepgram_model: "nova-2",
language: "en-US",

View File

@@ -0,0 +1,48 @@
import { describe, it, expect, vi } from "vitest";
// Mock fetch so the backend store module doesn't make real requests
vi.stubGlobal(
"fetch",
vi.fn(() =>
Promise.resolve({
ok: true,
json: () => Promise.resolve({}),
})
)
);
// Mock WebSocket for the backend store dependency
class MockWebSocket {
onopen: ((ev: Event) => void) | null = null;
onclose: ((ev: CloseEvent) => void) | null = null;
onmessage: ((ev: MessageEvent) => void) | null = null;
onerror: ((ev: Event) => void) | null = null;
close = vi.fn();
}
vi.stubGlobal("WebSocket", MockWebSocket);
import { configStore } from "./config.svelte.ts";
describe("config store", () => {
it("test_has_fetchConfig_method", () => {
expect(typeof configStore.fetchConfig).toBe("function");
});
it("test_has_updateConfig_method", () => {
expect(typeof configStore.updateConfig).toBe("function");
});
it("test_config_defaults_have_expected_keys", () => {
const cfg = configStore.config;
expect(cfg).toHaveProperty("user");
expect(cfg).toHaveProperty("audio");
expect(cfg).toHaveProperty("transcription");
expect(cfg).toHaveProperty("display");
expect(cfg).toHaveProperty("remote");
expect(cfg).toHaveProperty("updates");
});
it("test_remote_config_has_byok_api_key", () => {
expect(configStore.config.remote.byok_api_key).toBeDefined();
});
});

View File

@@ -0,0 +1,34 @@
import { describe, it, expect } from "vitest";
import * as fs from "node:fs";
import * as path from "node:path";
describe("store file extensions", () => {
it("test_store_files_use_svelte_ts_extension", () => {
const storesDir = path.resolve(__dirname);
const files = fs.readdirSync(storesDir);
// Find .ts files that are NOT .svelte.ts and NOT test files
const plainTsFiles = files.filter(
(f) =>
f.endsWith(".ts") &&
!f.endsWith(".svelte.ts") &&
!f.endsWith(".test.ts")
);
for (const file of plainTsFiles) {
const content = fs.readFileSync(path.join(storesDir, file), "utf-8");
expect(content).not.toMatch(
/\$state\s*[<(]/,
`${file} uses $state() but does not have .svelte.ts extension`
);
expect(content).not.toMatch(
/\$derived\s*[<(]/,
`${file} uses $derived() but does not have .svelte.ts extension`
);
expect(content).not.toMatch(
/\$effect\s*[<(]/,
`${file} uses $effect() but does not have .svelte.ts extension`
);
}
});
});

View File

@@ -0,0 +1,71 @@
import { describe, it, expect, vi, beforeEach } from "vitest";
// Mock WebSocket for the backend store dependency (loaded transitively)
class MockWebSocket {
onopen: ((ev: Event) => void) | null = null;
onclose: ((ev: CloseEvent) => void) | null = null;
onmessage: ((ev: MessageEvent) => void) | null = null;
onerror: ((ev: Event) => void) | null = null;
close = vi.fn();
}
vi.stubGlobal("WebSocket", MockWebSocket);
vi.stubGlobal(
"fetch",
vi.fn(() =>
Promise.resolve({
ok: true,
json: () => Promise.resolve({}),
})
)
);
import { transcriptionStore } from "./transcriptions.svelte.ts";
describe("transcriptions store", () => {
beforeEach(() => {
transcriptionStore.clearAll();
});
it("test_addTranscription", () => {
transcriptionStore.addTranscription({
text: "Hello world",
user_name: "TestUser",
timestamp: "12:00:00",
});
expect(transcriptionStore.items.length).toBe(1);
expect(transcriptionStore.items[0].text).toBe("Hello world");
expect(transcriptionStore.items[0].userName).toBe("TestUser");
expect(transcriptionStore.items[0].timestamp).toBe("12:00:00");
expect(transcriptionStore.items[0].isPreview).toBe(false);
});
it("test_clearAll", () => {
transcriptionStore.addTranscription({ text: "One" });
transcriptionStore.addTranscription({ text: "Two" });
expect(transcriptionStore.items.length).toBe(2);
transcriptionStore.clearAll();
expect(transcriptionStore.items.length).toBe(0);
});
it("test_getPlainText", () => {
transcriptionStore.addTranscription({
text: "Hello",
user_name: "Alice",
timestamp: "10:00",
});
transcriptionStore.addTranscription({
text: "World",
user_name: "Bob",
timestamp: "10:01",
});
const text = transcriptionStore.getPlainText();
expect(text).toContain("[10:00] Alice: Hello");
expect(text).toContain("[10:01] Bob: World");
// Lines separated by newline
expect(text.split("\n").length).toBe(2);
});
});

View File

@@ -1,7 +1,7 @@
"""Version information for Local Transcription."""
__version__ = "1.4.1"
__version_info__ = (1, 4, 1)
__version__ = "2.0.20"
__version_info__ = (2, 0, 20)
# Version history:
# 1.4.0 - Auto-update feature:

View File

@@ -10,6 +10,7 @@ export default defineConfig({
alias: {
$lib: path.resolve("./src/lib"),
},
extensions: [".svelte.ts", ".ts", ".svelte", ".js", ".mjs", ".mts"],
},
server: {
port: 1420,
@@ -18,4 +19,8 @@ export default defineConfig({
ignored: ["**/src-tauri/**", "**/client/**", "**/server/**", "**/backend/**", "**/gui/**"],
},
},
test: {
environment: "jsdom",
include: ["src/**/*.test.ts"],
},
});