Compare commits

..

18 Commits

Author SHA1 Message Date
Gitea Actions
b4c0589b04 chore: bump sidecar version to 1.0.12 [skip ci] 2026-04-11 02:44:12 +00:00
Developer
66c441b17f Revert macOS workflow to pre-signing state
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m59s
Remove all signing env vars and setup steps. The local act runner's
keychain interferes with Tauri's auto-detection. Will re-add signing
once Apple Developer verification is complete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:41:46 -07:00
Developer
94bc704950 Fix settings save blocking event loop and overwriting config keys
Some checks failed
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Has been cancelled
- Run apply_settings in thread pool executor to prevent engine reload
  from blocking the HTTP response (caused "TypeError: Failed to fetch")
- Flatten nested config objects into dot-notation keys before saving
  so partial updates don't wipe out unincluded keys like auth_token

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:40:51 -07:00
Developer
7900d2d9f2 Detect cloud-only sidecar from compute devices (no sidecar rebuild needed)
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m57s
Use the existing /api/compute-devices response to determine if only cloud
is available, instead of relying on the backend's is_cloud_only status field.
Hides Local (Whisper) option when the sidecar only supports cloud.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:36:13 -07:00
Developer
e0396df7b0 Use ad-hoc signing when no Apple certificate is configured
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m4s
Prevents Tauri from auto-detecting local keychain certificates on the
build machine, which causes SecKeychainItemImport failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:30:44 -07:00
Gitea Actions
ad89735822 chore: bump version to 2.0.17 [skip ci] 2026-04-11 02:27:20 +00:00
Developer
f0b5890eba Hide Local (Whisper) mode option when using cloud-only sidecar
All checks were successful
Tests / Python Backend Tests (push) Successful in 6s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m3s
- Expose is_cloud_only flag in /api/status response
- Add isCloudOnly to backend store state
- Conditionally hide Local (Whisper) radio button in Settings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:24:01 -07:00
Developer
8df1ab9817 Remove macOS signing config until Apple Developer verification completes
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m0s
hardenedRuntime triggers code signing which fails without a valid certificate.
Entitlements.plist and Info.plist remain for when signing is re-enabled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:21:20 -07:00
Gitea Actions
34a165fc05 chore: bump version to 2.0.16 [skip ci] 2026-04-11 02:15:32 +00:00
Developer
8f4e5cc099 Default managed mode to transcribe.shadowdao.com and simplify login UI
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 2m4s
- Set default server_url to https://transcribe.shadowdao.com
- Remove Server URL field from managed mode settings (users don't need to configure it)
- Replace Register button with link to website signup page
- Add fallback to default URL in login handler for existing users with empty config

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:12:50 -07:00
Developer
16f9ac2ab8 Add code signing config for Windows (Azure Artifact Signing) and macOS (Apple notarization)
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 1m58s
CI workflows now support code signing when secrets are configured:
- macOS: Apple Developer certificate + App Store Connect API key for notarization
- Windows: Azure Artifact Signing via signtool + dlib
- Both are no-ops when secrets aren't set (backwards-compatible)
- Add Entitlements.plist (mic, network) and Info.plist (NSMicrophoneUsageDescription)
- Add SIGNING.md with full setup guide for both platforms

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 18:02:46 -07:00
Developer
cd325102e2 Update docs for cloud-first UX and shared captions
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 2m13s
- README: document cloud-first quick start, shared captions workflow
  (create room, join via share code, share existing room), and
  self-hosting option
- README: update default remote.mode from local to byok in config table
- CLAUDE.md: reflect cloud-first default, settings gating, and shared
  captions features

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:10:46 -07:00
Gitea Actions
d220158dd7 chore: bump version to 2.0.15 [skip ci] 2026-04-10 19:38:00 +00:00
Developer
8670e19acc Add "Share Current Room" button to copy existing room config as share code
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 7s
Tests / Rust Sidecar Tests (push) Successful in 1m58s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 12:26:29 -07:00
Gitea Actions
812cc4ac5e chore: bump version to 2.0.14 [skip ci] 2026-04-10 19:15:02 +00:00
Developer
4aa19eee86 Fix test: align remote.mode in no-reload settings test
All checks were successful
Tests / Python Backend Tests (push) Successful in 5s
Tests / Frontend Tests (push) Successful in 8s
Tests / Rust Sidecar Tests (push) Successful in 1m59s
The default remote.mode changed from 'local' to 'byok', causing
the apply_settings test to detect a mode mismatch and trigger an
unexpected engine reload. Pin remote.mode to 'local' in the test
to match the controller's assumed current mode.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 12:01:11 -07:00
Developer
b8dfe0f1ba Cloud-first UX: default to Deepgram, gate start button, add room sharing
Some checks failed
Tests / Python Backend Tests (push) Failing after 6s
Tests / Frontend Tests (push) Successful in 9s
Tests / Rust Sidecar Tests (push) Successful in 2m1s
- Change default transcription mode from local to byok (cloud/Deepgram)
- Move Transcription Mode selector to top of settings for visibility
- Hide local-only settings (model, VAD, timing) when cloud mode selected
- Disable Start button until API key (byok) or login (managed) is configured
- Add room creation and share code flow to Shared Captions section
- Add POST /api/create-room endpoint to Node.js sync server
- Update default sync URL placeholder to caption.shadowdao.com

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 11:58:49 -07:00
Gitea Actions
5837b97a20 chore: bump sidecar version to 1.0.11 [skip ci] 2026-04-08 21:15:05 +00:00
21 changed files with 697 additions and 143 deletions

View File

@@ -46,8 +46,45 @@ jobs:
shell: powershell
run: npm ci
- name: Setup Azure Artifact Signing
shell: powershell
env:
AZURE_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
AZURE_SIGNING_ENDPOINT: ${{ secrets.AZURE_SIGNING_ENDPOINT }}
AZURE_SIGNING_ACCOUNT: ${{ secrets.AZURE_SIGNING_ACCOUNT }}
AZURE_CERT_PROFILE: ${{ secrets.AZURE_CERT_PROFILE }}
run: |
if (-not $env:AZURE_CLIENT_ID) {
Write-Host "No Azure signing secrets configured, skipping code signing setup"
return
}
Write-Host "Setting up Azure Artifact Signing..."
# Install Artifact Signing client tools
nuget install Microsoft.ArtifactSigning.Client -x -OutputDirectory .\signing-tools
$dlibPath = (Resolve-Path ".\signing-tools\Microsoft.ArtifactSigning.Client*\bin\x64\Azure.CodeSigning.Dlib.dll").Path
# Write metadata.json
@{
Endpoint = $env:AZURE_SIGNING_ENDPOINT
CodeSigningAccountName = $env:AZURE_SIGNING_ACCOUNT
CertificateProfileName = $env:AZURE_CERT_PROFILE
} | ConvertTo-Json | Out-File -Encoding UTF8 metadata.json
$metadataPath = (Resolve-Path "metadata.json").Path
# Inject signCommand into tauri.conf.json for this build
$conf = Get-Content src-tauri\tauri.conf.json -Raw | ConvertFrom-Json
$signCmd = "signtool.exe sign /v /fd SHA256 /tr http://timestamp.acs.microsoft.com /td SHA256 /dlib `"$dlibPath`" /dmdf `"$metadataPath`" %1"
$conf.bundle.windows | Add-Member -NotePropertyName "signCommand" -NotePropertyValue $signCmd -Force
$conf | ConvertTo-Json -Depth 10 | Set-Content src-tauri\tauri.conf.json -Encoding UTF8
- name: Build Tauri app
shell: powershell
env:
AZURE_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
AZURE_CLIENT_SECRET: ${{ secrets.AZURE_CLIENT_SECRET }}
AZURE_TENANT_ID: ${{ secrets.AZURE_TENANT_ID }}
run: npm run tauri build
- name: Upload to release

View File

@@ -11,9 +11,11 @@ Local Transcription is a cross-platform desktop application for real-time speech
**Key Features:**
- Cross-platform desktop app (Windows, macOS, Linux) via Tauri v2 + Svelte 5
- Headless Python backend with FastAPI control API
- Dual transcription modes: local Whisper or cloud Deepgram (managed/BYOK)
- Cloud-first: defaults to Deepgram (BYOK) transcription; local Whisper also supported
- Settings UI hides local-only options (model, VAD, timing) when in cloud mode
- Start button gated on API key / login — shows guidance if not configured
- Shared Captions: create rooms, share via codes, join with one click (hosted at caption.shadowdao.com)
- Built-in web server for OBS browser source at `http://localhost:8080`
- Optional multi-user sync via Node.js server
- CUDA, MPS (Apple Silicon), and CPU support
- Auto-updates, custom fonts, configurable colors
@@ -273,9 +275,29 @@ All per-OS build workflows can be re-run independently via `workflow_dispatch` w
- `Info.plist` must include `NSMicrophoneUsageDescription` for mic access
- No CUDA builds — CPU/MPS only
## Code Signing
Code signing is configured for Windows and macOS to eliminate install warnings (SmartScreen / Gatekeeper). See [SIGNING.md](SIGNING.md) for full setup details.
**Status (as of 2026-04-10):** CI workflow changes are committed. Waiting on identity verification for both platforms before secrets can be configured.
**How it works:**
- macOS: Tauri auto-signs when `APPLE_CERTIFICATE` and related env vars are set in CI. Notarization uses App Store Connect API key.
- Windows: Azure Artifact Signing via `signtool.exe` + dlib. CI workflow injects `signCommand` into `tauri.conf.json` at build time when `AZURE_CLIENT_ID` is set.
- Both are no-ops when secrets aren't configured — unsigned builds work as before.
**Key files:**
- `src-tauri/Entitlements.plist` — macOS hardened runtime entitlements (mic, network)
- `src-tauri/Info.plist` — macOS microphone usage description
- `.gitea/workflows/build-app-macos.yml` — Apple signing + notarization
- `.gitea/workflows/build-app-windows.yml` — Azure Artifact Signing
**Secrets required (12 total):** See [SIGNING.md](SIGNING.md) for the full list — 6 Apple secrets, 6 Azure secrets.
## Related Documentation
- [README.md](README.md) — User-facing documentation
- [BUILD.md](BUILD.md) — Detailed build instructions
- [INSTALL.md](INSTALL.md) — Installation guide
- [SIGNING.md](SIGNING.md) — Code signing setup guide
- [server/nodejs/README.md](server/nodejs/README.md) — Node.js server setup

View File

@@ -7,14 +7,14 @@ A real-time speech-to-text desktop application for streamers. Runs locally on yo
## Features
- **Real-Time Transcription**: Live speech-to-text using Whisper models with minimal latency
- **Cloud-First**: Defaults to Deepgram cloud transcription — get started with just an API key
- **Cross-Platform**: Native desktop app for Windows, macOS, and Linux via [Tauri](https://tauri.app/)
- **Dual Transcription Modes**: Local (Whisper) or cloud (Deepgram) with managed billing or BYOK
- **CPU & GPU Support**: Automatic detection of CUDA (NVIDIA), MPS (Apple Silicon), or CPU fallback
- **Advanced Voice Detection**: Dual-layer VAD (WebRTC + Silero) for accurate speech detection
- **Dual Transcription Modes**: Cloud (Deepgram) or local (Whisper) with automatic GPU/CPU detection
- **Shared Captions**: Create a room and share a code so others can join — no server setup needed
- **OBS Integration**: Built-in web server for browser source capture at `http://localhost:8080`
- **Multi-User Sync**: Optional Node.js server to sync transcriptions across multiple users
- **Custom Fonts**: Support for system fonts, web-safe fonts, Google Fonts, and custom font files
- **Customizable Colors**: User-configurable colors for name, text, and background
- **Advanced Voice Detection**: Dual-layer VAD (WebRTC + Silero) for accurate speech detection
- **Noise Suppression**: Built-in audio preprocessing to reduce background noise
- **Auto-Updates**: Automatic update checking with release notes display
@@ -87,27 +87,30 @@ For detailed build instructions, see [BUILD.md](BUILD.md).
## Usage
### Standalone Mode
### Quick Setup (Cloud — Recommended)
1. Launch the application
2. Select your microphone from the audio device dropdown
3. Choose a Whisper model (smaller = faster, larger = more accurate):
2. Open **Settings** — the transcription mode defaults to **Cloud (Deepgram)**
3. Get a free API key at [console.deepgram.com](https://console.deepgram.com) and paste it in Settings
4. Select your microphone from the audio device dropdown
5. Click **Start Transcription**
6. Transcriptions appear in the main window and at `http://localhost:8080`
> The Start button is disabled until an API key is entered. Local-only settings (model, VAD, timing) are hidden in cloud mode to keep things simple.
### Local Mode (Whisper)
For offline/on-device transcription, switch to **Local (Whisper)** in Settings:
1. Choose a Whisper model (smaller = faster, larger = more accurate):
- `tiny.en` / `tiny` — Fastest, good for quick captions
- `base.en` / `base` — Balanced speed and accuracy
- `small.en` / `small` — Better accuracy
- `medium.en` / `medium` — High accuracy
- `large-v3` — Best accuracy (requires more resources)
4. Click **Start** to begin transcription
5. Transcriptions appear in the main window and at `http://localhost:8080`
### Remote Transcription (Deepgram)
Instead of local Whisper models, you can use cloud-based transcription:
- **Managed mode**: Sign up via the transcription proxy for metered billing
- **BYOK mode**: Bring your own Deepgram API key for direct access
Configure in Settings > Remote Transcription.
2. Select compute device (Auto/CUDA/CPU) and compute type
3. Tune VAD sensitivity and timing settings as needed
4. Click **Start Transcription**
### OBS Browser Source Setup
@@ -117,18 +120,42 @@ Configure in Settings > Remote Transcription.
4. Set dimensions (e.g., 1920x300)
5. Check "Shutdown source when not visible" for performance
### Multi-User Mode (Optional)
### Shared Captions (Multi-User)
For syncing transcriptions across multiple users (e.g., multi-host streams or translation teams):
Share live captions across multiple users using the hosted service at `https://caption.shadowdao.com/` — no server setup required.
1. Deploy the Node.js server (see [server/nodejs/README.md](server/nodejs/README.md))
2. In the app settings, enable **Server Sync**
3. Enter the server URL (e.g., `http://your-server:3000/api/send`)
4. Set a room name and passphrase (shared with other users)
5. In OBS, use the server's display URL with your room name:
```
http://your-server:3000/display?room=YOURROOM&timestamps=true&maxlines=50
```
#### Creating a Room
1. Open **Settings** and enable **Shared Captions**
2. Click **Create Room** — this generates a room name and passphrase automatically
3. A **share code** is generated and copied to your clipboard
4. Send the share code to anyone who should join
#### Joining a Room
1. Open **Settings** and enable **Shared Captions**
2. Paste the share code you received into the **"Paste share code to join"** field
3. Click **Join** — the server URL, room, and passphrase are auto-filled
4. Click **Save**
#### Sharing an Existing Room
If you already have a room configured and want to invite others:
1. Open **Settings** and scroll to **Shared Captions**
2. Click **Share Current Room** — generates a share code from your current config and copies it to the clipboard
3. Send the code to others
#### OBS Display for Shared Rooms
In OBS, add a Browser source pointing to the server's display URL:
```
https://caption.shadowdao.com/display?room=YOURROOM&timestamps=true&maxlines=50
```
#### Self-Hosting
You can also self-host the sync server. See [server/nodejs/README.md](server/nodejs/README.md) for setup instructions, then enter your own server URL in the Shared Captions settings.
## Configuration
@@ -144,7 +171,7 @@ Settings are stored at `~/.local-transcription/config.yaml` and can be modified
| `transcription.silero_sensitivity` | VAD sensitivity (0-1, lower = more sensitive) | `0.4` |
| `transcription.post_speech_silence_duration` | Silence before finalizing (seconds) | `0.3` |
| `transcription.continuous_mode` | Fast speaker mode for quick talkers | `false` |
| `remote.mode` | Transcription mode (local/managed/byok) | `local` |
| `remote.mode` | Transcription mode (local/managed/byok) | `byok` |
| `display.show_timestamps` | Show timestamps with transcriptions | `true` |
| `display.fade_after_seconds` | Fade out time (0 = never) | `10` |
| `display.font_source` | Font type (System Font/Web-Safe/Google Font/Custom File) | `System Font` |

136
SIGNING.md Normal file
View File

@@ -0,0 +1,136 @@
# Code Signing Setup
This document explains how to configure code signing for Local Transcription so that Windows and macOS installers are trusted by the operating system.
## Overview
Without code signing:
- **Windows**: SmartScreen shows "Windows protected your PC" warnings
- **macOS**: Gatekeeper blocks the app — "app can't be opened because it is from an unidentified developer"
The CI/CD workflows are configured to sign automatically when the required secrets are present. Without secrets, builds still work — they just produce unsigned installers.
---
## Windows — Azure Artifact Signing
**Cost**: ~$9.99/month (up to 5,000 signatures)
### 1. Create an Azure Account
Sign up at https://azure.microsoft.com if you don't already have one.
### 2. Set Up Artifact Signing
1. In the Azure Portal, search for **Artifact Signing**
2. Create a new **Artifact Signing Account**
- Choose a region (e.g., West US 2) — note this for the endpoint URL
- The endpoint will be like `https://wus2.codesigning.azure.net/`
3. Complete **Identity Verification** (required before you can create certificate profiles)
4. Create a **Certificate Profile** with type "Public Trust" for code signing
### 3. Create an App Registration (Service Principal)
This allows CI to authenticate to Azure:
1. Go to **Azure Active Directory** > **App registrations** > **New registration**
2. Name it (e.g., `local-transcription-signing`)
3. After creation, note the **Application (client) ID** and **Directory (tenant) ID**
4. Go to **Certificates & secrets** > **New client secret** — note the secret value
5. Grant the app registration the **Artifact Signing Certificate Profile Signer** role on your Artifact Signing Account
### 4. Add Gitea Secrets
In your Gitea repository, go to **Settings** > **Actions** > **Secrets** and add:
| Secret Name | Value |
|-------------|-------|
| `AZURE_CLIENT_ID` | App registration Application (client) ID |
| `AZURE_CLIENT_SECRET` | App registration client secret value |
| `AZURE_TENANT_ID` | Directory (tenant) ID |
| `AZURE_SIGNING_ENDPOINT` | Artifact Signing endpoint URL (e.g., `https://wus2.codesigning.azure.net/`) |
| `AZURE_SIGNING_ACCOUNT` | Artifact Signing account name |
| `AZURE_CERT_PROFILE` | Certificate profile name |
---
## macOS — Apple Developer Code Signing + Notarization
**Cost**: $99/year (Apple Developer Program)
### 1. Enroll in the Apple Developer Program
Sign up at https://developer.apple.com/programs/
### 2. Create a Developer ID Certificate
1. Open **Xcode** > **Settings** > **Accounts** > select your team > **Manage Certificates**
2. Click **+** > **Developer ID Application**
3. Or create via the Apple Developer portal: **Certificates, Identifiers & Profiles** > **Certificates** > **+** > **Developer ID Application**
### 3. Export the Certificate as .p12
1. Open **Keychain Access**
2. Find your **Developer ID Application** certificate
3. Right-click > **Export** > save as `.p12` with a password
4. Base64-encode it:
```bash
base64 -i certificate.p12 | tr -d '\n'
```
### 4. Create an App Store Connect API Key
This is used for notarization (submitting the app to Apple for verification):
1. Go to https://appstoreconnect.apple.com/access/integrations/api
2. Click **Generate API Key**
3. Give it a name and **Developer** role (minimum)
4. Download the `.p8` private key file (you can only download it once)
5. Note the **Key ID** and **Issuer ID** shown on the page
### 5. Find Your Signing Identity
Your signing identity looks like:
```
Developer ID Application: Your Name (TEAMID)
```
You can find it by running:
```bash
security find-identity -v -p codesigning
```
### 6. Add Gitea Secrets
| Secret Name | Value |
|-------------|-------|
| `APPLE_CERTIFICATE` | Base64-encoded .p12 certificate (from step 3) |
| `APPLE_CERTIFICATE_PASSWORD` | Password used when exporting the .p12 |
| `APPLE_SIGNING_IDENTITY` | Full identity string (e.g., `Developer ID Application: Your Name (TEAMID)`) |
| `APPLE_API_KEY` | App Store Connect API Key ID |
| `APPLE_API_ISSUER` | API issuer UUID |
| `APPLE_API_KEY_CONTENT` | Full contents of the `.p8` private key file |
---
## Verifying Signing Works
### Trigger a Build
Both build workflows use `workflow_dispatch`, so you can trigger them manually in Gitea:
1. Go to **Actions** > select the workflow > **Run workflow**
2. Enter the release tag (e.g., `v2.0.15`)
### Check macOS
After installing the `.dmg`, the app should open without any Gatekeeper warnings. You can also verify from the command line:
```bash
codesign -dv --verbose=4 /Applications/Local\ Transcription.app
spctl --assess --type execute /Applications/Local\ Transcription.app
```
### Check Windows
After running the `.msi` or `-setup.exe`, there should be no SmartScreen warning. The installer properties should show your organization name as the publisher.

View File

@@ -212,7 +212,11 @@ class APIServer:
@app.put("/api/config")
async def update_config(update: ConfigUpdate):
engine_reloaded, message = ctrl.apply_settings(update.settings)
import asyncio
loop = asyncio.get_event_loop()
engine_reloaded, message = await loop.run_in_executor(
None, ctrl.apply_settings, update.settings
)
return {
"status": "ok",
"message": message,

View File

@@ -608,8 +608,17 @@ class AppController:
Returns (engine_reload_needed, message).
"""
if new_config:
for key, value in new_config.items():
self.config.set(key, value)
# Flatten nested dicts into dot-notation keys so we merge
# individual values instead of replacing entire sections
# (e.g. remote.mode instead of overwriting all of remote)
def _flatten(d, prefix=""):
for k, v in d.items():
full_key = f"{prefix}{k}" if not prefix else f"{prefix}.{k}"
if isinstance(v, dict):
_flatten(v, full_key)
else:
self.config.set(full_key, v)
_flatten(new_config)
# Update web server display settings
if self.web_server:
@@ -682,6 +691,7 @@ class AppController:
"transcription_count": len(self.transcriptions),
"remote_mode": remote_mode,
"server_sync_enabled": self.config.get('server_sync.enabled', False),
"is_cloud_only": self.is_cloud_only,
}
def get_audio_devices(self) -> list[dict]:

View File

@@ -125,6 +125,8 @@ def test_apply_settings_no_reload_when_same(controller):
# Ensure config returns the same values
controller.config.set("transcription.model", "base.en")
controller.config.set("transcription.device", "auto")
# Remote mode must also match (no engine means current mode is 'local')
controller.config.set("remote.mode", "local")
controller.reload_engine = MagicMock(return_value=(True, "reloaded"))

View File

@@ -42,7 +42,7 @@ transcription:
server_sync:
enabled: false
url: "http://localhost:3000/api/send"
url: ""
room: "default"
passphrase: ""
# Font settings are now in the display section (shared for local and server sync)
@@ -69,8 +69,8 @@ web_server:
host: "127.0.0.1"
remote:
mode: local # local | managed | byok
server_url: "" # Proxy server URL for managed mode (e.g., wss://your-proxy.com)
mode: byok # local | managed | byok
server_url: "https://transcribe.shadowdao.com" # Proxy server URL for managed mode
auth_token: "" # JWT stored after login (managed mode)
byok_api_key: "" # Deepgram API key for BYOK mode
deepgram_model: nova-2 # Deepgram model to use

View File

@@ -1,7 +1,7 @@
{
"name": "local-transcription",
"private": true,
"version": "2.0.13",
"version": "2.0.17",
"type": "module",
"scripts": {
"dev": "vite dev",

View File

@@ -1,6 +1,6 @@
[project]
name = "local-transcription"
version = "1.0.10"
version = "1.0.12"
description = "A standalone desktop application for real-time speech-to-text transcription using Whisper models"
readme = "README.md"
requires-python = ">=3.9"

View File

@@ -703,6 +703,36 @@ app.post('/api/send', async (req, res) => {
}
});
// Create room explicitly (no transcription needed)
app.post('/api/create-room', async (req, res) => {
try {
const { room, passphrase } = req.body;
if (!room || !passphrase) {
return res.status(400).json({ error: 'Missing room or passphrase' });
}
// Check if room already exists
const existing = await loadRoom(room);
if (existing) {
const valid = await verifyPassphrase(room, passphrase);
if (!valid) {
return res.status(401).json({ error: 'Room exists with different passphrase' });
}
return res.json({ status: 'ok', room, created: false, message: 'Room already exists' });
}
// Create the room (verifyPassphrase creates it if it doesn't exist)
await verifyPassphrase(room, passphrase);
console.log(`[Room] Created room "${room}"`);
res.json({ status: 'ok', room, created: true });
} catch (err) {
console.error('Error in /api/create-room:', err);
res.status(500).json({ error: err.message });
}
});
// List transcriptions
app.get('/api/list', async (req, res) => {
try {

2
src-tauri/Cargo.lock generated
View File

@@ -1881,7 +1881,7 @@ checksum = "92daf443525c4cce67b150400bc2316076100ce0b3686209eb8cf3c31612e6f0"
[[package]]
name = "local-transcription"
version = "2.0.8"
version = "2.0.12"
dependencies = [
"bytes",
"chrono",

View File

@@ -1,6 +1,6 @@
[package]
name = "local-transcription"
version = "2.0.13"
version = "2.0.17"
description = "Real-time speech-to-text transcription for streamers"
authors = ["Local Transcription Contributors"]
edition = "2021"

View File

@@ -0,0 +1,14 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>com.apple.security.device.audio-input</key>
<true/>
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.network.server</key>
<true/>
<key>com.apple.security.cs.allow-unsigned-executable-memory</key>
<true/>
</dict>
</plist>

8
src-tauri/Info.plist Normal file
View File

@@ -0,0 +1,8 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>NSMicrophoneUsageDescription</key>
<string>Local Transcription needs microphone access for real-time speech-to-text transcription.</string>
</dict>
</plist>

View File

@@ -1,6 +1,6 @@
{
"productName": "Local Transcription",
"version": "2.0.13",
"version": "2.0.17",
"identifier": "net.anhonesthost.local-transcription",
"build": {
"frontendDist": "../dist",
@@ -33,7 +33,10 @@
"icons/icon.icns",
"icons/icon.ico",
"icons/icon.png"
]
],
"windows": {
"digestAlgorithm": "sha256"
}
},
"plugins": {
"shell": {

View File

@@ -1,5 +1,6 @@
<script lang="ts">
import { backendStore } from "$lib/stores/backend";
import { configStore } from "$lib/stores/config";
import { transcriptionStore } from "$lib/stores/transcriptions";
let isTranscribing = $derived(backendStore.appState === "transcribing");
@@ -8,6 +9,16 @@
);
let isLoading = $state(false);
let remoteMode = $derived(configStore.config.remote.mode);
let byokApiKey = $derived(configStore.config.remote.byok_api_key);
let authToken = $derived(configStore.config.remote.auth_token);
let cloudConfigured = $derived(
remoteMode === "local" ||
(remoteMode === "byok" && byokApiKey.trim() !== "") ||
(remoteMode === "managed" && authToken.trim() !== "")
);
let errorMessage = $state("");
async function toggleTranscription() {
@@ -94,7 +105,7 @@
<button
class={isTranscribing ? "danger" : "primary"}
onclick={toggleTranscription}
disabled={!isReady || isLoading}
disabled={!isReady || isLoading || !cloudConfigured}
>
{#if isLoading}
...
@@ -116,6 +127,18 @@
{#if errorMessage}
<span class="error-msg">{errorMessage}</span>
{/if}
{#if !cloudConfigured && isReady}
<div class="cloud-warning">
{#if remoteMode === "byok"}
<span>API key required. Get one at
<a href="https://console.deepgram.com" target="_blank" rel="noopener">console.deepgram.com</a>,
then enter it in Settings.</span>
{:else if remoteMode === "managed"}
<span>Login required. Open Settings to log in.</span>
{/if}
</div>
{/if}
</div>
<style>
@@ -125,6 +148,18 @@
margin-left: 8px;
}
.cloud-warning {
font-size: 12px;
color: #ff9800;
margin-left: 8px;
flex: 1;
}
.cloud-warning a {
color: #4fc3f7;
text-decoration: underline;
}
.controls {
display: flex;
align-items: center;

View File

@@ -46,6 +46,17 @@
let managedPassword = $state("");
let autoCheckUpdates = $state(true);
let isCloudMode = $derived(remoteMode === "managed" || remoteMode === "byok");
let isCloudOnly = $derived(
computeDevices.length > 0 && computeDevices.every(d => d.id === "cloud")
);
// Room creation / join state
let shareCode = $state("");
let joinCode = $state("");
let roomCreating = $state(false);
let roomCreateMessage = $state("");
let saving = $state(false);
let saveMessage = $state("");
@@ -199,7 +210,7 @@
},
remote: {
mode: remoteMode,
server_url: remoteServerUrl,
server_url: remoteServerUrl || MANAGED_SERVER_URL,
byok_api_key: byokApiKey,
},
updates: {
@@ -244,25 +255,121 @@
}
}
const MANAGED_SERVER_URL = "https://transcribe.shadowdao.com";
async function handleManagedLogin() {
try {
await backendStore.apiPost("/api/login", {
email: managedEmail,
password: managedPassword,
server_url: remoteServerUrl || MANAGED_SERVER_URL,
});
} catch (err) {
console.error("Login failed:", err);
}
}
async function handleManagedRegister() {
const CAPTION_SERVER = "https://caption.shadowdao.com";
function generateRandomName(): string {
const adjectives = ['swift', 'bright', 'cosmic', 'electric', 'turbo', 'mega', 'ultra', 'super', 'hyper', 'alpha'];
const nouns = ['phoenix', 'dragon', 'tiger', 'falcon', 'comet', 'storm', 'blaze', 'thunder', 'frost', 'nebula'];
const num = Math.floor(Math.random() * 10000);
return `${adjectives[Math.floor(Math.random() * adjectives.length)]}-${nouns[Math.floor(Math.random() * nouns.length)]}-${num}`;
}
function generateRandomPassphrase(): string {
const chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
let result = '';
for (let i = 0; i < 16; i++) {
result += chars.charAt(Math.floor(Math.random() * chars.length));
}
return result;
}
function encodeShareCode(url: string, room: string, passphrase: string): string {
return btoa(JSON.stringify({ url, room, passphrase }));
}
function decodeShareCode(code: string): { url: string; room: string; passphrase: string } | null {
try {
await backendStore.apiPost("/api/register", {
email: managedEmail,
password: managedPassword,
const json = JSON.parse(atob(code.trim()));
if (json.url && json.room && json.passphrase) {
return json;
}
return null;
} catch {
return null;
}
}
async function handleCreateRoom() {
roomCreating = true;
roomCreateMessage = "";
shareCode = "";
const room = generateRandomName();
const passphrase = generateRandomPassphrase();
const serverSendUrl = `${CAPTION_SERVER}/api/send`;
try {
const resp = await fetch(`${CAPTION_SERVER}/api/create-room`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ room, passphrase }),
});
if (!resp.ok) {
const err = await resp.json().catch(() => ({ error: "Request failed" }));
roomCreateMessage = `Error: ${err.error || resp.statusText}`;
return;
}
syncUrl = serverSendUrl;
syncRoom = room;
syncPassphrase = passphrase;
syncEnabled = true;
shareCode = encodeShareCode(serverSendUrl, room, passphrase);
roomCreateMessage = "Room created! Share the code below with others.";
} catch (err) {
console.error("Register failed:", err);
roomCreateMessage = `Error: ${err instanceof Error ? err.message : String(err)}`;
} finally {
roomCreating = false;
}
}
function handleJoinRoom() {
const decoded = decodeShareCode(joinCode);
if (!decoded) {
roomCreateMessage = "Invalid share code. Please check and try again.";
return;
}
syncUrl = decoded.url;
syncRoom = decoded.room;
syncPassphrase = decoded.passphrase;
syncEnabled = true;
joinCode = "";
roomCreateMessage = "Room joined! Fields have been auto-filled.";
}
async function handleShareCurrentRoom() {
const code = encodeShareCode(syncUrl, syncRoom, syncPassphrase);
shareCode = code;
try {
await navigator.clipboard.writeText(code);
roomCreateMessage = "Share code copied to clipboard!";
} catch {
roomCreateMessage = "Share code generated. Copy it from the field below.";
}
}
async function copyShareCode() {
try {
await navigator.clipboard.writeText(shareCode);
roomCreateMessage = "Share code copied to clipboard!";
} catch {
roomCreateMessage = "Failed to copy. Please select and copy manually.";
}
}
@@ -327,7 +434,85 @@
</div>
</section>
<!-- Transcription Settings -->
<!-- Remote Transcription (moved up for cloud-first UX) -->
<section class="settings-section">
<h3>Transcription Mode</h3>
<div class="radio-group">
<label>
<input
type="radio"
name="remote-mode"
value="byok"
bind:group={remoteMode}
/>
Cloud (Deepgram)
</label>
<label>
<input
type="radio"
name="remote-mode"
value="managed"
bind:group={remoteMode}
/>
Managed Service
</label>
{#if !isCloudOnly}
<label>
<input
type="radio"
name="remote-mode"
value="local"
bind:group={remoteMode}
/>
Local (Whisper)
</label>
{/if}
</div>
{#if remoteMode === "byok"}
<div class="field">
<label for="byok-key">Deepgram API Key</label>
<input
id="byok-key"
type="password"
bind:value={byokApiKey}
placeholder="Enter your Deepgram API key"
/>
<p style="font-size: 11px; color: var(--text-muted); margin-top: 4px;">
Get a key at <a href="https://console.deepgram.com" target="_blank" rel="noopener" style="color: var(--accent-blue);">console.deepgram.com</a>
</p>
</div>
{/if}
{#if remoteMode === "managed"}
<div class="managed-auth">
<div class="field">
<label for="managed-email">Email</label>
<input
id="managed-email"
type="email"
bind:value={managedEmail}
placeholder="email@example.com"
/>
</div>
<div class="field">
<label for="managed-password">Password</label>
<input
id="managed-password"
type="password"
bind:value={managedPassword}
/>
</div>
<div class="auth-buttons">
<button onclick={handleManagedLogin}>Login</button>
</div>
<p style="font-size: 11px; color: var(--text-muted); margin-top: 8px;">
Don't have an account? <a href="https://transcribe.shadowdao.com/register.html" target="_blank" rel="noopener" style="color: var(--accent-blue);">Sign up here</a>
</p>
</div>
{/if}
</section>
{#if !isCloudMode}
<!-- Transcription Settings (local Whisper only) -->
<section class="settings-section">
<h3>Transcription Settings</h3>
<div class="field">
@@ -473,6 +658,7 @@
/>
</div>
</section>
{/if}
<!-- Display Settings -->
<section class="settings-section">
@@ -628,11 +814,11 @@
</div>
</section>
<!-- Server Sync -->
<!-- Server Sync (Shared Captions) -->
<section class="settings-section">
<h3>Server Sync</h3>
<h3>Shared Captions</h3>
<div class="field-row">
<label for="sync-enabled">Enable Server Sync</label>
<label for="sync-enabled">Enable Shared Captions</label>
<input
id="sync-enabled"
type="checkbox"
@@ -640,13 +826,57 @@
/>
</div>
{#if syncEnabled}
<div class="room-actions">
<div class="room-buttons-row">
<button
onclick={handleCreateRoom}
disabled={roomCreating}
class="secondary"
>
{roomCreating ? "Creating..." : "Create Room"}
</button>
<button
onclick={handleShareCurrentRoom}
disabled={!syncUrl.trim() || !syncRoom.trim() || !syncPassphrase.trim()}
class="secondary"
>
Share Current Room
</button>
</div>
<div class="join-row">
<input
type="text"
bind:value={joinCode}
placeholder="Paste share code to join"
class="join-input"
/>
<button onclick={handleJoinRoom} disabled={!joinCode.trim()} class="secondary">
Join
</button>
</div>
</div>
{#if roomCreateMessage}
<p class="room-message" class:error={roomCreateMessage.startsWith("Error")}>{roomCreateMessage}</p>
{/if}
{#if shareCode}
<div class="share-code-box">
<label>Share Code</label>
<div class="share-code-row">
<input type="text" value={shareCode} readonly class="share-code-input" />
<button onclick={copyShareCode} class="secondary">Copy</button>
</div>
</div>
{/if}
<div class="field">
<label for="sync-url">Server URL</label>
<input
id="sync-url"
type="url"
bind:value={syncUrl}
placeholder="http://localhost:3000/api/send"
placeholder="https://caption.shadowdao.com/api/send"
/>
</div>
<div class="field">
@@ -664,90 +894,6 @@
{/if}
</section>
<!-- Remote Transcription -->
<section class="settings-section">
<h3>Remote Transcription</h3>
<div class="radio-group">
<label>
<input
type="radio"
name="remote-mode"
value="local"
bind:group={remoteMode}
/>
Local
</label>
<label>
<input
type="radio"
name="remote-mode"
value="managed"
bind:group={remoteMode}
/>
Managed
</label>
<label>
<input
type="radio"
name="remote-mode"
value="byok"
bind:group={remoteMode}
/>
BYOK (Bring Your Own Key)
</label>
</div>
{#if remoteMode === "managed"}
<div class="field">
<label for="remote-url">Server URL</label>
<input
id="remote-url"
type="url"
bind:value={remoteServerUrl}
placeholder="wss://your-proxy.com"
/>
</div>
{/if}
{#if remoteMode === "byok"}
<div class="field">
<label for="byok-key">Deepgram API Key</label>
<input
id="byok-key"
type="password"
bind:value={byokApiKey}
placeholder="Enter your Deepgram API key"
/>
<p style="font-size: 11px; color: var(--text-muted); margin-top: 4px;">
Get a key at <a href="https://console.deepgram.com" target="_blank" rel="noopener" style="color: var(--accent-blue);">console.deepgram.com</a>
</p>
</div>
{/if}
{#if remoteMode === "managed"}
<div class="managed-auth">
<div class="field">
<label for="managed-email">Email</label>
<input
id="managed-email"
type="email"
bind:value={managedEmail}
placeholder="email@example.com"
/>
</div>
<div class="field">
<label for="managed-password">Password</label>
<input
id="managed-password"
type="password"
bind:value={managedPassword}
/>
</div>
<div class="auth-buttons">
<button onclick={handleManagedLogin}>Login</button>
<button onclick={handleManagedRegister}>Register</button>
</div>
</div>
{/if}
</section>
<!-- Updates -->
<section class="settings-section">
<h3>Updates</h3>
@@ -943,6 +1089,78 @@
color: #f44336;
}
.room-actions {
display: flex;
flex-direction: column;
gap: 8px;
margin-bottom: 12px;
}
.room-buttons-row {
display: flex;
gap: 8px;
}
.join-row {
display: flex;
gap: 8px;
}
.join-input {
flex: 1;
}
.room-message {
font-size: 12px;
color: #4CAF50;
margin-bottom: 8px;
}
.room-message.error {
color: #f44336;
}
.share-code-box {
margin-bottom: 12px;
}
.share-code-box label {
display: block;
margin-bottom: 4px;
font-size: 12px;
color: var(--text-secondary);
}
.share-code-row {
display: flex;
gap: 8px;
}
.share-code-input {
flex: 1;
font-size: 11px;
font-family: monospace;
}
.secondary {
background: transparent;
border: 1px solid var(--border-color);
color: var(--text-primary);
padding: 6px 12px;
border-radius: 6px;
cursor: pointer;
font-size: 13px;
}
.secondary:hover {
background: var(--bg-tertiary);
}
.secondary:disabled {
opacity: 0.5;
cursor: not-allowed;
}
.danger-btn {
background: transparent;
border: 1px solid var(--accent-red, #f44336);

View File

@@ -19,6 +19,7 @@ interface BackendState {
wsConnection: WebSocket | null;
version: string;
lastError: string;
isCloudOnly: boolean;
}
let state = $state<BackendState>({
@@ -30,6 +31,7 @@ let state = $state<BackendState>({
wsConnection: null,
version: "1.4.0",
lastError: "",
isCloudOnly: false,
});
let reconnectTimer: ReturnType<typeof setTimeout> | null = null;
@@ -72,6 +74,9 @@ async function pollStatus() {
if (data.version) {
state.version = data.version;
}
if (data.is_cloud_only !== undefined) {
state.isCloudOnly = data.is_cloud_only;
}
}
} catch {
// API not ready yet, will retry
@@ -285,6 +290,9 @@ export const backendStore = {
get lastError() {
return state.lastError;
},
get isCloudOnly() {
return state.isCloudOnly;
},
get apiBaseUrl() {
return `http://localhost:${state.port}`;
},

View File

@@ -107,7 +107,7 @@ function getDefaultConfig(): AppConfig {
},
server_sync: {
enabled: false,
url: "http://localhost:3000/api/send",
url: "",
room: "default",
passphrase: "",
},
@@ -128,7 +128,7 @@ function getDefaultConfig(): AppConfig {
},
web_server: { port: 8080, host: "127.0.0.1" },
remote: {
mode: "local",
mode: "byok",
server_url: "",
auth_token: "",
byok_api_key: "",

View File

@@ -1,7 +1,7 @@
"""Version information for Local Transcription."""
__version__ = "2.0.13"
__version_info__ = (2, 0, 13)
__version__ = "2.0.17"
__version_info__ = (2, 0, 17)
# Version history:
# 1.4.0 - Auto-update feature: