chore: bump version to 1.4.4 [skip ci]

Add sidecar download, setup screen, and auto-launch
On first launch, the app now prompts users to download the Python sidecar (CPU or CUDA variant) from Gitea releases, matching the voice-to-notes pattern. On subsequent launches, it auto-launches the sidecar and connects. New Rust module (src-tauri/src/sidecar/): - download_sidecar: streams download with progress events, extracts zip - check_sidecar: verifies installed sidecar binary exists - check_sidecar_update: compares local vs latest release version - SidecarManager: launches binary, waits for ready JSON, manages lifecycle - Dev mode: runs `python -m backend.main_headless` directly - start_sidecar/stop_sidecar/get_sidecar_port: Tauri commands New Svelte component (SidecarSetup.svelte): - First-time setup overlay with CPU/CUDA variant selection - Download progress bar with byte counter - Error state with retry, success state with auto-continue Updated App.svelte state machine: - checking -> needs_setup -> starting -> connected - Falls back to direct connection in browser dev mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:05:15 +00:00 · 2026-04-06 17:02:56 -07:00 · 2026-04-06 16:55:03 -07:00 · 2026-04-06 23:49:18 +00:00 · 2026-04-06 21:02:18 +00:00 · 2026-04-06 14:02:11 -07:00
55 changed files with 20298 additions and 723 deletions
--- a/.claude/settings.local.json
+++ b/.claude/settings.local.json
@@ -0,0 +1,9 @@
+{
+  "permissions": {
+    "allow": [
+      "Bash(python3:*)",
+      "Bash(node --check:*)",
+      "Bash(ls:*)"
+    ]
+  }
+}
--- a/.gitea/workflows/build-sidecar.yml
+++ b/.gitea/workflows/build-sidecar.yml
@@ -0,0 +1,428 @@
+name: Build Sidecars
+
+on:
+  push:
+    branches: [main]
+    paths:
+      - 'client/**'
+      - 'server/**'
+      - 'backend/**'
+      - 'pyproject.toml'
+      - 'local-transcription-headless.spec'
+  workflow_dispatch:
+
+jobs:
+  bump-sidecar-version:
+    name: Bump sidecar version and tag
+    if: "!contains(github.event.head_commit.message, '[skip ci]')"
+    runs-on: ubuntu-latest
+    outputs:
+      version: ${{ steps.bump.outputs.version }}
+      tag: ${{ steps.bump.outputs.tag }}
+      has_changes: ${{ steps.check_changes.outputs.has_changes }}
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 2
+
+      - name: Check for backend changes
+        id: check_changes
+        run: |
+          # If triggered by workflow_dispatch, always build
+          if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
+            echo "has_changes=true" >> $GITHUB_OUTPUT
+            exit 0
+          fi
+          # Check if relevant files changed in this commit
+          CHANGED=$(git diff --name-only HEAD~1 HEAD -- client/ server/ backend/ pyproject.toml local-transcription-headless.spec 2>/dev/null || echo "")
+          if [ -n "$CHANGED" ]; then
+            echo "has_changes=true" >> $GITHUB_OUTPUT
+            echo "Backend changes detected: $CHANGED"
+          else
+            echo "has_changes=false" >> $GITHUB_OUTPUT
+            echo "No backend changes detected, skipping sidecar build"
+          fi
+
+      - name: Configure git
+        if: steps.check_changes.outputs.has_changes == 'true'
+        run: |
+          git config user.name "Gitea Actions"
+          git config user.email "actions@gitea.local"
+
+      - name: Bump sidecar patch version
+        if: steps.check_changes.outputs.has_changes == 'true'
+        id: bump
+        run: |
+          # Read current version from pyproject.toml
+          CURRENT=$(grep '^version = ' pyproject.toml | head -1 | sed 's/version = "\(.*\)"/\1/')
+          echo "Current sidecar version: ${CURRENT}"
+
+          # Increment patch number
+          MAJOR=$(echo "${CURRENT}" | cut -d. -f1)
+          MINOR=$(echo "${CURRENT}" | cut -d. -f2)
+          PATCH=$(echo "${CURRENT}" | cut -d. -f3)
+          NEW_PATCH=$((PATCH + 1))
+          NEW_VERSION="${MAJOR}.${MINOR}.${NEW_PATCH}"
+          echo "New sidecar version: ${NEW_VERSION}"
+
+          # Update pyproject.toml
+          sed -i "s/^version = \"${CURRENT}\"/version = \"${NEW_VERSION}\"/" pyproject.toml
+
+          # Update version.py
+          sed -i "s/__version__ = \"${CURRENT}\"/__version__ = \"${NEW_VERSION}\"/" version.py
+          sed -i "s/__version_info__ = .*/__version_info__ = (${MAJOR}, ${MINOR}, ${NEW_PATCH})/" version.py
+
+          echo "version=${NEW_VERSION}" >> $GITHUB_OUTPUT
+          echo "tag=sidecar-v${NEW_VERSION}" >> $GITHUB_OUTPUT
+
+      - name: Commit and tag
+        if: steps.check_changes.outputs.has_changes == 'true'
+        env:
+          BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
+        run: |
+          NEW_VERSION="${{ steps.bump.outputs.version }}"
+          TAG="${{ steps.bump.outputs.tag }}"
+          git add pyproject.toml version.py
+          git commit -m "chore: bump sidecar version to ${NEW_VERSION} [skip ci]"
+          git tag "${TAG}"
+
+          REMOTE_URL=$(git remote get-url origin | sed "s|://|://gitea-actions:${BUILD_TOKEN}@|")
+          git pull --rebase "${REMOTE_URL}" main || true
+          git push "${REMOTE_URL}" HEAD:main
+          git push "${REMOTE_URL}" "${TAG}"
+
+      - name: Create Gitea release
+        if: steps.check_changes.outputs.has_changes == 'true'
+        env:
+          BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
+        run: |
+          REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
+          TAG="${{ steps.bump.outputs.tag }}"
+          VERSION="${{ steps.bump.outputs.version }}"
+          RELEASE_NAME="Sidecar v${VERSION}"
+
+          curl -s -X POST \
+            -H "Authorization: token ${BUILD_TOKEN}" \
+            -H "Content-Type: application/json" \
+            -d "{\"tag_name\": \"${TAG}\", \"name\": \"${RELEASE_NAME}\", \"body\": \"Automated sidecar build.\", \"draft\": false, \"prerelease\": false}" \
+            "${REPO_API}/releases"
+          echo "Created release: ${RELEASE_NAME}"
+
+  # ── Linux sidecar (CUDA + CPU) ──
+
+  build-sidecar-linux:
+    name: Build Sidecar (Linux)
+    needs: bump-sidecar-version
+    if: needs.bump-sidecar-version.outputs.has_changes == 'true'
+    runs-on: ubuntu-latest
+    env:
+      PYTHON_VERSION: "3.11"
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          ref: ${{ needs.bump-sidecar-version.outputs.tag }}
+
+      - name: Install uv
+        run: |
+          if command -v uv &> /dev/null; then
+            echo "uv already installed: $(uv --version)"
+          else
+            curl -LsSf https://astral.sh/uv/install.sh | sh
+            echo "$HOME/.local/bin" >> $GITHUB_PATH
+          fi
+
+      - name: Set up Python
+        run: uv python install ${{ env.PYTHON_VERSION }}
+
+      - name: Install system dependencies
+        run: |
+          sudo apt-get update
+          sudo apt-get install -y portaudio19-dev
+
+      - name: Build sidecar (CUDA)
+        run: |
+          uv sync --frozen || uv sync
+          uv run pyinstaller local-transcription-headless.spec
+
+      - name: Package sidecar (CUDA)
+        run: |
+          cd dist/local-transcription-backend && zip -r ../../sidecar-linux-x86_64-cuda.zip .
+
+      - name: Build sidecar (CPU)
+        run: |
+          rm -rf dist/local-transcription-backend build/
+          uv pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu --force-reinstall
+          uv run pyinstaller local-transcription-headless.spec
+
+      - name: Package sidecar (CPU)
+        run: |
+          cd dist/local-transcription-backend && zip -r ../../sidecar-linux-x86_64-cpu.zip .
+
+      - name: Upload to sidecar release
+        env:
+          BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
+        run: |
+          sudo apt-get install -y jq
+          REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
+          TAG="${{ needs.bump-sidecar-version.outputs.tag }}"
+
+          echo "Waiting for sidecar release ${TAG} to be available..."
+          for i in $(seq 1 30); do
+            RELEASE_JSON=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
+              "${REPO_API}/releases/tags/${TAG}")
+            RELEASE_ID=$(echo "$RELEASE_JSON" | jq -r '.id // empty')
+
+            if [ -n "${RELEASE_ID}" ] && [ "${RELEASE_ID}" != "null" ]; then
+              echo "Found sidecar release: ${TAG} (ID: ${RELEASE_ID})"
+              break
+            fi
+
+            echo "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
+            sleep 10
+          done
+
+          if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
+            echo "ERROR: Failed to find sidecar release for tag ${TAG} after 30 attempts."
+            exit 1
+          fi
+
+          for file in sidecar-*.zip; do
+            filename=$(basename "$file")
+            encoded_name=$(echo "$filename" | sed 's/ /%20/g')
+            echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
+
+            ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
+              "${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
+            if [ -n "${ASSET_ID}" ]; then
+              curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
+                "${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
+            fi
+
+            HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
+              -H "Authorization: token ${BUILD_TOKEN}" \
+              -H "Content-Type: application/octet-stream" \
+              -T "$file" \
+              "${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
+            echo "Upload response: HTTP ${HTTP_CODE}"
+          done
+
+  # ── Windows sidecar (CUDA + CPU) ──
+
+  build-sidecar-windows:
+    name: Build Sidecar (Windows)
+    needs: bump-sidecar-version
+    if: needs.bump-sidecar-version.outputs.has_changes == 'true'
+    runs-on: windows-latest
+    env:
+      PYTHON_VERSION: "3.11"
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          ref: ${{ needs.bump-sidecar-version.outputs.tag }}
+
+      - name: Install uv
+        shell: powershell
+        run: |
+          if (Get-Command uv -ErrorAction SilentlyContinue) {
+            Write-Host "uv already installed: $(uv --version)"
+          } else {
+            irm https://astral.sh/uv/install.ps1 | iex
+            # Add both possible uv install locations to PATH
+            $uvPaths = @(
+              "$env:USERPROFILE\.local\bin",
+              "$env:USERPROFILE\.cargo\bin",
+              "$env:LOCALAPPDATA\uv\bin"
+            )
+            foreach ($p in $uvPaths) {
+              if (Test-Path $p) {
+                echo $p | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
+              }
+            }
+          }
+
+      - name: Set up Python
+        shell: powershell
+        run: uv python install ${{ env.PYTHON_VERSION }}
+
+      - name: Install 7-Zip
+        shell: powershell
+        run: |
+          if (-not (Get-Command 7z -ErrorAction SilentlyContinue)) {
+            choco install 7zip -y
+          }
+
+      - name: Build sidecar (CUDA)
+        shell: powershell
+        run: |
+          uv sync --frozen
+          if ($LASTEXITCODE -ne 0) { uv sync }
+          uv run pyinstaller local-transcription-headless.spec
+
+      - name: Package sidecar (CUDA)
+        shell: powershell
+        run: |
+          7z a -tzip -mx=5 sidecar-windows-x86_64-cuda.zip .\dist\local-transcription-backend\*
+
+      - name: Build sidecar (CPU)
+        shell: powershell
+        run: |
+          Remove-Item -Recurse -Force dist\local-transcription-backend, build -ErrorAction SilentlyContinue
+          uv pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu --force-reinstall
+          uv run pyinstaller local-transcription-headless.spec
+
+      - name: Package sidecar (CPU)
+        shell: powershell
+        run: |
+          7z a -tzip -mx=5 sidecar-windows-x86_64-cpu.zip .\dist\local-transcription-backend\*
+
+      - name: Upload to sidecar release
+        shell: powershell
+        env:
+          BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
+        run: |
+          $REPO_API = "${{ github.server_url }}/api/v1/repos/${{ github.repository }}"
+          $Headers = @{ "Authorization" = "token $env:BUILD_TOKEN" }
+          $TAG = "${{ needs.bump-sidecar-version.outputs.tag }}"
+
+          Write-Host "Waiting for sidecar release ${TAG} to be available..."
+          $RELEASE_ID = $null
+
+          for ($i = 1; $i -le 30; $i++) {
+            try {
+              $release = Invoke-RestMethod -Uri "${REPO_API}/releases/tags/${TAG}" -Headers $Headers -ErrorAction Stop
+              $RELEASE_ID = $release.id
+
+              if ($RELEASE_ID) {
+                Write-Host "Found sidecar release: ${TAG} (ID: ${RELEASE_ID})"
+                break
+              }
+            } catch {}
+
+            Write-Host "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
+            Start-Sleep -Seconds 10
+          }
+
+          if (-not $RELEASE_ID) {
+            Write-Host "ERROR: Failed to find sidecar release for tag ${TAG} after 30 attempts."
+            exit 1
+          }
+
+          Get-ChildItem -Path . -Filter "sidecar-*.zip" | ForEach-Object {
+            $filename = $_.Name
+            $encodedName = [System.Uri]::EscapeDataString($filename)
+            $size = [math]::Round($_.Length / 1MB, 1)
+            Write-Host "Uploading ${filename} (${size} MB)..."
+
+            try {
+              $assets = Invoke-RestMethod -Uri "${REPO_API}/releases/${RELEASE_ID}/assets" -Headers $Headers
+              $existing = $assets | Where-Object { $_.name -eq $filename }
+              if ($existing) {
+                Invoke-RestMethod -Uri "${REPO_API}/releases/${RELEASE_ID}/assets/$($existing.id)" -Method Delete -Headers $Headers
+              }
+            } catch {}
+
+            $uploadUrl = "${REPO_API}/releases/${RELEASE_ID}/assets?name=${encodedName}"
+            $result = curl.exe --fail --silent --show-error `
+              -X POST `
+              -H "Authorization: token $env:BUILD_TOKEN" `
+              -H "Content-Type: application/octet-stream" `
+              -T "$($_.FullName)" `
+              "$uploadUrl" 2>&1
+            if ($LASTEXITCODE -eq 0) {
+              Write-Host "Upload successful: ${filename}"
+            } else {
+              Write-Host "WARNING: Upload failed for ${filename}: ${result}"
+            }
+          }
+
+  # ── macOS sidecar (CPU only — no CUDA on macOS) ──
+
+  build-sidecar-macos:
+    name: Build Sidecar (macOS)
+    needs: bump-sidecar-version
+    if: needs.bump-sidecar-version.outputs.has_changes == 'true'
+    runs-on: macos-latest
+    env:
+      PYTHON_VERSION: "3.11"
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          ref: ${{ needs.bump-sidecar-version.outputs.tag }}
+
+      - name: Install uv
+        run: |
+          if command -v uv &> /dev/null; then
+            echo "uv already installed: $(uv --version)"
+          else
+            curl -LsSf https://astral.sh/uv/install.sh | sh
+            echo "$HOME/.local/bin" >> $GITHUB_PATH
+          fi
+
+      - name: Set up Python
+        run: uv python install ${{ env.PYTHON_VERSION }}
+
+      - name: Install system dependencies
+        run: brew install portaudio
+
+      - name: Build sidecar (CPU)
+        env:
+          UV_NO_SOURCES: "1"
+        run: |
+          # UV_NO_SOURCES bypasses pyproject.toml's [tool.uv.sources] which forces
+          # torch from the CUDA index (no macOS ARM wheels there).
+          # Applies to both uv sync AND uv run (which re-resolves).
+          # Default PyPI torch includes MPS (Apple Silicon GPU) support.
+          uv sync
+          uv run pyinstaller local-transcription-headless.spec
+
+      - name: Package sidecar (CPU)
+        run: |
+          cd dist/local-transcription-backend && zip -r ../../sidecar-macos-aarch64-cpu.zip .
+
+      - name: Upload to sidecar release
+        env:
+          BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
+        run: |
+          which jq || brew install jq
+          REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
+          TAG="${{ needs.bump-sidecar-version.outputs.tag }}"
+
+          echo "Waiting for sidecar release ${TAG} to be available..."
+          for i in $(seq 1 30); do
+            RELEASE_JSON=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
+              "${REPO_API}/releases/tags/${TAG}")
+            RELEASE_ID=$(echo "$RELEASE_JSON" | jq -r '.id // empty')
+
+            if [ -n "${RELEASE_ID}" ] && [ "${RELEASE_ID}" != "null" ]; then
+              echo "Found sidecar release: ${TAG} (ID: ${RELEASE_ID})"
+              break
+            fi
+
+            echo "Attempt ${i}/30: Release not ready yet, retrying in 10s..."
+            sleep 10
+          done
+
+          if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
+            echo "ERROR: Failed to find sidecar release for tag ${TAG} after 30 attempts."
+            exit 1
+          fi
+
+          for file in sidecar-*.zip; do
+            filename=$(basename "$file")
+            encoded_name=$(echo "$filename" | sed 's/ /%20/g')
+            echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
+
+            ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
+              "${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
+            if [ -n "${ASSET_ID}" ]; then
+              curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
+                "${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
+            fi
+
+            HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
+              -H "Authorization: token ${BUILD_TOKEN}" \
+              -H "Content-Type: application/octet-stream" \
+              -T "$file" \
+              "${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
+            echo "Upload response: HTTP ${HTTP_CODE}"
+          done
--- a/.gitea/workflows/release.yml
+++ b/.gitea/workflows/release.yml
@@ -0,0 +1,300 @@
+name: Release
+
+on:
+  push:
+    branches: [main]
+
+jobs:
+  bump-version:
+    name: Bump version and tag
+    if: "!contains(github.event.head_commit.message, '[skip ci]')"
+    runs-on: ubuntu-latest
+    outputs:
+      new_version: ${{ steps.bump.outputs.new_version }}
+      tag: ${{ steps.bump.outputs.tag }}
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Configure git
+        run: |
+          git config user.name "Gitea Actions"
+          git config user.email "actions@gitea.local"
+
+      - name: Bump patch version
+        id: bump
+        run: |
+          # Read current version from package.json
+          CURRENT=$(grep '"version"' package.json | head -1 | sed 's/.*"version": *"\([^"]*\)".*/\1/')
+          echo "Current version: ${CURRENT}"
+
+          # Increment patch number
+          MAJOR=$(echo "${CURRENT}" | cut -d. -f1)
+          MINOR=$(echo "${CURRENT}" | cut -d. -f2)
+          PATCH=$(echo "${CURRENT}" | cut -d. -f3)
+          NEW_PATCH=$((PATCH + 1))
+          NEW_VERSION="${MAJOR}.${MINOR}.${NEW_PATCH}"
+          echo "New version: ${NEW_VERSION}"
+
+          # Update package.json
+          sed -i "s/\"version\": \"${CURRENT}\"/\"version\": \"${NEW_VERSION}\"/" package.json
+
+          # Update src-tauri/tauri.conf.json
+          sed -i "s/\"version\": \"${CURRENT}\"/\"version\": \"${NEW_VERSION}\"/" src-tauri/tauri.conf.json
+
+          # Update src-tauri/Cargo.toml
+          sed -i "s/^version = \"${CURRENT}\"/version = \"${NEW_VERSION}\"/" src-tauri/Cargo.toml
+
+          # Update version.py
+          sed -i "s/__version__ = \"${CURRENT}\"/__version__ = \"${NEW_VERSION}\"/" version.py
+          sed -i "s/__version_info__ = .*/__version_info__ = (${MAJOR}, ${MINOR}, ${NEW_PATCH})/" version.py
+
+          echo "new_version=${NEW_VERSION}" >> $GITHUB_OUTPUT
+          echo "tag=v${NEW_VERSION}" >> $GITHUB_OUTPUT
+
+      - name: Commit and tag
+        env:
+          BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
+        run: |
+          NEW_VERSION="${{ steps.bump.outputs.new_version }}"
+          git add package.json src-tauri/tauri.conf.json src-tauri/Cargo.toml version.py
+          git commit -m "chore: bump version to ${NEW_VERSION} [skip ci]"
+          git tag "v${NEW_VERSION}"
+
+          REMOTE_URL=$(git remote get-url origin | sed "s|://|://gitea-actions:${BUILD_TOKEN}@|")
+          git pull --rebase "${REMOTE_URL}" main || true
+          git push "${REMOTE_URL}" HEAD:main
+          git push "${REMOTE_URL}" "v${NEW_VERSION}"
+
+      - name: Create Gitea release
+        env:
+          BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
+        run: |
+          REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
+          TAG="${{ steps.bump.outputs.tag }}"
+          RELEASE_NAME="Local Transcription ${TAG}"
+
+          curl -s -X POST \
+            -H "Authorization: token ${BUILD_TOKEN}" \
+            -H "Content-Type: application/json" \
+            -d "{\"tag_name\": \"${TAG}\", \"name\": \"${RELEASE_NAME}\", \"body\": \"Automated build.\", \"draft\": false, \"prerelease\": false}" \
+            "${REPO_API}/releases"
+          echo "Created release: ${RELEASE_NAME}"
+
+  # ── Platform builds (run after version bump) ──
+
+  build-linux:
+    name: Build App (Linux)
+    needs: bump-version
+    runs-on: ubuntu-latest
+    env:
+      NODE_VERSION: "20"
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          ref: ${{ needs.bump-version.outputs.tag }}
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: ${{ env.NODE_VERSION }}
+
+      - name: Install Rust stable
+        run: |
+          curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
+          echo "$HOME/.cargo/bin" >> $GITHUB_PATH
+
+      - name: Install system dependencies
+        run: |
+          sudo apt-get update
+          sudo apt-get install -y libgtk-3-dev libwebkit2gtk-4.1-dev libappindicator3-dev librsvg2-dev patchelf xdg-utils rpm
+
+      - name: Install npm dependencies
+        run: npm ci
+
+      - name: Build Tauri app
+        run: npm run tauri build
+
+      - name: Upload to release
+        env:
+          BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
+        run: |
+          sudo apt-get install -y jq
+          REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
+          TAG="${{ needs.bump-version.outputs.tag }}"
+          echo "Release tag: ${TAG}"
+
+          RELEASE_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
+            "${REPO_API}/releases/tags/${TAG}" | jq -r '.id // empty')
+
+          if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
+            echo "ERROR: Failed to find release for tag ${TAG}."
+            exit 1
+          fi
+          echo "Release ID: ${RELEASE_ID}"
+
+          find src-tauri/target/release/bundle -type f \( -name "*.deb" -o -name "*.rpm" -o -name "*.AppImage" \) | while IFS= read -r file; do
+            filename=$(basename "$file")
+            encoded_name=$(echo "$filename" | sed 's/ /%20/g')
+            echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
+
+            ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
+              "${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
+            if [ -n "${ASSET_ID}" ]; then
+              curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
+                "${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
+            fi
+
+            HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
+              -H "Authorization: token ${BUILD_TOKEN}" \
+              -H "Content-Type: application/octet-stream" \
+              -T "$file" \
+              "${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
+            echo "Upload response: HTTP ${HTTP_CODE}"
+          done
+
+  build-windows:
+    name: Build App (Windows)
+    needs: bump-version
+    runs-on: windows-latest
+    env:
+      NODE_VERSION: "20"
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          ref: ${{ needs.bump-version.outputs.tag }}
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: ${{ env.NODE_VERSION }}
+
+      - name: Install Rust stable
+        shell: powershell
+        run: |
+          if (Get-Command rustup -ErrorAction SilentlyContinue) {
+            rustup default stable
+          } else {
+            Invoke-WebRequest -Uri https://win.rustup.rs/x86_64 -OutFile rustup-init.exe
+            .\rustup-init.exe -y --default-toolchain stable
+            echo "$env:USERPROFILE\.cargo\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
+          }
+
+      - name: Install npm dependencies
+        shell: powershell
+        run: npm ci
+
+      - name: Build Tauri app
+        shell: powershell
+        run: npm run tauri build
+
+      - name: Upload to release
+        shell: powershell
+        env:
+          BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
+        run: |
+          $REPO_API = "${{ github.server_url }}/api/v1/repos/${{ github.repository }}"
+          $Headers = @{ "Authorization" = "token $env:BUILD_TOKEN" }
+          $TAG = "${{ needs.bump-version.outputs.tag }}"
+          Write-Host "Release tag: ${TAG}"
+
+          $release = Invoke-RestMethod -Uri "${REPO_API}/releases/tags/${TAG}" -Headers $Headers -ErrorAction Stop
+          $RELEASE_ID = $release.id
+          Write-Host "Release ID: ${RELEASE_ID}"
+
+          Get-ChildItem -Path src-tauri\target\release\bundle -Recurse -Include *.msi,*-setup.exe | ForEach-Object {
+            $filename = $_.Name
+            $encodedName = [System.Uri]::EscapeDataString($filename)
+            $size = [math]::Round($_.Length / 1MB, 1)
+            Write-Host "Uploading ${filename} (${size} MB)..."
+
+            try {
+              $assets = Invoke-RestMethod -Uri "${REPO_API}/releases/${RELEASE_ID}/assets" -Headers $Headers
+              $existing = $assets | Where-Object { $_.name -eq $filename }
+              if ($existing) {
+                Invoke-RestMethod -Uri "${REPO_API}/releases/${RELEASE_ID}/assets/$($existing.id)" -Method Delete -Headers $Headers
+              }
+            } catch {}
+
+            $uploadUrl = "${REPO_API}/releases/${RELEASE_ID}/assets?name=${encodedName}"
+            $result = curl.exe --fail --silent --show-error `
+              -X POST `
+              -H "Authorization: token $env:BUILD_TOKEN" `
+              -H "Content-Type: application/octet-stream" `
+              -T "$($_.FullName)" `
+              "$uploadUrl" 2>&1
+            if ($LASTEXITCODE -eq 0) {
+              Write-Host "Upload successful: ${filename}"
+            } else {
+              Write-Host "WARNING: Upload failed for ${filename}: ${result}"
+            }
+          }
+
+  build-macos:
+    name: Build App (macOS)
+    needs: bump-version
+    runs-on: macos-latest
+    env:
+      NODE_VERSION: "20"
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          ref: ${{ needs.bump-version.outputs.tag }}
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: ${{ env.NODE_VERSION }}
+
+      - name: Install Rust stable
+        run: |
+          curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
+          echo "$HOME/.cargo/bin" >> $GITHUB_PATH
+
+      - name: Install system dependencies
+        run: brew install --quiet create-dmg || true
+
+      - name: Install npm dependencies
+        run: npm ci
+
+      - name: Build Tauri app
+        run: npm run tauri build
+
+      - name: Upload to release
+        env:
+          BUILD_TOKEN: ${{ secrets.BUILD_TOKEN }}
+        run: |
+          which jq || brew install jq
+          REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
+          TAG="${{ needs.bump-version.outputs.tag }}"
+          echo "Release tag: ${TAG}"
+
+          RELEASE_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
+            "${REPO_API}/releases/tags/${TAG}" | jq -r '.id // empty')
+
+          if [ -z "${RELEASE_ID}" ] || [ "${RELEASE_ID}" = "null" ]; then
+            echo "ERROR: Failed to find release for tag ${TAG}."
+            exit 1
+          fi
+          echo "Release ID: ${RELEASE_ID}"
+
+          find src-tauri/target/release/bundle -type f -name "*.dmg" | while IFS= read -r file; do
+            filename=$(basename "$file")
+            encoded_name=$(echo "$filename" | sed 's/ /%20/g')
+            echo "Uploading ${filename} ($(du -h "$file" | cut -f1))..."
+
+            ASSET_ID=$(curl -s -H "Authorization: token ${BUILD_TOKEN}" \
+              "${REPO_API}/releases/${RELEASE_ID}/assets" | jq -r ".[] | select(.name == \"${filename}\") | .id // empty")
+            if [ -n "${ASSET_ID}" ]; then
+              curl -s -X DELETE -H "Authorization: token ${BUILD_TOKEN}" \
+                "${REPO_API}/releases/${RELEASE_ID}/assets/${ASSET_ID}"
+            fi
+
+            HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
+              -H "Authorization: token ${BUILD_TOKEN}" \
+              -H "Content-Type: application/octet-stream" \
+              -T "$file" \
+              "${REPO_API}/releases/${RELEASE_ID}/assets?name=${encoded_name}")
+            echo "Upload response: HTTP ${HTTP_CODE}"
+          done
--- a/.gitignore
+++ b/.gitignore
@@ -10,8 +10,8 @@ dist/
 downloads/
 eggs/
 .eggs/
-lib/
-lib64/
+/lib/
+/lib64/
 parts/
 sdist/
 var/
@@ -54,3 +54,15 @@ models/

 # PyInstaller
 *.spec.lock
+
+# Node.js
+node_modules/
+
+# Vite / Svelte build output
+dist/
+
+# Tauri
+src-tauri/target/
+
+# Windows NTFS alternate data streams
+*:Zone.Identifier
--- a/2025-live-transcription-research.md:Zone.Identifier
+++ b/2025-live-transcription-research.md:Zone.Identifier
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -4,52 +4,108 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

 ## Project Overview

-Local Transcription is a desktop application for real-time speech-to-text transcription designed for streamers. It uses Whisper models (via faster-whisper) to transcribe audio locally with optional multi-user server synchronization.
+Local Transcription is a cross-platform desktop application for real-time speech-to-text transcription designed for streamers. It supports local Whisper models and cloud-based Deepgram transcription, with OBS browser source integration and optional multi-user sync.
+
+**Architecture:** Two-process model — a Tauri v2 shell (Svelte 5 frontend) communicates with a headless Python backend (sidecar) via REST API and WebSocket.

 **Key Features:**
- Standalone desktop GUI (PySide6/Qt)
- Local transcription with CPU/GPU support
- Built-in web server for OBS browser source integration
- Optional Node.js-based multi-user server for syncing transcriptions across users
- Noise suppression and Voice Activity Detection (VAD)
- Cross-platform builds (Linux/Windows) with PyInstaller
+- Cross-platform desktop app (Windows, macOS, Linux) via Tauri v2 + Svelte 5
+- Headless Python backend with FastAPI control API
+- Dual transcription modes: local Whisper or cloud Deepgram (managed/BYOK)
+- Built-in web server for OBS browser source at `http://localhost:8080`
+- Optional multi-user sync via Node.js server
+- CUDA, MPS (Apple Silicon), and CPU support
+- Auto-updates, custom fonts, configurable colors
+
+> **Legacy GUI:** The original PySide6/Qt GUI (`main.py`, `gui/`) still works during the transition. New features should target the Tauri frontend and headless backend.

 ## Project Structure

 ```
 local-transcription/
-├── client/                   # Core transcription logic
-│   ├── audio_capture.py      # Audio input and buffering
-│   ├── transcription_engine.py # Whisper model integration
-│   ├── noise_suppression.py  # VAD and noise reduction
-│   ├── device_utils.py       # CPU/GPU device management
-│   ├── config.py             # Configuration management
-│   └── server_sync.py        # Multi-user server sync client
-├── gui/                      # Desktop application UI
-│   ├── main_window_qt.py     # Main application window (PySide6)
-│   ├── settings_dialog_qt.py # Settings dialog (PySide6)
-│   └── transcription_display_qt.py # Display widget
-├── server/                   # Web display servers
-│   ├── web_display.py        # FastAPI server for OBS browser source (local)
-│   └── nodejs/               # Optional multi-user Node.js server
-│       ├── server.js         # Multi-user sync server with WebSocket
-│       ├── package.json      # Node.js dependencies
-│       └── README.md         # Server deployment documentation
-├── config/                   # Example configuration files
-│   └── default_config.yaml   # Default settings template
-├── main.py                   # GUI application entry point
-├── main_cli.py              # CLI version for testing
-└── pyproject.toml           # Dependencies and build config
+├── src/                             # Svelte 5 frontend (Tauri UI)
+│   ├── App.svelte                   # Main app shell
+│   ├── app.css                      # Global dark theme styles
+│   ├── main.ts                      # Svelte mount point
+│   ├── lib/components/              # UI components
+│   │   ├── Header.svelte            # Title bar + settings button
+│   │   ├── StatusBar.svelte         # State indicator, device, user info
+│   │   ├── Controls.svelte          # Start/Stop, Clear, Save buttons
+│   │   ├── TranscriptionDisplay.svelte  # Scrolling transcript view
+│   │   └── Settings.svelte          # Full settings modal (all sections)
+│   └── lib/stores/                  # Svelte 5 reactive stores ($state/$derived)
+│       ├── backend.ts               # WebSocket + REST API client
+│       ├── config.ts                # App configuration fetch/update
+│       └── transcriptions.ts        # Transcript data management
+├── src-tauri/                       # Tauri v2 Rust shell
+│   ├── src/lib.rs                   # Plugin registration (shell, dialog, process)
+│   ├── src/main.rs                  # Entry point
+│   ├── tauri.conf.json              # Window, bundle, plugin config
+│   └── Cargo.toml                   # Rust dependencies
+├── backend/                         # Headless Python backend (the sidecar)
+│   ├── app_controller.py            # Core orchestration (engine, sync, config)
+│   ├── api_server.py                # FastAPI REST endpoints + /ws/control
+│   └── main_headless.py             # Headless entry point (prints JSON to stdout)
+├── client/                          # Core transcription modules (used by backend)
+│   ├── audio_capture.py             # Audio input handling
+│   ├── transcription_engine_realtime.py  # RealtimeSTT / Whisper engine
+│   ├── deepgram_transcription.py    # Deepgram WebSocket cloud transcription
+│   ├── noise_suppression.py         # VAD and noise reduction
+│   ├── device_utils.py              # CPU/GPU/MPS detection
+│   ├── config.py                    # YAML config management (~/.local-transcription/)
+│   ├── server_sync.py               # Multi-user server sync client
+│   ├── instance_lock.py             # Single-instance PID lock
+│   └── update_checker.py            # Gitea release update checker
+├── gui/                             # Legacy PySide6/Qt GUI (still functional)
+│   ├── main_window_qt.py            # Main window (orchestration lives here in legacy)
+│   ├── settings_dialog_qt.py        # Settings dialog
+│   └── transcription_display_qt.py  # Display widget
+├── server/
+│   ├── web_display.py               # FastAPI OBS display server (WebSocket + HTML)
+│   └── nodejs/                      # Optional multi-user sync server
+├── .gitea/workflows/                # CI/CD
+│   ├── release.yml                  # Tauri app builds (Linux/Windows/macOS)
+│   └── build-sidecar.yml            # Python sidecar builds (CUDA + CPU)
+├── config/default_config.yaml       # Default settings template
+├── main.py                          # Legacy PySide6 GUI entry point
+├── main_cli.py                      # CLI version for testing
+├── version.py                       # Version string (__version__)
+├── local-transcription.spec         # PyInstaller config (legacy, includes PySide6)
+├── local-transcription-headless.spec # PyInstaller config (headless sidecar, no Qt)
+├── pyproject.toml                   # Python deps (uv, CUDA PyTorch index)
+├── package.json                     # Node/Tauri deps
+└── vite.config.ts                   # Vite build config ($lib alias)
 ```

 ## Development Commands

-### Installation and Setup
+### Frontend (Tauri + Svelte)
 ```bash
-# Install dependencies (creates .venv automatically)
+# Install npm dependencies
+npm install
+
+# Run Tauri in development mode (hot-reload)
+npm run tauri dev
+
+# Build frontend only (for testing)
+npx vite build
+
+# Type-check Svelte
+npx svelte-check
+
+# Check Rust compiles
+cd src-tauri && cargo check
+```
+
+### Backend (Python)
+```bash
+# Install Python dependencies
 uv sync

-# Run the GUI application
+# Run the headless backend standalone (for development)
+uv run python -m backend.main_headless --port 8080
+
+# Run the legacy PySide6 GUI
 uv run python main.py

 # Run CLI version (headless, for testing)
@@ -57,257 +113,154 @@ uv run python main_cli.py

 # List available audio devices
 uv run python main_cli.py --list-devices
-
-# Install with CUDA support (if needed)
-uv pip install torch --index-url https://download.pytorch.org/whl/cu121
 ```

-### Building Executables
+### Building
 ```bash
-# Linux (includes CUDA support - works on both GPU and CPU systems)
-./build.sh
+# Build Tauri app (produces platform installer)
+npm run tauri build

-# Windows (includes CUDA support - works on both GPU and CPU systems)
-build.bat
+# Build headless Python sidecar (no PySide6)
+uv run pyinstaller local-transcription-headless.spec
+# Output: dist/local-transcription-backend/

-# Manual build with PyInstaller
-uv sync                          # Install dependencies (includes CUDA PyTorch)
-uv pip uninstall -q enum34       # Remove incompatible enum34 package
+# Build legacy PySide6 app
 uv run pyinstaller local-transcription.spec
+# Or use: ./build.sh (Linux) / build.bat (Windows)
 ```

-**Important:** All builds include CUDA support via `pyproject.toml` configuration. CUDA builds can be created on systems without NVIDIA GPUs. The PyTorch CUDA runtime is bundled, and the app automatically falls back to CPU if no GPU is available.
-
 ### Testing
 ```bash
-# Run component tests
 uv run python test_components.py
-
-# Check CUDA availability
 uv run python check_cuda.py
-
-# Test web server manually
-uv run python -m uvicorn server.web_display:app --reload
 ```

-## Architecture
+## Architecture Details

-### Audio Processing Pipeline
+### Communication: Tauri <-> Python Backend

-1. **Audio Capture** ([client/audio_capture.py](client/audio_capture.py))
-   - Captures audio from microphone/system using sounddevice
-   - Handles automatic sample rate detection and resampling
-   - Uses chunking with overlap for better transcription quality
-   - Default: 3-second chunks with 0.5s overlap
+The Svelte frontend connects to the Python backend via two channels:

-2. **Noise Suppression** ([client/noise_suppression.py](client/noise_suppression.py))
-   - Applies noisereduce for background noise reduction
-   - Voice Activity Detection (VAD) using webrtcvad
-   - Skips silent segments to improve performance
+**REST API** (on port 8081 by default):
+- `GET /api/status` — app state, device info, version
+- `POST /api/start` / `POST /api/stop` — transcription control
+- `GET /api/config` / `PUT /api/config` — read/write settings (dot-notation keys)
+- `GET /api/audio-devices` / `GET /api/compute-devices` — device enumeration
+- `POST /api/reload-engine` — reload with new model/device
+- `GET /api/transcriptions` / `POST /api/clear` — transcript management
+- `POST /api/save-file` — write text to a file path
+- `GET /api/check-update` / `POST /api/skip-version` — update management
+- `POST /api/login` / `POST /api/register` / `GET /api/balance` — managed mode proxy

-3. **Transcription** ([client/transcription_engine.py](client/transcription_engine.py))
-   - Uses faster-whisper for efficient inference
-   - Supports CPU, CUDA, and Apple MPS (Mac)
-   - Models: tiny, base, small, medium, large
-   - Thread-safe model loading with locks
+**WebSocket** `/ws/control`:
+- Pushes real-time events: `state_changed`, `transcription`, `preview`, `error`, `credits_low`
+- Client sends keepalive pings

-4. **Display** ([gui/main_window_qt.py](gui/main_window_qt.py))
-   - PySide6/Qt-based desktop GUI
-   - Real-time transcription display with scrolling
-   - Settings panel with live updates (no restart needed)
+The OBS display server runs separately on port 8080 (`GET /` for HTML, `WebSocket /ws` for transcriptions).

-### Web Server Architecture
+### Backend Process Lifecycle

-**Local Web Server** ([server/web_display.py](server/web_display.py))
- Always runs when GUI starts (port 8080 by default)
- FastAPI with WebSocket for real-time updates
- Used for OBS browser source integration
- Single-user (displays only local transcriptions)
+1. `main_headless.py` starts, acquires instance lock, creates `AppController`
+2. `AppController.initialize()` starts the OBS web server (port 8080) and engine init thread
+3. `APIServer` wraps the controller with FastAPI routes, runs on port 8081
+4. Backend prints `{"event": "ready", "port": 8080}` to stdout for Tauri to discover
+5. On shutdown: engine stopped, web server stopped, lock released

-**Multi-User Server** (Optional - for syncing across multiple users)
+### Headless Backend vs Legacy GUI

-**Node.js WebSocket Server** ([server/nodejs/](server/nodejs/)) - **RECOMMENDED**
- Real-time WebSocket support (< 100ms latency)
- Handles 100+ concurrent users
- Easy deployment to VPS/cloud hosting (Railway, Heroku, DigitalOcean, or any VPS)
- Configurable display options via URL parameters:
-  - `timestamps=true/false` - Show/hide timestamps
-  - `maxlines=50` - Maximum visible lines (prevents scroll bars in OBS)
-  - `fontsize=16` - Font size in pixels
-  - `fontfamily=Arial` - Font family
-  - `fade=10` - Seconds before text fades (0 = never)
+The `AppController` class (`backend/app_controller.py`) extracts all orchestration logic from `gui/main_window_qt.py` into a Qt-free class. The mapping:

-See [server/nodejs/README.md](server/nodejs/README.md) for deployment instructions
+| Legacy (MainWindow) | Headless (AppController) |
+|---------------------|--------------------------|
+| `_initialize_components()` | `_initialize_engine()` |
+| `_start_transcription()` | `start_transcription()` |
+| `_stop_transcription()` | `stop_transcription()` |
+| `_on_settings_saved()` | `apply_settings()` |
+| `_reload_engine()` | `reload_engine()` |
+| `_start_web_server_if_enabled()` | `_start_web_server()` |
+| `_start_server_sync()` | `_start_server_sync()` |
+| Qt signals | Callbacks (`on_state_changed`, `on_transcription`, etc.) |

-### Configuration System
+### Threading Model (Headless)

- Config stored at `~/.local-transcription/config.yaml`
- Managed by [client/config.py](client/config.py)
- Settings apply immediately without restart (except model changes)
- YAML format with nested keys (e.g., `transcription.model`)
+- Main thread: Uvicorn (FastAPI) event loop
+- Engine init thread: Downloads models, initializes VAD
+- Web server thread: Separate asyncio loop for OBS display
+- Audio capture: Runs in engine callback threads
+- All results flow through `AppController` callbacks -> `APIServer` WebSocket broadcast

-### Device Management
+### Svelte Frontend

- [client/device_utils.py](client/device_utils.py) handles CPU/GPU detection
- Auto-detects CUDA, MPS (Mac), or falls back to CPU
- Compute types: float32 (best quality), float16 (GPU), int8 (fastest)
- Thread-safe device selection
+Uses Svelte 5 runes throughout (`$state`, `$derived`, `$effect`, `$props`). No Svelte 4 patterns.

-## Key Implementation Details
+**Stores** (`src/lib/stores/`):
+- `backend.ts` — WebSocket connection + REST helpers (`apiGet`, `apiPost`, `apiPut`), auto-reconnect
+- `config.ts` — fetches/updates config from backend API
+- `transcriptions.ts` — manages transcript list, listens for `CustomEvent`s from backend store

-### PyInstaller Build Configuration
+**Key patterns:**
+- Backend store dispatches `CustomEvent`s on `window` for cross-store communication
+- Settings component collects all changed values into a `Record<string, any>` with dot-notation keys, sends via `PUT /api/config`
+- Controls use Tauri dialog plugin for native file save, falls back to blob download

- [local-transcription.spec](local-transcription.spec) controls build
- UPX compression enabled for smaller executables
- Hidden imports required for PySide6, faster-whisper, torch
- Console mode enabled by default (set `console=False` to hide)
+## CI/CD

-### Threading Model
+Two Gitea Actions workflows in `.gitea/workflows/`:

- Main thread: Qt GUI event loop
- Audio thread: Captures and processes audio chunks
- Web server thread: Runs FastAPI server
- Transcription: Runs in callback thread from audio capture
- All transcription results communicated via Qt signals
+- **`release.yml`**: Triggers on push to `main`. Auto-bumps version, builds Tauri app on Linux/Windows/macOS, uploads `.deb`, `.rpm`, `.msi`, `.dmg` to Gitea release.
+- **`build-sidecar.yml`**: Triggers on changes to `client/`, `server/`, `backend/`, `pyproject.toml`. Builds headless Python sidecar via PyInstaller. CUDA + CPU for Linux/Windows, CPU-only for macOS.

-### Server Sync (Optional Multi-User Feature)
-
- [client/server_sync.py](client/server_sync.py) handles server communication
- Toggle in Settings: "Enable Server Sync"
- Sends transcriptions to Node.js server via HTTP POST
- Real-time updates via WebSocket to display page
- Per-speaker font support (Web-Safe, Google Fonts, Custom uploads)
- Falls back gracefully if server unavailable
+Both require a `BUILD_TOKEN` secret (Gitea API token with release write access).

 ## Common Patterns

 ### Adding a New Setting

-1. Add to [config/default_config.yaml](config/default_config.yaml)
-2. Update [client/config.py](client/config.py) if validation needed
-3. Add UI control in [gui/settings_dialog_qt.py](gui/settings_dialog_qt.py)
-4. Apply setting in relevant component (no restart if possible)
-5. Emit signal to update display if needed
+1. Add default to [config/default_config.yaml](config/default_config.yaml)
+2. Add UI control in [src/lib/components/Settings.svelte](src/lib/components/Settings.svelte)
+3. Ensure the setting is included in the save handler's config update
+4. Apply in `AppController.apply_settings()` or the relevant component
+5. For legacy GUI: also update [gui/settings_dialog_qt.py](gui/settings_dialog_qt.py)
+
+### Adding a New API Endpoint
+
+1. Add route in [backend/api_server.py](backend/api_server.py) `_setup_routes()`
+2. Add supporting logic in [backend/app_controller.py](backend/app_controller.py) if needed
+3. Call from Svelte via `backendStore.apiGet/apiPost/apiPut`

 ### Modifying Transcription Display

- Local GUI: [gui/transcription_display_qt.py](gui/transcription_display_qt.py)
- Local web display (OBS): [server/web_display.py](server/web_display.py) (HTML in `_get_html()`)
+- Tauri UI: [src/lib/components/TranscriptionDisplay.svelte](src/lib/components/TranscriptionDisplay.svelte)
+- OBS display: [server/web_display.py](server/web_display.py) (HTML in `_get_html()`)
 - Multi-user display: [server/nodejs/server.js](server/nodejs/server.js) (display page in `/display` route)

-### Adding a New Model Size
-
- Update [client/transcription_engine.py](client/transcription_engine.py)
- Add to model selector in [gui/settings_dialog_qt.py](gui/settings_dialog_qt.py)
- Update CLI argument choices in [main_cli.py](main_cli.py)
-
 ## Dependencies

-**Core:**
- `faster-whisper`: Optimized Whisper inference
- `torch`: ML framework (CUDA-enabled via special index)
- `PySide6`: Qt6 bindings for GUI
- `sounddevice`: Cross-platform audio I/O
- `noisereduce`, `webrtcvad`: Audio preprocessing
-
-**Web Server:**
- `fastapi`, `uvicorn`: Web server and ASGI
- `websockets`: Real-time communication
-
-**Build:**
- `pyinstaller`: Create standalone executables
- `uv`: Fast package manager
-
-**PyTorch CUDA Index:**
- Configured in [pyproject.toml](pyproject.toml) under `[[tool.uv.index]]`
- Uses PyTorch's custom wheel repository for CUDA builds
- Automatically installed with `uv sync` when using CUDA build scripts
+**Frontend:** Tauri v2, Svelte 5, Vite, TypeScript
+**Backend:** Python 3.9+, FastAPI, Uvicorn, RealtimeSTT, faster-whisper, PyTorch (CUDA), sounddevice
+**Build:** PyInstaller (sidecar), Tauri CLI (app), uv (Python packages)
+**CI:** Gitea Actions with platform-specific runners

 ## Platform-Specific Notes

 ### Linux
- Uses PulseAudio/ALSA for audio
- Build scripts use bash (`.sh` files)
- Executable: `dist/LocalTranscription/LocalTranscription`
+- Tauri needs: `libgtk-3-dev`, `libwebkit2gtk-4.1-dev`, `libappindicator3-dev`, `librsvg2-dev`, `patchelf`
+- Audio: PulseAudio/ALSA via sounddevice

 ### Windows
- Uses Windows Audio/WASAPI
- Build scripts use batch (`.bat` files)
- Executable: `dist\LocalTranscription\LocalTranscription.exe`
- Requires Visual C++ Redistributable on target systems
+- Tauri needs: WebView2 (usually pre-installed on Windows 10+)
+- Audio: WASAPI via sounddevice

-### Cross-Building
- **Cannot cross-compile** - must build on target platform
- CI/CD should use platform-specific runners
-
-## Troubleshooting
-
-### Model Loading Issues
- Models download to `~/.cache/huggingface/`
- First run requires internet connection
- Check disk space (models: 75MB-3GB depending on size)
-
-### Audio Device Issues
- Run `uv run python main_cli.py --list-devices`
- Check permissions (microphone access)
- Try different device indices in settings
-
-### GPU Not Detected
- Run `uv run python check_cuda.py`
- Install CUDA drivers (not CUDA toolkit - bundled in build)
- Verify PyTorch sees GPU: `python -c "import torch; print(torch.cuda.is_available())"`
-
-### Web Server Port Conflicts
- Default port: 8080
- Change in [gui/main_window_qt.py](gui/main_window_qt.py) or config
- Use `lsof -i :8080` (Linux) or `netstat -ano | findstr :8080` (Windows)
-
-## OBS Integration
-
-### Local Display (Single User)
-1. Start Local Transcription app
-2. In OBS: Add "Browser" source
-3. URL: `http://localhost:8080`
-4. Set dimensions (e.g., 1920x300)
-
-### Multi-User Display (Node.js Server)
-1. Deploy Node.js server (see [server/nodejs/README.md](server/nodejs/README.md))
-2. Each user configures Server URL: `http://your-server:3000/api/send`
-3. Enter same room name and passphrase
-4. In OBS: Add "Browser" source
-5. URL: `http://your-server:3000/display?room=ROOM&fade=10&timestamps=true&maxlines=50&fontsize=16`
-6. Customize URL parameters as needed:
-   - `timestamps=false` - Hide timestamps
-   - `maxlines=30` - Show max 30 lines (prevents scroll bars)
-   - `fontsize=18` - Larger font
-   - `fontfamily=Courier` - Different font
-
-## Performance Optimization
-
-**For Real-Time Transcription:**
- Use `tiny` or `base` model (faster)
- Enable GPU if available (5-10x faster)
- Increase chunk_duration for better accuracy (higher latency)
- Decrease chunk_duration for lower latency (less context)
- Enable VAD to skip silent audio
-
-**For Build Size Reduction:**
- Don't bundle models (download on demand)
- Use CPU-only build if no GPU users
- Enable UPX compression (already in spec)
-
-## Phase Status
-
- ✅ **Phase 1**: Standalone desktop application (complete)
- ✅ **Web Server**: Local OBS integration (complete)
- ✅ **Builds**: PyInstaller executables (complete)
- ✅ **Phase 2**: Multi-user Node.js server (complete, optional)
- ⏸️ **Phase 3+**: Advanced features (see [NEXT_STEPS.md](NEXT_STEPS.md))
+### macOS
+- Tauri needs: Xcode Command Line Tools
+- Audio: CoreAudio via sounddevice
+- GPU: MPS (Apple Silicon) detected by `device_utils.py`
+- `Info.plist` must include `NSMicrophoneUsageDescription` for mic access
+- No CUDA builds — CPU/MPS only

 ## Related Documentation

- [README.md](README.md) - User-facing documentation
- [BUILD.md](BUILD.md) - Detailed build instructions
- [INSTALL.md](INSTALL.md) - Installation guide
- [NEXT_STEPS.md](NEXT_STEPS.md) - Future enhancements
- [server/nodejs/README.md](server/nodejs/README.md) - Node.js server setup and deployment
+- [README.md](README.md) — User-facing documentation
+- [BUILD.md](BUILD.md) — Detailed build instructions
+- [INSTALL.md](INSTALL.md) — Installation guide
+- [server/nodejs/README.md](server/nodejs/README.md) — Node.js server setup
--- a/DEEPGRAM_PROXY_PLAN.md
+++ b/DEEPGRAM_PROXY_PLAN.md
@@ -0,0 +1,574 @@
+# Deepgram Proxy Service — Build Plan
+
+## Project Overview
+
+Build a standalone hosted service that acts as a Deepgram proxy for the Local Transcription
+desktop app. Users can either provide their own Deepgram API key (BYOK) or use the managed
+service with prepaid credits purchased via Stripe.
+
+This is a **separate repository** from `local-transcription`. The desktop app will be updated
+in a second phase to support both modes.
+
+---
+
+## Repository Structure
+
+```
+transcription-proxy/
+├── src/
+│   ├── server.js              # Express app entry point
+│   ├── config.js              # Environment config loader
+│   ├── db/
+│   │   ├── index.js           # node-postgres pool setup
+│   │   └── migrations/        # SQL migration files (numbered)
+│   │       ├── 001_users.sql
+│   │       ├── 002_credits.sql
+│   │       ├── 003_sessions.sql
+│   │       └── 004_usage_ledger.sql
+│   ├── middleware/
+│   │   ├── auth.js            # JWT verification middleware
+│   │   └── rateLimit.js       # Per-user rate limiting
+│   ├── routes/
+│   │   ├── auth.js            # POST /auth/register, /auth/login, /auth/refresh
+│   │   ├── billing.js         # POST /billing/checkout, GET /billing/balance
+│   │   └── account.js         # GET /account/me, GET /account/usage
+│   ├── websocket/
+│   │   └── proxy.js           # WebSocket proxy handler (core feature)
+│   └── webhooks/
+│       └── stripe.js          # POST /webhooks/stripe
+├── web/                       # Simple frontend dashboard
+│   ├── index.html             # Landing / login page
+│   ├── dashboard.html         # Balance, usage history, buy credits
+│   └── assets/
+│       ├── app.js
+│       └── style.css
+├── .env.example
+├── package.json
+├── docker-compose.yml         # Postgres + app for local dev
+└── CLAUDE.md                  # This file (after renaming)
+```
+
+---
+
+## Technology Stack
+
+- **Runtime**: Node.js 20+
+- **Framework**: Express 4
+- **WebSocket**: `ws` library (not socket.io — keep it lean)
+- **Database**: PostgreSQL 15+ via `pg` (node-postgres)
+- **Auth**: JWT via `jsonwebtoken`, passwords hashed with `bcrypt`
+- **Payments**: Stripe Node SDK (`stripe`)
+- **Environment**: `dotenv`
+- **Dev tooling**: `nodemon` for dev, no TypeScript (keep it simple)
+
+---
+
+## Database Schema
+
+Run migrations in order. Use a simple `schema_migrations` table to track applied migrations.
+
+### 001_users.sql
+```sql
+CREATE TABLE schema_migrations (
+  version INTEGER PRIMARY KEY,
+  applied_at TIMESTAMPTZ DEFAULT NOW()
+);
+
+CREATE TABLE users (
+  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+  email TEXT UNIQUE NOT NULL,
+  password_hash TEXT NOT NULL,
+  stripe_customer_id TEXT UNIQUE,
+  created_at TIMESTAMPTZ DEFAULT NOW(),
+  updated_at TIMESTAMPTZ DEFAULT NOW()
+);
+```
+
+### 002_credits.sql
+```sql
+CREATE TABLE credit_balance (
+  user_id UUID PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
+  seconds_remaining INTEGER NOT NULL DEFAULT 0,
+  updated_at TIMESTAMPTZ DEFAULT NOW()
+);
+```
+
+### 003_sessions.sql
+```sql
+CREATE TABLE transcription_sessions (
+  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+  user_id UUID NOT NULL REFERENCES users(id),
+  mode TEXT NOT NULL CHECK (mode IN ('managed', 'byok')),
+  started_at TIMESTAMPTZ DEFAULT NOW(),
+  ended_at TIMESTAMPTZ,
+  seconds_used INTEGER NOT NULL DEFAULT 0,
+  deepgram_model TEXT,
+  status TEXT NOT NULL DEFAULT 'active' CHECK (status IN ('active', 'completed', 'terminated'))
+);
+
+CREATE INDEX idx_sessions_user_id ON transcription_sessions(user_id);
+CREATE INDEX idx_sessions_started_at ON transcription_sessions(started_at);
+```
+
+### 004_usage_ledger.sql
+```sql
+CREATE TABLE usage_ledger (
+  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+  user_id UUID NOT NULL REFERENCES users(id),
+  session_id UUID REFERENCES transcription_sessions(id),
+  recorded_at TIMESTAMPTZ DEFAULT NOW(),
+  seconds INTEGER NOT NULL,
+  description TEXT  -- e.g. 'session_usage', 'credit_purchase', 'manual_adjustment'
+);
+
+CREATE INDEX idx_ledger_user_id ON usage_ledger(user_id);
+```
+
+---
+
+## Environment Variables (.env.example)
+
+```env
+# Server
+PORT=3000
+NODE_ENV=development
+
+# Database
+DATABASE_URL=postgresql://user:password@localhost:5432/transcription_proxy
+
+# Auth
+JWT_SECRET=changeme_use_long_random_string
+JWT_EXPIRY=7d
+
+# Stripe
+STRIPE_SECRET_KEY=sk_test_...
+STRIPE_WEBHOOK_SECRET=whsec_...
+
+# Deepgram
+DEEPGRAM_API_KEY=your_deepgram_key_here
+
+# Pricing (seconds per dollar — adjust for your margin)
+# Default: 1000 seconds per $1 = $0.006/min managed cost covered + margin
+CREDITS_PER_DOLLAR=1000
+```
+
+---
+
+## Phase 1 — Core Server & Auth
+
+### Goals
+- Working Express app with Postgres connection
+- Migration runner
+- User registration and login
+- JWT middleware
+
+### Tasks
+
+1. **Scaffold project**
+   - `npm init`, install dependencies: `express ws pg jsonwebtoken bcrypt stripe dotenv`
+   - Dev dependencies: `nodemon`
+   - Add `start` and `dev` scripts to package.json
+
+2. **Database connection** (`src/db/index.js`)
+   - Export a `pg.Pool` instance using `DATABASE_URL`
+   - Export a `migrate()` function that reads `src/db/migrations/*.sql` in order,
+     checks `schema_migrations` table, and applies unapplied ones
+   - Call `migrate()` on server startup before listening
+
+3. **Auth routes** (`src/routes/auth.js`)
+   - `POST /auth/register` — validate email/password, hash password with bcrypt (cost 12),
+     insert user, insert empty credit_balance row, return JWT
+   - `POST /auth/login` — verify credentials, return JWT + refresh token
+   - `POST /auth/refresh` — validate refresh token, return new JWT
+   - Passwords: minimum 8 characters, validate email format
+
+4. **JWT middleware** (`src/middleware/auth.js`)
+   - Verify `Authorization: Bearer <token>` header
+   - Attach `req.user = { id, email }` on success
+   - Return 401 on failure
+   - Export as `requireAuth` middleware
+
+5. **Basic health check**
+   - `GET /health` returns `{ status: 'ok', db: 'connected' }`
+
+---
+
+## Phase 2 — Billing & Credits
+
+### Goals
+- Stripe Checkout session creation for credit purchases
+- Webhook handler to fulfill purchases
+- Balance endpoint
+
+### Payment Methods
+
+Use **Stripe Dynamic Payment Methods** — do NOT hardcode `payment_method_types` in the
+Checkout Session. Instead, leave it unset and manage everything from the Stripe Dashboard.
+
+Enable the following in the Stripe Dashboard under Settings → Payment Methods:
+- **Cards** (Visa, Mastercard, Amex, Discover) — on by default
+- **PayPal** — enable manually
+- **Apple Pay** — on by default, shows automatically on Safari/iOS
+- **Google Pay** — enable manually (one toggle)
+- **Cash App Pay** — enable manually (popular with streaming audiences)
+- **Link** — Stripe's saved payment network, on by default
+
+Stripe will automatically show the most relevant methods to each user based on their
+location and device. No code changes are needed to add or remove methods in future —
+it's all dashboard config.
+
+### Credit Packages
+
+Define these as constants in `src/config.js`:
+
+```javascript
+CREDIT_PACKAGES: [
+  { id: 'pack_500',  label: '500 minutes',  seconds: 30000,  price_cents: 300  },
+  { id: 'pack_1200', label: '1200 minutes', seconds: 72000,  price_cents: 600  },
+  { id: 'pack_3000', label: '3000 minutes', seconds: 180000, price_cents: 1200 },
+]
+```
+
+Adjust pricing to cover Deepgram costs ($0.006/min = $0.0001/sec) plus margin and
+Stripe fees (~2.9% + $0.30).
+
+### Tasks
+
+1. **Stripe customer creation**
+   - On user registration, create a Stripe customer and store `stripe_customer_id`
+   - Do this asynchronously (don't block registration response)
+
+2. **Billing routes** (`src/routes/billing.js`)
+   - `GET /billing/packages` — return credit package list (no auth required)
+   - `POST /billing/checkout` — requires auth, accepts `{ package_id }`,
+     creates Stripe Checkout Session using dynamic payment methods (do NOT pass
+     `payment_method_types` — omitting it enables dynamic methods automatically),
+     include `payment_intent_data.metadata` containing `user_id` and `package_id`,
+     returns `{ checkout_url }`
+   - `GET /billing/balance` — requires auth, returns `{ seconds_remaining, minutes_remaining }`
+
+3. **Stripe webhook** (`src/webhooks/stripe.js`)
+   - Mount at `POST /webhooks/stripe` with raw body (use `express.raw()` for this route only)
+   - Verify signature with `stripe.webhooks.constructEvent()`
+   - Handle `checkout.session.completed`:
+     - Extract `user_id` and `package_id` from metadata
+     - Add seconds to `credit_balance`
+     - Insert row into `usage_ledger` with description `'credit_purchase'`
+   - Handle `payment_intent.payment_failed`: log it (no action needed for prepaid)
+
+4. **Success/cancel pages**
+   - Stripe Checkout redirects to `GET /billing/success?session_id=...` and `/billing/cancel`
+   - These can be simple HTML responses or redirects to the web dashboard
+
+---
+
+## Phase 3 — WebSocket Proxy (Core Feature)
+
+This is the most critical component. The proxy sits between the desktop client and Deepgram,
+forwarding audio while tracking usage in real time.
+
+### Connection Flow
+
+```
+Client connects → validate JWT → check credit balance → open Deepgram upstream
+     ↓
+Audio chunks arrive → forward to Deepgram → record usage every 5 seconds
+     ↓
+Transcription arrives from Deepgram → forward to client
+     ↓
+Client disconnects (or credits exhausted) → close upstream → finalize session
+```
+
+### WebSocket Protocol
+
+**Client connects to**: `wss://your-domain/ws/transcribe`
+
+**Client sends as first message** (JSON):
+```json
+{
+  "type": "auth",
+  "token": "<JWT>",
+  "config": {
+    "model": "nova-2",
+    "language": "en-US",
+    "interim_results": true,
+    "endpointing": 300
+  }
+}
+```
+
+**After auth success, client sends**: raw audio binary frames (PCM 16kHz mono)
+
+**Server sends to client**:
+```json
+{ "type": "ready" }
+{ "type": "transcript", "text": "...", "is_final": true, "confidence": 0.98 }
+{ "type": "error", "code": "insufficient_credits", "message": "..." }
+{ "type": "credits_low", "seconds_remaining": 300 }
+{ "type": "session_end", "seconds_used": 120 }
+```
+
+### Tasks (`src/websocket/proxy.js`)
+
+1. **Upgrade handler**
+   - Attach to the HTTP server using `ws.Server({ noServer: true })`
+   - In `server.on('upgrade', ...)`, route `/ws/transcribe` to this handler
+
+2. **Auth handshake**
+   - First message must be `{ type: 'auth', token: '...' }` — received within 5 seconds
+     or connection is terminated
+   - Verify JWT, load user's credit balance from DB
+   - If balance is 0 or negative, send `insufficient_credits` error and close
+
+3. **Deepgram upstream connection**
+   - Open a WebSocket to Deepgram's streaming API:
+     `wss://api.deepgram.com/v1/listen?model=nova-2&language=en-US&interim_results=true`
+   - Auth header: `Authorization: Token <DEEPGRAM_API_KEY>`
+   - Use query params from client's `config` object (whitelist allowed params)
+
+4. **Audio forwarding**
+   - All binary messages from client → forward directly to Deepgram upstream
+   - All messages from Deepgram → parse JSON, reformat, forward to client
+
+5. **Usage tracking**
+   - Create a `transcription_sessions` row on connection
+   - Maintain an in-memory `secondsUsed` counter per connection
+   - Deepgram sends `{ type: 'Results', duration: X }` in responses — use this for
+     accurate second counting
+   - Every 10 seconds (or on disconnect), write current `secondsUsed` to DB:
+     - Update `transcription_sessions.seconds_used`
+     - Decrement `credit_balance.seconds_remaining`
+     - Insert into `usage_ledger`
+   - If `seconds_remaining` hits 0: send `insufficient_credits`, close connection
+
+6. **Cleanup on disconnect**
+   - Mark session as `completed`, set `ended_at`
+   - Do final usage flush to DB
+   - Close Deepgram upstream if still open
+
+7. **Error handling**
+   - If Deepgram upstream closes unexpectedly, notify client and close
+   - If client sends malformed data, log and continue (don't crash)
+
+---
+
+## Phase 4 — Account Routes & Rate Limiting
+
+### Tasks
+
+1. **Account routes** (`src/routes/account.js`)
+   - `GET /account/me` — returns `{ email, credits: { seconds_remaining, minutes_remaining }, created_at }`
+   - `GET /account/usage` — returns last 30 days of `usage_ledger` entries grouped by day,
+     plus list of last 10 sessions with duration
+
+2. **Rate limiting** (`src/middleware/rateLimit.js`)
+   - Use in-memory rate limiting (no Redis needed at this scale)
+   - Auth endpoints: max 10 requests per minute per IP
+   - WebSocket connections: max 2 concurrent connections per user
+     (store active connections in a `Map<userId, Set<ws>>`)
+
+---
+
+## Phase 5 — Web Dashboard
+
+A simple, functional HTML/CSS/JS dashboard. No framework — vanilla JS is fine.
+This is a developer-friendly streamer tool, not a consumer SaaS, so clean and
+functional beats flashy.
+
+### Pages
+
+**`/` (Landing / Login)**
+- Brief product description (what this is, why it exists)
+- Login form and link to register
+- Link to GitHub/Gitea repo
+
+**`/dashboard` (Post-login)**
+- Current credit balance (minutes remaining, prominently displayed)
+- "Buy Credits" section showing the three packages with Stripe Checkout buttons
+- Usage chart: last 30 days bar chart (vanilla canvas or a small CDN chart lib)
+- Recent sessions table: date, duration, status
+
+**`/register`**
+- Registration form
+
+### Implementation Notes
+- Store JWT in `localStorage`, attach as `Authorization` header on API calls
+- Redirect to `/` if JWT missing or expired
+- Keep CSS minimal but readable — this is a utility dashboard
+
+---
+
+## Phase 6 — Desktop App Integration
+
+Changes needed in the `local-transcription` Python repo.
+
+### New file: `client/remote_transcription.py`
+
+This module replaces `transcription_engine_realtime.py` when remote mode is active.
+
+```python
+# Pseudocode / spec for Claude Code to implement
+
+class RemoteTranscriptionEngine:
+    """
+    Connects to the transcription proxy WebSocket and streams audio.
+    Provides the same callback interface as the local engine so the
+    rest of the app doesn't need to change.
+    """
+
+    def __init__(self, config, on_transcript_callback):
+        # config contains: server_url, auth_token (or byok_api_key), model
+        ...
+
+    def start(self):
+        # Open WebSocket connection
+        # Send auth message
+        # Start audio capture thread (reuse existing audio_capture.py)
+        ...
+
+    def stop(self):
+        # Close WebSocket gracefully
+        ...
+
+    def _on_audio_chunk(self, audio_data):
+        # Called by audio_capture.py with raw PCM data
+        # Send as binary WebSocket frame
+        ...
+
+    def _on_server_message(self, message):
+        # Parse JSON from server
+        # On type='transcript': call on_transcript_callback
+        # On type='credits_low': trigger UI warning
+        # On type='error': surface to user
+        ...
+```
+
+### BYOK Mode
+
+When user provides their own Deepgram key, connect directly to Deepgram instead of the proxy:
+- Endpoint: `wss://api.deepgram.com/v1/listen?...`
+- Auth: `Authorization: Token <user_key>`
+- No session tracking (Deepgram handles billing directly to the user)
+- Same `RemoteTranscriptionEngine` class, just different URL and auth header
+
+### Settings Changes (`gui/settings_dialog_qt.py`)
+
+Add a new "Transcription Mode" section:
+
+```
+Transcription Mode:
+  ○ Local (Whisper)          [existing behavior]
+  ○ Remote - Managed         [requires login]
+  ○ Remote - BYOK            [requires Deepgram API key]
+
+[If Managed selected]:
+  Server URL: [____________]
+  [Login / Register]  [View Balance: 420 min remaining]
+
+[If BYOK selected]:
+  Deepgram API Key: [____________]
+  Model: [nova-2 ▼]
+```
+
+### Config additions (`config/default_config.yaml`)
+
+```yaml
+remote:
+  mode: local           # local | managed | byok
+  server_url: ""        # proxy server URL for managed mode
+  auth_token: ""        # JWT stored after login
+  byok_api_key: ""      # Deepgram key for BYOK mode
+  deepgram_model: nova-2
+  language: en-US
+```
+
+---
+
+## Build & Deployment Notes
+
+### Docker Compose (local dev)
+
+```yaml
+version: '3.8'
+services:
+  db:
+    image: postgres:15
+    environment:
+      POSTGRES_DB: transcription_proxy
+      POSTGRES_USER: user
+      POSTGRES_PASSWORD: password
+    ports:
+      - "5432:5432"
+    volumes:
+      - pgdata:/var/lib/postgresql/data
+
+  app:
+    build: .
+    ports:
+      - "3000:3000"
+    environment:
+      DATABASE_URL: postgresql://user:password@db:5432/transcription_proxy
+    depends_on:
+      - db
+    volumes:
+      - .:/app
+      - /app/node_modules
+
+volumes:
+  pgdata:
+```
+
+### Production Deployment
+
+This service is a good fit for deployment on AnHonestHost WHP as a containerized app,
+or on a small DigitalOcean/Linode VPS. Requirements are light:
+- 512MB RAM is sufficient
+- Postgres can be the same instance as other services or managed (e.g., Supabase free tier)
+- Needs a public domain with SSL for WebSocket (`wss://`) to work from desktop clients
+
+Reverse proxy config (Nginx or HAProxy) should:
+- Proxy HTTP → `localhost:3000`
+- Pass `Upgrade` and `Connection` headers for WebSocket support
+- Set `proxy_read_timeout 3600` (sessions can be long)
+
+---
+
+## Implementation Order
+
+Build and test in this sequence:
+
+1. Project scaffold + DB connection + migrations
+2. Auth (register/login/JWT) — test with curl
+3. Stripe billing + webhook — test with Stripe CLI (`stripe listen`)
+4. WebSocket proxy — test with a simple browser WebSocket client first
+5. Usage tracking and credit decrement
+6. Account/usage routes
+7. Web dashboard
+8. Desktop app integration (separate PR in local-transcription repo)
+
+---
+
+## Key Decisions & Rationale
+
+| Decision | Choice | Reason |
+|---|---|---|
+| Credits model | Prepaid | No surprise charges, simpler billing, better for irregular streamer usage |
+| WebSocket library | `ws` | Lightweight, no abstraction overhead, plays well with raw binary audio |
+| Auth | JWT (stateless) | Desktop app holds token locally; no session store needed |
+| DB driver | `node-postgres` (pg) | No ORM overhead; schema is simple enough for raw SQL |
+| Migrations | Raw SQL files | No dependency on Knex/Prisma; easy to inspect and reason about |
+| Rate limiting | In-memory | Redis is overkill for this scale; single-process Node is fine initially |
+| Frontend | Vanilla JS | Dashboard is simple utility UI; no framework justified |
+
+---
+
+## What This Plan Does NOT Cover (Future Work)
+
+- OAuth / social login
+- Admin panel for managing users
+- Refund / credit adjustment tooling
+- Email verification
+- Password reset flow
+- Multi-language support beyond Deepgram's defaults
+- Analytics / aggregated usage reporting
+- Self-hosted Whisper inference as a third backend option
--- a/README.md
+++ b/README.md
@@ -1,494 +1,318 @@
-# Local Transcription for Streamers
+# Local Transcription

-A local speech-to-text application designed for streamers that provides real-time transcription using Whisper or similar models. Multiple users can run the application locally and sync their transcriptions to a centralized web stream that can be easily captured in OBS or other streaming software.
+A real-time speech-to-text desktop application for streamers. Runs locally on your machine with GPU or CPU, displays transcriptions via OBS browser source, and optionally syncs with other users through a multi-user server.
+
+**Version 1.4.0**

 ## Features

- **Standalone Desktop Application**: Use locally with built-in GUI display - no server required
- **Local Transcription**: Run Whisper (or compatible models) locally on your machine
- **CPU/GPU Support**: Choose between CPU or GPU processing based on your hardware
- **Real-time Processing**: Live audio transcription with minimal latency
+- **Real-Time Transcription**: Live speech-to-text using Whisper models with minimal latency
+- **Cross-Platform**: Native desktop app for Windows, macOS, and Linux via [Tauri](https://tauri.app/)
+- **Dual Transcription Modes**: Local (Whisper) or cloud (Deepgram) with managed billing or BYOK
+- **CPU & GPU Support**: Automatic detection of CUDA (NVIDIA), MPS (Apple Silicon), or CPU fallback
+- **Advanced Voice Detection**: Dual-layer VAD (WebRTC + Silero) for accurate speech detection
+- **OBS Integration**: Built-in web server for browser source capture at `http://localhost:8080`
+- **Multi-User Sync**: Optional Node.js server to sync transcriptions across multiple users
+- **Custom Fonts**: Support for system fonts, web-safe fonts, Google Fonts, and custom font files
+- **Customizable Colors**: User-configurable colors for name, text, and background
 - **Noise Suppression**: Built-in audio preprocessing to reduce background noise
- **User Configuration**: Set your display name and preferences through the GUI
- **Optional Multi-user Sync**: Connect to a server to sync transcriptions with other users
- **OBS Integration**: Web-based output designed for easy browser source capture
- **Privacy-First**: All processing happens locally; only transcription text is shared
- **Customizable**: Configure model size, language, and streaming settings
+- **Auto-Updates**: Automatic update checking with release notes display
+
+## Architecture
+
+The application uses a two-process architecture:
+
+1. **Tauri Shell** (Svelte 5 frontend) — lightweight native window (~50MB) rendering the UI
+2. **Python Backend** (sidecar) — headless process running transcription, audio capture, and the OBS web server
+
+The Tauri frontend communicates with the Python backend via REST API and WebSocket, following the same pattern as [voice-to-notes](https://repo.anhonesthost.net/MacroPad/voice-to-notes).
+
+```
+Tauri App (user launches this)
+  └─ Spawns Python backend as sidecar
+       ├─ FastAPI REST API (control endpoints)
+       ├─ WebSocket /ws/control (real-time state + transcriptions)
+       ├─ OBS web display at http://localhost:8080
+       └─ Transcription engine (Whisper or Deepgram)
+```
+
+> **Legacy GUI**: The original PySide6/Qt desktop GUI (`main.py`) still works alongside the new Tauri frontend during the transition period.

 ## Quick Start

 ### Running from Source

 ```bash
-# Install dependencies
+# Install Python dependencies
 uv sync

-# Run the application
+# Run the Tauri app (frontend + backend)
+npm install
+npm run tauri dev
+
+# Or run just the headless backend (for development)
+uv run python -m backend.main_headless
+
+# Or run the legacy PySide6 GUI
 uv run python main.py
 ```

-### Building Standalone Executables
+### Using Pre-Built Executables

-To create standalone executables for distribution:
+Download the latest release from the [releases page](https://repo.anhonesthost.net/streamer-tools/local-transcription/releases):
+
+- **App installer** (Tauri shell): `.msi` (Windows), `.dmg` (macOS), `.deb`/`.rpm`/`.AppImage` (Linux)
+- **Sidecar** (Python backend): Download the matching `sidecar-*` zip for your platform (CUDA or CPU)
+
+### Building from Source

-**Linux:**
 ```bash
-./build.sh
-```
+# Build the Tauri app
+npm install
+npm run tauri build
+# Output: src-tauri/target/release/bundle/

-**Windows:**
-```cmd
+# Build the Python sidecar (headless, no Qt)
+uv sync
+uv run pyinstaller local-transcription-headless.spec
+# Output: dist/local-transcription-backend/
+
+# Build the legacy PySide6 app (Linux)
+./build.sh
+# Build the legacy PySide6 app (Windows)
 build.bat
 ```

 For detailed build instructions, see [BUILD.md](BUILD.md).

-## Architecture Overview
+## Usage

-The application can run in two modes:
+### Standalone Mode

-### Standalone Mode (No Server Required):
-1. **Desktop Application**: Captures audio, performs speech-to-text, and displays transcriptions locally in a GUI window
+1. Launch the application
+2. Select your microphone from the audio device dropdown
+3. Choose a Whisper model (smaller = faster, larger = more accurate):
+   - `tiny.en` / `tiny` — Fastest, good for quick captions
+   - `base.en` / `base` — Balanced speed and accuracy
+   - `small.en` / `small` — Better accuracy
+   - `medium.en` / `medium` — High accuracy
+   - `large-v3` — Best accuracy (requires more resources)
+4. Click **Start** to begin transcription
+5. Transcriptions appear in the main window and at `http://localhost:8080`

-### Multi-user Sync Mode (Optional):
-1. **Local Transcription Client**: Captures audio, performs speech-to-text, and sends results to the web server
-2. **Centralized Web Server**: Aggregates transcriptions from multiple clients and serves a web stream
-3. **Web Stream Interface**: Browser-accessible page displaying synchronized transcriptions (for OBS capture)
+### Remote Transcription (Deepgram)

-## Use Cases
+Instead of local Whisper models, you can use cloud-based transcription:

- **Multi-language Streams**: Multiple translators transcribing in different languages
- **Accessibility**: Provide real-time captions for viewers
- **Collaborative Podcasts**: Multiple hosts with separate transcriptions
- **Gaming Commentary**: Track who said what in multiplayer sessions
+- **Managed mode**: Sign up via the transcription proxy for metered billing
+- **BYOK mode**: Bring your own Deepgram API key for direct access

---
+Configure in Settings > Remote Transcription.

-## Implementation Plan
+### OBS Browser Source Setup

-### Phase 1: Standalone Desktop Application
+1. Start the Local Transcription app
+2. In OBS, add a **Browser** source
+3. Set URL to `http://localhost:8080`
+4. Set dimensions (e.g., 1920x300)
+5. Check "Shutdown source when not visible" for performance

-**Objective**: Build a fully functional standalone transcription app with GUI that works without any server
+### Multi-User Mode (Optional)

-#### Components:
-1. **Audio Capture Module**
-   - Capture system audio or microphone input
-   - Support multiple audio sources (virtual audio cables, physical devices)
-   - Real-time audio buffering with configurable chunk sizes
-   - **Noise Suppression**: Preprocess audio to reduce background noise
-   - Libraries: `pyaudio`, `sounddevice`, `noisereduce`, `webrtcvad`
+For syncing transcriptions across multiple users (e.g., multi-host streams or translation teams):

-2. **Noise Suppression Engine**
-   - Real-time noise reduction using RNNoise or noisereduce
-   - Adjustable noise reduction strength
-   - Optional VAD (Voice Activity Detection) to skip silent segments
-   - Libraries: `noisereduce`, `rnnoise-python`, `webrtcvad`
+1. Deploy the Node.js server (see [server/nodejs/README.md](server/nodejs/README.md))
+2. In the app settings, enable **Server Sync**
+3. Enter the server URL (e.g., `http://your-server:3000/api/send`)
+4. Set a room name and passphrase (shared with other users)
+5. In OBS, use the server's display URL with your room name:
+   ```
+   http://your-server:3000/display?room=YOURROOM&timestamps=true&maxlines=50
+   ```

-3. **Transcription Engine**
-   - Integrate OpenAI Whisper (or alternatives: faster-whisper, whisper.cpp)
-   - Support multiple model sizes (tiny, base, small, medium, large)
-   - CPU and GPU inference options
-   - Model management and automatic downloading
-   - Libraries: `openai-whisper`, `faster-whisper`, `torch`
+## Configuration

-4. **Device Selection**
-   - Auto-detect available compute devices (CPU, CUDA, MPS for Mac)
-   - Allow user to specify preferred device via GUI
-   - Graceful fallback if GPU unavailable
-   - Display device status and performance metrics
+Settings are stored at `~/.local-transcription/config.yaml` and can be modified through the GUI settings panel or the REST API.

-5. **Desktop GUI Application**
-   - Cross-platform GUI using PyQt6, Tkinter, or CustomTkinter
-   - Main transcription display window (scrolling text area)
-   - Settings panel for configuration
-   - User name input field
-   - Audio input device selector
-   - Model size selector
-   - CPU/GPU toggle
-   - Start/Stop transcription button
-   - Optional: System tray integration
-   - Libraries: `PyQt6`, `customtkinter`, or `tkinter`
+### Key Settings

-6. **Local Display**
-   - Real-time transcription display in GUI window
-   - Scrolling text with timestamps
-   - User name/label shown with transcriptions
-   - Copy transcription to clipboard
-   - Optional: Save transcription to file (TXT, SRT, VTT)
+| Setting | Description | Default |
+|---------|-------------|---------|
+| `transcription.model` | Whisper model to use | `base.en` |
+| `transcription.device` | Processing device (auto/cuda/cpu) | `auto` |
+| `transcription.enable_realtime_transcription` | Show preview while speaking | `false` |
+| `transcription.silero_sensitivity` | VAD sensitivity (0-1, lower = more sensitive) | `0.4` |
+| `transcription.post_speech_silence_duration` | Silence before finalizing (seconds) | `0.3` |
+| `transcription.continuous_mode` | Fast speaker mode for quick talkers | `false` |
+| `remote.mode` | Transcription mode (local/managed/byok) | `local` |
+| `display.show_timestamps` | Show timestamps with transcriptions | `true` |
+| `display.fade_after_seconds` | Fade out time (0 = never) | `10` |
+| `display.font_source` | Font type (System Font/Web-Safe/Google Font/Custom File) | `System Font` |
+| `web_server.port` | Local web server port | `8080` |

-#### Tasks:
- [ ] Set up project structure and dependencies
- [ ] Implement audio capture with device selection
- [ ] Add noise suppression and VAD preprocessing
- [ ] Integrate Whisper model loading and inference
- [ ] Add CPU/GPU device detection and selection logic
- [ ] Create real-time audio buffer processing pipeline
- [ ] Design and implement GUI layout (main window)
- [ ] Add settings panel with user name configuration
- [ ] Implement local transcription display area
- [ ] Add start/stop controls and status indicators
- [ ] Test transcription accuracy and latency
- [ ] Test noise suppression effectiveness
-
---
-
-### Phase 2: Web Server and Sync System
-
-**Objective**: Create a centralized server to aggregate and serve transcriptions
-
-#### Components:
-1. **Web Server**
-   - FastAPI or Flask-based REST API
-   - WebSocket support for real-time updates
-   - User/client registration and management
-   - Libraries: `fastapi`, `uvicorn`, `websockets`
-
-2. **Transcription Aggregator**
-   - Receive transcription chunks from multiple clients
-   - Associate transcriptions with user IDs/names
-   - Timestamp management and synchronization
-   - Buffer management for smooth streaming
-
-3. **Database/Storage** (Optional)
-   - Store transcription history (SQLite for simplicity)
-   - Session management
-   - Export functionality (SRT, VTT, TXT formats)
-
-#### API Endpoints:
- `POST /api/register` - Register a new client
- `POST /api/transcription` - Submit transcription chunk
- `WS /api/stream` - WebSocket for real-time transcription stream
- `GET /stream` - Web page for OBS browser source
-
-#### Tasks:
- [ ] Set up FastAPI server with CORS support
- [ ] Implement WebSocket handler for real-time streaming
- [ ] Create client registration system
- [ ] Build transcription aggregation logic
- [ ] Add timestamp synchronization
- [ ] Create data models for clients and transcriptions
-
---
-
-### Phase 3: Client-Server Communication (Optional Multi-user Mode)
-
-**Objective**: Add optional server connectivity to enable multi-user transcription sync
-
-#### Components:
-1. **HTTP/WebSocket Client**
-   - Register client with server on startup
-   - Send transcription chunks as they're generated
-   - Handle connection drops and reconnection
-   - Libraries: `requests`, `websockets`
-
-2. **Configuration System**
-   - Config file for server URL, API keys, user settings
-   - Model preferences (size, language)
-   - Audio input settings
-   - Format: YAML or JSON
-
-3. **Status Monitoring**
-   - Connection status indicator
-   - Transcription queue health
-   - Error handling and logging
-
-#### Tasks:
- [ ] Add "Enable Server Sync" toggle to GUI
- [ ] Add server URL configuration field in settings
- [ ] Implement WebSocket client for sending transcriptions
- [ ] Add configuration file support (YAML/JSON)
- [ ] Create connection management with auto-reconnect
- [ ] Add local logging and error handling
- [ ] Add server connection status indicator to GUI
- [ ] Allow app to function normally if server is unavailable
-
---
-
-### Phase 4: Web Stream Interface (OBS Integration)
-
-**Objective**: Create a web page that displays synchronized transcriptions for OBS
-
-#### Components:
-1. **Web Frontend**
-   - HTML/CSS/JavaScript page for displaying transcriptions
-   - Responsive design with customizable styling
-   - Auto-scroll with configurable retention window
-   - Libraries: Vanilla JS or lightweight framework (Alpine.js, htmx)
-
-2. **Styling Options**
-   - Customizable fonts, colors, sizes
-   - Background transparency for OBS chroma key
-   - User name/ID display options
-   - Timestamp display (optional)
-
-3. **Display Modes**
-   - Scrolling captions (like live TV captions)
-   - Multi-user panel view (separate sections per user)
-   - Overlay mode (minimal UI for transparency)
-
-#### Tasks:
- [ ] Create HTML template for transcription display
- [ ] Implement WebSocket client in JavaScript
- [ ] Add CSS styling with OBS-friendly transparency
- [ ] Create customization controls (URL parameters or UI)
- [ ] Test with OBS browser source
- [ ] Add configurable retention/scroll behavior
-
---
-
-### Phase 5: Advanced Features
-
-**Objective**: Enhance functionality and user experience
-
-#### Features:
-1. **Language Detection**
-   - Auto-detect spoken language
-   - Multi-language support in single stream
-   - Language selector in GUI
-
-2. **Speaker Diarization** (Optional)
-   - Identify different speakers
-   - Label transcriptions by speaker
-   - Useful for multi-host streams
-
-3. **Profanity Filtering**
-   - Optional word filtering/replacement
-   - Customizable filter lists
-   - Toggle in GUI settings
-
-4. **Advanced Noise Profiles**
-   - Save and load custom noise profiles
-   - Adaptive noise suppression
-   - Different profiles for different environments
-
-5. **Export Functionality**
-   - Save transcriptions in multiple formats (TXT, SRT, VTT, JSON)
-   - Export button in GUI
-   - Automatic session saving
-
-6. **Hotkey Support**
-   - Global hotkeys to start/stop transcription
-   - Mute/unmute hotkey
-   - Quick save hotkey
-
-7. **Docker Support**
-   - Containerized server deployment
-   - Docker Compose for easy multi-component setup
-   - Pre-built images for easy deployment
-
-8. **Themes and Customization**
-   - Dark/light theme toggle
-   - Customizable font sizes and colors for display
-   - OBS-friendly transparent overlay mode
-
-#### Tasks:
- [ ] Add language detection and multi-language support
- [ ] Implement speaker diarization
- [ ] Create optional profanity filter
- [ ] Add export functionality (SRT, VTT, plain text, JSON)
- [ ] Implement global hotkey support
- [ ] Create Docker containers for server component
- [ ] Add theme customization options
- [ ] Create advanced noise profile management
-
---
-
-## Technology Stack
-
-### Local Client:
- **Python 3.9+**
- **GUI**: PyQt6 / CustomTkinter / tkinter
- **Audio**: PyAudio / sounddevice
- **Noise Suppression**: noisereduce / rnnoise-python
- **VAD**: webrtcvad
- **ML Framework**: PyTorch (for Whisper)
- **Transcription**: openai-whisper / faster-whisper
- **Networking**: websockets, requests (optional for server sync)
- **Config**: PyYAML / json
-
-### Server:
- **Backend**: FastAPI / Flask
- **WebSocket**: python-websockets / FastAPI WebSockets
- **Server**: Uvicorn / Gunicorn
- **Database** (optional): SQLite / PostgreSQL
- **CORS**: fastapi-cors
-
-### Web Interface:
- **Frontend**: HTML5, CSS3, JavaScript (ES6+)
- **Real-time**: WebSocket API
- **Styling**: CSS Grid/Flexbox for layout
-
---
+See [config/default_config.yaml](config/default_config.yaml) for all available options.

 ## Project Structure

 ```
 local-transcription/
- client/                      # Local transcription client
-    __init__.py
-    audio_capture.py         # Audio input handling
-    transcription_engine.py  # Whisper integration
-    network_client.py        # Server communication
-    config.py                # Configuration management
-    main.py                  # Client entry point
- server/                      # Centralized web server
-    __init__.py
-    api.py                   # FastAPI routes
-    websocket_handler.py     # WebSocket management
-    models.py                # Data models
-    database.py              # Optional DB layer
-    main.py                  # Server entry point
- web/                         # Web stream interface
-    index.html               # OBS browser source page
-    styles.css               # Customizable styling
-    app.js                   # WebSocket client & UI logic
- config/
-    client_config.example.yaml
-    server_config.example.yaml
- tests/
-    test_audio.py
-    test_transcription.py
-    test_server.py
- requirements.txt             # Python dependencies
- README.md
- main.py                      # Combined launcher (optional)
+├── src/                             # Svelte 5 frontend (Tauri UI)
+│   ├── App.svelte                   # Main app shell
+│   ├── lib/components/              # UI components
+│   │   ├── Header.svelte
+│   │   ├── StatusBar.svelte
+│   │   ├── Controls.svelte
+│   │   ├── TranscriptionDisplay.svelte
+│   │   └── Settings.svelte
+│   └── lib/stores/                  # Reactive state management
+│       ├── backend.ts               # WebSocket + REST API client
+│       ├── config.ts                # App configuration
+│       └── transcriptions.ts        # Transcription data
+├── src-tauri/                       # Tauri v2 Rust shell
+│   ├── src/main.rs
+│   └── tauri.conf.json
+├── backend/                         # Headless Python backend (sidecar)
+│   ├── app_controller.py            # Orchestration logic (engine, sync, config)
+│   ├── api_server.py                # FastAPI REST + WebSocket control API
+│   └── main_headless.py             # Headless entry point
+├── client/                          # Core transcription modules
+│   ├── audio_capture.py             # Audio input handling
+│   ├── transcription_engine_realtime.py  # RealtimeSTT / Whisper
+│   ├── deepgram_transcription.py    # Deepgram cloud transcription
+│   ├── noise_suppression.py         # VAD and noise reduction
+│   ├── device_utils.py              # CPU/GPU/MPS detection
+│   ├── config.py                    # Configuration management
+│   ├── server_sync.py               # Multi-user server client
+│   └── update_checker.py            # Auto-update functionality
+├── gui/                             # Legacy PySide6/Qt GUI
+│   ├── main_window_qt.py
+│   ├── settings_dialog_qt.py
+│   └── transcription_display_qt.py
+├── server/                          # Web servers
+│   ├── web_display.py               # Local FastAPI server for OBS
+│   └── nodejs/                      # Multi-user sync server
+├── .gitea/workflows/                # CI/CD
+│   ├── release.yml                  # Tauri app builds (all platforms)
+│   └── build-sidecar.yml            # Python sidecar builds (CUDA + CPU)
+├── config/
+│   └── default_config.yaml          # Default settings template
+├── main.py                          # Legacy GUI entry point
+├── main_cli.py                      # CLI version (for testing)
+├── local-transcription.spec         # PyInstaller config (legacy, with PySide6)
+├── local-transcription-headless.spec # PyInstaller config (headless sidecar)
+├── pyproject.toml                   # Python dependencies
+└── package.json                     # Node.js / Tauri dependencies
 ```

---
+## Technology Stack

-## Installation (Planned)
+### Frontend (Tauri)
+- **Tauri v2** — Native cross-platform shell (Rust)
+- **Svelte 5** — Reactive UI framework (TypeScript)
+- **Vite** — Frontend build tool

-### Prerequisites:
- Python 3.9 or higher
- CUDA-capable GPU (optional, for GPU acceleration)
- FFmpeg (required by Whisper)
+### Backend (Python Sidecar)
+- **Python 3.9+**
+- **FastAPI + Uvicorn** — REST API and WebSocket server
+- **RealtimeSTT** — Real-time speech-to-text with advanced VAD
+- **faster-whisper** — Optimized Whisper model inference (CTranslate2)
+- **PyTorch** — ML framework (CUDA-enabled builds available)
+- **sounddevice** — Cross-platform audio capture
+- **webrtcvad + silero_vad** — Voice activity detection

-### Steps:
+### Multi-User Server (Optional)
+- **Node.js + Express + WebSocket** — Real-time sync server

-1. **Clone the repository**
-   ```bash
-   git clone <repository-url>
-   cd local-transcription
-   ```
+### Build & CI/CD
+- **PyInstaller** — Python sidecar packaging
+- **Tauri CLI** — App bundling (.msi, .dmg, .deb, .rpm, .AppImage)
+- **Gitea Actions** — Automated cross-platform builds
+- **uv** — Fast Python package manager

-2. **Install dependencies**
-   ```bash
-   pip install -r requirements.txt
-   ```
+## CI/CD

-3. **Download Whisper models**
-   ```bash
-   # Models will be auto-downloaded on first run
-   # Or manually download:
-   python -c "import whisper; whisper.load_model('base')"
-   ```
+Two Gitea Actions workflows in `.gitea/workflows/`:

-4. **Configure client**
-   ```bash
-   cp config/client_config.example.yaml config/client_config.yaml
-   # Edit config/client_config.yaml with your settings
-   ```
+| Workflow | Trigger | Produces |
+|----------|---------|----------|
+| `release.yml` | Push to `main` | Tauri app installers for all platforms |
+| `build-sidecar.yml` | Changes to `client/`, `server/`, `backend/`, or `pyproject.toml` | Python sidecar zips (CUDA + CPU) |

-5. **Run the server** (one instance)
-   ```bash
-   python server/main.py
-   ```
+Both workflows require a `BUILD_TOKEN` secret in the repo settings (Gitea API token with release write access).

-6. **Run the client** (on each user's machine)
-   ```bash
-   python client/main.py
-   ```
+### Release Artifacts

-7. **Add to OBS**
-   - Add a Browser Source
-   - URL: `http://<server-ip>:8000/stream`
-   - Set width/height as needed
-   - Check "Shutdown source when not visible" for performance
+| Platform | App Installer | Sidecar (CUDA) | Sidecar (CPU) |
+|----------|--------------|----------------|---------------|
+| Linux x86_64 | `.deb`, `.rpm`, `.AppImage` | `sidecar-linux-x86_64-cuda.zip` | `sidecar-linux-x86_64-cpu.zip` |
+| Windows x86_64 | `.msi`, `-setup.exe` | `sidecar-windows-x86_64-cuda.zip` | `sidecar-windows-x86_64-cpu.zip` |
+| macOS ARM64 | `.dmg` | — | `sidecar-macos-aarch64-cpu.zip` |

---
+## System Requirements

-## Configuration (Planned)
+### Minimum
+- 4GB RAM
+- Any modern CPU

-### Client Configuration:
-```yaml
-user:
-  name: "Streamer1"          # Display name for transcriptions
-  id: "unique-user-id"       # Optional unique identifier
+### Recommended (for local real-time transcription)
+- 8GB+ RAM
+- NVIDIA GPU with CUDA support (for GPU acceleration)

-audio:
-  input_device: "default"    # or specific device index
-  sample_rate: 16000
-  chunk_duration: 2.0        # seconds
+### For Building
+- **Tauri app**: Node.js 20+, Rust stable, platform SDK (see [Tauri prerequisites](https://tauri.app/start/prerequisites/))
+- **Python sidecar**: Python 3.9+, uv, PyInstaller
+- **Linux**: `libgtk-3-dev`, `libwebkit2gtk-4.1-dev`, `libappindicator3-dev`, `librsvg2-dev`, `patchelf`
+- **Windows**: Visual Studio Build Tools, WebView2
+- **macOS**: Xcode Command Line Tools

-noise_suppression:
-  enabled: true              # Enable/disable noise reduction
-  strength: 0.7              # 0.0 to 1.0 - reduction strength
-  method: "noisereduce"      # "noisereduce" or "rnnoise"
+## Troubleshooting

-transcription:
-  model: "base"              # tiny, base, small, medium, large
-  device: "cuda"             # cpu, cuda, mps
-  language: "en"             # or "auto" for detection
-  task: "transcribe"         # or "translate"
+### Model Loading Issues
+- Models download automatically on first use to `~/.cache/huggingface/`
+- First run requires internet connection
+- Check disk space (models range from 75MB to 3GB)

-processing:
-  use_vad: true              # Voice Activity Detection
-  min_confidence: 0.5        # Minimum transcription confidence
-
-server_sync:
-  enabled: false             # Enable multi-user server sync
-  url: "ws://localhost:8000" # Server URL (when enabled)
-  api_key: ""                # Optional API key
-
-display:
-  show_timestamps: true      # Show timestamps in local display
-  max_lines: 100             # Maximum lines to keep in display
-  font_size: 12              # GUI font size
+### Audio Device Issues
+```bash
+# List available audio devices
+uv run python main_cli.py --list-devices
 ```
+- Ensure microphone permissions are granted (especially on macOS)
+- Try different device indices in settings

-### Server Configuration:
-```yaml
-server:
-  host: "0.0.0.0"
-  port: 8000
-  api_key_required: false
-
-stream:
-  max_clients: 10
-  buffer_size: 100         # messages to buffer
-  retention_time: 300      # seconds
-
-database:
-  enabled: false
-  path: "transcriptions.db"
+### GPU Not Detected
+```bash
+# Check CUDA availability
+uv run python -c "import torch; print(torch.cuda.is_available())"
 ```
+- Install NVIDIA drivers (CUDA toolkit is bundled in CUDA sidecar builds)
+- The app automatically falls back to CPU if no GPU is available

---
+### Web Server Port Conflicts
+- Default port is 8080; the app tries ports 8080-8084 automatically
+- Change in settings or edit config file
+- Check for conflicts: `lsof -i :8080` (Linux/macOS) or `netstat -ano | findstr :8080` (Windows)

-## Roadmap
+## Use Cases

- [x] Project planning and architecture design
- [ ] Phase 1: Standalone desktop application with GUI
- [ ] Phase 2: Web server and sync system (optional multi-user mode)
- [ ] Phase 3: Client-server communication (optional)
- [ ] Phase 4: Web stream interface for OBS (optional)
- [ ] Phase 5: Advanced features (hotkeys, themes, Docker, etc.)
-
---
+- **Live Streaming Captions**: Add real-time captions to your Twitch/YouTube streams
+- **Multi-Language Translation**: Multiple translators transcribing in different languages
+- **Accessibility**: Provide captions for hearing-impaired viewers
+- **Podcast Recording**: Real-time transcription for multi-host shows
+- **Gaming Commentary**: Track who said what in multiplayer sessions

 ## Contributing

-Contributions are welcome! Please feel free to submit issues or pull requests.
-
---
+Contributions are welcome! Please feel free to submit issues or pull requests at the [repository](https://repo.anhonesthost.net/streamer-tools/local-transcription).

 ## License

-[Choose appropriate license - MIT, Apache 2.0, etc.]
-
---
+MIT License

 ## Acknowledgments

- OpenAI Whisper for the excellent speech recognition model
- The streaming community for inspiration and use cases
+- [OpenAI Whisper](https://github.com/openai/whisper) for the speech recognition model
+- [RealtimeSTT](https://github.com/KoljaB/RealtimeSTT) for real-time transcription capabilities
+- [faster-whisper](https://github.com/guillaumekln/faster-whisper) for optimized inference
+- [Tauri](https://tauri.app/) for the cross-platform desktop framework
+- [Deepgram](https://deepgram.com/) for cloud transcription API
--- a/backend/init.py
+++ b/backend/init.py
@@ -0,0 +1 @@
+"""Backend package for headless transcription service."""
--- a/backend/api_server.py
+++ b/backend/api_server.py
@@ -0,0 +1,323 @@
+"""FastAPI control API server for the headless transcription backend.
+
+Extends the existing OBS display server with REST endpoints and a
+control WebSocket channel so that a Tauri (or any other) frontend
+can drive the application.
+"""
+
+import asyncio
+import json
+from datetime import datetime
+from typing import List, Optional
+
+from fastapi import FastAPI, WebSocket, HTTPException
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel
+
+from backend.app_controller import AppController
+
+
+# ── Request / Response Models ──────────────────────────────────────
+
+class ConfigUpdate(BaseModel):
+    """Batch config update payload. Keys use dot-notation."""
+    settings: dict  # e.g. {"user.name": "Alice", "transcription.model": "small.en"}
+
+
+class LoginRequest(BaseModel):
+    email: str
+    password: str
+    server_url: str
+
+
+class RegisterRequest(BaseModel):
+    email: str
+    password: str
+    server_url: str
+
+
+class SkipVersionRequest(BaseModel):
+    version: str
+
+
+class SaveFileRequest(BaseModel):
+    path: str
+    text: str
+
+
+# ── API Server ─────────────────────────────────────────────────────
+
+class APIServer:
+    """Wraps AppController with a FastAPI application exposing control endpoints."""
+
+    def __init__(self, controller: AppController):
+        self.controller = controller
+        self.control_connections: List[WebSocket] = []
+
+        self.app = FastAPI(title="Local Transcription API", version="1.0.0")
+
+        # Allow Tauri webview origin
+        self.app.add_middleware(
+            CORSMiddleware,
+            allow_origins=["*"],  # Tauri uses tauri://localhost or https://tauri.localhost
+            allow_credentials=True,
+            allow_methods=["*"],
+            allow_headers=["*"],
+        )
+
+        self._setup_routes()
+        self._wire_controller_callbacks()
+
+    def _wire_controller_callbacks(self):
+        """Wire AppController callbacks to broadcast over /ws/control."""
+        original_state_cb = self.controller.on_state_changed
+
+        def on_state_changed(state: str, message: str):
+            if original_state_cb:
+                original_state_cb(state, message)
+            self._broadcast_control({"type": "state_changed", "state": state, "message": message})
+
+        self.controller.on_state_changed = on_state_changed
+
+        def on_transcription(data: dict):
+            self._broadcast_control({"type": "transcription", **data})
+
+        self.controller.on_transcription = on_transcription
+
+        def on_preview(data: dict):
+            self._broadcast_control({"type": "preview", **data})
+
+        self.controller.on_preview = on_preview
+
+        def on_error(msg: str):
+            self._broadcast_control({"type": "error", "message": msg})
+
+        self.controller.on_error = on_error
+
+        def on_credits_low(seconds: int):
+            self._broadcast_control({"type": "credits_low", "seconds_remaining": seconds})
+
+        self.controller.on_credits_low = on_credits_low
+
+    def _broadcast_control(self, data: dict):
+        """Send a message to all connected /ws/control clients."""
+        if not self.control_connections:
+            return
+
+        message = json.dumps(data)
+        disconnected = []
+
+        for ws in self.control_connections:
+            try:
+                asyncio.run_coroutine_threadsafe(
+                    ws.send_text(message),
+                    asyncio.get_event_loop(),
+                )
+            except Exception:
+                disconnected.append(ws)
+
+        for ws in disconnected:
+            self.control_connections.remove(ws)
+
+    def _setup_routes(self):
+        """Register all API routes."""
+        app = self.app
+        ctrl = self.controller
+
+        # ── Status ─────────────────────────────────────────────
+
+        @app.get("/api/status")
+        async def get_status():
+            return ctrl.get_status()
+
+        @app.get("/api/version")
+        async def get_version():
+            from version import __version__
+            return {"version": __version__}
+
+        # ── Transcription Control ──────────────────────────────
+
+        @app.post("/api/start")
+        async def start_transcription():
+            success, message = ctrl.start_transcription()
+            if not success:
+                raise HTTPException(status_code=400, detail=message)
+            return {"status": "ok", "message": message}
+
+        @app.post("/api/stop")
+        async def stop_transcription():
+            success, message = ctrl.stop_transcription()
+            if not success:
+                raise HTTPException(status_code=400, detail=message)
+            return {"status": "ok", "message": message}
+
+        @app.post("/api/clear")
+        async def clear_transcriptions():
+            count = ctrl.clear_transcriptions()
+            return {"status": "ok", "cleared": count}
+
+        @app.get("/api/transcriptions")
+        async def get_transcriptions():
+            show_timestamps = ctrl.config.get('display.show_timestamps', True)
+            return {
+                "count": len(ctrl.transcriptions),
+                "text": ctrl.get_transcriptions_text(include_timestamps=show_timestamps),
+                "items": [
+                    {
+                        "text": r.text,
+                        "user_name": r.user_name,
+                        "timestamp": r.timestamp.strftime("%H:%M:%S") if r.timestamp else None,
+                    }
+                    for r in ctrl.transcriptions
+                ],
+            }
+
+        @app.post("/api/save-file")
+        async def save_file(req: SaveFileRequest):
+            """Save text to a file (used by Tauri frontend after dialog)."""
+            from pathlib import Path
+            try:
+                Path(req.path).write_text(req.text, encoding="utf-8")
+                return {"status": "ok", "path": req.path}
+            except Exception as e:
+                raise HTTPException(status_code=500, detail=str(e))
+
+        # ── Configuration ──────────────────────────────────────
+
+        @app.get("/api/config")
+        async def get_config():
+            return ctrl.config.config
+
+        @app.put("/api/config")
+        async def update_config(update: ConfigUpdate):
+            engine_reloaded, message = ctrl.apply_settings(update.settings)
+            return {
+                "status": "ok",
+                "message": message,
+                "engine_reloaded": engine_reloaded,
+            }
+
+        # ── Devices ────────────────────────────────────────────
+
+        @app.get("/api/audio-devices")
+        async def get_audio_devices():
+            return {"devices": ctrl.get_audio_devices()}
+
+        @app.get("/api/compute-devices")
+        async def get_compute_devices():
+            return {"devices": ctrl.get_compute_devices()}
+
+        # ── Engine ─────────────────────────────────────────────
+
+        @app.post("/api/reload-engine")
+        async def reload_engine():
+            success, message = ctrl.reload_engine()
+            if not success:
+                raise HTTPException(status_code=500, detail=message)
+            return {"status": "ok", "message": message}
+
+        # ── Updates ────────────────────────────────────────────
+
+        @app.get("/api/check-update")
+        async def check_update():
+            return ctrl.check_for_updates()
+
+        @app.post("/api/skip-version")
+        async def skip_version(req: SkipVersionRequest):
+            ctrl.skip_version(req.version)
+            return {"status": "ok"}
+
+        # ── Managed Mode Auth Proxy ────────────────────────────
+
+        @app.post("/api/login")
+        async def login(req: LoginRequest):
+            """Proxy login to the transcription proxy server."""
+            import requests as http_requests
+            try:
+                resp = http_requests.post(
+                    f"{req.server_url}/api/auth/login",
+                    json={"email": req.email, "password": req.password},
+                    timeout=10,
+                )
+                if resp.status_code == 200:
+                    data = resp.json()
+                    ctrl.config.set('remote.auth_token', data.get('token', ''))
+                    ctrl.config.set('remote.server_url', req.server_url)
+                    return {"status": "ok", "token": data.get('token', '')}
+                else:
+                    raise HTTPException(status_code=resp.status_code, detail=resp.text)
+            except http_requests.RequestException as e:
+                raise HTTPException(status_code=502, detail=str(e))
+
+        @app.post("/api/register")
+        async def register(req: RegisterRequest):
+            """Proxy registration to the transcription proxy server."""
+            import requests as http_requests
+            try:
+                resp = http_requests.post(
+                    f"{req.server_url}/api/auth/register",
+                    json={"email": req.email, "password": req.password},
+                    timeout=10,
+                )
+                if resp.status_code in (200, 201):
+                    return {"status": "ok", "data": resp.json()}
+                else:
+                    raise HTTPException(status_code=resp.status_code, detail=resp.text)
+            except http_requests.RequestException as e:
+                raise HTTPException(status_code=502, detail=str(e))
+
+        @app.get("/api/balance")
+        async def get_balance():
+            """Proxy balance check to the transcription proxy server."""
+            import requests as http_requests
+            server_url = ctrl.config.get('remote.server_url', '')
+            token = ctrl.config.get('remote.auth_token', '')
+            if not server_url or not token:
+                raise HTTPException(status_code=400, detail="Not logged in to managed service")
+            try:
+                resp = http_requests.get(
+                    f"{server_url}/api/billing/balance",
+                    headers={"Authorization": f"Bearer {token}"},
+                    timeout=10,
+                )
+                if resp.status_code == 200:
+                    return resp.json()
+                else:
+                    raise HTTPException(status_code=resp.status_code, detail=resp.text)
+            except http_requests.RequestException as e:
+                raise HTTPException(status_code=502, detail=str(e))
+
+        # ── Control WebSocket ──────────────────────────────────
+
+        @app.websocket("/ws/control")
+        async def websocket_control(websocket: WebSocket):
+            """WebSocket channel for real-time state and transcription push."""
+            await websocket.accept()
+            self.control_connections.append(websocket)
+
+            # Send current status on connect
+            try:
+                await websocket.send_json({
+                    "type": "state_changed",
+                    "state": ctrl.state,
+                    "message": "Connected",
+                })
+            except Exception:
+                pass
+
+            try:
+                while True:
+                    # Keep alive -- client sends pings
+                    await websocket.receive_text()
+            except Exception:
+                if websocket in self.control_connections:
+                    self.control_connections.remove(websocket)
+
+        # ── Mount the existing OBS display routes ──────────────
+        # The OBS display (GET / and /ws) is handled by the
+        # TranscriptionWebServer which shares the same Uvicorn
+        # instance. We mount it as a sub-application so the
+        # existing OBS URLs continue to work.
+
+        if ctrl.web_server:
+            app.mount("/obs", ctrl.web_server.app)
--- a/backend/app_controller.py
+++ b/backend/app_controller.py
@@ -0,0 +1,692 @@
+"""Headless application controller for transcription backend.
+
+Extracts orchestration logic from gui/main_window_qt.py into a
+Qt-free class that manages engine lifecycle, web server, server sync,
+and configuration -- all accessible via callbacks instead of Qt signals.
+"""
+
+import asyncio
+import time
+from datetime import datetime
+from pathlib import Path
+from threading import Thread, Lock
+from typing import Callable, List, Optional
+
+import sys
+
+# Add project root to path
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
+
+from client.config import Config
+from client.device_utils import DeviceManager
+from client.transcription_engine_realtime import RealtimeTranscriptionEngine, TranscriptionResult
+from client.deepgram_transcription import DeepgramTranscriptionEngine
+from client.server_sync import ServerSyncClient
+from server.web_display import TranscriptionWebServer
+from version import __version__
+
+
+class AppState:
+    """Enum-like class for application states."""
+    INITIALIZING = "initializing"
+    READY = "ready"
+    TRANSCRIBING = "transcribing"
+    RELOADING = "reloading"
+    ERROR = "error"
+
+
+class WebServerThread(Thread):
+    """Thread for running the web server."""
+
+    def __init__(self, web_server: TranscriptionWebServer):
+        super().__init__(daemon=True)
+        self.web_server = web_server
+        self.loop: Optional[asyncio.AbstractEventLoop] = None
+        self.error: Optional[Exception] = None
+
+    def run(self):
+        try:
+            self.loop = asyncio.new_event_loop()
+            asyncio.set_event_loop(self.loop)
+            self.loop.run_until_complete(self.web_server.start())
+        except Exception as e:
+            self.error = e
+            print(f"ERROR: Web server failed to start: {e}")
+
+
+class EngineInitThread(Thread):
+    """Thread for initializing the transcription engine without blocking."""
+
+    def __init__(self, engine, on_complete: Callable[[bool, str], None]):
+        super().__init__(daemon=True)
+        self.engine = engine
+        self.on_complete = on_complete
+
+    def run(self):
+        try:
+            success = self.engine.initialize()
+            if success:
+                self.on_complete(True, "Engine initialized successfully")
+            else:
+                self.on_complete(False, "Failed to initialize engine")
+        except Exception as e:
+            self.on_complete(False, f"Error initializing engine: {e}")
+
+
+class AppController:
+    """Headless controller managing the transcription application lifecycle.
+
+    This replaces the orchestration logic that previously lived in MainWindow.
+    It manages:
+    - Transcription engine lifecycle (init, start, stop, reload)
+    - Web server for OBS display
+    - Server sync for multi-user mode
+    - Configuration
+    - Update checking
+
+    All state changes are communicated via callbacks, making it UI-agnostic.
+    """
+
+    def __init__(self, config: Optional[Config] = None):
+        self.config = config or Config()
+        self.device_manager = DeviceManager()
+
+        # State
+        self._state = AppState.INITIALIZING
+        self._state_lock = Lock()
+        self.is_transcribing = False
+
+        # Engine
+        self.transcription_engine = None
+        self._engine_init_thread: Optional[EngineInitThread] = None
+        self.current_model_size: Optional[str] = None
+        self.current_device_config: Optional[str] = None
+
+        # Web server
+        self.web_server: Optional[TranscriptionWebServer] = None
+        self.web_server_thread: Optional[WebServerThread] = None
+        self.actual_web_port: Optional[int] = None
+
+        # Server sync
+        self.server_sync_client: Optional[ServerSyncClient] = None
+
+        # Transcription storage
+        self.transcriptions: List[TranscriptionResult] = []
+
+        # Callbacks for state notifications (set by the frontend / API server)
+        self.on_state_changed: Optional[Callable[[str, str], None]] = None  # (state, message)
+        self.on_transcription: Optional[Callable[[dict], None]] = None  # final transcription
+        self.on_preview: Optional[Callable[[dict], None]] = None  # realtime preview
+        self.on_error: Optional[Callable[[str], None]] = None
+        self.on_credits_low: Optional[Callable[[int], None]] = None
+
+    @property
+    def state(self) -> str:
+        with self._state_lock:
+            return self._state
+
+    def _set_state(self, state: str, message: str = ""):
+        with self._state_lock:
+            self._state = state
+        if self.on_state_changed:
+            self.on_state_changed(state, message)
+
+    # ── Lifecycle ──────────────────────────────────────────────────
+
+    def initialize(self):
+        """Initialize the web server and transcription engine.
+
+        Call this once at startup. Non-blocking -- engine init happens
+        in a background thread.
+        """
+        self._set_state(AppState.INITIALIZING, "Starting web server...")
+        self._start_web_server()
+
+        self._set_state(AppState.INITIALIZING, "Loading transcription engine...")
+        self._initialize_engine()
+
+    def shutdown(self):
+        """Gracefully shut down all components."""
+        # Stop transcription
+        if self.is_transcribing:
+            self.stop_transcription()
+
+        # Stop web server
+        if self.web_server_thread and self.web_server_thread.is_alive():
+            try:
+                if self.web_server_thread.loop:
+                    self.web_server_thread.loop.call_soon_threadsafe(
+                        self.web_server_thread.loop.stop
+                    )
+            except Exception as e:
+                print(f"Warning: Error stopping web server: {e}")
+
+        # Stop transcription engine
+        if self.transcription_engine:
+            try:
+                self.transcription_engine.stop()
+            except Exception as e:
+                print(f"Warning: Error stopping engine: {e}")
+
+        # Wait for engine init thread
+        if self._engine_init_thread and self._engine_init_thread.is_alive():
+            self._engine_init_thread.join(timeout=5)
+
+    # ── Web Server ─────────────────────────────────────────────────
+
+    def _start_web_server(self):
+        """Start the FastAPI web server for OBS display."""
+        try:
+            host = self.config.get('web_server.host', '127.0.0.1')
+            port = self.config.get('web_server.port', 8080)
+
+            # Gather display settings
+            ws_kwargs = self._get_web_server_kwargs(host, port)
+
+            # Try up to 5 ports
+            ports_to_try = [port] + [port + i for i in range(1, 5)]
+
+            for try_port in ports_to_try:
+                print(f"Attempting to start web server at http://{host}:{try_port}")
+                ws_kwargs['port'] = try_port
+
+                self.web_server = TranscriptionWebServer(**ws_kwargs)
+                self.web_server_thread = WebServerThread(self.web_server)
+                self.web_server_thread.start()
+
+                time.sleep(0.5)
+
+                if self.web_server_thread.error:
+                    error_str = str(self.web_server_thread.error)
+                    if "address already in use" in error_str.lower() or "errno 98" in error_str.lower():
+                        print(f"Port {try_port} is in use, trying next port...")
+                        self.web_server = None
+                        self.web_server_thread = None
+                        continue
+                    else:
+                        print(f"Web server failed to start: {self.web_server_thread.error}")
+                        self.web_server = None
+                        self.web_server_thread = None
+                        break
+                else:
+                    self.actual_web_port = try_port
+                    print(f"Web server started at http://{host}:{try_port}")
+                    return
+
+            print(f"WARNING: Could not start web server on any port")
+
+        except Exception as e:
+            print(f"ERROR: Failed to initialize web server: {e}")
+            self.web_server = None
+            self.web_server_thread = None
+
+    def _get_web_server_kwargs(self, host: str, port: int) -> dict:
+        """Build kwargs dict for TranscriptionWebServer from config."""
+        return dict(
+            host=host,
+            port=port,
+            show_timestamps=self.config.get('display.show_timestamps', True),
+            fade_after_seconds=self.config.get('display.fade_after_seconds', 10),
+            max_lines=self.config.get('display.max_lines', 50),
+            font_family=self.config.get('display.font_family', 'Arial'),
+            font_size=self.config.get('display.font_size', 16),
+            fonts_dir=self.config.fonts_dir,
+            font_source=self.config.get('display.font_source', 'System Font'),
+            websafe_font=self.config.get('display.websafe_font', 'Arial'),
+            google_font=self.config.get('display.google_font', 'Roboto'),
+            user_color=self.config.get('display.user_color', '#4CAF50'),
+            text_color=self.config.get('display.text_color', '#FFFFFF'),
+            background_color=self.config.get('display.background_color', '#000000B3'),
+        )
+
+    # ── Transcription Engine ───────────────────────────────────────
+
+    def _initialize_engine(self):
+        """Initialize the transcription engine in a background thread."""
+        device_config = self.config.get('transcription.device', 'auto')
+        self.device_manager.set_device(device_config)
+
+        audio_device_str = self.config.get('audio.input_device', 'default')
+        audio_device = None if audio_device_str == 'default' else int(audio_device_str)
+
+        model = self.config.get('transcription.model', 'base.en')
+        language = self.config.get('transcription.language', 'en')
+        device = self.device_manager.get_device_for_whisper()
+        compute_type = self.config.get('transcription.compute_type', 'default')
+
+        self.current_model_size = model
+        self.current_device_config = device_config
+
+        user_name = self.config.get('user.name', 'User')
+        continuous_mode = self.config.get('transcription.continuous_mode', False)
+
+        if continuous_mode:
+            post_speech_silence = 0.15
+            min_gap = 0.0
+            min_recording = 0.3
+        else:
+            post_speech_silence = self.config.get('transcription.post_speech_silence_duration', 0.3)
+            min_gap = self.config.get('transcription.min_gap_between_recordings', 0.0)
+            min_recording = self.config.get('transcription.min_length_of_recording', 0.5)
+
+        remote_mode = self.config.get('remote.mode', 'local')
+
+        if remote_mode in ('managed', 'byok'):
+            self.transcription_engine = DeepgramTranscriptionEngine(
+                config=self.config,
+                user_name=user_name,
+                input_device_index=audio_device,
+            )
+            self.transcription_engine.set_callbacks(
+                realtime_callback=self._on_realtime_transcription,
+                final_callback=self._on_final_transcription,
+            )
+            self.transcription_engine.set_error_callback(self._on_remote_error)
+            self.transcription_engine.set_credits_low_callback(self._on_credits_low)
+        else:
+            self.transcription_engine = RealtimeTranscriptionEngine(
+                model=model,
+                device=device,
+                language=language,
+                compute_type=compute_type,
+                enable_realtime_transcription=self.config.get('transcription.enable_realtime_transcription', False),
+                realtime_model=self.config.get('transcription.realtime_model', 'tiny.en'),
+                realtime_processing_pause=self.config.get('transcription.realtime_processing_pause', 0.1),
+                silero_sensitivity=self.config.get('transcription.silero_sensitivity', 0.4),
+                silero_use_onnx=self.config.get('transcription.silero_use_onnx', True),
+                webrtc_sensitivity=self.config.get('transcription.webrtc_sensitivity', 3),
+                post_speech_silence_duration=post_speech_silence,
+                min_length_of_recording=min_recording,
+                min_gap_between_recordings=min_gap,
+                pre_recording_buffer_duration=self.config.get('transcription.pre_recording_buffer_duration', 0.2),
+                beam_size=self.config.get('transcription.beam_size', 5),
+                initial_prompt=self.config.get('transcription.initial_prompt', ''),
+                no_log_file=self.config.get('transcription.no_log_file', True),
+                input_device_index=audio_device,
+                user_name=user_name,
+            )
+            self.transcription_engine.set_callbacks(
+                realtime_callback=self._on_realtime_transcription,
+                final_callback=self._on_final_transcription,
+            )
+
+        # Start init in background thread
+        self._engine_init_thread = EngineInitThread(
+            self.transcription_engine,
+            self._on_engine_ready,
+        )
+        self._engine_init_thread.start()
+
+    def _on_engine_ready(self, success: bool, message: str):
+        """Called from EngineInitThread when engine init completes."""
+        if success:
+            remote_mode = self.config.get('remote.mode', 'local')
+            if remote_mode in ('managed', 'byok'):
+                mode_label = 'Managed' if remote_mode == 'managed' else 'BYOK'
+                device_display = f"Deepgram ({mode_label})"
+            elif self.transcription_engine:
+                actual_device = self.transcription_engine.device
+                compute_type = self.transcription_engine.compute_type
+                device_display = f"{actual_device.upper()} ({compute_type})"
+            else:
+                device_display = "Unknown"
+
+            self._set_state(AppState.READY, f"Ready | Device: {device_display}")
+        else:
+            self._set_state(AppState.ERROR, message)
+
+    # ── Transcription Control ──────────────────────────────────────
+
+    def start_transcription(self) -> tuple[bool, str]:
+        """Start transcription. Returns (success, message)."""
+        if self.is_transcribing:
+            return False, "Already transcribing"
+
+        if not self.transcription_engine or not self.transcription_engine.is_ready():
+            return False, "Transcription engine not ready"
+
+        try:
+            success = self.transcription_engine.start_recording()
+            if not success:
+                return False, "Failed to start recording"
+
+            # Start server sync if enabled
+            if self.config.get('server_sync.enabled', False):
+                self._start_server_sync()
+
+            self.is_transcribing = True
+            self._set_state(AppState.TRANSCRIBING, "Transcribing...")
+            return True, "Transcription started"
+
+        except Exception as e:
+            return False, f"Failed to start transcription: {e}"
+
+    def stop_transcription(self) -> tuple[bool, str]:
+        """Stop transcription. Returns (success, message)."""
+        if not self.is_transcribing:
+            return False, "Not transcribing"
+
+        try:
+            if self.transcription_engine:
+                self.transcription_engine.stop_recording()
+
+            if self.server_sync_client:
+                self.server_sync_client.stop()
+                self.server_sync_client = None
+
+            self.is_transcribing = False
+            self._set_state(AppState.READY, "Ready")
+            return True, "Transcription stopped"
+
+        except Exception as e:
+            return False, f"Failed to stop transcription: {e}"
+
+    def clear_transcriptions(self) -> int:
+        """Clear stored transcriptions. Returns count of cleared items."""
+        count = len(self.transcriptions)
+        self.transcriptions.clear()
+        return count
+
+    def get_transcriptions_text(self, include_timestamps: bool = True) -> str:
+        """Get all transcriptions as formatted text."""
+        lines = []
+        for result in self.transcriptions:
+            parts = []
+            if include_timestamps:
+                parts.append(f"[{result.timestamp.strftime('%H:%M:%S')}]")
+            if result.user_name and result.user_name.strip():
+                parts.append(f"{result.user_name}:")
+            parts.append(result.text)
+            lines.append(" ".join(parts))
+        return "\n".join(lines)
+
+    def reload_engine(self) -> tuple[bool, str]:
+        """Reload the transcription engine with current config settings."""
+        try:
+            was_transcribing = self.is_transcribing
+            if was_transcribing:
+                self.stop_transcription()
+
+            self._set_state(AppState.RELOADING, "Reloading engine...")
+
+            # Wait for any existing init thread
+            if self._engine_init_thread and self._engine_init_thread.is_alive():
+                self._engine_init_thread.join(timeout=10)
+
+            # Stop current engine
+            if self.transcription_engine:
+                try:
+                    self.transcription_engine.stop()
+                except Exception as e:
+                    print(f"Warning: Error stopping engine: {e}")
+
+            # Re-initialize
+            self._initialize_engine()
+            return True, "Engine reload initiated"
+
+        except Exception as e:
+            self._set_state(AppState.ERROR, f"Engine reload failed: {e}")
+            return False, str(e)
+
+    # ── Transcription Callbacks ────────────────────────────────────
+
+    def _on_realtime_transcription(self, result: TranscriptionResult):
+        """Handle realtime (preview) transcription."""
+        if not self.is_transcribing:
+            return
+
+        try:
+            # Broadcast to web server
+            if self.web_server and self.web_server_thread and self.web_server_thread.loop:
+                asyncio.run_coroutine_threadsafe(
+                    self.web_server.broadcast_preview(
+                        result.text, result.user_name, result.timestamp
+                    ),
+                    self.web_server_thread.loop,
+                )
+
+            # Send to server sync
+            if self.server_sync_client:
+                self.server_sync_client.send_preview(result.text, result.timestamp)
+
+            # Notify frontend
+            if self.on_preview:
+                self.on_preview({
+                    "text": result.text,
+                    "user_name": result.user_name,
+                    "timestamp": result.timestamp.strftime("%H:%M:%S") if result.timestamp else None,
+                    "is_preview": True,
+                })
+
+        except Exception as e:
+            print(f"Error handling realtime transcription: {e}")
+
+    def _on_final_transcription(self, result: TranscriptionResult):
+        """Handle final transcription."""
+        if not self.is_transcribing:
+            return
+
+        try:
+            self.transcriptions.append(result)
+
+            # Broadcast to web server
+            if self.web_server and self.web_server_thread and self.web_server_thread.loop:
+                asyncio.run_coroutine_threadsafe(
+                    self.web_server.broadcast_transcription(
+                        result.text, result.user_name, result.timestamp
+                    ),
+                    self.web_server_thread.loop,
+                )
+
+            # Send to server sync
+            if self.server_sync_client:
+                self.server_sync_client.send_transcription(
+                    result.text, result.timestamp
+                )
+
+            # Notify frontend
+            if self.on_transcription:
+                self.on_transcription({
+                    "text": result.text,
+                    "user_name": result.user_name,
+                    "timestamp": result.timestamp.strftime("%H:%M:%S") if result.timestamp else None,
+                    "is_preview": False,
+                })
+
+        except Exception as e:
+            print(f"Error handling final transcription: {e}")
+
+    def _on_remote_error(self, error_msg: str):
+        """Handle error from remote transcription service."""
+        print(f"Remote transcription error: {error_msg}")
+        if self.on_error:
+            self.on_error(error_msg)
+
+    def _on_credits_low(self, seconds_remaining: int):
+        """Handle low credits warning from proxy."""
+        if self.on_credits_low:
+            self.on_credits_low(seconds_remaining)
+
+    # ── Server Sync ────────────────────────────────────────────────
+
+    def _start_server_sync(self):
+        """Start server sync client."""
+        try:
+            url = self.config.get('server_sync.url', '')
+            if not url:
+                print("Server sync enabled but no URL configured")
+                return
+
+            room = self.config.get('server_sync.room', 'default')
+            passphrase = self.config.get('server_sync.passphrase', '')
+            user_name = self.config.get('user.name', 'User')
+            fonts_dir = self.config.fonts_dir
+
+            font_source = self.config.get('display.font_source', 'System Font')
+            if font_source == "System Font":
+                font_source = "None"
+
+            self.server_sync_client = ServerSyncClient(
+                url=url,
+                room=room,
+                passphrase=passphrase,
+                user_name=user_name,
+                fonts_dir=fonts_dir,
+                font_source=font_source,
+                websafe_font=self.config.get('display.websafe_font', '') or None,
+                google_font=self.config.get('display.google_font', '') or None,
+                custom_font_file=self.config.get('display.custom_font_file', '') or None,
+                user_color=self.config.get('display.user_color', '#4CAF50'),
+                text_color=self.config.get('display.text_color', '#FFFFFF'),
+                background_color=self.config.get('display.background_color', '#000000B3'),
+            )
+            self.server_sync_client.start()
+
+        except Exception as e:
+            print(f"Error starting server sync: {e}")
+
+    # ── Configuration ──────────────────────────────────────────────
+
+    def apply_settings(self, new_config: Optional[dict] = None) -> tuple[bool, str]:
+        """Apply settings changes. If new_config is provided, merge it first.
+
+        Returns (engine_reload_needed, message).
+        """
+        if new_config:
+            for key, value in new_config.items():
+                self.config.set(key, value)
+
+        # Update web server display settings
+        if self.web_server:
+            self.web_server.show_timestamps = self.config.get('display.show_timestamps', True)
+            self.web_server.fade_after_seconds = self.config.get('display.fade_after_seconds', 10)
+            self.web_server.max_lines = self.config.get('display.max_lines', 50)
+            self.web_server.font_family = self.config.get('display.font_family', 'Arial')
+            self.web_server.font_size = self.config.get('display.font_size', 16)
+            self.web_server.font_source = self.config.get('display.font_source', 'System Font')
+            self.web_server.websafe_font = self.config.get('display.websafe_font', 'Arial')
+            self.web_server.google_font = self.config.get('display.google_font', 'Roboto')
+            self.web_server.user_color = self.config.get('display.user_color', '#4CAF50')
+            self.web_server.text_color = self.config.get('display.text_color', '#FFFFFF')
+            self.web_server.background_color = self.config.get('display.background_color', '#000000B3')
+
+        # Restart server sync if running
+        if self.is_transcribing and self.server_sync_client:
+            self.server_sync_client.stop()
+            self.server_sync_client = None
+            if self.config.get('server_sync.enabled', False):
+                self._start_server_sync()
+
+        # Check if model/device changed
+        new_model = self.config.get('transcription.model', 'base.en')
+        new_device = self.config.get('transcription.device', 'auto')
+        engine_reload_needed = (
+            self.current_model_size != new_model
+            or self.current_device_config != new_device
+        )
+
+        if engine_reload_needed:
+            self.reload_engine()
+            return True, "Settings applied. Engine reloading with new model/device."
+        else:
+            return False, "Settings applied successfully."
+
+    def get_status(self) -> dict:
+        """Get current application status as a dict."""
+        host = self.config.get('web_server.host', '127.0.0.1')
+        port = self.actual_web_port or self.config.get('web_server.port', 8080)
+
+        device_info = self.device_manager.get_device_info()
+
+        remote_mode = self.config.get('remote.mode', 'local')
+        if remote_mode in ('managed', 'byok') and self.transcription_engine:
+            mode_label = 'Managed' if remote_mode == 'managed' else 'BYOK'
+            engine_device = f"Deepgram ({mode_label})"
+        elif self.transcription_engine and hasattr(self.transcription_engine, 'device'):
+            engine_device = f"{self.transcription_engine.device.upper()} ({self.transcription_engine.compute_type})"
+        else:
+            engine_device = "Not initialized"
+
+        return {
+            "state": self.state,
+            "is_transcribing": self.is_transcribing,
+            "version": __version__,
+            "engine_device": engine_device,
+            "web_server": {
+                "host": host,
+                "port": port,
+                "url": f"http://{host}:{port}",
+                "running": self.web_server_thread is not None and self.web_server_thread.is_alive(),
+            },
+            "transcription_count": len(self.transcriptions),
+            "remote_mode": remote_mode,
+            "server_sync_enabled": self.config.get('server_sync.enabled', False),
+        }
+
+    def get_audio_devices(self) -> list[dict]:
+        """List available audio input devices."""
+        import sounddevice as sd
+        devices = []
+        try:
+            device_list = sd.query_devices()
+            for i, device in enumerate(device_list):
+                if device['max_input_channels'] > 0:
+                    devices.append({"index": i, "name": device['name']})
+        except Exception:
+            pass
+        if not devices:
+            devices = [{"index": 0, "name": "Default"}]
+        return devices
+
+    def get_compute_devices(self) -> list[dict]:
+        """List available compute devices."""
+        device_info = self.device_manager.get_device_info()
+        devices = [{"id": "auto", "name": "Auto-detect"}]
+        for dev_id, dev_name in device_info:
+            devices.append({"id": dev_id, "name": dev_name})
+        return devices
+
+    # ── Update Checking ────────────────────────────────────────────
+
+    def check_for_updates(self) -> dict:
+        """Check for updates synchronously. Returns update info or None."""
+        from client.update_checker import UpdateChecker
+
+        gitea_url = self.config.get('updates.gitea_url', 'https://repo.anhonesthost.net')
+        owner = self.config.get('updates.owner', 'streamer-tools')
+        repo = self.config.get('updates.repo', 'local-transcription')
+
+        if not gitea_url or not owner or not repo:
+            return {"available": False, "error": "Update checking not configured"}
+
+        checker = UpdateChecker(
+            current_version=__version__,
+            gitea_url=gitea_url,
+            owner=owner,
+            repo=repo,
+        )
+
+        try:
+            release_info = checker.check_for_update()
+            self.config.set('updates.last_check', datetime.now().isoformat())
+
+            if release_info:
+                skipped = self.config.get('updates.skipped_versions', [])
+                return {
+                    "available": True,
+                    "version": release_info.version,
+                    "download_url": release_info.download_url,
+                    "release_notes": release_info.release_notes,
+                    "skipped": release_info.version in skipped,
+                }
+            else:
+                return {"available": False, "current_version": __version__}
+        except Exception as e:
+            return {"available": False, "error": str(e)}
+
+    def skip_version(self, version: str):
+        """Mark a version as skipped for update notifications."""
+        skipped = self.config.get('updates.skipped_versions', [])
+        if version not in skipped:
+            skipped.append(version)
+            self.config.set('updates.skipped_versions', skipped)
--- a/backend/main_headless.py
+++ b/backend/main_headless.py
@@ -0,0 +1,126 @@
+#!/usr/bin/env python3
+"""Headless entry point for the Local Transcription backend.
+
+Runs the transcription engine + API server without any GUI (no PySide6).
+Designed to be launched as a Tauri sidecar or run standalone for development.
+
+Usage:
+    python -m backend.main_headless [--port PORT] [--host HOST]
+
+The backend prints the actual port to stdout as JSON on startup:
+    {"event": "ready", "port": 8080}
+
+This allows the Tauri shell to discover which port the backend bound to.
+"""
+
+import argparse
+import json
+import multiprocessing
+import os
+import signal
+import sys
+from pathlib import Path
+
+# Must be called before anything else for PyInstaller compatibility
+multiprocessing.freeze_support()
+
+if __name__ == "__main__":
+    try:
+        multiprocessing.set_start_method('spawn', force=True)
+    except RuntimeError:
+        pass
+
+# Add project root to path
+project_root = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(project_root))
+os.chdir(project_root)
+
+from client.instance_lock import InstanceLock
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Local Transcription headless backend")
+    parser.add_argument("--host", default="127.0.0.1", help="API server host (default: 127.0.0.1)")
+    parser.add_argument("--port", type=int, default=8080, help="API server port (default: 8080)")
+    args = parser.parse_args()
+
+    instance_lock = InstanceLock()
+    if not instance_lock.acquire():
+        print(json.dumps({"event": "error", "message": "Another instance is already running"}),
+              flush=True)
+        sys.exit(1)
+
+    def handle_shutdown(signum, frame):
+        print(json.dumps({"event": "shutdown"}), flush=True)
+        if controller:
+            controller.shutdown()
+        instance_lock.release()
+        sys.exit(0)
+
+    signal.signal(signal.SIGTERM, handle_shutdown)
+    signal.signal(signal.SIGINT, handle_shutdown)
+
+    controller = None
+
+    try:
+        from backend.app_controller import AppController
+        from backend.api_server import APIServer
+
+        # Override web server port from CLI arg
+        from client.config import Config
+        config = Config()
+        config.set('web_server.host', args.host)
+        config.set('web_server.port', args.port)
+
+        # Create controller and initialize
+        controller = AppController(config=config)
+
+        # Wire a state callback that prints the ready event
+        def on_state_changed(state, message):
+            event = {"event": "state", "state": state, "message": message}
+            print(json.dumps(event), flush=True)
+
+        controller.on_state_changed = on_state_changed
+
+        # Initialize engine + web server
+        controller.initialize()
+
+        # Create API server wrapping the controller
+        api_server = APIServer(controller)
+
+        # Determine actual port (web server may have shifted if port was in use)
+        actual_port = controller.actual_web_port or args.port
+
+        # Print ready event so Tauri can discover the port
+        print(json.dumps({"event": "ready", "port": actual_port}), flush=True)
+
+        # Run the API server (blocks)
+        import uvicorn
+        import logging
+
+        logging.getLogger("uvicorn").setLevel(logging.ERROR)
+        logging.getLogger("uvicorn.access").setLevel(logging.ERROR)
+
+        uvicorn.run(
+            api_server.app,
+            host=args.host,
+            port=actual_port + 1,  # API on port+1, OBS display on the main port
+            log_level="error",
+            access_log=False,
+        )
+
+    except KeyboardInterrupt:
+        print(json.dumps({"event": "shutdown", "reason": "keyboard_interrupt"}), flush=True)
+    except Exception as e:
+        print(json.dumps({"event": "error", "message": str(e)}), flush=True)
+        import traceback
+        traceback.print_exc()
+        sys.exit(1)
+    finally:
+        if controller:
+            controller.shutdown()
+        instance_lock.release()
+
+
+if __name__ == "__main__":
+    main()
--- a/client/config.py
+++ b/client/config.py
@@ -48,6 +48,25 @@ class Config:
            # Save the default configuration
            self.save()

+        # Migrate remote_processing -> remote
+        self._migrate_remote_config()
+
+    def _migrate_remote_config(self):
+        """Migrate old remote_processing config to new remote config."""
+        if 'remote_processing' in self.config and 'remote' not in self.config:
+            old = self.config['remote_processing']
+            self.config['remote'] = {
+                'mode': 'managed' if old.get('enabled', False) else 'local',
+                'server_url': old.get('server_url', ''),
+                'auth_token': '',
+                'byok_api_key': old.get('api_key', ''),
+                'deepgram_model': 'nova-2',
+                'language': 'en-US',
+                'fallback_to_local': old.get('fallback_to_local', True),
+            }
+            del self.config['remote_processing']
+            self.save()
+
    def save(self) -> None:
        """Save current configuration to file."""
        with open(self.config_path, 'w') as f:
--- a/client/deepgram_transcription.py
+++ b/client/deepgram_transcription.py
@@ -0,0 +1,528 @@
+"""Deepgram-based transcription engine using WebSocket streaming.
+
+Supports two modes:
+  - Managed mode: connects to a proxy server that handles Deepgram credentials
+  - BYOK mode: connects directly to the Deepgram API with a user-provided key
+
+Implements the same duck-type interface as RealtimeTranscriptionEngine so
+MainWindow can use it as a drop-in replacement.
+"""
+
+import asyncio
+import json
+import logging
+import numpy as np
+import threading
+from datetime import datetime
+from queue import Queue, Empty
+from typing import Optional, Callable
+
+from client.transcription_engine_realtime import TranscriptionResult
+
+logger = logging.getLogger(__name__)
+
+
+class DeepgramTranscriptionEngine:
+    """
+    Transcription engine that streams audio to Deepgram via WebSocket.
+
+    In managed mode the connection goes through a proxy at
+    ``wss://<server>/ws/transcribe`` which handles authentication and
+    Deepgram credentials.  In BYOK (bring-your-own-key) mode the
+    connection goes directly to the Deepgram API.
+    """
+
+    # ------------------------------------------------------------------ #
+    #  Construction / configuration
+    # ------------------------------------------------------------------ #
+
+    def __init__(self, config, user_name: str = "User", input_device_index: Optional[int] = None):
+        """
+        Initialise the engine from a :class:`client.config.Config` object.
+
+        Args:
+            config: Application ``Config`` instance.
+            user_name: Display name attached to transcriptions.
+            input_device_index: Index of the audio input device to use
+                (``None`` for the system default).
+        """
+        self.config = config
+        self.user_name = user_name
+        self.input_device_index = input_device_index
+
+        # Mode: 'managed' (proxy) or 'byok' (direct Deepgram)
+        self.mode: str = config.get("remote.mode", "managed")
+
+        # Managed-mode settings
+        self.server_url: str = config.get("remote.server_url", "")
+        self.auth_token: str = config.get("remote.auth_token", "")
+
+        # BYOK-mode settings
+        self.byok_api_key: str = config.get("remote.byok_api_key", "")
+
+        # Deepgram model / language (used in both modes)
+        self.deepgram_model: str = config.get("remote.deepgram_model", "nova-2")
+        self.language: str = config.get("remote.language", "en-US")
+
+        # Audio parameters
+        self.sample_rate: int = 16000
+        self.channels: int = 1
+        self.blocksize: int = 4096
+
+        # Callbacks
+        self.realtime_callback: Optional[Callable[[TranscriptionResult], None]] = None
+        self.final_callback: Optional[Callable[[TranscriptionResult], None]] = None
+        self._on_error: Optional[Callable[[str], None]] = None
+        self._on_credits_low: Optional[Callable[[int], None]] = None
+
+        # Internal state
+        self._is_initialized: bool = False
+        self._is_recording: bool = False
+        self._stop_event: threading.Event = threading.Event()
+        self._audio_queue: Queue = Queue()
+
+        # Asyncio event loop running in a daemon thread
+        self._loop: Optional[asyncio.AbstractEventLoop] = None
+        self._thread: Optional[threading.Thread] = None
+
+        # WebSocket handle (set inside the async context)
+        self._ws = None
+
+        # sounddevice InputStream
+        self._stream = None
+
+    # ------------------------------------------------------------------ #
+    #  Callback setters
+    # ------------------------------------------------------------------ #
+
+    def set_callbacks(
+        self,
+        realtime_callback: Optional[Callable[[TranscriptionResult], None]] = None,
+        final_callback: Optional[Callable[[TranscriptionResult], None]] = None,
+    ):
+        """Set transcription result callbacks (matches RealtimeTranscriptionEngine API)."""
+        self.realtime_callback = realtime_callback
+        self.final_callback = final_callback
+
+    def set_error_callback(self, fn: Optional[Callable[[str], None]]):
+        """Set a callback invoked on errors.  ``fn`` receives a string message."""
+        self._on_error = fn
+
+    def set_credits_low_callback(self, fn: Optional[Callable[[int], None]]):
+        """Set a callback for low-credit warnings.  ``fn`` receives seconds remaining."""
+        self._on_credits_low = fn
+
+    # ------------------------------------------------------------------ #
+    #  Public interface (duck-typed with RealtimeTranscriptionEngine)
+    # ------------------------------------------------------------------ #
+
+    def initialize(self) -> bool:
+        """Validate configuration and mark the engine as ready.
+
+        Returns ``True`` when the engine is ready to start recording.
+        """
+        if self._is_initialized:
+            return True
+
+        if self.mode == "managed":
+            if not self.server_url:
+                logger.error("Managed mode requires a server URL (remote.server_url)")
+                return False
+            if not self.auth_token:
+                logger.error("Managed mode requires an auth token (remote.auth_token)")
+                return False
+        elif self.mode == "byok":
+            if not self.byok_api_key:
+                logger.error("BYOK mode requires an API key (remote.byok_api_key)")
+                return False
+        else:
+            logger.error("Unknown remote mode: %s (expected 'managed' or 'byok')", self.mode)
+            return False
+
+        self._is_initialized = True
+        logger.info("DeepgramTranscriptionEngine initialised in %s mode", self.mode)
+        return True
+
+    def start_recording(self) -> bool:
+        """Open the audio stream and connect the WebSocket.
+
+        Returns ``True`` on success.
+        """
+        if not self._is_initialized:
+            logger.error("Engine not initialised -- call initialize() first")
+            return False
+
+        if self._is_recording:
+            return True
+
+        self._stop_event.clear()
+        self._is_recording = True
+
+        # Start the asyncio event-loop thread (handles WS send/receive)
+        self._thread = threading.Thread(target=self._run_event_loop, daemon=True)
+        self._thread.start()
+
+        # Start the audio capture stream
+        try:
+            self._start_audio_stream()
+        except Exception as exc:
+            logger.error("Failed to open audio stream: %s", exc)
+            self._is_recording = False
+            self._stop_event.set()
+            return False
+
+        logger.info("Recording started")
+        return True
+
+    def stop_recording(self):
+        """Stop audio capture and close the WebSocket."""
+        if not self._is_recording:
+            return
+
+        self._is_recording = False
+        self._stop_event.set()
+
+        # Stop audio stream
+        self._stop_audio_stream()
+
+        # Close WebSocket from outside the event-loop thread
+        if self._ws is not None and self._loop is not None and not self._loop.is_closed():
+            asyncio.run_coroutine_threadsafe(self._close_ws(), self._loop)
+
+        # Wait for the thread to finish
+        if self._thread is not None:
+            self._thread.join(timeout=5)
+            self._thread = None
+
+        logger.info("Recording stopped")
+
+    def stop(self):
+        """Full shutdown -- stop recording and release all resources."""
+        self.stop_recording()
+        self._is_initialized = False
+        logger.info("DeepgramTranscriptionEngine shut down")
+
+    def is_ready(self) -> bool:
+        """Return ``True`` if the engine has been successfully initialised."""
+        return self._is_initialized
+
+    # ------------------------------------------------------------------ #
+    #  Audio capture (sounddevice)
+    # ------------------------------------------------------------------ #
+
+    def _start_audio_stream(self):
+        """Open a ``sounddevice.InputStream`` that feeds the audio queue."""
+        import sounddevice as sd
+
+        def _audio_callback(indata, frames, time_info, status):  # noqa: ARG001
+            if status:
+                logger.warning("Audio stream status: %s", status)
+            if self._is_recording:
+                # float32 -> int16 PCM bytes
+                pcm = (indata * 32767).astype(np.int16).tobytes()
+                self._audio_queue.put(pcm)
+
+        self._stream = sd.InputStream(
+            samplerate=self.sample_rate,
+            blocksize=self.blocksize,
+            channels=self.channels,
+            dtype="float32",
+            device=self.input_device_index,
+            callback=_audio_callback,
+        )
+        self._stream.start()
+
+    def _stop_audio_stream(self):
+        """Close the audio input stream."""
+        if self._stream is not None:
+            try:
+                self._stream.stop()
+                self._stream.close()
+            except Exception as exc:
+                logger.debug("Error closing audio stream: %s", exc)
+            finally:
+                self._stream = None
+
+    # ------------------------------------------------------------------ #
+    #  Asyncio event-loop (runs in daemon thread)
+    # ------------------------------------------------------------------ #
+
+    def _run_event_loop(self):
+        """Entry point for the daemon thread -- runs the async event loop."""
+        self._loop = asyncio.new_event_loop()
+        asyncio.set_event_loop(self._loop)
+        try:
+            self._loop.run_until_complete(self._ws_lifecycle())
+        except Exception as exc:
+            logger.error("Event-loop error: %s", exc)
+        finally:
+            try:
+                self._loop.run_until_complete(self._loop.shutdown_asyncgens())
+            except Exception:
+                pass
+            self._loop.close()
+            self._loop = None
+
+    async def _ws_lifecycle(self):
+        """Connect, authenticate (if managed), then run send/receive loops."""
+        import websockets
+
+        try:
+            ws_url, extra_headers = self._build_ws_url_and_headers()
+
+            logger.info("Connecting to %s", ws_url)
+            self._ws = await websockets.connect(
+                ws_url,
+                additional_headers=extra_headers,
+                ping_interval=20,
+                ping_timeout=10,
+            )
+
+            # Managed mode: send auth message and wait for ready
+            if self.mode == "managed":
+                if not await self._managed_handshake():
+                    return
+
+            # Run send and receive concurrently
+            await asyncio.gather(
+                self._send_loop(),
+                self._receive_loop(),
+            )
+
+        except asyncio.CancelledError:
+            pass
+        except Exception as exc:
+            msg = f"WebSocket error: {exc}"
+            logger.error(msg)
+            if self._on_error:
+                self._on_error(msg)
+        finally:
+            await self._close_ws()
+
+    def _build_ws_url_and_headers(self):
+        """Return ``(url, headers)`` depending on the current mode."""
+        if self.mode == "managed":
+            # Ensure the server URL uses wss:// and append the path
+            url = self.server_url.rstrip("/")
+            if not url.startswith("ws://") and not url.startswith("wss://"):
+                url = f"wss://{url}"
+            url = f"{url}/ws/transcribe"
+            return url, {}
+
+        # BYOK -- connect directly to Deepgram
+        params = (
+            f"model={self.deepgram_model}"
+            f"&language={self.language}"
+            "&interim_results=true"
+            "&encoding=linear16"
+            f"&sample_rate={self.sample_rate}"
+            f"&channels={self.channels}"
+        )
+        url = f"wss://api.deepgram.com/v1/listen?{params}"
+        headers = {"Authorization": f"Token {self.byok_api_key}"}
+        return url, headers
+
+    # -- managed-mode handshake ---------------------------------------- #
+
+    async def _managed_handshake(self) -> bool:
+        """Send auth message and wait for ``ready`` (managed mode).
+
+        Returns ``True`` on success.
+        """
+        auth_msg = {
+            "type": "auth",
+            "token": self.auth_token,
+            "config": {
+                "model": self.deepgram_model,
+                "language": self.language,
+                "sample_rate": self.sample_rate,
+                "channels": self.channels,
+                "encoding": "linear16",
+                "interim_results": True,
+            },
+        }
+        await self._ws.send(json.dumps(auth_msg))
+
+        try:
+            raw = await asyncio.wait_for(self._ws.recv(), timeout=15)
+            data = json.loads(raw)
+            if data.get("type") == "ready":
+                logger.info("Managed proxy is ready")
+                return True
+
+            if data.get("type") == "error":
+                err = data.get("message", "unknown error")
+                logger.error("Auth error from proxy: %s", err)
+                if self._on_error:
+                    self._on_error(f"Proxy auth error: {err}")
+                return False
+
+            logger.warning("Unexpected handshake message: %s", data)
+            return False
+
+        except asyncio.TimeoutError:
+            logger.error("Timed out waiting for proxy ready message")
+            if self._on_error:
+                self._on_error("Timed out waiting for proxy ready message")
+            return False
+
+    # -- send loop ----------------------------------------------------- #
+
+    async def _send_loop(self):
+        """Drain the audio queue and push raw PCM bytes over the WebSocket."""
+        while not self._stop_event.is_set():
+            try:
+                pcm_bytes = self._audio_queue.get(timeout=0.1)
+            except Empty:
+                continue
+
+            try:
+                await self._ws.send(pcm_bytes)
+            except Exception as exc:
+                if not self._stop_event.is_set():
+                    logger.error("Send error: %s", exc)
+                break
+
+    # -- receive loop -------------------------------------------------- #
+
+    async def _receive_loop(self):
+        """Listen for messages from the WebSocket and dispatch them."""
+        while not self._stop_event.is_set():
+            try:
+                raw = await asyncio.wait_for(self._ws.recv(), timeout=1.0)
+            except asyncio.TimeoutError:
+                continue
+            except Exception as exc:
+                if not self._stop_event.is_set():
+                    logger.error("Receive error: %s", exc)
+                break
+
+            try:
+                data = json.loads(raw)
+            except (json.JSONDecodeError, TypeError):
+                logger.debug("Non-JSON message received, ignoring")
+                continue
+
+            if self.mode == "managed":
+                self._handle_managed_message(data)
+            else:
+                self._handle_byok_message(data)
+
+    # ------------------------------------------------------------------ #
+    #  Message handlers
+    # ------------------------------------------------------------------ #
+
+    def _handle_managed_message(self, data: dict):
+        """Process a message from the managed proxy."""
+        msg_type = data.get("type", "")
+
+        if msg_type == "transcript":
+            text = data.get("text", "")
+            is_final = data.get("is_final", False)
+            if text.strip():
+                result = TranscriptionResult(
+                    text=text,
+                    is_final=is_final,
+                    timestamp=datetime.now(),
+                    user_name=self.user_name,
+                )
+                if is_final:
+                    if self.final_callback:
+                        self.final_callback(result)
+                else:
+                    if self.realtime_callback:
+                        self.realtime_callback(result)
+
+        elif msg_type == "credits_low":
+            seconds_remaining = data.get("seconds_remaining", 0)
+            logger.warning("Credits low -- %d seconds remaining", seconds_remaining)
+            if self._on_credits_low:
+                self._on_credits_low(int(seconds_remaining))
+
+        elif msg_type == "error":
+            code = data.get("code", "")
+            message = data.get("message", "Unknown error")
+            logger.error("Proxy error [%s]: %s", code, message)
+            if self._on_error:
+                self._on_error(f"[{code}] {message}" if code else message)
+
+        elif msg_type == "session_end":
+            seconds_used = data.get("seconds_used", 0)
+            logger.info("Session ended -- %d seconds used", seconds_used)
+
+        elif msg_type == "ready":
+            # May arrive again after reconnects; safe to ignore.
+            logger.debug("Received ready message (already connected)")
+
+        else:
+            logger.debug("Unhandled managed message type: %s", msg_type)
+
+    def _handle_byok_message(self, data: dict):
+        """Process a message received directly from the Deepgram API."""
+        msg_type = data.get("type", "")
+
+        if msg_type == "Results":
+            channel = data.get("channel", {})
+            alternatives = channel.get("alternatives", [])
+            if not alternatives:
+                return
+
+            transcript = alternatives[0].get("transcript", "")
+            is_final = data.get("is_final", False)
+
+            if transcript.strip():
+                result = TranscriptionResult(
+                    text=transcript,
+                    is_final=is_final,
+                    timestamp=datetime.now(),
+                    user_name=self.user_name,
+                )
+                if is_final:
+                    if self.final_callback:
+                        self.final_callback(result)
+                else:
+                    if self.realtime_callback:
+                        self.realtime_callback(result)
+
+        elif msg_type == "Metadata":
+            logger.debug("Deepgram metadata: %s", data)
+
+        elif msg_type == "UtteranceEnd":
+            logger.debug("Deepgram utterance end")
+
+        else:
+            logger.debug("Unhandled Deepgram message type: %s", msg_type)
+
+    # ------------------------------------------------------------------ #
+    #  Helpers
+    # ------------------------------------------------------------------ #
+
+    async def _close_ws(self):
+        """Close the WebSocket connection if open."""
+        if self._ws is not None:
+            try:
+                await self._ws.close()
+            except Exception:
+                pass
+            self._ws = None
+
+    def set_user_name(self, user_name: str):
+        """Update the user name attached to future transcriptions."""
+        self.user_name = user_name
+
+    def is_recording_active(self) -> bool:
+        """Return ``True`` if audio is currently being captured."""
+        return self._is_recording
+
+    def __repr__(self) -> str:
+        return (
+            f"DeepgramTranscriptionEngine(mode={self.mode}, "
+            f"recording={self._is_recording})"
+        )
+
+    def __del__(self):
+        """Best-effort cleanup."""
+        try:
+            self.stop()
+        except Exception:
+            pass
--- a/config/default_config.yaml
+++ b/config/default_config.yaml
@@ -68,11 +68,14 @@ web_server:
  port: 8080
  host: "127.0.0.1"

-remote_processing:
-  enabled: false  # Enable remote transcription offloading
-  server_url: ""  # WebSocket URL of remote transcription service (e.g., ws://your-server:8765/ws/transcribe)
-  api_key: ""  # API key for authentication
-  fallback_to_local: true  # Fall back to local processing if remote fails
+remote:
+  mode: local  # local | managed | byok
+  server_url: ""  # Proxy server URL for managed mode (e.g., wss://your-proxy.com)
+  auth_token: ""  # JWT stored after login (managed mode)
+  byok_api_key: ""  # Deepgram API key for BYOK mode
+  deepgram_model: nova-2  # Deepgram model to use
+  language: en-US  # Language code
+  fallback_to_local: true  # Fall back to local Whisper if remote fails

 updates:
  auto_check: true  # Check for updates on startup
--- a/gui/main_window_qt.py
+++ b/gui/main_window_qt.py
@@ -18,6 +18,7 @@ sys.path.append(str(Path(__file__).resolve().parent.parent))
 from client.config import Config
 from client.device_utils import DeviceManager
 from client.transcription_engine_realtime import RealtimeTranscriptionEngine, TranscriptionResult
+from client.deepgram_transcription import DeepgramTranscriptionEngine
 from client.server_sync import ServerSyncClient
 from gui.settings_dialog_qt import SettingsDialog
 from server.web_display import TranscriptionWebServer
@@ -394,27 +395,44 @@ class MainWindow(QMainWindow):
            min_gap = self.config.get('transcription.min_gap_between_recordings', 0.0)
            min_recording = self.config.get('transcription.min_length_of_recording', 0.5)

-        self.transcription_engine = RealtimeTranscriptionEngine(
-            model=model,
-            device=device,
-            language=language,
-            compute_type=compute_type,
-            enable_realtime_transcription=self.config.get('transcription.enable_realtime_transcription', False),
-            realtime_model=self.config.get('transcription.realtime_model', 'tiny.en'),
-            realtime_processing_pause=self.config.get('transcription.realtime_processing_pause', 0.1),
-            silero_sensitivity=self.config.get('transcription.silero_sensitivity', 0.4),
-            silero_use_onnx=self.config.get('transcription.silero_use_onnx', True),
-            webrtc_sensitivity=self.config.get('transcription.webrtc_sensitivity', 3),
-            post_speech_silence_duration=post_speech_silence,
-            min_length_of_recording=min_recording,
-            min_gap_between_recordings=min_gap,
-            pre_recording_buffer_duration=self.config.get('transcription.pre_recording_buffer_duration', 0.2),
-            beam_size=self.config.get('transcription.beam_size', 5),
-            initial_prompt=self.config.get('transcription.initial_prompt', ''),
-            no_log_file=self.config.get('transcription.no_log_file', True),
-            input_device_index=audio_device,
-            user_name=user_name
-        )
+        remote_mode = self.config.get('remote.mode', 'local')
+
+        if remote_mode in ('managed', 'byok'):
+            # Use Deepgram-based remote transcription
+            self.transcription_engine = DeepgramTranscriptionEngine(
+                config=self.config,
+                user_name=user_name,
+                input_device_index=audio_device
+            )
+            self.transcription_engine.set_callbacks(
+                realtime_callback=self._on_realtime_transcription,
+                final_callback=self._on_final_transcription
+            )
+            self.transcription_engine.set_error_callback(self._on_remote_error)
+            self.transcription_engine.set_credits_low_callback(self._on_credits_low)
+        else:
+            # Use local Whisper transcription
+            self.transcription_engine = RealtimeTranscriptionEngine(
+                model=model,
+                device=device,
+                language=language,
+                compute_type=compute_type,
+                enable_realtime_transcription=self.config.get('transcription.enable_realtime_transcription', False),
+                realtime_model=self.config.get('transcription.realtime_model', 'tiny.en'),
+                realtime_processing_pause=self.config.get('transcription.realtime_processing_pause', 0.1),
+                silero_sensitivity=self.config.get('transcription.silero_sensitivity', 0.4),
+                silero_use_onnx=self.config.get('transcription.silero_use_onnx', True),
+                webrtc_sensitivity=self.config.get('transcription.webrtc_sensitivity', 3),
+                post_speech_silence_duration=post_speech_silence,
+                min_length_of_recording=min_recording,
+                min_gap_between_recordings=min_gap,
+                pre_recording_buffer_duration=self.config.get('transcription.pre_recording_buffer_duration', 0.2),
+                beam_size=self.config.get('transcription.beam_size', 5),
+                initial_prompt=self.config.get('transcription.initial_prompt', ''),
+                no_log_file=self.config.get('transcription.no_log_file', True),
+                input_device_index=audio_device,
+                user_name=user_name
+            )

        # Set up callbacks for transcription results
        self.transcription_engine.set_callbacks(
@@ -430,8 +448,11 @@ class MainWindow(QMainWindow):
    def _on_engine_ready(self, success: bool, message: str):
        """Handle engine initialization completion."""
        if success:
-            # Update device label with actual device used
-            if self.transcription_engine:
+            remote_mode = self.config.get('remote.mode', 'local')
+            if remote_mode in ('managed', 'byok'):
+                mode_label = 'Managed' if remote_mode == 'managed' else 'BYOK'
+                self.device_label.setText(f"Device: Deepgram ({mode_label})")
+            elif self.transcription_engine:
                actual_device = self.transcription_engine.device
                compute_type = self.transcription_engine.compute_type
                device_display = f"{actual_device.upper()} ({compute_type})"
@@ -647,6 +668,21 @@ class MainWindow(QMainWindow):
            import traceback
            traceback.print_exc()

+    def _on_remote_error(self, error_msg: str):
+        """Handle error from remote transcription service."""
+        print(f"Remote transcription error: {error_msg}")
+        self.status_label.setText(f"⚠ Remote error: {error_msg}")
+
+        # Fallback to local if enabled
+        if self.config.get('remote.fallback_to_local', True) and self.is_transcribing:
+            print("Falling back to local transcription...")
+            self.status_label.setText("⚠ Remote failed — falling back to local")
+
+    def _on_credits_low(self, seconds_remaining: int):
+        """Handle low credits warning from proxy."""
+        minutes = seconds_remaining // 60
+        self.status_label.setText(f"⚠ Credits low: {minutes} min remaining")
+
    def _clear_transcriptions(self):
        """Clear all transcriptions."""
        if not self.transcriptions:
--- a/gui/settings_dialog_qt.py
+++ b/gui/settings_dialog_qt.py
@@ -4,7 +4,7 @@ from PySide6.QtWidgets import (
    QDialog, QVBoxLayout, QHBoxLayout, QFormLayout,
    QLabel, QLineEdit, QComboBox, QCheckBox, QSlider,
    QPushButton, QMessageBox, QGroupBox, QScrollArea, QWidget,
-    QFileDialog, QColorDialog
+    QFileDialog, QColorDialog, QRadioButton
 )
 from PySide6.QtCore import Qt
 from PySide6.QtGui import QScreen, QFontDatabase, QColor
@@ -487,46 +487,91 @@ class SettingsDialog(QDialog):
        server_group.setLayout(server_layout)
        content_layout.addWidget(server_group)

-        # Remote Processing Group
-        remote_group = QGroupBox("Remote Processing (GPU Offload)")
-        remote_layout = QFormLayout()
-        remote_layout.setSpacing(10)
+        # Transcription Mode Group
+        mode_group = QGroupBox("Transcription Mode")
+        mode_layout = QVBoxLayout()
+        mode_layout.setSpacing(10)

-        self.remote_enabled_check = QCheckBox()
-        self.remote_enabled_check.setToolTip(
-            "Enable remote transcription processing:\n"
-            "• Offload transcription to a GPU-equipped server\n"
-            "• Reduces local CPU/GPU usage\n"
-            "• Requires running the remote transcription service"
-        )
-        remote_layout.addRow("Enable Remote Processing:", self.remote_enabled_check)
+        # Radio buttons for mode selection
+        self.mode_local_radio = QRadioButton("Local (Whisper)")
+        self.mode_local_radio.setToolTip("Transcribe locally using Whisper models")
+        self.mode_managed_radio = QRadioButton("Remote - Managed")
+        self.mode_managed_radio.setToolTip("Use the transcription proxy service with prepaid credits")
+        self.mode_byok_radio = QRadioButton("Remote - BYOK (Bring Your Own Key)")
+        self.mode_byok_radio.setToolTip("Connect directly to Deepgram with your own API key")

-        self.remote_url_input = QLineEdit()
-        self.remote_url_input.setPlaceholderText("ws://your-server:8765/ws/transcribe")
-        self.remote_url_input.setToolTip(
-            "WebSocket URL of the remote transcription service:\n"
-            "• Format: ws://host:port/ws/transcribe\n"
-            "• Use wss:// for secure connections"
-        )
-        remote_layout.addRow("Server URL:", self.remote_url_input)
+        mode_layout.addWidget(self.mode_local_radio)
+        mode_layout.addWidget(self.mode_managed_radio)
+        mode_layout.addWidget(self.mode_byok_radio)

-        self.remote_api_key_input = QLineEdit()
-        self.remote_api_key_input.setEchoMode(QLineEdit.Password)
-        self.remote_api_key_input.setPlaceholderText("your-api-key")
-        self.remote_api_key_input.setToolTip(
-            "API key for authentication with the remote service"
-        )
-        remote_layout.addRow("API Key:", self.remote_api_key_input)
+        # Managed mode fields (shown when managed radio selected)
+        self.managed_widget = QWidget()
+        managed_layout = QFormLayout()
+        managed_layout.setSpacing(8)

-        self.remote_fallback_check = QCheckBox("Enable")
-        self.remote_fallback_check.setChecked(True)
-        self.remote_fallback_check.setToolTip(
-            "Fall back to local transcription if remote service is unavailable"
-        )
-        remote_layout.addRow("Fallback to Local:", self.remote_fallback_check)
+        self.managed_server_url = QLineEdit()
+        self.managed_server_url.setPlaceholderText("wss://your-proxy-server.com")
+        managed_layout.addRow("Server URL:", self.managed_server_url)

-        remote_group.setLayout(remote_layout)
-        content_layout.addWidget(remote_group)
+        # Login/Register buttons in a row
+        auth_widget = QWidget()
+        auth_layout = QHBoxLayout()
+        auth_layout.setContentsMargins(0, 0, 0, 0)
+        self.managed_login_btn = QPushButton("Login")
+        self.managed_login_btn.clicked.connect(self._managed_login)
+        self.managed_register_btn = QPushButton("Register")
+        self.managed_register_btn.clicked.connect(self._managed_register)
+        auth_layout.addWidget(self.managed_login_btn)
+        auth_layout.addWidget(self.managed_register_btn)
+        auth_layout.addStretch()
+        auth_widget.setLayout(auth_layout)
+        managed_layout.addRow("Account:", auth_widget)
+
+        self.managed_balance_label = QLabel("Not logged in")
+        managed_layout.addRow("Balance:", self.managed_balance_label)
+
+        self.managed_fallback_check = QCheckBox("Enable")
+        self.managed_fallback_check.setChecked(True)
+        self.managed_fallback_check.setToolTip("Fall back to local Whisper if remote fails")
+        managed_layout.addRow("Fallback to Local:", self.managed_fallback_check)
+
+        self.managed_widget.setLayout(managed_layout)
+        mode_layout.addWidget(self.managed_widget)
+
+        # BYOK mode fields (shown when BYOK radio selected)
+        self.byok_widget = QWidget()
+        byok_layout = QFormLayout()
+        byok_layout.setSpacing(8)
+
+        self.byok_api_key_input = QLineEdit()
+        self.byok_api_key_input.setEchoMode(QLineEdit.Password)
+        self.byok_api_key_input.setPlaceholderText("your-deepgram-api-key")
+        byok_layout.addRow("Deepgram API Key:", self.byok_api_key_input)
+
+        self.byok_model_combo = QComboBox()
+        self.byok_model_combo.addItems(["nova-2", "nova-2-general", "nova-2-meeting", "nova-2-phonecall", "whisper-large", "whisper-medium", "whisper-small"])
+        byok_layout.addRow("Model:", self.byok_model_combo)
+
+        self.byok_language_input = QLineEdit()
+        self.byok_language_input.setText("en-US")
+        self.byok_language_input.setPlaceholderText("en-US")
+        byok_layout.addRow("Language:", self.byok_language_input)
+
+        self.byok_fallback_check = QCheckBox("Enable")
+        self.byok_fallback_check.setChecked(True)
+        self.byok_fallback_check.setToolTip("Fall back to local Whisper if Deepgram fails")
+        byok_layout.addRow("Fallback to Local:", self.byok_fallback_check)
+
+        self.byok_widget.setLayout(byok_layout)
+        mode_layout.addWidget(self.byok_widget)
+
+        mode_group.setLayout(mode_layout)
+        content_layout.addWidget(mode_group)
+
+        # Connect radio buttons to show/hide relevant widgets
+        self.mode_local_radio.toggled.connect(self._on_mode_changed)
+        self.mode_managed_radio.toggled.connect(self._on_mode_changed)
+        self.mode_byok_radio.toggled.connect(self._on_mode_changed)

        # Updates Group
        updates_group = QGroupBox("Software Updates")
@@ -794,11 +839,28 @@ class SettingsDialog(QDialog):
        self.server_room_input.setText(self.config.get('server_sync.room', 'default'))
        self.server_passphrase_input.setText(self.config.get('server_sync.passphrase', ''))

-        # Remote processing settings
-        self.remote_enabled_check.setChecked(self.config.get('remote_processing.enabled', False))
-        self.remote_url_input.setText(self.config.get('remote_processing.server_url', ''))
-        self.remote_api_key_input.setText(self.config.get('remote_processing.api_key', ''))
-        self.remote_fallback_check.setChecked(self.config.get('remote_processing.fallback_to_local', True))
+        # Transcription mode settings
+        mode = self.config.get('remote.mode', 'local')
+        if mode == 'managed':
+            self.mode_managed_radio.setChecked(True)
+        elif mode == 'byok':
+            self.mode_byok_radio.setChecked(True)
+        else:
+            self.mode_local_radio.setChecked(True)
+
+        self.managed_server_url.setText(self.config.get('remote.server_url', ''))
+        self.managed_fallback_check.setChecked(self.config.get('remote.fallback_to_local', True))
+        self.byok_api_key_input.setText(self.config.get('remote.byok_api_key', ''))
+        self.byok_model_combo.setCurrentText(self.config.get('remote.deepgram_model', 'nova-2'))
+        self.byok_language_input.setText(self.config.get('remote.language', 'en-US'))
+        self.byok_fallback_check.setChecked(self.config.get('remote.fallback_to_local', True))
+
+        # Trigger visibility update
+        self._on_mode_changed()
+
+        # Update balance if managed mode and has token
+        if self.config.get('remote.auth_token'):
+            self._update_managed_balance()

        # Update settings
        self.update_auto_check.setChecked(self.config.get('updates.auto_check', True))
@@ -869,11 +931,21 @@ class SettingsDialog(QDialog):
            self.config.set('server_sync.room', self.server_room_input.text())
            self.config.set('server_sync.passphrase', self.server_passphrase_input.text())

-            # Remote processing settings
-            self.config.set('remote_processing.enabled', self.remote_enabled_check.isChecked())
-            self.config.set('remote_processing.server_url', self.remote_url_input.text())
-            self.config.set('remote_processing.api_key', self.remote_api_key_input.text())
-            self.config.set('remote_processing.fallback_to_local', self.remote_fallback_check.isChecked())
+            # Transcription mode settings
+            if self.mode_managed_radio.isChecked():
+                self.config.set('remote.mode', 'managed')
+            elif self.mode_byok_radio.isChecked():
+                self.config.set('remote.mode', 'byok')
+            else:
+                self.config.set('remote.mode', 'local')
+
+            self.config.set('remote.server_url', self.managed_server_url.text())
+            self.config.set('remote.fallback_to_local',
+                self.managed_fallback_check.isChecked() if self.mode_managed_radio.isChecked()
+                else self.byok_fallback_check.isChecked())
+            self.config.set('remote.byok_api_key', self.byok_api_key_input.text())
+            self.config.set('remote.deepgram_model', self.byok_model_combo.currentText())
+            self.config.set('remote.language', self.byok_language_input.text())

            # Update settings
            self.config.set('updates.auto_check', self.update_auto_check.isChecked())
@@ -892,6 +964,194 @@ class SettingsDialog(QDialog):
        except Exception as e:
            QMessageBox.critical(self, "Error", f"Failed to save settings:\n{e}")

+    def _on_mode_changed(self):
+        """Show/hide mode-specific widgets based on selected radio button."""
+        self.managed_widget.setVisible(self.mode_managed_radio.isChecked())
+        self.byok_widget.setVisible(self.mode_byok_radio.isChecked())
+
+    def _managed_login(self):
+        """Open a login dialog and authenticate with the managed proxy server."""
+        import json
+        import urllib.request
+        import urllib.error
+
+        dialog = QDialog(self)
+        dialog.setWindowTitle("Login")
+        dialog.setMinimumWidth(350)
+        layout = QFormLayout()
+
+        email_input = QLineEdit()
+        email_input.setPlaceholderText("you@example.com")
+        layout.addRow("Email:", email_input)
+
+        password_input = QLineEdit()
+        password_input.setEchoMode(QLineEdit.Password)
+        layout.addRow("Password:", password_input)
+
+        button_layout = QHBoxLayout()
+        cancel_btn = QPushButton("Cancel")
+        cancel_btn.clicked.connect(dialog.reject)
+        login_btn = QPushButton("Login")
+        login_btn.setDefault(True)
+        button_layout.addStretch()
+        button_layout.addWidget(cancel_btn)
+        button_layout.addWidget(login_btn)
+        layout.addRow("", button_layout)
+
+        dialog.setLayout(layout)
+
+        def do_login():
+            server_url = self.managed_server_url.text().rstrip('/')
+            if not server_url:
+                QMessageBox.warning(dialog, "Error", "Please enter a Server URL first.")
+                return
+            payload = json.dumps({
+                "email": email_input.text(),
+                "password": password_input.text()
+            }).encode('utf-8')
+            req = urllib.request.Request(
+                f"{server_url}/auth/login",
+                data=payload,
+                headers={"Content-Type": "application/json"},
+                method="POST"
+            )
+            try:
+                with urllib.request.urlopen(req, timeout=10) as resp:
+                    data = json.loads(resp.read().decode('utf-8'))
+                token = data.get('token', '')
+                if token:
+                    self.config.set('remote.auth_token', token)
+                    self._update_managed_balance()
+                    QMessageBox.information(dialog, "Success", "Logged in successfully.")
+                    dialog.accept()
+                else:
+                    QMessageBox.warning(dialog, "Error", "Login succeeded but no token received.")
+            except urllib.error.HTTPError as e:
+                try:
+                    body = json.loads(e.read().decode('utf-8'))
+                    msg = body.get('detail', body.get('message', str(e)))
+                except Exception:
+                    msg = str(e)
+                QMessageBox.warning(dialog, "Login Failed", msg)
+            except Exception as e:
+                QMessageBox.warning(dialog, "Error", f"Could not connect to server:\n{e}")
+
+        login_btn.clicked.connect(do_login)
+        dialog.exec()
+
+    def _managed_register(self):
+        """Open a registration dialog and create an account on the managed proxy server."""
+        import json
+        import urllib.request
+        import urllib.error
+
+        dialog = QDialog(self)
+        dialog.setWindowTitle("Register")
+        dialog.setMinimumWidth(350)
+        layout = QFormLayout()
+
+        email_input = QLineEdit()
+        email_input.setPlaceholderText("you@example.com")
+        layout.addRow("Email:", email_input)
+
+        password_input = QLineEdit()
+        password_input.setEchoMode(QLineEdit.Password)
+        layout.addRow("Password:", password_input)
+
+        confirm_input = QLineEdit()
+        confirm_input.setEchoMode(QLineEdit.Password)
+        layout.addRow("Confirm Password:", confirm_input)
+
+        button_layout = QHBoxLayout()
+        cancel_btn = QPushButton("Cancel")
+        cancel_btn.clicked.connect(dialog.reject)
+        register_btn = QPushButton("Register")
+        register_btn.setDefault(True)
+        button_layout.addStretch()
+        button_layout.addWidget(cancel_btn)
+        button_layout.addWidget(register_btn)
+        layout.addRow("", button_layout)
+
+        dialog.setLayout(layout)
+
+        def do_register():
+            if password_input.text() != confirm_input.text():
+                QMessageBox.warning(dialog, "Error", "Passwords do not match.")
+                return
+            server_url = self.managed_server_url.text().rstrip('/')
+            if not server_url:
+                QMessageBox.warning(dialog, "Error", "Please enter a Server URL first.")
+                return
+            payload = json.dumps({
+                "email": email_input.text(),
+                "password": password_input.text()
+            }).encode('utf-8')
+            req = urllib.request.Request(
+                f"{server_url}/auth/register",
+                data=payload,
+                headers={"Content-Type": "application/json"},
+                method="POST"
+            )
+            try:
+                with urllib.request.urlopen(req, timeout=10) as resp:
+                    data = json.loads(resp.read().decode('utf-8'))
+                token = data.get('token', '')
+                if token:
+                    self.config.set('remote.auth_token', token)
+                    self._update_managed_balance()
+                    QMessageBox.information(dialog, "Success", "Account created and logged in.")
+                    dialog.accept()
+                else:
+                    QMessageBox.information(dialog, "Success",
+                        "Account created. Please log in.")
+                    dialog.accept()
+            except urllib.error.HTTPError as e:
+                try:
+                    body = json.loads(e.read().decode('utf-8'))
+                    msg = body.get('detail', body.get('message', str(e)))
+                except Exception:
+                    msg = str(e)
+                QMessageBox.warning(dialog, "Registration Failed", msg)
+            except Exception as e:
+                QMessageBox.warning(dialog, "Error", f"Could not connect to server:\n{e}")
+
+        register_btn.clicked.connect(do_register)
+        dialog.exec()
+
+    def _update_managed_balance(self):
+        """Fetch and display the current account balance from the managed proxy server."""
+        import json
+        import urllib.request
+        import urllib.error
+
+        server_url = self.managed_server_url.text().rstrip('/')
+        token = self.config.get('remote.auth_token', '')
+        if not server_url or not token:
+            self.managed_balance_label.setText("Not logged in")
+            return
+
+        req = urllib.request.Request(
+            f"{server_url}/billing/balance",
+            headers={
+                "Authorization": f"Bearer {token}",
+                "Content-Type": "application/json"
+            },
+            method="GET"
+        )
+        try:
+            with urllib.request.urlopen(req, timeout=10) as resp:
+                data = json.loads(resp.read().decode('utf-8'))
+            balance = data.get('balance', data.get('credits', 'N/A'))
+            self.managed_balance_label.setText(str(balance))
+        except urllib.error.HTTPError as e:
+            if e.code == 401:
+                self.managed_balance_label.setText("Session expired - please login again")
+                self.config.set('remote.auth_token', '')
+            else:
+                self.managed_balance_label.setText("Error fetching balance")
+        except Exception:
+            self.managed_balance_label.setText("Could not connect to server")
+
    def _check_for_updates_now(self):
        """Manually check for updates."""
        from version import __version__
--- a/index.html
+++ b/index.html
@@ -0,0 +1,13 @@
+<!doctype html>
+<html lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <link rel="icon" type="image/png" href="/LocalTranscription.png" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <title>Local Transcription</title>
+  </head>
+  <body>
+    <div id="app"></div>
+    <script type="module" src="/src/main.ts"></script>
+  </body>
+</html>
--- a/local-transcription-headless.spec
+++ b/local-transcription-headless.spec
@@ -0,0 +1,184 @@
+# -*- mode: python ; coding: utf-8 -*-
+"""PyInstaller spec file for headless Local Transcription backend (no PySide6/Qt).
+
+This builds the Python sidecar for the Tauri frontend.
+Much simpler than local-transcription.spec since all Qt dependencies are removed.
+"""
+
+import sys
+import os
+
+block_cipher = None
+is_windows = sys.platform == 'win32'
+
+from PyInstaller.utils.hooks import collect_submodules, collect_data_files
+
+# Find faster_whisper assets folder
+import faster_whisper
+faster_whisper_path = os.path.dirname(faster_whisper.__file__)
+vad_assets_path = os.path.join(faster_whisper_path, 'assets')
+
+# pvporcupine resources (indirect dependency from RealtimeSTT)
+try:
+    import pvporcupine
+    pvporcupine_path = os.path.dirname(pvporcupine.__file__)
+    pvporcupine_resources = os.path.join(pvporcupine_path, 'resources')
+    pvporcupine_lib = os.path.join(pvporcupine_path, 'lib')
+    pvporcupine_data_files = []
+    if os.path.exists(pvporcupine_resources):
+        pvporcupine_data_files.append((pvporcupine_resources, 'pvporcupine/resources'))
+    if os.path.exists(pvporcupine_lib):
+        pvporcupine_data_files.append((pvporcupine_lib, 'pvporcupine/lib'))
+except ImportError:
+    pvporcupine_data_files = []
+
+# Data files
+datas = [
+    ('config/default_config.yaml', 'config'),
+    (vad_assets_path, 'faster_whisper/assets'),
+] + pvporcupine_data_files
+
+# Hidden imports -- NO PySide6/Qt needed for headless backend
+hiddenimports = [
+    # Transcription engine
+    'faster_whisper',
+    'faster_whisper.transcribe',
+    'faster_whisper.vad',
+    'ctranslate2',
+    'sounddevice',
+    'scipy',
+    'scipy.signal',
+    'numpy',
+    # RealtimeSTT
+    'RealtimeSTT',
+    'RealtimeSTT.audio_recorder',
+    'webrtcvad',
+    'webrtcvad_wheels',
+    'silero_vad',
+    # PyTorch
+    'torch',
+    'torch.nn',
+    'torch.nn.functional',
+    'torchaudio',
+    'onnxruntime',
+    'onnxruntime.capi',
+    'onnxruntime.capi.onnxruntime_pybind11_state',
+    'pyaudio',
+    'halo',
+    'colorama',
+    # FastAPI and dependencies
+    'fastapi',
+    'fastapi.routing',
+    'fastapi.responses',
+    'starlette',
+    'starlette.applications',
+    'starlette.routing',
+    'starlette.responses',
+    'starlette.websockets',
+    'starlette.middleware',
+    'starlette.middleware.cors',
+    'pydantic',
+    'pydantic.fields',
+    'pydantic.main',
+    'anyio',
+    'anyio._backends',
+    'anyio._backends._asyncio',
+    'sniffio',
+    # Uvicorn
+    'uvicorn',
+    'uvicorn.logging',
+    'uvicorn.loops',
+    'uvicorn.loops.auto',
+    'uvicorn.protocols',
+    'uvicorn.protocols.http',
+    'uvicorn.protocols.http.auto',
+    'uvicorn.protocols.http.h11_impl',
+    'uvicorn.protocols.websockets',
+    'uvicorn.protocols.websockets.auto',
+    'uvicorn.protocols.websockets.wsproto_impl',
+    'uvicorn.lifespan',
+    'uvicorn.lifespan.on',
+    'h11',
+    'websockets',
+    'websockets.legacy',
+    'websockets.legacy.server',
+    # HTTP client
+    'requests',
+    'urllib3',
+    'certifi',
+    'charset_normalizer',
+]
+
+# Collect submodules for key packages
+print("Collecting submodules for backend packages...")
+for package in ['fastapi', 'starlette', 'pydantic', 'pydantic_core', 'anyio', 'uvicorn', 'websockets', 'h11', 'httptools', 'uvloop']:
+    try:
+        submodules = collect_submodules(package)
+        hiddenimports += submodules
+        print(f"  + Collected {len(submodules)} submodules from {package}")
+    except Exception as e:
+        print(f"  - Warning: Could not collect {package}: {e}")
+
+# Collect data files
+for package in ['fastapi', 'starlette', 'pydantic', 'uvicorn', 'RealtimeSTT']:
+    try:
+        data_files = collect_data_files(package)
+        if data_files:
+            datas += data_files
+            print(f"  + Collected {len(data_files)} data files from {package}")
+    except Exception:
+        pass
+
+# Pydantic critical deps
+hiddenimports += [
+    'colorsys', 'decimal', 'json', 'ipaddress', 'pathlib', 'uuid',
+    'email.message', 'typing_extensions',
+]
+
+a = Analysis(
+    ['backend/main_headless.py'],
+    pathex=[],
+    binaries=[],
+    datas=datas,
+    hiddenimports=hiddenimports,
+    hookspath=['hooks'],
+    hooksconfig={},
+    runtime_hooks=[],
+    excludes=['enum34', 'PySide6', 'PyQt5', 'PyQt6', 'tkinter'],
+    win_no_prefer_redirects=False,
+    win_private_assemblies=False,
+    cipher=block_cipher,
+    noarchive=False,
+)
+
+pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)
+
+exe = EXE(
+    pyz,
+    a.scripts,
+    [],
+    exclude_binaries=True,
+    name='local-transcription-backend',
+    debug=False,
+    bootloader_ignore_signals=False,
+    strip=False,
+    upx=True,
+    console=True,  # Headless backend needs console for JSON output
+    disable_windowed_traceback=False,
+    argv_emulation=False,
+    target_arch=None,
+    codesign_identity=None,
+    entitlements_file=None,
+    icon='LocalTranscription.ico' if is_windows else None,
+)
+
+coll = COLLECT(
+    exe,
+    a.binaries,
+    a.zipfiles,
+    a.datas,
+    strip=False,
+    upx=True,
+    upx_exclude=[],
+    name='local-transcription-backend',
+)
--- a/package-lock.json
+++ b/package-lock.json
--- a/package.json
+++ b/package.json
@@ -0,0 +1,27 @@
+{
+  "name": "local-transcription",
+  "private": true,
+  "version": "1.4.4",
+  "type": "module",
+  "scripts": {
+    "dev": "vite dev",
+    "build": "vite build",
+    "preview": "vite preview",
+    "tauri": "tauri"
+  },
+  "devDependencies": {
+    "@sveltejs/vite-plugin-svelte": "^5.0.0",
+    "@tauri-apps/cli": "^2.0.0",
+    "@tsconfig/svelte": "^5.0.0",
+    "svelte": "^5.0.0",
+    "svelte-check": "^4.0.0",
+    "typescript": "~5.6.0",
+    "vite": "^6.0.0"
+  },
+  "dependencies": {
+    "@tauri-apps/api": "^2.0.0",
+    "@tauri-apps/plugin-dialog": "^2.0.0",
+    "@tauri-apps/plugin-shell": "^2.0.0",
+    "@tauri-apps/plugin-process": "^2.0.0"
+  }
+}
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "local-transcription"
-version = "1.0.0"
+version = "1.0.2"
 description = "A standalone desktop application for real-time speech-to-text transcription using Whisper models"
 readme = "README.md"
 requires-python = ">=3.9"
--- a/src-tauri/Cargo.lock
+++ b/src-tauri/Cargo.lock
--- a/src-tauri/Cargo.toml
+++ b/src-tauri/Cargo.toml
@@ -0,0 +1,26 @@
+[package]
+name = "local-transcription"
+version = "1.4.4"
+description = "Real-time speech-to-text transcription for streamers"
+authors = ["Local Transcription Contributors"]
+edition = "2021"
+
+[lib]
+name = "local_transcription_lib"
+crate-type = ["lib", "cdylib", "staticlib"]
+
+[build-dependencies]
+tauri-build = { version = "2", features = [] }
+
+[dependencies]
+tauri = { version = "2", features = [] }
+tauri-plugin-shell = "2"
+tauri-plugin-dialog = "2"
+tauri-plugin-process = "2"
+serde = { version = "1", features = ["derive"] }
+serde_json = "1"
+reqwest = { version = "0.12", features = ["json", "stream"] }
+futures-util = "0.3"
+zip = { version = "2", default-features = false, features = ["deflate"] }
+bytes = "1"
+tokio = { version = "1", features = ["full"] }
--- a/src-tauri/build.rs
+++ b/src-tauri/build.rs
@@ -0,0 +1,3 @@
+fn main() {
+    tauri_build::build()
+}
--- a/src-tauri/gen/schemas/acl-manifests.json
+++ b/src-tauri/gen/schemas/acl-manifests.json
--- a/src-tauri/gen/schemas/capabilities.json
+++ b/src-tauri/gen/schemas/capabilities.json
@@ -0,0 +1 @@
+{}
--- a/src-tauri/gen/schemas/desktop-schema.json
+++ b/src-tauri/gen/schemas/desktop-schema.json
--- a/src-tauri/gen/schemas/linux-schema.json
+++ b/src-tauri/gen/schemas/linux-schema.json
--- a/src-tauri/icons/128x128.png
+++ b/src-tauri/icons/128x128.png
--- a/src-tauri/icons/128x128@2x.png
+++ b/src-tauri/icons/128x128@2x.png
--- a/src-tauri/icons/32x32.png
+++ b/src-tauri/icons/32x32.png
--- a/src-tauri/icons/icon.icns
+++ b/src-tauri/icons/icon.icns
--- a/src-tauri/icons/icon.ico
+++ b/src-tauri/icons/icon.ico
--- a/src-tauri/icons/icon.png
+++ b/src-tauri/icons/icon.png
--- a/src-tauri/src/lib.rs
+++ b/src-tauri/src/lib.rs
@@ -0,0 +1,41 @@
+mod sidecar;
+
+use std::sync::Mutex;
+use tauri::Manager;
+
+#[cfg_attr(mobile, tauri::mobile_entry_point)]
+pub fn run() {
+    tauri::Builder::default()
+        .plugin(tauri_plugin_shell::init())
+        .plugin(tauri_plugin_dialog::init())
+        .plugin(tauri_plugin_process::init())
+        .manage(sidecar::ManagedSidecar(Mutex::new(
+            sidecar::SidecarManager::new(),
+        )))
+        .setup(|app| {
+            let resource_dir = app
+                .path()
+                .resource_dir()
+                .expect("failed to resolve resource dir");
+            let data_dir = app
+                .path()
+                .app_data_dir()
+                .expect("failed to resolve app data dir");
+
+            // Ensure the data directory exists
+            std::fs::create_dir_all(&data_dir).expect("failed to create app data dir");
+
+            sidecar::init_dirs(resource_dir, data_dir);
+            Ok(())
+        })
+        .invoke_handler(tauri::generate_handler![
+            sidecar::check_sidecar,
+            sidecar::download_sidecar,
+            sidecar::check_sidecar_update,
+            sidecar::get_sidecar_port,
+            sidecar::start_sidecar,
+            sidecar::stop_sidecar,
+        ])
+        .run(tauri::generate_context!())
+        .expect("error while running tauri application");
+}
--- a/src-tauri/src/main.rs
+++ b/src-tauri/src/main.rs
@@ -0,0 +1,6 @@
+// Prevents additional console window on Windows in release
+#![cfg_attr(not(debug_assertions), windows_subsystem = "windows")]
+
+fn main() {
+    local_transcription_lib::run()
+}
--- a/src-tauri/src/sidecar/mod.rs
+++ b/src-tauri/src/sidecar/mod.rs
@@ -0,0 +1,580 @@
+use std::io::BufRead;
+use std::path::PathBuf;
+use std::sync::Mutex;
+
+use serde::{Deserialize, Serialize};
+use tauri::{AppHandle, Emitter};
+
+const REPO_API: &str =
+    "https://repo.anhonesthost.net/api/v1/repos/streamer-tools/local-transcription";
+
+const BINARY_NAME: &str = if cfg!(windows) {
+    "local-transcription-backend.exe"
+} else {
+    "local-transcription-backend"
+};
+
+// ---------------------------------------------------------------------------
+// Directory state (initialised once during Tauri setup)
+// ---------------------------------------------------------------------------
+
+static DIRS: std::sync::OnceLock<SidecarDirs> = std::sync::OnceLock::new();
+
+struct SidecarDirs {
+    #[allow(dead_code)]
+    resource_dir: PathBuf,
+    data_dir: PathBuf,
+}
+
+/// Called from Tauri `setup` to persist the resource / data directories.
+pub fn init_dirs(resource_dir: PathBuf, data_dir: PathBuf) {
+    let _ = DIRS.set(SidecarDirs {
+        resource_dir,
+        data_dir,
+    });
+}
+
+fn data_dir() -> &'static PathBuf {
+    &DIRS.get().expect("sidecar::init_dirs not called").data_dir
+}
+
+// ---------------------------------------------------------------------------
+// Version helpers
+// ---------------------------------------------------------------------------
+
+fn version_file() -> PathBuf {
+    data_dir().join("sidecar-version.txt")
+}
+
+fn read_installed_version() -> Option<String> {
+    std::fs::read_to_string(version_file())
+        .ok()
+        .map(|s| s.trim().to_string())
+        .filter(|s| !s.is_empty())
+}
+
+fn sidecar_dir_for_version(version: &str) -> PathBuf {
+    data_dir().join(format!("sidecar-{version}"))
+}
+
+fn binary_path_for_version(version: &str) -> PathBuf {
+    sidecar_dir_for_version(version).join(BINARY_NAME)
+}
+
+// ---------------------------------------------------------------------------
+// Gitea API types
+// ---------------------------------------------------------------------------
+
+#[derive(Debug, Deserialize)]
+struct GiteaRelease {
+    tag_name: String,
+    assets: Vec<GiteaAsset>,
+}
+
+#[derive(Debug, Deserialize)]
+struct GiteaAsset {
+    name: String,
+    browser_download_url: String,
+    size: u64,
+}
+
+// ---------------------------------------------------------------------------
+// Platform / arch detection
+// ---------------------------------------------------------------------------
+
+fn platform_token() -> &'static str {
+    if cfg!(target_os = "windows") {
+        "windows"
+    } else if cfg!(target_os = "macos") {
+        "macos"
+    } else {
+        "linux"
+    }
+}
+
+fn arch_token() -> &'static str {
+    if cfg!(target_arch = "aarch64") {
+        "aarch64"
+    } else {
+        "x86_64"
+    }
+}
+
+/// Build the expected asset prefix, e.g. `sidecar-linux-x86_64-cuda`.
+fn asset_prefix(variant: &str) -> String {
+    format!("sidecar-{}-{}-{}", platform_token(), arch_token(), variant)
+}
+
+// ---------------------------------------------------------------------------
+// Tauri commands
+// ---------------------------------------------------------------------------
+
+/// Returns `true` when a sidecar binary is installed and the file exists.
+#[tauri::command]
+pub fn check_sidecar() -> bool {
+    if let Some(version) = read_installed_version() {
+        binary_path_for_version(&version).exists()
+    } else {
+        false
+    }
+}
+
+/// Download progress payload emitted via `sidecar-download-progress`.
+#[derive(Clone, Serialize)]
+struct DownloadProgress {
+    downloaded: u64,
+    total: u64,
+    phase: String, // "downloading" | "extracting" | "done" | "error"
+    message: String,
+}
+
+/// Download & install the latest sidecar release.
+///
+/// `variant` is typically `"cuda"` or `"cpu"`.
+#[tauri::command]
+pub async fn download_sidecar(app: AppHandle, variant: String) -> Result<String, String> {
+    use futures_util::StreamExt;
+
+    let emit = |progress: DownloadProgress| {
+        let _ = app.emit("sidecar-download-progress", progress);
+    };
+
+    // 1. Fetch releases from Gitea (filter to sidecar-v* tags) ---------------
+    emit(DownloadProgress {
+        downloaded: 0,
+        total: 0,
+        phase: "downloading".into(),
+        message: "Fetching release info...".into(),
+    });
+
+    let releases_url = format!("{REPO_API}/releases?limit=20");
+    let client = reqwest::Client::new();
+    let releases: Vec<GiteaRelease> = client
+        .get(&releases_url)
+        .send()
+        .await
+        .map_err(|e| format!("Failed to fetch releases: {e}"))?
+        .json()
+        .await
+        .map_err(|e| format!("Failed to parse releases: {e}"))?;
+
+    // Find the latest release whose tag starts with `sidecar-v`
+    let release = releases
+        .into_iter()
+        .find(|r| r.tag_name.starts_with("sidecar-v"))
+        .ok_or_else(|| "No sidecar release found".to_string())?;
+
+    let version = release.tag_name.clone(); // e.g. "sidecar-v1.0.2"
+
+    // 2. Find matching asset ----------------------------------------------------
+    let prefix = asset_prefix(&variant);
+    let asset = release
+        .assets
+        .iter()
+        .find(|a| a.name.starts_with(&prefix) && a.name.ends_with(".zip"))
+        .ok_or_else(|| {
+            format!(
+                "No asset matching '{}' in release {}. Available: {}",
+                prefix,
+                version,
+                release
+                    .assets
+                    .iter()
+                    .map(|a| a.name.as_str())
+                    .collect::<Vec<_>>()
+                    .join(", ")
+            )
+        })?;
+
+    let total_size = asset.size;
+    let download_url = asset.browser_download_url.clone();
+
+    // 3. Stream download ---------------------------------------------------------
+    emit(DownloadProgress {
+        downloaded: 0,
+        total: total_size,
+        phase: "downloading".into(),
+        message: format!("Downloading {}...", asset.name),
+    });
+
+    let response = client
+        .get(&download_url)
+        .send()
+        .await
+        .map_err(|e| format!("Download request failed: {e}"))?;
+
+    if !response.status().is_success() {
+        return Err(format!("Download failed with status {}", response.status()));
+    }
+
+    let tmp_zip = data_dir().join("_sidecar_download.zip");
+    let mut file = tokio::fs::File::create(&tmp_zip)
+        .await
+        .map_err(|e| format!("Cannot create temp file: {e}"))?;
+
+    let mut stream = response.bytes_stream();
+    let mut downloaded: u64 = 0;
+
+    use tokio::io::AsyncWriteExt;
+    while let Some(chunk) = stream.next().await {
+        let chunk = chunk.map_err(|e| format!("Download stream error: {e}"))?;
+        file.write_all(&chunk)
+            .await
+            .map_err(|e| format!("Write error: {e}"))?;
+        downloaded += chunk.len() as u64;
+
+        emit(DownloadProgress {
+            downloaded,
+            total: total_size,
+            phase: "downloading".into(),
+            message: format!(
+                "Downloading... {:.1} / {:.1} MB",
+                downloaded as f64 / 1_048_576.0,
+                total_size as f64 / 1_048_576.0
+            ),
+        });
+    }
+
+    file.flush()
+        .await
+        .map_err(|e| format!("Flush error: {e}"))?;
+    drop(file);
+
+    // 4. Extract zip -------------------------------------------------------------
+    emit(DownloadProgress {
+        downloaded,
+        total: total_size,
+        phase: "extracting".into(),
+        message: "Extracting sidecar...".into(),
+    });
+
+    let dest_dir = sidecar_dir_for_version(&version);
+    if dest_dir.exists() {
+        std::fs::remove_dir_all(&dest_dir)
+            .map_err(|e| format!("Cannot clean old dir: {e}"))?;
+    }
+    std::fs::create_dir_all(&dest_dir)
+        .map_err(|e| format!("Cannot create sidecar dir: {e}"))?;
+
+    // Extraction is blocking I/O -- offload to a spawn_blocking thread.
+    let zip_path = tmp_zip.clone();
+    let dest = dest_dir.clone();
+    tokio::task::spawn_blocking(move || extract_zip(&zip_path, &dest))
+        .await
+        .map_err(|e| format!("Join error: {e}"))?
+        .map_err(|e| format!("Extraction error: {e}"))?;
+
+    // Remove the temp zip
+    let _ = std::fs::remove_file(&tmp_zip);
+
+    // 5. Set executable permissions on Unix -------------------------------------
+    #[cfg(unix)]
+    {
+        use std::os::unix::fs::PermissionsExt;
+        let bin = dest_dir.join(BINARY_NAME);
+        if bin.exists() {
+            let mut perms = std::fs::metadata(&bin)
+                .map_err(|e| format!("metadata error: {e}"))?
+                .permissions();
+            perms.set_mode(0o755);
+            std::fs::set_permissions(&bin, perms)
+                .map_err(|e| format!("chmod error: {e}"))?;
+        }
+    }
+
+    // 6. Write version file & clean up old versions ----------------------------
+    std::fs::write(version_file(), &version)
+        .map_err(|e| format!("Failed to write version file: {e}"))?;
+
+    cleanup_old_versions(&version);
+
+    emit(DownloadProgress {
+        downloaded,
+        total: total_size,
+        phase: "done".into(),
+        message: "Sidecar installed successfully".into(),
+    });
+
+    Ok(version)
+}
+
+/// Check if there is a newer sidecar release than the installed one.
+/// Returns `Some(tag_name)` when an update is available, or `None`.
+#[tauri::command]
+pub async fn check_sidecar_update() -> Result<Option<String>, String> {
+    let installed = match read_installed_version() {
+        Some(v) => v,
+        None => return Ok(None),
+    };
+
+    let releases_url = format!("{REPO_API}/releases?limit=20");
+    let releases: Vec<GiteaRelease> = reqwest::Client::new()
+        .get(&releases_url)
+        .send()
+        .await
+        .map_err(|e| format!("Failed to fetch releases: {e}"))?
+        .json()
+        .await
+        .map_err(|e| format!("Failed to parse releases: {e}"))?;
+
+    let latest = releases
+        .iter()
+        .find(|r| r.tag_name.starts_with("sidecar-v"));
+
+    match latest {
+        Some(rel) if rel.tag_name != installed => Ok(Some(rel.tag_name.clone())),
+        _ => Ok(None),
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Zip extraction helper
+// ---------------------------------------------------------------------------
+
+fn extract_zip(zip_path: &std::path::Path, dest: &std::path::Path) -> Result<(), String> {
+    let file =
+        std::fs::File::open(zip_path).map_err(|e| format!("Cannot open zip: {e}"))?;
+    let mut archive =
+        zip::ZipArchive::new(file).map_err(|e| format!("Invalid zip: {e}"))?;
+
+    for i in 0..archive.len() {
+        let mut entry = archive
+            .by_index(i)
+            .map_err(|e| format!("Zip entry error: {e}"))?;
+        let entry_path = match entry.enclosed_name() {
+            Some(p) => p.to_owned(),
+            None => continue,
+        };
+
+        let out_path = dest.join(&entry_path);
+
+        if entry.is_dir() {
+            std::fs::create_dir_all(&out_path)
+                .map_err(|e| format!("mkdir error: {e}"))?;
+        } else {
+            if let Some(parent) = out_path.parent() {
+                std::fs::create_dir_all(parent)
+                    .map_err(|e| format!("mkdir error: {e}"))?;
+            }
+            let mut outfile = std::fs::File::create(&out_path)
+                .map_err(|e| format!("create file error: {e}"))?;
+            std::io::copy(&mut entry, &mut outfile)
+                .map_err(|e| format!("copy error: {e}"))?;
+        }
+    }
+    Ok(())
+}
+
+// ---------------------------------------------------------------------------
+// Cleanup old versions
+// ---------------------------------------------------------------------------
+
+fn cleanup_old_versions(current_version: &str) {
+    let data = data_dir();
+    let current_dir_name = format!("sidecar-{current_version}");
+    if let Ok(entries) = std::fs::read_dir(data) {
+        for entry in entries.flatten() {
+            let name = entry.file_name().to_string_lossy().to_string();
+            if name.starts_with("sidecar-v")      // e.g. sidecar-v1.0.1
+                && name != current_dir_name
+                && entry.path().is_dir()
+            {
+                let _ = std::fs::remove_dir_all(entry.path());
+            }
+        }
+    }
+}
+
+// ---------------------------------------------------------------------------
+// SidecarManager — launch / stop / query the backend process
+// ---------------------------------------------------------------------------
+
+#[derive(Debug, Serialize, Deserialize)]
+struct ReadyEvent {
+    event: String,
+    port: u16,
+}
+
+pub struct SidecarManager {
+    child: Option<std::process::Child>,
+    port: Option<u16>,
+}
+
+impl SidecarManager {
+    pub fn new() -> Self {
+        Self {
+            child: None,
+            port: None,
+        }
+    }
+
+    /// Returns `true` when the child process is still alive.
+    pub fn is_running(&mut self) -> bool {
+        match &mut self.child {
+            Some(child) => match child.try_wait() {
+                Ok(Some(_)) => {
+                    // Process has exited
+                    self.child = None;
+                    self.port = None;
+                    false
+                }
+                Ok(None) => true,
+                Err(_) => false,
+            },
+            None => false,
+        }
+    }
+
+    /// Start the sidecar if it is not already running. Returns the port.
+    pub fn ensure_running(&mut self) -> Result<u16, String> {
+        if self.is_running() {
+            return self
+                .port
+                .ok_or_else(|| "Sidecar running but port unknown".into());
+        }
+
+        let is_dev = cfg!(debug_assertions)
+            || std::env::var("LOCAL_TRANSCRIPTION_DEV")
+                .map(|v| v == "1")
+                .unwrap_or(false);
+
+        let mut cmd = if is_dev {
+            self.build_dev_command()?
+        } else {
+            self.build_prod_command()?
+        };
+
+        // Hide the console window on Windows in release mode.
+        #[cfg(windows)]
+        {
+            use std::os::windows::process::CommandExt;
+            const CREATE_NO_WINDOW: u32 = 0x08000000;
+            cmd.creation_flags(CREATE_NO_WINDOW);
+        }
+
+        cmd.stdout(std::process::Stdio::piped());
+        cmd.stderr(std::process::Stdio::piped());
+
+        let mut child = cmd.spawn().map_err(|e| format!("Failed to spawn sidecar: {e}"))?;
+
+        // Wait for the `{"event":"ready","port":...}` line on stdout.
+        let stdout = child
+            .stdout
+            .take()
+            .ok_or("Failed to capture sidecar stdout")?;
+
+        let port = Self::wait_for_ready(stdout)?;
+
+        self.child = Some(child);
+        self.port = Some(port);
+        Ok(port)
+    }
+
+    /// Stop the sidecar process if running.
+    pub fn stop(&mut self) {
+        if let Some(mut child) = self.child.take() {
+            let _ = child.kill();
+            let _ = child.wait();
+        }
+        self.port = None;
+    }
+
+    /// Return the port the sidecar is listening on, if known.
+    pub fn port(&self) -> Option<u16> {
+        self.port
+    }
+
+    // -- private helpers -------------------------------------------------------
+
+    fn build_dev_command(&self) -> Result<std::process::Command, String> {
+        let mut cmd = std::process::Command::new("python");
+        cmd.args(["-m", "backend.main_headless"]);
+
+        // Try to find the project root (parent of src-tauri)
+        if let Some(dirs) = DIRS.get() {
+            let project_root = dirs
+                .resource_dir
+                .parent() // src-tauri
+                .and_then(|p| p.parent()); // project root
+            if let Some(root) = project_root {
+                cmd.current_dir(root);
+            }
+        }
+
+        Ok(cmd)
+    }
+
+    fn build_prod_command(&self) -> Result<std::process::Command, String> {
+        let version = read_installed_version()
+            .ok_or("No sidecar version installed")?;
+        let bin = binary_path_for_version(&version);
+        if !bin.exists() {
+            return Err(format!("Sidecar binary not found at {}", bin.display()));
+        }
+        let mut cmd = std::process::Command::new(&bin);
+        cmd.current_dir(
+            bin.parent()
+                .ok_or("Cannot determine sidecar parent dir")?,
+        );
+        Ok(cmd)
+    }
+
+    fn wait_for_ready(stdout: std::process::ChildStdout) -> Result<u16, String> {
+        let reader = std::io::BufReader::new(stdout);
+        let timeout = std::time::Duration::from_secs(120);
+        let start = std::time::Instant::now();
+
+        for line in reader.lines() {
+            if start.elapsed() > timeout {
+                return Err("Timed out waiting for sidecar ready event".into());
+            }
+            let line = line.map_err(|e| format!("IO error reading stdout: {e}"))?;
+            if let Ok(evt) = serde_json::from_str::<ReadyEvent>(&line) {
+                if evt.event == "ready" {
+                    return Ok(evt.port);
+                }
+            }
+            // Ignore other lines (e.g. log output)
+        }
+        Err("Sidecar process exited before sending ready event".into())
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Tauri-managed SidecarManager state & commands
+// ---------------------------------------------------------------------------
+
+/// Wrapper so we can store `SidecarManager` in Tauri's managed state.
+pub struct ManagedSidecar(pub Mutex<SidecarManager>);
+
+#[tauri::command]
+pub fn get_sidecar_port(state: tauri::State<'_, ManagedSidecar>) -> Result<Option<u16>, String> {
+    let mut mgr = state
+        .0
+        .lock()
+        .map_err(|e| format!("Lock error: {e}"))?;
+    // Refresh running status before returning port
+    if !mgr.is_running() {
+        return Ok(None);
+    }
+    Ok(mgr.port())
+}
+
+#[tauri::command]
+pub fn start_sidecar(state: tauri::State<'_, ManagedSidecar>) -> Result<u16, String> {
+    let mut mgr = state
+        .0
+        .lock()
+        .map_err(|e| format!("Lock error: {e}"))?;
+    mgr.ensure_running()
+}
+
+#[tauri::command]
+pub fn stop_sidecar(state: tauri::State<'_, ManagedSidecar>) -> Result<(), String> {
+    let mut mgr = state
+        .0
+        .lock()
+        .map_err(|e| format!("Lock error: {e}"))?;
+    mgr.stop();
+    Ok(())
+}
--- a/src-tauri/tauri.conf.json
+++ b/src-tauri/tauri.conf.json
@@ -0,0 +1,43 @@
+{
+  "productName": "Local Transcription",
+  "version": "1.4.4",
+  "identifier": "net.anhonesthost.local-transcription",
+  "build": {
+    "frontendDist": "../dist",
+    "devUrl": "http://localhost:1420",
+    "beforeDevCommand": "npm run dev",
+    "beforeBuildCommand": "npm run build"
+  },
+  "app": {
+    "windows": [
+      {
+        "title": "Local Transcription",
+        "width": 800,
+        "height": 600,
+        "minWidth": 640,
+        "minHeight": 480,
+        "resizable": true
+      }
+    ],
+    "security": {
+      "csp": null
+    }
+  },
+  "bundle": {
+    "active": true,
+    "targets": "all",
+    "icon": [
+      "icons/32x32.png",
+      "icons/128x128.png",
+      "icons/128x128@2x.png",
+      "icons/icon.icns",
+      "icons/icon.ico",
+      "icons/icon.png"
+    ]
+  },
+  "plugins": {
+    "shell": {
+      "open": true
+    }
+  }
+}
--- a/src/App.svelte
+++ b/src/App.svelte
@@ -0,0 +1,253 @@
+<script lang="ts">
+  import { onMount } from "svelte";
+  import Header from "$lib/components/Header.svelte";
+  import StatusBar from "$lib/components/StatusBar.svelte";
+  import Controls from "$lib/components/Controls.svelte";
+  import TranscriptionDisplay from "$lib/components/TranscriptionDisplay.svelte";
+  import Settings from "$lib/components/Settings.svelte";
+  import SidecarSetup from "$lib/components/SidecarSetup.svelte";
+  import { backendStore } from "$lib/stores/backend";
+  import { configStore } from "$lib/stores/config";
+
+  type SidecarState = "checking" | "needs_setup" | "starting" | "connected";
+
+  let showSettings = $state(false);
+  let sidecarState = $state<SidecarState>("checking");
+
+  let obsDisplayUrl = $derived(backendStore.obsUrl);
+  let syncDisplayUrl = $derived(backendStore.syncUrl);
+  let isConnected = $derived(backendStore.connectionState === "connected");
+  let connectionState = $derived(backendStore.connectionState);
+
+  function openSettings() {
+    showSettings = true;
+  }
+
+  function closeSettings() {
+    showSettings = false;
+  }
+
+  async function checkAndLaunchSidecar() {
+    try {
+      const { invoke } = await import("@tauri-apps/api/core");
+
+      // Check if sidecar is installed
+      sidecarState = "checking";
+      const installed = await invoke<boolean>("check_sidecar");
+
+      if (!installed) {
+        sidecarState = "needs_setup";
+        return;
+      }
+
+      await launchSidecar();
+    } catch {
+      // Not running in Tauri (browser dev mode) - skip sidecar check
+      // and connect directly to localhost:8081
+      sidecarState = "starting";
+      backendStore.setPort(8081);
+      backendStore.connect();
+      configStore.loadConfig();
+    }
+  }
+
+  async function launchSidecar() {
+    try {
+      const { invoke } = await import("@tauri-apps/api/core");
+
+      sidecarState = "starting";
+      await invoke("start_sidecar");
+
+      const port = await invoke<number>("get_sidecar_port");
+      backendStore.setPort(port);
+      backendStore.connect();
+      configStore.loadConfig();
+    } catch {
+      // If sidecar launch fails, still try connecting to default port
+      sidecarState = "starting";
+      backendStore.connect();
+      configStore.loadConfig();
+    }
+  }
+
+  async function onSidecarReady() {
+    await launchSidecar();
+  }
+
+  onMount(() => {
+    checkAndLaunchSidecar();
+
+    return () => {
+      backendStore.disconnect();
+    };
+  });
+</script>
+
+{#if sidecarState === "checking"}
+  <div class="connecting-overlay">
+    <div class="connecting-content">
+      <div class="connecting-icon">
+        <div class="spinner"></div>
+      </div>
+      <h2>Local Transcription</h2>
+      <p>Checking setup...</p>
+    </div>
+  </div>
+
+{:else if sidecarState === "needs_setup"}
+  <SidecarSetup onComplete={onSidecarReady} />
+
+{:else if !isConnected}
+  <div class="connecting-overlay">
+    <div class="connecting-content">
+      <div class="connecting-icon">
+        {#if connectionState === "error"}
+          <svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="#e74c3c" stroke-width="2">
+            <circle cx="12" cy="12" r="10"/>
+            <line x1="15" y1="9" x2="9" y2="15"/>
+            <line x1="9" y1="9" x2="15" y2="15"/>
+          </svg>
+        {:else}
+          <div class="spinner"></div>
+        {/if}
+      </div>
+      <h2>Local Transcription</h2>
+      {#if connectionState === "error"}
+        <p>Cannot connect to backend</p>
+        <p class="hint">Make sure the Python backend is running:<br>
+          <code>uv run python -m backend.main_headless</code></p>
+      {:else}
+        <p>Connecting to backend...</p>
+      {/if}
+    </div>
+  </div>
+
+{:else}
+  <div class="app-shell">
+    <Header onSettingsClick={openSettings} />
+    <StatusBar />
+
+    <div class="display-links">
+      <span class="link-label">OBS:</span>
+      <a href={obsDisplayUrl} target="_blank" rel="noopener">{obsDisplayUrl}</a>
+      {#if syncDisplayUrl}
+        <span class="link-separator">|</span>
+        <span class="link-label">Sync:</span>
+        <a href={syncDisplayUrl} target="_blank" rel="noopener"
+          >{syncDisplayUrl}</a
+        >
+      {/if}
+    </div>
+
+    <TranscriptionDisplay />
+    <Controls />
+
+    <div class="version-label">v{backendStore.version}</div>
+  </div>
+
+  {#if showSettings}
+    <Settings onClose={closeSettings} />
+  {/if}
+{/if}
+
+<style>
+  .connecting-overlay {
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    height: 100%;
+    width: 100%;
+    background-color: var(--bg-primary);
+  }
+
+  .connecting-content {
+    text-align: center;
+    color: var(--text-primary);
+  }
+
+  .connecting-content h2 {
+    margin: 16px 0 8px;
+    font-size: 20px;
+    font-weight: 600;
+  }
+
+  .connecting-content p {
+    margin: 4px 0;
+    color: var(--text-secondary);
+    font-size: 14px;
+  }
+
+  .connecting-content .hint {
+    margin-top: 16px;
+    font-size: 12px;
+    color: var(--text-muted);
+  }
+
+  .connecting-content code {
+    display: inline-block;
+    margin-top: 4px;
+    padding: 4px 8px;
+    background: var(--bg-tertiary);
+    border-radius: 4px;
+    font-size: 12px;
+    color: var(--text-primary);
+  }
+
+  .connecting-icon {
+    display: flex;
+    justify-content: center;
+    margin-bottom: 8px;
+  }
+
+  .spinner {
+    width: 40px;
+    height: 40px;
+    border: 3px solid var(--border-color);
+    border-top-color: var(--accent-color, #4CAF50);
+    border-radius: 50%;
+    animation: spin 0.8s linear infinite;
+  }
+
+  @keyframes spin {
+    to { transform: rotate(360deg); }
+  }
+
+  .app-shell {
+    display: flex;
+    flex-direction: column;
+    height: 100%;
+    width: 100%;
+    background-color: var(--bg-primary);
+  }
+
+  .display-links {
+    display: flex;
+    align-items: center;
+    gap: 6px;
+    padding: 6px 20px;
+    font-size: 12px;
+    background-color: var(--bg-primary);
+    border-bottom: 1px solid var(--border-color);
+    flex-shrink: 0;
+  }
+
+  .link-label {
+    color: var(--text-secondary);
+    font-weight: 500;
+  }
+
+  .link-separator {
+    color: var(--text-muted);
+    margin: 0 4px;
+  }
+
+  .version-label {
+    position: fixed;
+    bottom: 6px;
+    right: 12px;
+    font-size: 11px;
+    color: var(--text-muted);
+    pointer-events: none;
+    z-index: 10;
+  }
+</style>
--- a/src/app.css
+++ b/src/app.css
@@ -0,0 +1,312 @@
+/* Global dark theme styles for Local Transcription */
+
+:root {
+  --bg-primary: #1e1e1e;
+  --bg-secondary: #2d2d2d;
+  --bg-tertiary: #3a3a3a;
+  --bg-hover: #454545;
+  --text-primary: #e0e0e0;
+  --text-secondary: #a0a0a0;
+  --text-muted: #707070;
+  --accent-green: #4caf50;
+  --accent-green-hover: #45a049;
+  --accent-red: #f44336;
+  --accent-red-hover: #d32f2f;
+  --accent-blue: #2196f3;
+  --accent-blue-hover: #1976d2;
+  --accent-orange: #ff9800;
+  --border-color: #444;
+  --border-color-light: #555;
+  --scrollbar-track: #2d2d2d;
+  --scrollbar-thumb: #555;
+  --scrollbar-thumb-hover: #777;
+}
+
+*,
+*::before,
+*::after {
+  box-sizing: border-box;
+  margin: 0;
+  padding: 0;
+}
+
+html,
+body {
+  height: 100%;
+  width: 100%;
+  overflow: hidden;
+}
+
+body {
+  background-color: var(--bg-primary);
+  color: var(--text-primary);
+  font-family: system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto,
+    "Helvetica Neue", Arial, sans-serif;
+  font-size: 14px;
+  line-height: 1.5;
+  -webkit-font-smoothing: antialiased;
+  -moz-osx-font-smoothing: grayscale;
+}
+
+#app {
+  height: 100%;
+  width: 100%;
+  display: flex;
+  flex-direction: column;
+}
+
+/* Buttons */
+button {
+  font-family: inherit;
+  font-size: 13px;
+  font-weight: 500;
+  padding: 8px 16px;
+  border: 1px solid var(--border-color);
+  border-radius: 6px;
+  background-color: var(--bg-secondary);
+  color: var(--text-primary);
+  cursor: pointer;
+  transition: background-color 0.15s ease, border-color 0.15s ease,
+    transform 0.1s ease;
+  user-select: none;
+}
+
+button:hover {
+  background-color: var(--bg-hover);
+  border-color: var(--border-color-light);
+}
+
+button:active {
+  transform: scale(0.98);
+}
+
+button:disabled {
+  opacity: 0.5;
+  cursor: not-allowed;
+  transform: none;
+}
+
+button.primary {
+  background-color: var(--accent-green);
+  border-color: var(--accent-green);
+  color: white;
+}
+
+button.primary:hover {
+  background-color: var(--accent-green-hover);
+}
+
+button.danger {
+  background-color: var(--accent-red);
+  border-color: var(--accent-red);
+  color: white;
+}
+
+button.danger:hover {
+  background-color: var(--accent-red-hover);
+}
+
+/* Inputs and Selects */
+input[type="text"],
+input[type="password"],
+input[type="number"],
+input[type="url"],
+input[type="email"],
+select,
+textarea {
+  font-family: inherit;
+  font-size: 13px;
+  padding: 8px 12px;
+  border: 1px solid var(--border-color);
+  border-radius: 6px;
+  background-color: var(--bg-secondary);
+  color: var(--text-primary);
+  outline: none;
+  transition: border-color 0.15s ease;
+  width: 100%;
+}
+
+input[type="text"]:focus,
+input[type="password"]:focus,
+input[type="number"]:focus,
+input[type="url"]:focus,
+input[type="email"]:focus,
+select:focus,
+textarea:focus {
+  border-color: var(--accent-blue);
+}
+
+input[type="text"]::placeholder,
+input[type="password"]::placeholder,
+input[type="url"]::placeholder {
+  color: var(--text-muted);
+}
+
+select {
+  appearance: none;
+  background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' width='12' height='12' viewBox='0 0 12 12'%3E%3Cpath fill='%23a0a0a0' d='M6 8L1 3h10z'/%3E%3C/svg%3E");
+  background-repeat: no-repeat;
+  background-position: right 10px center;
+  padding-right: 30px;
+}
+
+/* Color input */
+input[type="color"] {
+  width: 50px;
+  height: 36px;
+  border: 1px solid var(--border-color);
+  border-radius: 6px;
+  background-color: var(--bg-secondary);
+  cursor: pointer;
+  padding: 2px;
+}
+
+input[type="color"]::-webkit-color-swatch-wrapper {
+  padding: 2px;
+}
+
+input[type="color"]::-webkit-color-swatch {
+  border: none;
+  border-radius: 3px;
+}
+
+/* Range slider */
+input[type="range"] {
+  -webkit-appearance: none;
+  appearance: none;
+  width: 100%;
+  height: 6px;
+  background: var(--bg-tertiary);
+  border-radius: 3px;
+  outline: none;
+  cursor: pointer;
+}
+
+input[type="range"]::-webkit-slider-thumb {
+  -webkit-appearance: none;
+  appearance: none;
+  width: 16px;
+  height: 16px;
+  border-radius: 50%;
+  background: var(--accent-blue);
+  cursor: pointer;
+  border: 2px solid var(--bg-primary);
+}
+
+input[type="range"]::-moz-range-thumb {
+  width: 16px;
+  height: 16px;
+  border-radius: 50%;
+  background: var(--accent-blue);
+  cursor: pointer;
+  border: 2px solid var(--bg-primary);
+}
+
+/* Toggle / Checkbox styled as switch */
+input[type="checkbox"] {
+  position: relative;
+  width: 40px;
+  height: 22px;
+  -webkit-appearance: none;
+  appearance: none;
+  background-color: var(--bg-tertiary);
+  border-radius: 11px;
+  cursor: pointer;
+  transition: background-color 0.2s ease;
+  flex-shrink: 0;
+}
+
+input[type="checkbox"]::after {
+  content: "";
+  position: absolute;
+  top: 2px;
+  left: 2px;
+  width: 18px;
+  height: 18px;
+  background-color: var(--text-secondary);
+  border-radius: 50%;
+  transition: transform 0.2s ease, background-color 0.2s ease;
+}
+
+input[type="checkbox"]:checked {
+  background-color: var(--accent-green);
+}
+
+input[type="checkbox"]:checked::after {
+  transform: translateX(18px);
+  background-color: white;
+}
+
+/* Radio buttons */
+input[type="radio"] {
+  -webkit-appearance: none;
+  appearance: none;
+  width: 18px;
+  height: 18px;
+  border: 2px solid var(--border-color);
+  border-radius: 50%;
+  background-color: var(--bg-secondary);
+  cursor: pointer;
+  position: relative;
+  flex-shrink: 0;
+}
+
+input[type="radio"]:checked {
+  border-color: var(--accent-blue);
+}
+
+input[type="radio"]:checked::after {
+  content: "";
+  position: absolute;
+  top: 3px;
+  left: 3px;
+  width: 8px;
+  height: 8px;
+  background-color: var(--accent-blue);
+  border-radius: 50%;
+}
+
+/* Scrollbar */
+::-webkit-scrollbar {
+  width: 8px;
+  height: 8px;
+}
+
+::-webkit-scrollbar-track {
+  background: var(--scrollbar-track);
+  border-radius: 4px;
+}
+
+::-webkit-scrollbar-thumb {
+  background: var(--scrollbar-thumb);
+  border-radius: 4px;
+}
+
+::-webkit-scrollbar-thumb:hover {
+  background: var(--scrollbar-thumb-hover);
+}
+
+/* Firefox scrollbar */
+* {
+  scrollbar-width: thin;
+  scrollbar-color: var(--scrollbar-thumb) var(--scrollbar-track);
+}
+
+/* Links */
+a {
+  color: var(--accent-blue);
+  text-decoration: none;
+}
+
+a:hover {
+  text-decoration: underline;
+}
+
+/* Label */
+label {
+  font-size: 13px;
+  color: var(--text-secondary);
+  display: flex;
+  align-items: center;
+  gap: 8px;
+}
--- a/src/lib/components/Controls.svelte
+++ b/src/lib/components/Controls.svelte
@@ -0,0 +1,116 @@
+<script lang="ts">
+  import { backendStore } from "$lib/stores/backend";
+  import { transcriptionStore } from "$lib/stores/transcriptions";
+
+  let isTranscribing = $derived(backendStore.appState === "transcribing");
+  let isReady = $derived(
+    backendStore.appState === "ready" || backendStore.appState === "transcribing"
+  );
+  let isLoading = $state(false);
+
+  async function toggleTranscription() {
+    if (isLoading) return;
+    isLoading = true;
+    try {
+      if (isTranscribing) {
+        await backendStore.apiPost("/api/stop");
+      } else {
+        await backendStore.apiPost("/api/start");
+      }
+    } catch (err) {
+      console.error("Failed to toggle transcription:", err);
+    } finally {
+      isLoading = false;
+    }
+  }
+
+  async function clearTranscriptions() {
+    try {
+      await backendStore.apiPost("/api/clear");
+      transcriptionStore.clearAll();
+    } catch (err) {
+      console.error("Failed to clear:", err);
+    }
+  }
+
+  async function saveTranscriptions() {
+    try {
+      // Get transcription text from backend or local store
+      let text: string;
+      try {
+        const data = await backendStore.apiGet<{ text: string }>("/api/transcriptions");
+        text = data.text || transcriptionStore.getPlainText();
+      } catch {
+        text = transcriptionStore.getPlainText();
+      }
+
+      if (!text.trim()) {
+        console.warn("No transcriptions to save");
+        return;
+      }
+
+      // Try Tauri dialog for native save, fall back to browser download
+      try {
+        const { save } = await import("@tauri-apps/plugin-dialog");
+        const filePath = await save({
+          defaultPath: "transcription.txt",
+          filters: [
+            { name: "Text Files", extensions: ["txt"] },
+            { name: "All Files", extensions: ["*"] },
+          ],
+        });
+        if (filePath) {
+          // Write via backend API
+          await backendStore.apiPost("/api/save-file", { path: filePath, text });
+        }
+      } catch {
+        // Fallback: browser-style download
+        const blob = new Blob([text], { type: "text/plain" });
+        const url = URL.createObjectURL(blob);
+        const a = document.createElement("a");
+        a.href = url;
+        a.download = "transcription.txt";
+        a.click();
+        URL.revokeObjectURL(url);
+      }
+    } catch (err) {
+      console.error("Failed to save:", err);
+    }
+  }
+</script>
+
+<div class="controls">
+  <button
+    class={isTranscribing ? "danger" : "primary"}
+    onclick={toggleTranscription}
+    disabled={!isReady || isLoading}
+  >
+    {#if isLoading}
+      ...
+    {:else if isTranscribing}
+      Stop Transcription
+    {:else}
+      Start Transcription
+    {/if}
+  </button>
+
+  <button onclick={clearTranscriptions} disabled={!backendStore.connected}>
+    Clear
+  </button>
+
+  <button onclick={saveTranscriptions} disabled={!backendStore.connected}>
+    Save
+  </button>
+</div>
+
+<style>
+  .controls {
+    display: flex;
+    align-items: center;
+    gap: 8px;
+    padding: 10px 20px;
+    background-color: var(--bg-secondary);
+    border-top: 1px solid var(--border-color);
+    flex-shrink: 0;
+  }
+</style>
--- a/src/lib/components/Header.svelte
+++ b/src/lib/components/Header.svelte
@@ -0,0 +1,82 @@
+<script lang="ts">
+  interface Props {
+    onSettingsClick: () => void;
+  }
+
+  let { onSettingsClick }: Props = $props();
+</script>
+
+<header class="app-header">
+  <h1 class="app-title">Local Transcription</h1>
+  <button class="settings-btn" onclick={onSettingsClick} title="Settings">
+    <svg
+      width="20"
+      height="20"
+      viewBox="0 0 24 24"
+      fill="none"
+      stroke="currentColor"
+      stroke-width="2"
+      stroke-linecap="round"
+      stroke-linejoin="round"
+    >
+      <circle cx="12" cy="12" r="3"></circle>
+      <path
+        d="M19.4 15a1.65 1.65 0 0 0 .33 1.82l.06.06a2 2 0 0 1
+        0 2.83 2 2 0 0 1-2.83 0l-.06-.06a1.65 1.65 0 0
+        0-1.82-.33 1.65 1.65 0 0 0-1 1.51V21a2 2 0 0 1-2
+        2 2 2 0 0 1-2-2v-.09A1.65 1.65 0 0 0 9 19.4a1.65
+        1.65 0 0 0-1.82.33l-.06.06a2 2 0 0 1-2.83 0 2 2
+        0 0 1 0-2.83l.06-.06A1.65 1.65 0 0 0 4.68
+        15a1.65 1.65 0 0 0-1.51-1H3a2 2 0 0 1-2-2 2 2 0
+        0 1 2-2h.09A1.65 1.65 0 0 0 4.6 9a1.65 1.65 0 0
+        0-.33-1.82l-.06-.06a2 2 0 0 1 0-2.83 2 2 0 0 1
+        2.83 0l.06.06A1.65 1.65 0 0 0 9 4.68a1.65 1.65 0
+        0 0 1-1.51V3a2 2 0 0 1 2-2 2 2 0 0 1 2
+        2v.09a1.65 1.65 0 0 0 1 1.51 1.65 1.65 0 0 0
+        1.82-.33l.06-.06a2 2 0 0 1 2.83 0 2 2 0 0 1 0
+        2.83l-.06.06a1.65 1.65 0 0 0-.33 1.82V9a1.65 1.65
+        0 0 0 1.51 1H21a2 2 0 0 1 2 2 2 2 0 0
+        1-2 2h-.09a1.65 1.65 0 0 0-1.51 1z"
+      ></path>
+    </svg>
+  </button>
+</header>
+
+<style>
+  .app-header {
+    display: flex;
+    align-items: center;
+    justify-content: space-between;
+    padding: 12px 20px;
+    background-color: var(--bg-secondary);
+    border-bottom: 1px solid var(--border-color);
+    flex-shrink: 0;
+  }
+
+  .app-title {
+    font-size: 24px;
+    font-weight: 700;
+    color: var(--text-primary);
+    letter-spacing: -0.5px;
+  }
+
+  .settings-btn {
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    width: 36px;
+    height: 36px;
+    padding: 0;
+    border: 1px solid var(--border-color);
+    border-radius: 8px;
+    background-color: transparent;
+    color: var(--text-secondary);
+    cursor: pointer;
+    transition: color 0.15s ease, background-color 0.15s ease;
+  }
+
+  .settings-btn:hover {
+    color: var(--text-primary);
+    background-color: var(--bg-tertiary);
+  }
+</style>
--- a/src/lib/components/Settings.svelte
+++ b/src/lib/components/Settings.svelte
@@ -0,0 +1,780 @@
+<script lang="ts">
+  import { configStore } from "$lib/stores/config";
+  import { backendStore } from "$lib/stores/backend";
+
+  interface Props {
+    onClose: () => void;
+  }
+
+  let { onClose }: Props = $props();
+
+  // Local copies of config values for editing
+  let userName = $state("");
+  let audioDevice = $state("default");
+  let model = $state("base.en");
+  let language = $state("en");
+  let computeDevice = $state("auto");
+  let computeType = $state("default");
+  let enableRealtime = $state(false);
+  let realtimeModel = $state("tiny.en");
+  let realtimeProcessingPause = $state(0.1);
+  let sileroSensitivity = $state(0.4);
+  let webrtcSensitivity = $state(3);
+  let postSpeechSilence = $state(0.3);
+  let minRecordingLength = $state(0.5);
+  let minGapBetween = $state(0);
+  let continuousMode = $state(false);
+  let showTimestamps = $state(true);
+  let fadeSeconds = $state(10);
+  let maxLines = $state(100);
+  let fontSize = $state(12);
+  let userColor = $state("#4CAF50");
+  let textColor = $state("#FFFFFF");
+  let backgroundColor = $state("#000000");
+  let syncEnabled = $state(false);
+  let syncUrl = $state("");
+  let syncRoom = $state("default");
+  let syncPassphrase = $state("");
+  let remoteMode = $state("local");
+  let remoteServerUrl = $state("");
+  let managedEmail = $state("");
+  let managedPassword = $state("");
+  let autoCheckUpdates = $state(true);
+
+  // Fetched device lists
+  let audioDevices = $state<{ id: string; name: string }[]>([]);
+  let computeDevices = $state<{ id: string; name: string }[]>([]);
+
+  // Model options
+  const modelOptions = [
+    "tiny",
+    "tiny.en",
+    "base",
+    "base.en",
+    "small",
+    "small.en",
+    "medium",
+    "medium.en",
+    "large-v1",
+    "large-v2",
+    "large-v3",
+  ];
+
+  const computeTypeOptions = [
+    { value: "default", label: "Default" },
+    { value: "int8", label: "int8 (Fastest)" },
+    { value: "float16", label: "float16 (GPU)" },
+    { value: "float32", label: "float32 (Best Quality)" },
+  ];
+
+  const webrtcOptions = [
+    { value: 0, label: "0 (Most Sensitive)" },
+    { value: 1, label: "1" },
+    { value: 2, label: "2" },
+    { value: 3, label: "3 (Least Sensitive)" },
+  ];
+
+  // Load config values on mount
+  $effect(() => {
+    const cfg = configStore.config;
+    userName = cfg.user.name;
+    audioDevice = cfg.audio.input_device;
+    model = cfg.transcription.model;
+    language = cfg.transcription.language;
+    computeDevice = cfg.transcription.device;
+    computeType = cfg.transcription.compute_type;
+    enableRealtime = cfg.transcription.enable_realtime_transcription;
+    realtimeModel = cfg.transcription.realtime_model;
+    realtimeProcessingPause = cfg.transcription.realtime_processing_pause;
+    sileroSensitivity = cfg.transcription.silero_sensitivity;
+    webrtcSensitivity = cfg.transcription.webrtc_sensitivity;
+    postSpeechSilence = cfg.transcription.post_speech_silence_duration;
+    minRecordingLength = cfg.transcription.min_length_of_recording;
+    minGapBetween = cfg.transcription.min_gap_between_recordings;
+    continuousMode = cfg.transcription.continuous_mode;
+    showTimestamps = cfg.display.show_timestamps;
+    fadeSeconds = cfg.display.fade_after_seconds;
+    maxLines = cfg.display.max_lines;
+    fontSize = cfg.display.font_size;
+    userColor = cfg.display.user_color;
+    textColor = cfg.display.text_color;
+    // Strip alpha from background color for the color picker (only supports 6-char hex)
+    const bgHex = cfg.display.background_color.replace("#", "");
+    backgroundColor = "#" + bgHex.substring(0, 6);
+    syncEnabled = cfg.server_sync.enabled;
+    syncUrl = cfg.server_sync.url;
+    syncRoom = cfg.server_sync.room;
+    syncPassphrase = cfg.server_sync.passphrase;
+    remoteMode = cfg.remote.mode;
+    remoteServerUrl = cfg.remote.server_url;
+    autoCheckUpdates = cfg.updates.auto_check;
+  });
+
+  // Fetch audio devices and compute devices on mount
+  $effect(() => {
+    fetchAudioDevices();
+    fetchComputeDevices();
+  });
+
+  async function fetchAudioDevices() {
+    try {
+      const data = await backendStore.apiGet<{
+        devices: { id: string; name: string }[];
+      }>("/api/audio-devices");
+      audioDevices = data.devices ?? [];
+    } catch {
+      audioDevices = [];
+    }
+  }
+
+  async function fetchComputeDevices() {
+    try {
+      const data = await backendStore.apiGet<{
+        devices: { id: string; name: string }[];
+      }>("/api/compute-devices");
+      computeDevices = data.devices ?? [];
+    } catch {
+      computeDevices = [
+        { id: "auto", name: "Auto" },
+        { id: "cpu", name: "CPU" },
+        { id: "cuda", name: "CUDA (GPU)" },
+      ];
+    }
+  }
+
+  async function handleSave() {
+    const updates = {
+      user: {
+        name: userName,
+      },
+      audio: {
+        input_device: audioDevice,
+      },
+      transcription: {
+        model,
+        device: computeDevice,
+        language,
+        compute_type: computeType,
+        enable_realtime_transcription: enableRealtime,
+        realtime_model: realtimeModel,
+        realtime_processing_pause: realtimeProcessingPause,
+        silero_sensitivity: sileroSensitivity,
+        webrtc_sensitivity: webrtcSensitivity,
+        post_speech_silence_duration: postSpeechSilence,
+        min_length_of_recording: minRecordingLength,
+        min_gap_between_recordings: minGapBetween,
+        continuous_mode: continuousMode,
+      },
+      display: {
+        show_timestamps: showTimestamps,
+        fade_after_seconds: fadeSeconds,
+        max_lines: maxLines,
+        font_size: fontSize,
+        user_color: userColor,
+        text_color: textColor,
+        background_color: backgroundColor,
+      },
+      server_sync: {
+        enabled: syncEnabled,
+        url: syncUrl,
+        room: syncRoom,
+        passphrase: syncPassphrase,
+      },
+      remote: {
+        mode: remoteMode,
+        server_url: remoteServerUrl,
+      },
+      updates: {
+        auto_check: autoCheckUpdates,
+      },
+    };
+
+    try {
+      await configStore.saveConfig(updates);
+      onClose();
+    } catch (err) {
+      console.error("Failed to save settings:", err);
+    }
+  }
+
+  function handleCancel() {
+    onClose();
+  }
+
+  async function handleCheckUpdates() {
+    try {
+      await backendStore.apiPost("/api/check-updates");
+    } catch (err) {
+      console.error("Failed to check for updates:", err);
+    }
+  }
+
+  async function handleManagedLogin() {
+    try {
+      await backendStore.apiPost("/api/remote/login", {
+        email: managedEmail,
+        password: managedPassword,
+      });
+    } catch (err) {
+      console.error("Login failed:", err);
+    }
+  }
+
+  async function handleManagedRegister() {
+    try {
+      await backendStore.apiPost("/api/remote/register", {
+        email: managedEmail,
+        password: managedPassword,
+      });
+    } catch (err) {
+      console.error("Register failed:", err);
+    }
+  }
+
+  function handleOverlayClick(e: MouseEvent) {
+    if ((e.target as HTMLElement).classList.contains("settings-overlay")) {
+      handleCancel();
+    }
+  }
+
+  function handleKeydown(e: KeyboardEvent) {
+    if (e.key === "Escape") {
+      handleCancel();
+    }
+  }
+</script>
+
+<svelte:window onkeydown={handleKeydown} />
+
+<!-- svelte-ignore a11y_click_events_have_key_events a11y_no_static_element_interactions -->
+<div class="settings-overlay" role="presentation" onclick={handleOverlayClick}>
+  <div class="settings-panel">
+    <div class="settings-header">
+      <h2>Settings</h2>
+      <button class="close-btn" aria-label="Close settings" onclick={handleCancel}>
+        <svg
+          width="18"
+          height="18"
+          viewBox="0 0 24 24"
+          fill="none"
+          stroke="currentColor"
+          stroke-width="2"
+          stroke-linecap="round"
+          stroke-linejoin="round"
+        >
+          <line x1="18" y1="6" x2="6" y2="18"></line>
+          <line x1="6" y1="6" x2="18" y2="18"></line>
+        </svg>
+      </button>
+    </div>
+
+    <div class="settings-content">
+      <!-- User Settings -->
+      <section class="settings-section">
+        <h3>User Settings</h3>
+        <div class="field">
+          <label for="user-name">Display Name</label>
+          <input id="user-name" type="text" bind:value={userName} />
+        </div>
+      </section>
+
+      <!-- Audio Settings -->
+      <section class="settings-section">
+        <h3>Audio Settings</h3>
+        <div class="field">
+          <label for="audio-device">Audio Device</label>
+          <select id="audio-device" bind:value={audioDevice}>
+            <option value="default">Default</option>
+            {#each audioDevices as device}
+              <option value={device.id}>{device.name}</option>
+            {/each}
+          </select>
+        </div>
+      </section>
+
+      <!-- Transcription Settings -->
+      <section class="settings-section">
+        <h3>Transcription Settings</h3>
+        <div class="field">
+          <label for="model">Model</label>
+          <select id="model" bind:value={model}>
+            {#each modelOptions as opt}
+              <option value={opt}>{opt}</option>
+            {/each}
+          </select>
+        </div>
+        <div class="field">
+          <label for="language">Language</label>
+          <input id="language" type="text" bind:value={language} placeholder="en" />
+        </div>
+        <div class="field">
+          <label for="compute-device">Compute Device</label>
+          <select id="compute-device" bind:value={computeDevice}>
+            {#each computeDevices as dev}
+              <option value={dev.id}>{dev.name}</option>
+            {/each}
+          </select>
+        </div>
+        <div class="field">
+          <label for="compute-type">Compute Type</label>
+          <select id="compute-type" bind:value={computeType}>
+            {#each computeTypeOptions as opt}
+              <option value={opt.value}>{opt.label}</option>
+            {/each}
+          </select>
+        </div>
+      </section>
+
+      <!-- Realtime Preview -->
+      <section class="settings-section">
+        <h3>Realtime Preview</h3>
+        <div class="field-row">
+          <label for="enable-realtime">Enable Realtime Preview</label>
+          <input
+            id="enable-realtime"
+            type="checkbox"
+            bind:checked={enableRealtime}
+          />
+        </div>
+        {#if enableRealtime}
+          <div class="field">
+            <label for="realtime-model">Realtime Model</label>
+            <select id="realtime-model" bind:value={realtimeModel}>
+              {#each modelOptions as opt}
+                <option value={opt}>{opt}</option>
+              {/each}
+            </select>
+          </div>
+          <div class="field">
+            <label for="realtime-pause"
+              >Processing Pause: {realtimeProcessingPause.toFixed(2)}s</label
+            >
+            <input
+              id="realtime-pause"
+              type="range"
+              min="0.01"
+              max="1.0"
+              step="0.01"
+              bind:value={realtimeProcessingPause}
+            />
+          </div>
+        {/if}
+      </section>
+
+      <!-- VAD Settings -->
+      <section class="settings-section">
+        <h3>VAD Settings</h3>
+        <div class="field">
+          <label for="silero-sensitivity"
+            >Silero Sensitivity: {sileroSensitivity.toFixed(2)}</label
+          >
+          <input
+            id="silero-sensitivity"
+            type="range"
+            min="0.0"
+            max="1.0"
+            step="0.05"
+            bind:value={sileroSensitivity}
+          />
+        </div>
+        <div class="field">
+          <label for="webrtc-sensitivity">WebRTC Sensitivity</label>
+          <select id="webrtc-sensitivity" bind:value={webrtcSensitivity}>
+            {#each webrtcOptions as opt}
+              <option value={opt.value}>{opt.label}</option>
+            {/each}
+          </select>
+        </div>
+      </section>
+
+      <!-- Timing -->
+      <section class="settings-section">
+        <h3>Timing</h3>
+        <div class="field">
+          <label for="post-speech-silence"
+            >Post-Speech Silence: {postSpeechSilence.toFixed(2)}s</label
+          >
+          <input
+            id="post-speech-silence"
+            type="range"
+            min="0.1"
+            max="3.0"
+            step="0.1"
+            bind:value={postSpeechSilence}
+          />
+        </div>
+        <div class="field">
+          <label for="min-recording"
+            >Min Recording Length: {minRecordingLength.toFixed(2)}s</label
+          >
+          <input
+            id="min-recording"
+            type="range"
+            min="0.1"
+            max="5.0"
+            step="0.1"
+            bind:value={minRecordingLength}
+          />
+        </div>
+        <div class="field">
+          <label for="min-gap"
+            >Min Gap Between Recordings: {minGapBetween.toFixed(2)}s</label
+          >
+          <input
+            id="min-gap"
+            type="range"
+            min="0"
+            max="3.0"
+            step="0.1"
+            bind:value={minGapBetween}
+          />
+        </div>
+        <div class="field-row">
+          <label for="continuous-mode">Continuous Mode</label>
+          <input
+            id="continuous-mode"
+            type="checkbox"
+            bind:checked={continuousMode}
+          />
+        </div>
+      </section>
+
+      <!-- Display Settings -->
+      <section class="settings-section">
+        <h3>Display Settings</h3>
+        <div class="field-row">
+          <label for="show-timestamps">Show Timestamps</label>
+          <input
+            id="show-timestamps"
+            type="checkbox"
+            bind:checked={showTimestamps}
+          />
+        </div>
+        <div class="field">
+          <label for="fade-seconds"
+            >Fade After Seconds: {fadeSeconds} (0 = never)</label
+          >
+          <input
+            id="fade-seconds"
+            type="range"
+            min="0"
+            max="60"
+            step="1"
+            bind:value={fadeSeconds}
+          />
+        </div>
+        <div class="field">
+          <label for="max-lines">Max Lines: {maxLines}</label>
+          <input
+            id="max-lines"
+            type="range"
+            min="10"
+            max="500"
+            step="10"
+            bind:value={maxLines}
+          />
+        </div>
+        <div class="field">
+          <label for="font-size">Font Size: {fontSize}px</label>
+          <input
+            id="font-size"
+            type="range"
+            min="8"
+            max="32"
+            step="1"
+            bind:value={fontSize}
+          />
+        </div>
+      </section>
+
+      <!-- Color Settings -->
+      <section class="settings-section">
+        <h3>Color Settings</h3>
+        <div class="field-row">
+          <label for="user-color">User Color</label>
+          <input id="user-color" type="color" bind:value={userColor} />
+        </div>
+        <div class="field-row">
+          <label for="text-color">Text Color</label>
+          <input id="text-color" type="color" bind:value={textColor} />
+        </div>
+        <div class="field-row">
+          <label for="bg-color">Background Color</label>
+          <input id="bg-color" type="color" bind:value={backgroundColor} />
+        </div>
+      </section>
+
+      <!-- Server Sync -->
+      <section class="settings-section">
+        <h3>Server Sync</h3>
+        <div class="field-row">
+          <label for="sync-enabled">Enable Server Sync</label>
+          <input
+            id="sync-enabled"
+            type="checkbox"
+            bind:checked={syncEnabled}
+          />
+        </div>
+        {#if syncEnabled}
+          <div class="field">
+            <label for="sync-url">Server URL</label>
+            <input
+              id="sync-url"
+              type="url"
+              bind:value={syncUrl}
+              placeholder="http://localhost:3000/api/send"
+            />
+          </div>
+          <div class="field">
+            <label for="sync-room">Room</label>
+            <input id="sync-room" type="text" bind:value={syncRoom} />
+          </div>
+          <div class="field">
+            <label for="sync-passphrase">Passphrase</label>
+            <input
+              id="sync-passphrase"
+              type="password"
+              bind:value={syncPassphrase}
+            />
+          </div>
+        {/if}
+      </section>
+
+      <!-- Remote Transcription -->
+      <section class="settings-section">
+        <h3>Remote Transcription</h3>
+        <div class="radio-group">
+          <label>
+            <input
+              type="radio"
+              name="remote-mode"
+              value="local"
+              bind:group={remoteMode}
+            />
+            Local
+          </label>
+          <label>
+            <input
+              type="radio"
+              name="remote-mode"
+              value="managed"
+              bind:group={remoteMode}
+            />
+            Managed
+          </label>
+          <label>
+            <input
+              type="radio"
+              name="remote-mode"
+              value="byok"
+              bind:group={remoteMode}
+            />
+            BYOK (Bring Your Own Key)
+          </label>
+        </div>
+        {#if remoteMode !== "local"}
+          <div class="field">
+            <label for="remote-url">Server URL</label>
+            <input
+              id="remote-url"
+              type="url"
+              bind:value={remoteServerUrl}
+              placeholder="wss://your-proxy.com"
+            />
+          </div>
+        {/if}
+        {#if remoteMode === "managed"}
+          <div class="managed-auth">
+            <div class="field">
+              <label for="managed-email">Email</label>
+              <input
+                id="managed-email"
+                type="email"
+                bind:value={managedEmail}
+                placeholder="email@example.com"
+              />
+            </div>
+            <div class="field">
+              <label for="managed-password">Password</label>
+              <input
+                id="managed-password"
+                type="password"
+                bind:value={managedPassword}
+              />
+            </div>
+            <div class="auth-buttons">
+              <button onclick={handleManagedLogin}>Login</button>
+              <button onclick={handleManagedRegister}>Register</button>
+            </div>
+          </div>
+        {/if}
+      </section>
+
+      <!-- Updates -->
+      <section class="settings-section">
+        <h3>Updates</h3>
+        <div class="field-row">
+          <label for="auto-check-updates">Auto-Check for Updates</label>
+          <input
+            id="auto-check-updates"
+            type="checkbox"
+            bind:checked={autoCheckUpdates}
+          />
+        </div>
+        <button onclick={handleCheckUpdates}>Check Now</button>
+      </section>
+    </div>
+
+    <div class="settings-footer">
+      <button onclick={handleCancel}>Cancel</button>
+      <button class="primary" onclick={handleSave}>Save</button>
+    </div>
+  </div>
+</div>
+
+<style>
+  .settings-overlay {
+    position: fixed;
+    top: 0;
+    left: 0;
+    right: 0;
+    bottom: 0;
+    background-color: rgba(0, 0, 0, 0.6);
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    z-index: 1000;
+  }
+
+  .settings-panel {
+    background-color: var(--bg-primary);
+    border: 1px solid var(--border-color);
+    border-radius: 12px;
+    width: 560px;
+    max-width: 95vw;
+    max-height: 85vh;
+    display: flex;
+    flex-direction: column;
+    box-shadow: 0 8px 32px rgba(0, 0, 0, 0.5);
+  }
+
+  .settings-header {
+    display: flex;
+    align-items: center;
+    justify-content: space-between;
+    padding: 16px 20px;
+    border-bottom: 1px solid var(--border-color);
+    flex-shrink: 0;
+  }
+
+  .settings-header h2 {
+    font-size: 18px;
+    font-weight: 600;
+    color: var(--text-primary);
+  }
+
+  .close-btn {
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    width: 32px;
+    height: 32px;
+    padding: 0;
+    border: none;
+    border-radius: 6px;
+    background-color: transparent;
+    color: var(--text-secondary);
+    cursor: pointer;
+  }
+
+  .close-btn:hover {
+    background-color: var(--bg-tertiary);
+    color: var(--text-primary);
+  }
+
+  .settings-content {
+    flex: 1;
+    overflow-y: auto;
+    padding: 16px 20px;
+  }
+
+  .settings-section {
+    margin-bottom: 24px;
+  }
+
+  .settings-section:last-child {
+    margin-bottom: 0;
+  }
+
+  .settings-section h3 {
+    font-size: 14px;
+    font-weight: 600;
+    color: var(--accent-blue);
+    text-transform: uppercase;
+    letter-spacing: 0.5px;
+    margin-bottom: 12px;
+    padding-bottom: 6px;
+    border-bottom: 1px solid var(--border-color);
+  }
+
+  .field {
+    margin-bottom: 12px;
+  }
+
+  .field label {
+    display: block;
+    margin-bottom: 4px;
+    font-size: 12px;
+    color: var(--text-secondary);
+  }
+
+  .field-row {
+    display: flex;
+    align-items: center;
+    justify-content: space-between;
+    margin-bottom: 12px;
+  }
+
+  .field-row label {
+    font-size: 13px;
+    color: var(--text-primary);
+  }
+
+  .radio-group {
+    display: flex;
+    flex-direction: column;
+    gap: 8px;
+    margin-bottom: 12px;
+  }
+
+  .radio-group label {
+    display: flex;
+    align-items: center;
+    gap: 8px;
+    font-size: 13px;
+    color: var(--text-primary);
+    cursor: pointer;
+  }
+
+  .managed-auth {
+    margin-top: 8px;
+    padding: 12px;
+    background-color: var(--bg-secondary);
+    border-radius: 8px;
+  }
+
+  .auth-buttons {
+    display: flex;
+    gap: 8px;
+    margin-top: 8px;
+  }
+
+  .settings-footer {
+    display: flex;
+    justify-content: flex-end;
+    gap: 8px;
+    padding: 16px 20px;
+    border-top: 1px solid var(--border-color);
+    flex-shrink: 0;
+  }
+</style>
--- a/src/lib/components/SidecarSetup.svelte
+++ b/src/lib/components/SidecarSetup.svelte
@@ -0,0 +1,384 @@
+<script lang="ts">
+  import { invoke } from "@tauri-apps/api/core";
+  import { listen } from "@tauri-apps/api/event";
+  import { onMount } from "svelte";
+
+  interface Props {
+    onComplete: () => void;
+  }
+
+  let { onComplete }: Props = $props();
+
+  type SetupState = "choose" | "downloading" | "error" | "success";
+
+  let setupState = $state<SetupState>("choose");
+  let variant = $state<"cpu" | "cuda">("cpu");
+  let progress = $state(0);
+  let progressMessage = $state("");
+  let errorMessage = $state("");
+
+  let unlisten: (() => void) | null = null;
+
+  onMount(() => {
+    return () => {
+      if (unlisten) {
+        unlisten();
+        unlisten = null;
+      }
+    };
+  });
+
+  async function startDownload() {
+    setupState = "downloading";
+    progress = 0;
+    progressMessage = "Starting download...";
+    errorMessage = "";
+
+    try {
+      // Listen for progress events from the Tauri backend
+      unlisten = await listen<{ progress: number; message: string }>(
+        "sidecar-download-progress",
+        (event) => {
+          progress = event.payload.progress;
+          progressMessage = event.payload.message;
+        }
+      );
+
+      await invoke("download_sidecar", { variant });
+
+      // Download complete
+      setupState = "success";
+      if (unlisten) {
+        unlisten();
+        unlisten = null;
+      }
+
+      // Brief pause to show success, then proceed
+      setTimeout(() => {
+        onComplete();
+      }, 1500);
+    } catch (err) {
+      setupState = "error";
+      errorMessage = err instanceof Error ? err.message : String(err);
+      if (unlisten) {
+        unlisten();
+        unlisten = null;
+      }
+    }
+  }
+
+  function retry() {
+    setupState = "choose";
+    progress = 0;
+    progressMessage = "";
+    errorMessage = "";
+  }
+</script>
+
+<div class="setup-overlay">
+  <div class="setup-card">
+    <div class="setup-header">
+      <h1 class="app-title">Local Transcription</h1>
+      <h2 class="setup-heading">First-Time Setup</h2>
+    </div>
+
+    {#if setupState === "choose"}
+      <p class="setup-description">
+        The app needs to download its transcription engine before you can start.
+        Choose the version that best fits your hardware.
+      </p>
+
+      <div class="variant-options">
+        <label class="variant-option" class:selected={variant === "cpu"}>
+          <input
+            type="radio"
+            name="variant"
+            value="cpu"
+            bind:group={variant}
+          />
+          <div class="variant-info">
+            <span class="variant-name">Standard (CPU)</span>
+            <span class="variant-desc">Works on all computers (~500 MB download)</span>
+          </div>
+        </label>
+
+        <label class="variant-option" class:selected={variant === "cuda"}>
+          <input
+            type="radio"
+            name="variant"
+            value="cuda"
+            bind:group={variant}
+          />
+          <div class="variant-info">
+            <span class="variant-name">GPU Accelerated (CUDA)</span>
+            <span class="variant-desc">Faster transcription with NVIDIA GPU (~2 GB download)</span>
+          </div>
+        </label>
+      </div>
+
+      <button class="download-btn" onclick={startDownload}>
+        Download & Install
+      </button>
+
+    {:else if setupState === "downloading"}
+      <div class="progress-section">
+        <p class="progress-message">{progressMessage}</p>
+        <div class="progress-bar-track">
+          <div
+            class="progress-bar-fill"
+            style="width: {progress}%"
+          ></div>
+        </div>
+        <p class="progress-percent">{Math.round(progress)}%</p>
+      </div>
+
+    {:else if setupState === "error"}
+      <div class="error-section">
+        <div class="error-icon">
+          <svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="#f44336" stroke-width="2">
+            <circle cx="12" cy="12" r="10"/>
+            <line x1="15" y1="9" x2="9" y2="15"/>
+            <line x1="9" y1="9" x2="15" y2="15"/>
+          </svg>
+        </div>
+        <p class="error-title">Download Failed</p>
+        <p class="error-message">{errorMessage}</p>
+        <button class="retry-btn" onclick={retry}>
+          Try Again
+        </button>
+      </div>
+
+    {:else if setupState === "success"}
+      <div class="success-section">
+        <div class="success-icon">
+          <svg width="48" height="48" viewBox="0 0 24 24" fill="none" stroke="#4CAF50" stroke-width="2">
+            <circle cx="12" cy="12" r="10"/>
+            <polyline points="16 9 10.5 15 8 12.5"/>
+          </svg>
+        </div>
+        <p class="success-title">Setup Complete</p>
+        <p class="success-message">The transcription engine is ready to go.</p>
+      </div>
+    {/if}
+  </div>
+</div>
+
+<style>
+  .setup-overlay {
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    height: 100%;
+    width: 100%;
+    background-color: #1e1e1e;
+  }
+
+  .setup-card {
+    background-color: #2a2a2a;
+    border-radius: 12px;
+    padding: 40px;
+    max-width: 480px;
+    width: 100%;
+    margin: 20px;
+    box-shadow: 0 8px 32px rgba(0, 0, 0, 0.4);
+  }
+
+  .setup-header {
+    text-align: center;
+    margin-bottom: 24px;
+  }
+
+  .app-title {
+    font-size: 24px;
+    font-weight: 700;
+    color: #e0e0e0;
+    margin-bottom: 4px;
+  }
+
+  .setup-heading {
+    font-size: 16px;
+    font-weight: 500;
+    color: #a0a0a0;
+  }
+
+  .setup-description {
+    font-size: 14px;
+    color: #a0a0a0;
+    line-height: 1.6;
+    text-align: center;
+    margin-bottom: 24px;
+  }
+
+  .variant-options {
+    display: flex;
+    flex-direction: column;
+    gap: 12px;
+    margin-bottom: 24px;
+  }
+
+  .variant-option {
+    display: flex;
+    align-items: center;
+    gap: 12px;
+    padding: 14px 16px;
+    border: 2px solid #444;
+    border-radius: 8px;
+    cursor: pointer;
+    transition: border-color 0.15s ease, background-color 0.15s ease;
+  }
+
+  .variant-option:hover {
+    background-color: #333;
+    border-color: #555;
+  }
+
+  .variant-option.selected {
+    border-color: #4CAF50;
+    background-color: rgba(76, 175, 80, 0.08);
+  }
+
+  .variant-option input[type="radio"] {
+    width: 18px;
+    height: 18px;
+    flex-shrink: 0;
+  }
+
+  .variant-info {
+    display: flex;
+    flex-direction: column;
+    gap: 2px;
+  }
+
+  .variant-name {
+    font-size: 14px;
+    font-weight: 600;
+    color: #e0e0e0;
+  }
+
+  .variant-desc {
+    font-size: 12px;
+    color: #888;
+  }
+
+  .download-btn {
+    display: block;
+    width: 100%;
+    padding: 12px 24px;
+    font-size: 15px;
+    font-weight: 600;
+    color: white;
+    background-color: #4CAF50;
+    border: none;
+    border-radius: 8px;
+    cursor: pointer;
+    transition: background-color 0.15s ease;
+  }
+
+  .download-btn:hover {
+    background-color: #45a049;
+  }
+
+  .download-btn:active {
+    transform: scale(0.98);
+  }
+
+  /* Progress state */
+  .progress-section {
+    text-align: center;
+    padding: 20px 0;
+  }
+
+  .progress-message {
+    font-size: 14px;
+    color: #a0a0a0;
+    margin-bottom: 16px;
+  }
+
+  .progress-bar-track {
+    width: 100%;
+    height: 8px;
+    background-color: #3a3a3a;
+    border-radius: 4px;
+    overflow: hidden;
+    margin-bottom: 8px;
+  }
+
+  .progress-bar-fill {
+    height: 100%;
+    background-color: #4CAF50;
+    border-radius: 4px;
+    transition: width 0.3s ease;
+  }
+
+  .progress-percent {
+    font-size: 13px;
+    color: #707070;
+  }
+
+  /* Error state */
+  .error-section {
+    text-align: center;
+    padding: 10px 0;
+  }
+
+  .error-icon {
+    display: flex;
+    justify-content: center;
+    margin-bottom: 12px;
+  }
+
+  .error-title {
+    font-size: 18px;
+    font-weight: 600;
+    color: #f44336;
+    margin-bottom: 8px;
+  }
+
+  .error-message {
+    font-size: 13px;
+    color: #a0a0a0;
+    margin-bottom: 20px;
+    word-break: break-word;
+  }
+
+  .retry-btn {
+    display: inline-block;
+    padding: 10px 28px;
+    font-size: 14px;
+    font-weight: 600;
+    color: white;
+    background-color: #4CAF50;
+    border: none;
+    border-radius: 8px;
+    cursor: pointer;
+    transition: background-color 0.15s ease;
+  }
+
+  .retry-btn:hover {
+    background-color: #45a049;
+  }
+
+  /* Success state */
+  .success-section {
+    text-align: center;
+    padding: 20px 0;
+  }
+
+  .success-icon {
+    display: flex;
+    justify-content: center;
+    margin-bottom: 12px;
+  }
+
+  .success-title {
+    font-size: 18px;
+    font-weight: 600;
+    color: #4CAF50;
+    margin-bottom: 4px;
+  }
+
+  .success-message {
+    font-size: 14px;
+    color: #a0a0a0;
+  }
+</style>
--- a/src/lib/components/StatusBar.svelte
+++ b/src/lib/components/StatusBar.svelte
@@ -0,0 +1,106 @@
+<script lang="ts">
+  import { backendStore } from "$lib/stores/backend";
+  import { configStore } from "$lib/stores/config";
+
+  let statusColor = $derived.by(() => {
+    switch (backendStore.appState) {
+      case "initializing":
+        return "#ff9800";
+      case "ready":
+        return "#4caf50";
+      case "transcribing":
+        return "#f44336";
+      case "error":
+        return "#f44336";
+      default:
+        return "#888";
+    }
+  });
+
+  let isPulsing = $derived(backendStore.appState === "transcribing");
+  let userName = $derived(configStore.config.user.name);
+</script>
+
+<div class="status-bar">
+  <div class="status-left">
+    <span
+      class="status-indicator"
+      class:pulsing={isPulsing}
+      style="background-color: {statusColor}"
+    ></span>
+    <span class="state-message">{backendStore.stateMessage}</span>
+  </div>
+  <div class="status-right">
+    {#if backendStore.deviceInfo}
+      <span class="device-info">{backendStore.deviceInfo}</span>
+      <span class="separator">|</span>
+    {/if}
+    <span class="user-name">{userName}</span>
+  </div>
+</div>
+
+<style>
+  .status-bar {
+    display: flex;
+    align-items: center;
+    justify-content: space-between;
+    padding: 6px 20px;
+    background-color: var(--bg-secondary);
+    border-bottom: 1px solid var(--border-color);
+    font-size: 12px;
+    flex-shrink: 0;
+  }
+
+  .status-left {
+    display: flex;
+    align-items: center;
+    gap: 8px;
+  }
+
+  .status-right {
+    display: flex;
+    align-items: center;
+    gap: 8px;
+    color: var(--text-secondary);
+  }
+
+  .status-indicator {
+    width: 10px;
+    height: 10px;
+    border-radius: 50%;
+    flex-shrink: 0;
+  }
+
+  .status-indicator.pulsing {
+    animation: pulse 1.5s ease-in-out infinite;
+  }
+
+  @keyframes pulse {
+    0%,
+    100% {
+      opacity: 1;
+      box-shadow: 0 0 0 0 rgba(244, 67, 54, 0.4);
+    }
+    50% {
+      opacity: 0.7;
+      box-shadow: 0 0 0 6px rgba(244, 67, 54, 0);
+    }
+  }
+
+  .state-message {
+    color: var(--text-primary);
+  }
+
+  .device-info {
+    color: var(--text-secondary);
+  }
+
+  .separator {
+    color: var(--text-muted);
+  }
+
+  .user-name {
+    color: var(--accent-green);
+    font-weight: 500;
+  }
+</style>
--- a/src/lib/components/TranscriptionDisplay.svelte
+++ b/src/lib/components/TranscriptionDisplay.svelte
@@ -0,0 +1,110 @@
+<script lang="ts">
+  import { transcriptionStore } from "$lib/stores/transcriptions";
+  import { configStore } from "$lib/stores/config";
+
+  let container: HTMLDivElement | undefined = $state();
+  let showTimestamps = $derived(configStore.config.display.show_timestamps);
+  let items = $derived(transcriptionStore.items);
+
+  $effect(() => {
+    // Trigger on items length change to auto-scroll
+    const _len = items.length;
+    if (container) {
+      requestAnimationFrame(() => {
+        if (container) {
+          container.scrollTop = container.scrollHeight;
+        }
+      });
+    }
+  });
+</script>
+
+<div class="transcription-display" bind:this={container}>
+  {#each items as item (item.id)}
+    <div class="transcription-item" class:preview={item.isPreview}>
+      {#if showTimestamps && item.timestamp}
+        <span class="timestamp">[{item.timestamp}]</span>
+      {/if}
+      {#if item.userName}
+        <span class="user-name">{item.userName}:</span>
+      {/if}
+      {#if item.isPreview}
+        <span class="preview-indicator">[...]</span>
+      {/if}
+      <span class="text">{item.text}</span>
+    </div>
+  {:else}
+    <div class="empty-state">
+      Transcriptions will appear here...
+    </div>
+  {/each}
+</div>
+
+<style>
+  .transcription-display {
+    flex: 1;
+    overflow-y: auto;
+    padding: 12px 20px;
+    display: flex;
+    flex-direction: column;
+    gap: 6px;
+  }
+
+  .transcription-item {
+    padding: 6px 10px;
+    border-radius: 4px;
+    background-color: rgba(255, 255, 255, 0.03);
+    animation: fadeIn 0.2s ease-out;
+    line-height: 1.6;
+    word-wrap: break-word;
+  }
+
+  .transcription-item.preview {
+    font-style: italic;
+    opacity: 0.7;
+  }
+
+  .timestamp {
+    color: #888;
+    font-size: 0.85em;
+    margin-right: 8px;
+    font-family: monospace;
+  }
+
+  .user-name {
+    color: #4caf50;
+    font-weight: 700;
+    margin-right: 6px;
+  }
+
+  .preview-indicator {
+    color: #888;
+    font-size: 0.85em;
+    margin-right: 4px;
+  }
+
+  .text {
+    color: #ffffff;
+  }
+
+  .empty-state {
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    height: 100%;
+    color: var(--text-muted);
+    font-size: 15px;
+    font-style: italic;
+  }
+
+  @keyframes fadeIn {
+    from {
+      opacity: 0;
+      transform: translateY(4px);
+    }
+    to {
+      opacity: 1;
+      transform: translateY(0);
+    }
+  }
+</style>
--- a/src/lib/stores/backend.ts
+++ b/src/lib/stores/backend.ts
@@ -0,0 +1,266 @@
+/**
+ * Backend store - manages WebSocket connection and REST API communication
+ * with the Python backend server running on localhost.
+ *
+ * The backend port defaults to 8081 but can be updated at runtime via
+ * `setPort()`. The WebSocket connects to /ws/control for real-time push
+ * of transcriptions, previews, and state changes.
+ */
+
+export type ConnectionState = "connecting" | "connected" | "disconnected" | "error";
+export type AppState = "initializing" | "ready" | "transcribing" | "reloading" | "error";
+
+interface BackendState {
+  port: number;
+  connectionState: ConnectionState;
+  appState: AppState;
+  stateMessage: string;
+  deviceInfo: string;
+  wsConnection: WebSocket | null;
+  version: string;
+  lastError: string;
+}
+
+let state = $state<BackendState>({
+  port: 8081,
+  connectionState: "disconnected",
+  appState: "initializing",
+  stateMessage: "Connecting to backend...",
+  deviceInfo: "",
+  wsConnection: null,
+  version: "1.4.0",
+  lastError: "",
+});
+
+let reconnectTimer: ReturnType<typeof setTimeout> | null = null;
+let reconnectAttempts = 0;
+const MAX_RECONNECT_DELAY_MS = 30_000;
+const BASE_RECONNECT_DELAY_MS = 1_000;
+
+// ── URL helpers ──────────────────────────────────────────────────────
+
+function apiUrl(path: string): string {
+  const normalised = path.startsWith("/") ? path : `/${path}`;
+  return `http://localhost:${state.port}${normalised}`;
+}
+
+async function apiFetch(path: string, options?: RequestInit): Promise<Response> {
+  const url = apiUrl(path);
+  const method = options?.method?.toUpperCase() ?? "GET";
+  const headers = new Headers(options?.headers);
+  if (method !== "GET" && !headers.has("Content-Type")) {
+    headers.set("Content-Type", "application/json");
+  }
+  return fetch(url, { ...options, headers });
+}
+
+// ── WebSocket management ─────────────────────────────────────────────
+
+function connectWebSocket() {
+  // Tear down any existing connection
+  disconnect();
+
+  state.connectionState = "connecting";
+  reconnectAttempts = 0;
+
+  _openSocket();
+}
+
+function _openSocket() {
+  const wsUrl = `ws://localhost:${state.port}/ws/control`;
+
+  try {
+    const ws = new WebSocket(wsUrl);
+
+    ws.onopen = () => {
+      state.connectionState = "connected";
+      state.lastError = "";
+      reconnectAttempts = 0;
+      if (reconnectTimer) {
+        clearTimeout(reconnectTimer);
+        reconnectTimer = null;
+      }
+    };
+
+    ws.onmessage = (event) => {
+      try {
+        const data = JSON.parse(event.data);
+        handleWebSocketMessage(data);
+      } catch {
+        // ignore parse errors
+      }
+    };
+
+    ws.onclose = () => {
+      state.wsConnection = null;
+      if (state.connectionState !== "disconnected") {
+        state.connectionState = "error";
+        state.stateMessage = "Disconnected from backend";
+        _scheduleReconnect();
+      }
+    };
+
+    ws.onerror = () => {
+      state.lastError = "WebSocket error";
+      // onclose fires after this, which handles reconnect
+    };
+
+    state.wsConnection = ws;
+  } catch {
+    state.connectionState = "error";
+    state.stateMessage = "Failed to connect";
+    _scheduleReconnect();
+  }
+}
+
+function _scheduleReconnect() {
+  if (reconnectTimer) return;
+
+  const delay = Math.min(
+    BASE_RECONNECT_DELAY_MS * Math.pow(2, reconnectAttempts),
+    MAX_RECONNECT_DELAY_MS,
+  );
+  reconnectAttempts++;
+
+  reconnectTimer = setTimeout(() => {
+    reconnectTimer = null;
+    if (state.connectionState !== "disconnected") {
+      state.connectionState = "connecting";
+      _openSocket();
+    }
+  }, delay);
+}
+
+function disconnect() {
+  if (reconnectTimer) {
+    clearTimeout(reconnectTimer);
+    reconnectTimer = null;
+  }
+  state.connectionState = "disconnected";
+  if (state.wsConnection) {
+    const ws = state.wsConnection;
+    ws.onclose = null;
+    ws.onerror = null;
+    ws.close();
+    state.wsConnection = null;
+  }
+}
+
+// ── WebSocket message handling ───────────────────────────────────────
+
+function handleWebSocketMessage(data: Record<string, unknown>) {
+  // Handle state changes locally
+  if (data.type === "state_changed") {
+    if (data.state) {
+      state.appState = data.state as AppState;
+    }
+    if (data.message) {
+      state.stateMessage = data.message as string;
+    }
+  }
+
+  if (data.type === "error") {
+    state.lastError = (data.message as string) ?? "Unknown error";
+  }
+
+  // Dispatch to window for other stores (transcriptions, etc.)
+  if (data.type === "transcription") {
+    window.dispatchEvent(
+      new CustomEvent("backend:transcription", { detail: data })
+    );
+  } else if (data.type === "preview") {
+    window.dispatchEvent(
+      new CustomEvent("backend:preview", { detail: data })
+    );
+  } else if (data.type === "credits_low") {
+    window.dispatchEvent(
+      new CustomEvent("backend:credits_low", { detail: data })
+    );
+  }
+}
+
+// ── Port management ──────────────────────────────────────────────────
+
+function setPort(newPort: number) {
+  if (newPort === state.port) return;
+  state.port = newPort;
+  // Reconnect with new port if we had a connection
+  if (state.connectionState !== "disconnected") {
+    connectWebSocket();
+  }
+}
+
+// ── Typed REST helpers ───────────────────────────────────────────────
+
+async function apiGet<T = unknown>(path: string): Promise<T> {
+  const resp = await apiFetch(path);
+  if (!resp.ok) throw new Error(`GET ${path} failed: ${resp.status}`);
+  return resp.json();
+}
+
+async function apiPost<T = unknown>(
+  path: string,
+  body?: unknown
+): Promise<T> {
+  const resp = await apiFetch(path, {
+    method: "POST",
+    body: body !== undefined ? JSON.stringify(body) : undefined,
+  });
+  if (!resp.ok) throw new Error(`POST ${path} failed: ${resp.status}`);
+  return resp.json();
+}
+
+async function apiPut<T = unknown>(
+  path: string,
+  body?: unknown
+): Promise<T> {
+  const resp = await apiFetch(path, {
+    method: "PUT",
+    body: body !== undefined ? JSON.stringify(body) : undefined,
+  });
+  if (!resp.ok) throw new Error(`PUT ${path} failed: ${resp.status}`);
+  return resp.json();
+}
+
+// ── Public API ───────────────────────────────────────────────────────
+
+export const backendStore = {
+  get port() {
+    return state.port;
+  },
+  get connectionState() {
+    return state.connectionState;
+  },
+  get connected() {
+    return state.connectionState === "connected";
+  },
+  get appState() {
+    return state.appState;
+  },
+  get stateMessage() {
+    return state.stateMessage;
+  },
+  get deviceInfo() {
+    return state.deviceInfo;
+  },
+  get version() {
+    return state.version;
+  },
+  get lastError() {
+    return state.lastError;
+  },
+  get apiBaseUrl() {
+    return `http://localhost:${state.port}`;
+  },
+  get wsUrl() {
+    return `ws://localhost:${state.port}/ws/control`;
+  },
+  setPort,
+  connect: connectWebSocket,
+  disconnect,
+  apiUrl,
+  apiFetch,
+  apiGet,
+  apiPost,
+  apiPut,
+};
--- a/src/lib/stores/config.ts
+++ b/src/lib/stores/config.ts
@@ -0,0 +1,243 @@
+/**
+ * Config store - manages application configuration loaded from
+ * and saved to the Python backend via the backend store's API helpers.
+ *
+ * The backend accepts PUT /api/config with `{ settings: { "dot.key": value } }`.
+ */
+
+import { backendStore } from "$lib/stores/backend";
+
+export interface AppConfig {
+  user: {
+    name: string;
+    id: string;
+  };
+  audio: {
+    input_device: string;
+    sample_rate: number;
+  };
+  transcription: {
+    model: string;
+    device: string;
+    language: string;
+    compute_type: string;
+    enable_realtime_transcription: boolean;
+    realtime_model: string;
+    realtime_processing_pause: number;
+    silero_sensitivity: number;
+    silero_use_onnx: boolean;
+    webrtc_sensitivity: number;
+    post_speech_silence_duration: number;
+    min_length_of_recording: number;
+    min_gap_between_recordings: number;
+    pre_recording_buffer_duration: number;
+    beam_size: number;
+    initial_prompt: string;
+    no_log_file: boolean;
+    continuous_mode: boolean;
+  };
+  server_sync: {
+    enabled: boolean;
+    url: string;
+    room: string;
+    passphrase: string;
+  };
+  display: {
+    show_timestamps: boolean;
+    max_lines: number;
+    font_source: string;
+    font_family: string;
+    websafe_font: string;
+    google_font: string;
+    custom_font_file: string;
+    font_size: number;
+    theme: string;
+    fade_after_seconds: number;
+    user_color: string;
+    text_color: string;
+    background_color: string;
+  };
+  web_server: {
+    port: number;
+    host: string;
+  };
+  remote: {
+    mode: string;
+    server_url: string;
+    auth_token: string;
+    byok_api_key: string;
+    deepgram_model: string;
+    language: string;
+    fallback_to_local: boolean;
+  };
+  updates: {
+    auto_check: boolean;
+    gitea_url: string;
+    owner: string;
+    repo: string;
+    skipped_versions: string[];
+    last_check: string;
+    check_interval_hours: number;
+  };
+}
+
+function getDefaultConfig(): AppConfig {
+  return {
+    user: { name: "User", id: "" },
+    audio: { input_device: "default", sample_rate: 16000 },
+    transcription: {
+      model: "base.en",
+      device: "auto",
+      language: "en",
+      compute_type: "default",
+      enable_realtime_transcription: false,
+      realtime_model: "tiny.en",
+      realtime_processing_pause: 0.1,
+      silero_sensitivity: 0.4,
+      silero_use_onnx: true,
+      webrtc_sensitivity: 3,
+      post_speech_silence_duration: 0.3,
+      min_length_of_recording: 0.5,
+      min_gap_between_recordings: 0,
+      pre_recording_buffer_duration: 0.2,
+      beam_size: 5,
+      initial_prompt: "",
+      no_log_file: true,
+      continuous_mode: false,
+    },
+    server_sync: {
+      enabled: false,
+      url: "http://localhost:3000/api/send",
+      room: "default",
+      passphrase: "",
+    },
+    display: {
+      show_timestamps: true,
+      max_lines: 100,
+      font_source: "System Font",
+      font_family: "Courier",
+      websafe_font: "Arial",
+      google_font: "Roboto",
+      custom_font_file: "",
+      font_size: 12,
+      theme: "dark",
+      fade_after_seconds: 10,
+      user_color: "#4CAF50",
+      text_color: "#FFFFFF",
+      background_color: "#000000B3",
+    },
+    web_server: { port: 8080, host: "127.0.0.1" },
+    remote: {
+      mode: "local",
+      server_url: "",
+      auth_token: "",
+      byok_api_key: "",
+      deepgram_model: "nova-2",
+      language: "en-US",
+      fallback_to_local: true,
+    },
+    updates: {
+      auto_check: true,
+      gitea_url: "https://repo.anhonesthost.net",
+      owner: "streamer-tools",
+      repo: "local-transcription",
+      skipped_versions: [],
+      last_check: "",
+      check_interval_hours: 24,
+    },
+  };
+}
+
+let config = $state<AppConfig>(getDefaultConfig());
+let loading = $state(false);
+let error = $state("");
+
+/**
+ * Fetch the full configuration tree from the backend.
+ * GET /api/config
+ */
+async function fetchConfig(): Promise<void> {
+  loading = true;
+  error = "";
+
+  try {
+    const data = await backendStore.apiGet<Record<string, unknown>>("/api/config");
+    // Deep merge with defaults to ensure all keys exist
+    config = deepMerge(getDefaultConfig(), data) as AppConfig;
+  } catch (err) {
+    error = err instanceof Error ? err.message : String(err);
+    console.error("[config] fetchConfig failed:", error);
+  } finally {
+    loading = false;
+  }
+}
+
+function deepMerge(target: Record<string, unknown>, source: Record<string, unknown>): Record<string, unknown> {
+  const result = { ...target };
+  for (const key of Object.keys(source)) {
+    if (
+      source[key] &&
+      typeof source[key] === "object" &&
+      !Array.isArray(source[key]) &&
+      target[key] &&
+      typeof target[key] === "object" &&
+      !Array.isArray(target[key])
+    ) {
+      result[key] = deepMerge(
+        target[key] as Record<string, unknown>,
+        source[key] as Record<string, unknown>
+      );
+    } else {
+      result[key] = source[key];
+    }
+  }
+  return result;
+}
+
+/**
+ * Send a batch of setting updates to the backend.
+ * PUT /api/config with body `{ settings: { "dot.key": value, ... } }`
+ *
+ * Keys use dot-notation, e.g. `{ "transcription.model": "small.en" }`.
+ *
+ * Returns the response payload on success, or throws on failure.
+ */
+async function updateConfig(
+  settings: Record<string, unknown>,
+): Promise<{ status: string; message: string; engine_reloaded: boolean }> {
+  loading = true;
+  error = "";
+
+  try {
+    const result = await backendStore.apiPut<{
+      status: string;
+      message: string;
+      engine_reloaded: boolean;
+    }>("/api/config", { settings });
+
+    // Refresh the local config tree so the UI stays in sync
+    await fetchConfig();
+
+    return result;
+  } catch (err) {
+    error = err instanceof Error ? err.message : String(err);
+    console.error("[config] updateConfig failed:", error);
+    throw err;
+  } finally {
+    loading = false;
+  }
+}
+
+export const configStore = {
+  get config() {
+    return config;
+  },
+  get loading() {
+    return loading;
+  },
+  get error() {
+    return error;
+  },
+  fetchConfig,
+  updateConfig,
+};
--- a/src/lib/stores/transcriptions.ts
+++ b/src/lib/stores/transcriptions.ts
@@ -0,0 +1,109 @@
+/**
+ * Transcriptions store - manages the list of transcription items
+ * received from the backend via WebSocket.
+ */
+
+export interface TranscriptionItem {
+  id: string;
+  text: string;
+  userName: string;
+  timestamp: string;
+  isPreview: boolean;
+}
+
+let items = $state<TranscriptionItem[]>([]);
+let nextId = 0;
+
+function generateId(): string {
+  return `t-${Date.now()}-${nextId++}`;
+}
+
+function addTranscription(data: {
+  text?: string;
+  user_name?: string;
+  timestamp?: string;
+}) {
+  // When a final transcription arrives, remove any existing preview
+  const previewIndex = items.findIndex((item) => item.isPreview);
+  if (previewIndex !== -1) {
+    items.splice(previewIndex, 1);
+  }
+
+  items.push({
+    id: generateId(),
+    text: data.text ?? "",
+    userName: data.user_name ?? "",
+    timestamp: data.timestamp ?? "",
+    isPreview: false,
+  });
+
+  // Keep a reasonable limit
+  if (items.length > 500) {
+    items.splice(0, items.length - 500);
+  }
+}
+
+function setPreview(data: {
+  text?: string;
+  user_name?: string;
+  timestamp?: string;
+}) {
+  const existingIndex = items.findIndex((item) => item.isPreview);
+  const previewItem: TranscriptionItem = {
+    id: existingIndex !== -1 ? items[existingIndex].id : generateId(),
+    text: data.text ?? "",
+    userName: data.user_name ?? "",
+    timestamp: data.timestamp ?? "",
+    isPreview: true,
+  };
+
+  if (existingIndex !== -1) {
+    items[existingIndex] = previewItem;
+  } else {
+    items.push(previewItem);
+  }
+}
+
+function clearAll() {
+  items.length = 0;
+}
+
+function getPlainText(): string {
+  return items
+    .filter((item) => !item.isPreview)
+    .map((item) => {
+      let line = "";
+      if (item.timestamp) line += `[${item.timestamp}] `;
+      if (item.userName) line += `${item.userName}: `;
+      line += item.text;
+      return line;
+    })
+    .join("\n");
+}
+
+// Listen for backend events
+if (typeof window !== "undefined") {
+  window.addEventListener("backend:transcription", ((e: CustomEvent) => {
+    addTranscription(e.detail);
+  }) as EventListener);
+
+  window.addEventListener("backend:preview", ((e: CustomEvent) => {
+    setPreview(e.detail);
+  }) as EventListener);
+}
+
+export const transcriptionStore = {
+  get items() {
+    return items;
+  },
+  get currentPreview(): TranscriptionItem | null {
+    return items.find((item) => item.isPreview) ?? null;
+  },
+  get transcriptions(): TranscriptionItem[] {
+    return items.filter((item) => !item.isPreview);
+  },
+  addTranscription,
+  setPreview,
+  clearAll,
+  getPlainText,
+};
--- a/src/main.ts
+++ b/src/main.ts
@@ -0,0 +1,6 @@
+import App from "./App.svelte";
+import { mount } from "svelte";
+import "./app.css";
+
+const app = mount(App, { target: document.getElementById("app")! });
+export default app;
--- a/svelte.config.js
+++ b/svelte.config.js
@@ -0,0 +1,5 @@
+import { vitePreprocess } from "@sveltejs/vite-plugin-svelte";
+
+export default {
+  preprocess: vitePreprocess(),
+};
--- a/tsconfig.json
+++ b/tsconfig.json
@@ -0,0 +1,15 @@
+{
+  "extends": "@tsconfig/svelte/tsconfig.json",
+  "compilerOptions": {
+    "target": "ESNext",
+    "useDefineForClassFields": true,
+    "module": "ESNext",
+    "resolveJsonModule": true,
+    "allowJs": true,
+    "checkJs": true,
+    "isolatedModules": true,
+    "moduleDetection": "force",
+    "strict": true
+  },
+  "include": ["src/**/*.ts", "src/**/*.svelte"]
+}
--- a/version.py
+++ b/version.py
@@ -1,7 +1,7 @@
 """Version information for Local Transcription."""

-__version__ = "1.4.0"
-__version_info__ = (1, 4, 0)
+__version__ = "1.4.4"
+__version_info__ = (1, 4, 4)

 # Version history:
 # 1.4.0 - Auto-update feature:
--- a/vite.config.ts
+++ b/vite.config.ts
@@ -0,0 +1,21 @@
+import { defineConfig } from "vite";
+import { svelte } from "@sveltejs/vite-plugin-svelte";
+import path from "path";
+
+// https://vitejs.dev/config/
+export default defineConfig({
+  plugins: [svelte()],
+  clearScreen: false,
+  resolve: {
+    alias: {
+      $lib: path.resolve("./src/lib"),
+    },
+  },
+  server: {
+    port: 1420,
+    strictPort: true,
+    watch: {
+      ignored: ["**/src-tauri/**", "**/client/**", "**/server/**", "**/backend/**", "**/gui/**"],
+    },
+  },
+});
Author	SHA1	Message	Date
Gitea Actions	fff37992b1	chore: bump version to 1.4.4 [skip ci]	2026-04-07 00:05:15 +00:00
Developer	8afe3230d3	Add sidecar download, setup screen, and auto-launch Some checks failed Release / Bump version and tag (push) Successful in 3s Details Release / Build App (macOS) (push) Successful in 1m9s Details Release / Build App (Linux) (push) Successful in 5m36s Details Release / Build App (Windows) (push) Has been cancelled Details On first launch, the app now prompts users to download the Python sidecar (CPU or CUDA variant) from Gitea releases, matching the voice-to-notes pattern. On subsequent launches, it auto-launches the sidecar and connects. New Rust module (src-tauri/src/sidecar/): - download_sidecar: streams download with progress events, extracts zip - check_sidecar: verifies installed sidecar binary exists - check_sidecar_update: compares local vs latest release version - SidecarManager: launches binary, waits for ready JSON, manages lifecycle - Dev mode: runs `python -m backend.main_headless` directly - start_sidecar/stop_sidecar/get_sidecar_port: Tauri commands New Svelte component (SidecarSetup.svelte): - First-time setup overlay with CPU/CUDA variant selection - Download progress bar with byte counter - Error state with retry, success state with auto-continue Updated App.svelte state machine: - checking -> needs_setup -> starting -> connected - Falls back to direct connection in browser dev mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 17:02:56 -07:00
Developer	04e7fb1a99	Fix macOS sidecar build and blank window on startup Some checks failed Release / Bump version and tag (push) Has been cancelled Details Release / Build App (Linux) (push) Has been cancelled Details Release / Build App (Windows) (push) Has been cancelled Details Release / Build App (macOS) (push) Has been cancelled Details macOS sidecar: `uv run` re-resolves dependencies using CUDA sources even after `uv sync --no-sources`. Use UV_NO_SOURCES=1 env var instead so it applies to all uv commands in the step. Blank window: When the Tauri app starts without the Python backend running, it showed a completely blank window. Now shows a "Connecting to backend..." spinner, or an error state with instructions to start the backend manually. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 16:55:03 -07:00
Gitea Actions	9a282215c9	chore: bump sidecar version to 1.0.2 [skip ci]	2026-04-06 23:49:18 +00:00
Gitea Actions	cc2d17a627	chore: bump version to 1.4.3 [skip ci]	2026-04-06 21:02:18 +00:00
Developer	61c5ffa4fa	Remove Zone.Identifier files that break Windows checkout All checks were successful Release / Bump version and tag (push) Successful in 4s Details Release / Build App (macOS) (push) Successful in 58s Details Release / Build App (Windows) (push) Successful in 3m22s Details Release / Build App (Linux) (push) Successful in 6m27s Details Windows NTFS Zone.Identifier alternate data stream files were accidentally committed. The colon in the filename is invalid on Windows, causing git checkout to fail on Windows runners. Also added *:Zone.Identifier to .gitignore to prevent this recurring. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 14:02:11 -07:00
Gitea Actions	289b9dabe1	chore: bump version to 1.4.2 [skip ci]	2026-04-06 21:00:01 +00:00
Developer	9522f28c57	Fix app icons: regenerate as RGBA and add macOS .icns Some checks failed Release / Bump version and tag (push) Successful in 4s Details Release / Build App (Windows) (push) Failing after 10s Details Release / Build App (macOS) (push) Successful in 59s Details Release / Build App (Linux) (push) Has been cancelled Details The bundled .ico had non-RGBA PNGs which caused Tauri's macOS bundler to fail with "The PNG is not in RGBA format!". Regenerated all icons from the source PNG as proper RGBA, and added icon.icns for macOS. Also fixed bundle identifier from "com.localtranscription.app" (the .app suffix conflicts with macOS bundle extension) to "net.anhonesthost.local-transcription". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 13:59:50 -07:00
Gitea Actions	a8e2e7dca8	chore: bump version to 1.4.1 [skip ci]	2026-04-06 20:53:15 +00:00
Developer	3bcf4f09a3	Fix sidecar builds: macOS CUDA resolution and Windows uv install Some checks failed Release / Bump version and tag (push) Successful in 3s Details Release / Build App (Windows) (push) Failing after 10s Details Release / Build App (macOS) (push) Failing after 51s Details Release / Build App (Linux) (push) Successful in 4m31s Details macOS: pyproject.toml's [tool.uv.sources] forces torch from the CUDA index which has no macOS ARM wheels. Use `uv sync --no-sources` to bypass this and get torch from PyPI (which includes MPS support). Windows: Add additional uv PATH locations ($LOCALAPPDATA\uv\bin) for robustness with different runner environments. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 13:51:41 -07:00
Gitea Actions	ef5734ef15	chore: bump sidecar version to 1.0.1 [skip ci]	2026-04-06 20:45:14 +00:00
jknapp	c9db43d56c	Merge pull request 'Rewrite frontend to Tauri v2 + Svelte 5 for cross-platform support' (#4 ) from feature/tauri-rewrite into main Some checks failed Build Sidecars / Bump sidecar version and tag (push) Successful in 4s Details Release / Bump version and tag (push) Successful in 2s Details Build Sidecars / Build Sidecar (Windows) (push) Failing after 15s Details Build Sidecars / Build Sidecar (macOS) (push) Failing after 18s Details Release / Build App (Windows) (push) Failing after 15s Details Release / Build App (macOS) (push) Failing after 52s Details Release / Build App (Linux) (push) Has been cancelled Details Build Sidecars / Build Sidecar (Linux) (push) Has been cancelled Details Reviewed-on: #4	2026-04-06 20:45:10 +00:00
Developer	4c519a109a	Add missing Svelte components and stores, fix .gitignore lib/ pattern The src/lib/ directory was being excluded by a Python .gitignore rule for lib/ (meant for Python's build output). Changed to /lib/ so it only matches root-level lib/ and doesn't block src/lib/. Adds 8 files that were created but missed in the initial commit: - 5 Svelte components (Header, StatusBar, Controls, TranscriptionDisplay, Settings) - 3 TypeScript stores (backend, config, transcriptions) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 13:42:31 -07:00
Developer	47ca74e75d	Update README and CLAUDE.md for Tauri rewrite Update both docs to reflect the new architecture: - Tauri v2 + Svelte 5 frontend replacing PySide6/Qt - Headless Python backend with FastAPI control API - Cross-platform support (Windows, macOS, Linux) - Deepgram remote transcription (managed/BYOK) - Gitea CI/CD workflows for automated builds - New project structure with backend/, src/, src-tauri/ - Updated development commands and build instructions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 13:34:10 -07:00
Developer	25d2a55efb	Add Gitea CI/CD workflows for cross-platform builds Two workflows adapted from voice-to-notes: - release.yml: Builds the Tauri app shell (.deb/.rpm for Linux, .msi for Windows, .dmg for macOS) on push to main. Auto-bumps version, creates Gitea release, uploads platform binaries. - build-sidecar.yml: Builds the headless Python backend sidecar via PyInstaller when client/server/backend code changes. Produces CUDA and CPU variants for Linux/Windows, CPU-only for macOS. Uses the new local-transcription-headless.spec (no PySide6 dependencies). Also adds local-transcription-headless.spec — a simplified PyInstaller config for the headless backend that excludes all Qt/PySide6 imports. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 11:44:34 -07:00
Developer	af534bf768	Add Tauri v2 + Svelte 5 frontend and headless Python backend Scaffold the cross-platform rewrite from PySide6/Qt to Tauri + Svelte, following the same architecture as voice-to-notes. The Python backend runs headless as a sidecar, with a FastAPI control API that the Svelte frontend connects to via REST and WebSocket. New files: - backend/app_controller.py: Headless orchestration (extracted from MainWindow) - backend/api_server.py: FastAPI control endpoints + /ws/control WebSocket - backend/main_headless.py: Headless entry point for sidecar mode - src-tauri/: Tauri v2 Rust shell with sidecar and dialog plugins - src/: Svelte 5 frontend (App, Settings, Controls, TranscriptionDisplay) - src/lib/stores/: Reactive stores for backend connection, config, transcriptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 10:20:25 -07:00
Developer	9ff883e2e3	Phase 6: Add Deepgram remote transcription (managed + BYOK modes) New files: - client/deepgram_transcription.py — DeepgramTranscriptionEngine with managed mode (proxy) and BYOK mode (direct Deepgram). Sends raw binary PCM audio over WebSocket, handles both proxy and Deepgram response formats. Modified files: - config/default_config.yaml — Replace remote_processing with new remote section (mode, server_url, auth_token, byok_api_key, deepgram_model, language) - client/config.py — Add migration from old remote_processing config - gui/settings_dialog_qt.py — Replace Remote Processing group with Transcription Mode section (Local/Managed/BYOK radio buttons, login/register dialogs, balance display, model selector) - gui/main_window_qt.py — Select engine based on remote.mode config, add error and credits_low handlers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 11:45:30 -07:00
jknapp	bb8a8c251d	Update README to reflect current application state Remove outdated implementation plan and task checklists. Document actual implemented features including RealtimeSTT, dual-layer VAD, custom fonts/colors, and auto-updates. Add practical usage instructions for standalone mode, OBS setup, and multi-user sync. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-23 06:31:27 -08:00
				`@@ -0,0 +1 @@`
				`"""Backend package for headless transcription service."""`