Add cloud-only sidecar variant (~50MB vs 500MB-2GB)

Lightweight Deepgram-only sidecar that excludes PyTorch, faster-whisper, RealtimeSTT, and CUDA. Only includes audio capture + WebSocket streaming to Deepgram. Requires a Deepgram API key (BYOK or managed mode). Changes: - client/models.py: Extracted TranscriptionResult into standalone module so deepgram_transcription.py doesn't transitively import torch - backend/app_controller.py: Made RealtimeTranscriptionEngine and DeviceManager imports lazy (only loaded when remote.mode == "local") - local-transcription-cloud.spec: PyInstaller spec excluding all ML deps - SidecarSetup.svelte: Added "Cloud Only (Deepgram)" variant option - build-sidecar-cloud.yml: CI workflow building cloud sidecar for all 3 OS - sidecar-release.yml: Dispatches cloud build alongside CPU/CUDA builds Sidecar download options are now: - Standard (CPU): ~500 MB - local Whisper on any computer - GPU Accelerated (CUDA): ~2 GB - local Whisper with NVIDIA GPU - Cloud Only (Deepgram): ~50 MB - requires API key, no local models Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:57:43 -07:00
parent bb039399fc
commit 3d3d7ec3c5
10 changed files with 469 additions and 42 deletions
--- a/.gitea/workflows/sidecar-release.yml
+++ b/.gitea/workflows/sidecar-release.yml
@@ -118,7 +118,7 @@ jobs:
          REPO_API="${GITHUB_SERVER_URL}/api/v1/repos/${GITHUB_REPOSITORY}"
          TAG="${{ steps.bump.outputs.tag }}"

-          for workflow in build-sidecar-linux.yml build-sidecar-windows.yml build-sidecar-macos.yml; do
+          for workflow in build-sidecar-linux.yml build-sidecar-windows.yml build-sidecar-macos.yml build-sidecar-cloud.yml; do
            echo "Dispatching ${workflow} for ${TAG}..."
            HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
              -H "Authorization: token ${BUILD_TOKEN}" \