Phase 6: Add Deepgram remote transcription (managed + BYOK modes)

New files:
- client/deepgram_transcription.py — DeepgramTranscriptionEngine with
  managed mode (proxy) and BYOK mode (direct Deepgram). Sends raw binary
  PCM audio over WebSocket, handles both proxy and Deepgram response formats.

Modified files:
- config/default_config.yaml — Replace remote_processing with new remote
  section (mode, server_url, auth_token, byok_api_key, deepgram_model, language)
- client/config.py — Add migration from old remote_processing config
- gui/settings_dialog_qt.py — Replace Remote Processing group with
  Transcription Mode section (Local/Managed/BYOK radio buttons, login/register
  dialogs, balance display, model selector)
- gui/main_window_qt.py — Select engine based on remote.mode config,
  add error and credits_low handlers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Developer
2026-04-05 11:45:30 -07:00
parent bb8a8c251d
commit 9ff883e2e3
8 changed files with 1503 additions and 74 deletions

View File

@@ -0,0 +1,9 @@
{
"permissions": {
"allow": [
"Bash(python3:*)",
"Bash(node --check:*)",
"Bash(ls:*)"
]
}
}

574
DEEPGRAM_PROXY_PLAN.md Normal file
View File

@@ -0,0 +1,574 @@
# Deepgram Proxy Service — Build Plan
## Project Overview
Build a standalone hosted service that acts as a Deepgram proxy for the Local Transcription
desktop app. Users can either provide their own Deepgram API key (BYOK) or use the managed
service with prepaid credits purchased via Stripe.
This is a **separate repository** from `local-transcription`. The desktop app will be updated
in a second phase to support both modes.
---
## Repository Structure
```
transcription-proxy/
├── src/
│ ├── server.js # Express app entry point
│ ├── config.js # Environment config loader
│ ├── db/
│ │ ├── index.js # node-postgres pool setup
│ │ └── migrations/ # SQL migration files (numbered)
│ │ ├── 001_users.sql
│ │ ├── 002_credits.sql
│ │ ├── 003_sessions.sql
│ │ └── 004_usage_ledger.sql
│ ├── middleware/
│ │ ├── auth.js # JWT verification middleware
│ │ └── rateLimit.js # Per-user rate limiting
│ ├── routes/
│ │ ├── auth.js # POST /auth/register, /auth/login, /auth/refresh
│ │ ├── billing.js # POST /billing/checkout, GET /billing/balance
│ │ └── account.js # GET /account/me, GET /account/usage
│ ├── websocket/
│ │ └── proxy.js # WebSocket proxy handler (core feature)
│ └── webhooks/
│ └── stripe.js # POST /webhooks/stripe
├── web/ # Simple frontend dashboard
│ ├── index.html # Landing / login page
│ ├── dashboard.html # Balance, usage history, buy credits
│ └── assets/
│ ├── app.js
│ └── style.css
├── .env.example
├── package.json
├── docker-compose.yml # Postgres + app for local dev
└── CLAUDE.md # This file (after renaming)
```
---
## Technology Stack
- **Runtime**: Node.js 20+
- **Framework**: Express 4
- **WebSocket**: `ws` library (not socket.io — keep it lean)
- **Database**: PostgreSQL 15+ via `pg` (node-postgres)
- **Auth**: JWT via `jsonwebtoken`, passwords hashed with `bcrypt`
- **Payments**: Stripe Node SDK (`stripe`)
- **Environment**: `dotenv`
- **Dev tooling**: `nodemon` for dev, no TypeScript (keep it simple)
---
## Database Schema
Run migrations in order. Use a simple `schema_migrations` table to track applied migrations.
### 001_users.sql
```sql
CREATE TABLE schema_migrations (
version INTEGER PRIMARY KEY,
applied_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email TEXT UNIQUE NOT NULL,
password_hash TEXT NOT NULL,
stripe_customer_id TEXT UNIQUE,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
```
### 002_credits.sql
```sql
CREATE TABLE credit_balance (
user_id UUID PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
seconds_remaining INTEGER NOT NULL DEFAULT 0,
updated_at TIMESTAMPTZ DEFAULT NOW()
);
```
### 003_sessions.sql
```sql
CREATE TABLE transcription_sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(id),
mode TEXT NOT NULL CHECK (mode IN ('managed', 'byok')),
started_at TIMESTAMPTZ DEFAULT NOW(),
ended_at TIMESTAMPTZ,
seconds_used INTEGER NOT NULL DEFAULT 0,
deepgram_model TEXT,
status TEXT NOT NULL DEFAULT 'active' CHECK (status IN ('active', 'completed', 'terminated'))
);
CREATE INDEX idx_sessions_user_id ON transcription_sessions(user_id);
CREATE INDEX idx_sessions_started_at ON transcription_sessions(started_at);
```
### 004_usage_ledger.sql
```sql
CREATE TABLE usage_ledger (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(id),
session_id UUID REFERENCES transcription_sessions(id),
recorded_at TIMESTAMPTZ DEFAULT NOW(),
seconds INTEGER NOT NULL,
description TEXT -- e.g. 'session_usage', 'credit_purchase', 'manual_adjustment'
);
CREATE INDEX idx_ledger_user_id ON usage_ledger(user_id);
```
---
## Environment Variables (.env.example)
```env
# Server
PORT=3000
NODE_ENV=development
# Database
DATABASE_URL=postgresql://user:password@localhost:5432/transcription_proxy
# Auth
JWT_SECRET=changeme_use_long_random_string
JWT_EXPIRY=7d
# Stripe
STRIPE_SECRET_KEY=sk_test_...
STRIPE_WEBHOOK_SECRET=whsec_...
# Deepgram
DEEPGRAM_API_KEY=your_deepgram_key_here
# Pricing (seconds per dollar — adjust for your margin)
# Default: 1000 seconds per $1 = $0.006/min managed cost covered + margin
CREDITS_PER_DOLLAR=1000
```
---
## Phase 1 — Core Server & Auth
### Goals
- Working Express app with Postgres connection
- Migration runner
- User registration and login
- JWT middleware
### Tasks
1. **Scaffold project**
- `npm init`, install dependencies: `express ws pg jsonwebtoken bcrypt stripe dotenv`
- Dev dependencies: `nodemon`
- Add `start` and `dev` scripts to package.json
2. **Database connection** (`src/db/index.js`)
- Export a `pg.Pool` instance using `DATABASE_URL`
- Export a `migrate()` function that reads `src/db/migrations/*.sql` in order,
checks `schema_migrations` table, and applies unapplied ones
- Call `migrate()` on server startup before listening
3. **Auth routes** (`src/routes/auth.js`)
- `POST /auth/register` — validate email/password, hash password with bcrypt (cost 12),
insert user, insert empty credit_balance row, return JWT
- `POST /auth/login` — verify credentials, return JWT + refresh token
- `POST /auth/refresh` — validate refresh token, return new JWT
- Passwords: minimum 8 characters, validate email format
4. **JWT middleware** (`src/middleware/auth.js`)
- Verify `Authorization: Bearer <token>` header
- Attach `req.user = { id, email }` on success
- Return 401 on failure
- Export as `requireAuth` middleware
5. **Basic health check**
- `GET /health` returns `{ status: 'ok', db: 'connected' }`
---
## Phase 2 — Billing & Credits
### Goals
- Stripe Checkout session creation for credit purchases
- Webhook handler to fulfill purchases
- Balance endpoint
### Payment Methods
Use **Stripe Dynamic Payment Methods** — do NOT hardcode `payment_method_types` in the
Checkout Session. Instead, leave it unset and manage everything from the Stripe Dashboard.
Enable the following in the Stripe Dashboard under Settings → Payment Methods:
- **Cards** (Visa, Mastercard, Amex, Discover) — on by default
- **PayPal** — enable manually
- **Apple Pay** — on by default, shows automatically on Safari/iOS
- **Google Pay** — enable manually (one toggle)
- **Cash App Pay** — enable manually (popular with streaming audiences)
- **Link** — Stripe's saved payment network, on by default
Stripe will automatically show the most relevant methods to each user based on their
location and device. No code changes are needed to add or remove methods in future —
it's all dashboard config.
### Credit Packages
Define these as constants in `src/config.js`:
```javascript
CREDIT_PACKAGES: [
{ id: 'pack_500', label: '500 minutes', seconds: 30000, price_cents: 300 },
{ id: 'pack_1200', label: '1200 minutes', seconds: 72000, price_cents: 600 },
{ id: 'pack_3000', label: '3000 minutes', seconds: 180000, price_cents: 1200 },
]
```
Adjust pricing to cover Deepgram costs ($0.006/min = $0.0001/sec) plus margin and
Stripe fees (~2.9% + $0.30).
### Tasks
1. **Stripe customer creation**
- On user registration, create a Stripe customer and store `stripe_customer_id`
- Do this asynchronously (don't block registration response)
2. **Billing routes** (`src/routes/billing.js`)
- `GET /billing/packages` — return credit package list (no auth required)
- `POST /billing/checkout` — requires auth, accepts `{ package_id }`,
creates Stripe Checkout Session using dynamic payment methods (do NOT pass
`payment_method_types` — omitting it enables dynamic methods automatically),
include `payment_intent_data.metadata` containing `user_id` and `package_id`,
returns `{ checkout_url }`
- `GET /billing/balance` — requires auth, returns `{ seconds_remaining, minutes_remaining }`
3. **Stripe webhook** (`src/webhooks/stripe.js`)
- Mount at `POST /webhooks/stripe` with raw body (use `express.raw()` for this route only)
- Verify signature with `stripe.webhooks.constructEvent()`
- Handle `checkout.session.completed`:
- Extract `user_id` and `package_id` from metadata
- Add seconds to `credit_balance`
- Insert row into `usage_ledger` with description `'credit_purchase'`
- Handle `payment_intent.payment_failed`: log it (no action needed for prepaid)
4. **Success/cancel pages**
- Stripe Checkout redirects to `GET /billing/success?session_id=...` and `/billing/cancel`
- These can be simple HTML responses or redirects to the web dashboard
---
## Phase 3 — WebSocket Proxy (Core Feature)
This is the most critical component. The proxy sits between the desktop client and Deepgram,
forwarding audio while tracking usage in real time.
### Connection Flow
```
Client connects → validate JWT → check credit balance → open Deepgram upstream
Audio chunks arrive → forward to Deepgram → record usage every 5 seconds
Transcription arrives from Deepgram → forward to client
Client disconnects (or credits exhausted) → close upstream → finalize session
```
### WebSocket Protocol
**Client connects to**: `wss://your-domain/ws/transcribe`
**Client sends as first message** (JSON):
```json
{
"type": "auth",
"token": "<JWT>",
"config": {
"model": "nova-2",
"language": "en-US",
"interim_results": true,
"endpointing": 300
}
}
```
**After auth success, client sends**: raw audio binary frames (PCM 16kHz mono)
**Server sends to client**:
```json
{ "type": "ready" }
{ "type": "transcript", "text": "...", "is_final": true, "confidence": 0.98 }
{ "type": "error", "code": "insufficient_credits", "message": "..." }
{ "type": "credits_low", "seconds_remaining": 300 }
{ "type": "session_end", "seconds_used": 120 }
```
### Tasks (`src/websocket/proxy.js`)
1. **Upgrade handler**
- Attach to the HTTP server using `ws.Server({ noServer: true })`
- In `server.on('upgrade', ...)`, route `/ws/transcribe` to this handler
2. **Auth handshake**
- First message must be `{ type: 'auth', token: '...' }` — received within 5 seconds
or connection is terminated
- Verify JWT, load user's credit balance from DB
- If balance is 0 or negative, send `insufficient_credits` error and close
3. **Deepgram upstream connection**
- Open a WebSocket to Deepgram's streaming API:
`wss://api.deepgram.com/v1/listen?model=nova-2&language=en-US&interim_results=true`
- Auth header: `Authorization: Token <DEEPGRAM_API_KEY>`
- Use query params from client's `config` object (whitelist allowed params)
4. **Audio forwarding**
- All binary messages from client → forward directly to Deepgram upstream
- All messages from Deepgram → parse JSON, reformat, forward to client
5. **Usage tracking**
- Create a `transcription_sessions` row on connection
- Maintain an in-memory `secondsUsed` counter per connection
- Deepgram sends `{ type: 'Results', duration: X }` in responses — use this for
accurate second counting
- Every 10 seconds (or on disconnect), write current `secondsUsed` to DB:
- Update `transcription_sessions.seconds_used`
- Decrement `credit_balance.seconds_remaining`
- Insert into `usage_ledger`
- If `seconds_remaining` hits 0: send `insufficient_credits`, close connection
6. **Cleanup on disconnect**
- Mark session as `completed`, set `ended_at`
- Do final usage flush to DB
- Close Deepgram upstream if still open
7. **Error handling**
- If Deepgram upstream closes unexpectedly, notify client and close
- If client sends malformed data, log and continue (don't crash)
---
## Phase 4 — Account Routes & Rate Limiting
### Tasks
1. **Account routes** (`src/routes/account.js`)
- `GET /account/me` — returns `{ email, credits: { seconds_remaining, minutes_remaining }, created_at }`
- `GET /account/usage` — returns last 30 days of `usage_ledger` entries grouped by day,
plus list of last 10 sessions with duration
2. **Rate limiting** (`src/middleware/rateLimit.js`)
- Use in-memory rate limiting (no Redis needed at this scale)
- Auth endpoints: max 10 requests per minute per IP
- WebSocket connections: max 2 concurrent connections per user
(store active connections in a `Map<userId, Set<ws>>`)
---
## Phase 5 — Web Dashboard
A simple, functional HTML/CSS/JS dashboard. No framework — vanilla JS is fine.
This is a developer-friendly streamer tool, not a consumer SaaS, so clean and
functional beats flashy.
### Pages
**`/` (Landing / Login)**
- Brief product description (what this is, why it exists)
- Login form and link to register
- Link to GitHub/Gitea repo
**`/dashboard` (Post-login)**
- Current credit balance (minutes remaining, prominently displayed)
- "Buy Credits" section showing the three packages with Stripe Checkout buttons
- Usage chart: last 30 days bar chart (vanilla canvas or a small CDN chart lib)
- Recent sessions table: date, duration, status
**`/register`**
- Registration form
### Implementation Notes
- Store JWT in `localStorage`, attach as `Authorization` header on API calls
- Redirect to `/` if JWT missing or expired
- Keep CSS minimal but readable — this is a utility dashboard
---
## Phase 6 — Desktop App Integration
Changes needed in the `local-transcription` Python repo.
### New file: `client/remote_transcription.py`
This module replaces `transcription_engine_realtime.py` when remote mode is active.
```python
# Pseudocode / spec for Claude Code to implement
class RemoteTranscriptionEngine:
"""
Connects to the transcription proxy WebSocket and streams audio.
Provides the same callback interface as the local engine so the
rest of the app doesn't need to change.
"""
def __init__(self, config, on_transcript_callback):
# config contains: server_url, auth_token (or byok_api_key), model
...
def start(self):
# Open WebSocket connection
# Send auth message
# Start audio capture thread (reuse existing audio_capture.py)
...
def stop(self):
# Close WebSocket gracefully
...
def _on_audio_chunk(self, audio_data):
# Called by audio_capture.py with raw PCM data
# Send as binary WebSocket frame
...
def _on_server_message(self, message):
# Parse JSON from server
# On type='transcript': call on_transcript_callback
# On type='credits_low': trigger UI warning
# On type='error': surface to user
...
```
### BYOK Mode
When user provides their own Deepgram key, connect directly to Deepgram instead of the proxy:
- Endpoint: `wss://api.deepgram.com/v1/listen?...`
- Auth: `Authorization: Token <user_key>`
- No session tracking (Deepgram handles billing directly to the user)
- Same `RemoteTranscriptionEngine` class, just different URL and auth header
### Settings Changes (`gui/settings_dialog_qt.py`)
Add a new "Transcription Mode" section:
```
Transcription Mode:
○ Local (Whisper) [existing behavior]
○ Remote - Managed [requires login]
○ Remote - BYOK [requires Deepgram API key]
[If Managed selected]:
Server URL: [____________]
[Login / Register] [View Balance: 420 min remaining]
[If BYOK selected]:
Deepgram API Key: [____________]
Model: [nova-2 ▼]
```
### Config additions (`config/default_config.yaml`)
```yaml
remote:
mode: local # local | managed | byok
server_url: "" # proxy server URL for managed mode
auth_token: "" # JWT stored after login
byok_api_key: "" # Deepgram key for BYOK mode
deepgram_model: nova-2
language: en-US
```
---
## Build & Deployment Notes
### Docker Compose (local dev)
```yaml
version: '3.8'
services:
db:
image: postgres:15
environment:
POSTGRES_DB: transcription_proxy
POSTGRES_USER: user
POSTGRES_PASSWORD: password
ports:
- "5432:5432"
volumes:
- pgdata:/var/lib/postgresql/data
app:
build: .
ports:
- "3000:3000"
environment:
DATABASE_URL: postgresql://user:password@db:5432/transcription_proxy
depends_on:
- db
volumes:
- .:/app
- /app/node_modules
volumes:
pgdata:
```
### Production Deployment
This service is a good fit for deployment on AnHonestHost WHP as a containerized app,
or on a small DigitalOcean/Linode VPS. Requirements are light:
- 512MB RAM is sufficient
- Postgres can be the same instance as other services or managed (e.g., Supabase free tier)
- Needs a public domain with SSL for WebSocket (`wss://`) to work from desktop clients
Reverse proxy config (Nginx or HAProxy) should:
- Proxy HTTP → `localhost:3000`
- Pass `Upgrade` and `Connection` headers for WebSocket support
- Set `proxy_read_timeout 3600` (sessions can be long)
---
## Implementation Order
Build and test in this sequence:
1. Project scaffold + DB connection + migrations
2. Auth (register/login/JWT) — test with curl
3. Stripe billing + webhook — test with Stripe CLI (`stripe listen`)
4. WebSocket proxy — test with a simple browser WebSocket client first
5. Usage tracking and credit decrement
6. Account/usage routes
7. Web dashboard
8. Desktop app integration (separate PR in local-transcription repo)
---
## Key Decisions & Rationale
| Decision | Choice | Reason |
|---|---|---|
| Credits model | Prepaid | No surprise charges, simpler billing, better for irregular streamer usage |
| WebSocket library | `ws` | Lightweight, no abstraction overhead, plays well with raw binary audio |
| Auth | JWT (stateless) | Desktop app holds token locally; no session store needed |
| DB driver | `node-postgres` (pg) | No ORM overhead; schema is simple enough for raw SQL |
| Migrations | Raw SQL files | No dependency on Knex/Prisma; easy to inspect and reason about |
| Rate limiting | In-memory | Redis is overkill for this scale; single-process Node is fine initially |
| Frontend | Vanilla JS | Dashboard is simple utility UI; no framework justified |
---
## What This Plan Does NOT Cover (Future Work)
- OAuth / social login
- Admin panel for managing users
- Refund / credit adjustment tooling
- Email verification
- Password reset flow
- Multi-language support beyond Deepgram's defaults
- Analytics / aggregated usage reporting
- Self-hosted Whisper inference as a third backend option

Binary file not shown.

View File

@@ -48,6 +48,25 @@ class Config:
# Save the default configuration
self.save()
# Migrate remote_processing -> remote
self._migrate_remote_config()
def _migrate_remote_config(self):
"""Migrate old remote_processing config to new remote config."""
if 'remote_processing' in self.config and 'remote' not in self.config:
old = self.config['remote_processing']
self.config['remote'] = {
'mode': 'managed' if old.get('enabled', False) else 'local',
'server_url': old.get('server_url', ''),
'auth_token': '',
'byok_api_key': old.get('api_key', ''),
'deepgram_model': 'nova-2',
'language': 'en-US',
'fallback_to_local': old.get('fallback_to_local', True),
}
del self.config['remote_processing']
self.save()
def save(self) -> None:
"""Save current configuration to file."""
with open(self.config_path, 'w') as f:

View File

@@ -0,0 +1,528 @@
"""Deepgram-based transcription engine using WebSocket streaming.
Supports two modes:
- Managed mode: connects to a proxy server that handles Deepgram credentials
- BYOK mode: connects directly to the Deepgram API with a user-provided key
Implements the same duck-type interface as RealtimeTranscriptionEngine so
MainWindow can use it as a drop-in replacement.
"""
import asyncio
import json
import logging
import numpy as np
import threading
from datetime import datetime
from queue import Queue, Empty
from typing import Optional, Callable
from client.transcription_engine_realtime import TranscriptionResult
logger = logging.getLogger(__name__)
class DeepgramTranscriptionEngine:
"""
Transcription engine that streams audio to Deepgram via WebSocket.
In managed mode the connection goes through a proxy at
``wss://<server>/ws/transcribe`` which handles authentication and
Deepgram credentials. In BYOK (bring-your-own-key) mode the
connection goes directly to the Deepgram API.
"""
# ------------------------------------------------------------------ #
# Construction / configuration
# ------------------------------------------------------------------ #
def __init__(self, config, user_name: str = "User", input_device_index: Optional[int] = None):
"""
Initialise the engine from a :class:`client.config.Config` object.
Args:
config: Application ``Config`` instance.
user_name: Display name attached to transcriptions.
input_device_index: Index of the audio input device to use
(``None`` for the system default).
"""
self.config = config
self.user_name = user_name
self.input_device_index = input_device_index
# Mode: 'managed' (proxy) or 'byok' (direct Deepgram)
self.mode: str = config.get("remote.mode", "managed")
# Managed-mode settings
self.server_url: str = config.get("remote.server_url", "")
self.auth_token: str = config.get("remote.auth_token", "")
# BYOK-mode settings
self.byok_api_key: str = config.get("remote.byok_api_key", "")
# Deepgram model / language (used in both modes)
self.deepgram_model: str = config.get("remote.deepgram_model", "nova-2")
self.language: str = config.get("remote.language", "en-US")
# Audio parameters
self.sample_rate: int = 16000
self.channels: int = 1
self.blocksize: int = 4096
# Callbacks
self.realtime_callback: Optional[Callable[[TranscriptionResult], None]] = None
self.final_callback: Optional[Callable[[TranscriptionResult], None]] = None
self._on_error: Optional[Callable[[str], None]] = None
self._on_credits_low: Optional[Callable[[int], None]] = None
# Internal state
self._is_initialized: bool = False
self._is_recording: bool = False
self._stop_event: threading.Event = threading.Event()
self._audio_queue: Queue = Queue()
# Asyncio event loop running in a daemon thread
self._loop: Optional[asyncio.AbstractEventLoop] = None
self._thread: Optional[threading.Thread] = None
# WebSocket handle (set inside the async context)
self._ws = None
# sounddevice InputStream
self._stream = None
# ------------------------------------------------------------------ #
# Callback setters
# ------------------------------------------------------------------ #
def set_callbacks(
self,
realtime_callback: Optional[Callable[[TranscriptionResult], None]] = None,
final_callback: Optional[Callable[[TranscriptionResult], None]] = None,
):
"""Set transcription result callbacks (matches RealtimeTranscriptionEngine API)."""
self.realtime_callback = realtime_callback
self.final_callback = final_callback
def set_error_callback(self, fn: Optional[Callable[[str], None]]):
"""Set a callback invoked on errors. ``fn`` receives a string message."""
self._on_error = fn
def set_credits_low_callback(self, fn: Optional[Callable[[int], None]]):
"""Set a callback for low-credit warnings. ``fn`` receives seconds remaining."""
self._on_credits_low = fn
# ------------------------------------------------------------------ #
# Public interface (duck-typed with RealtimeTranscriptionEngine)
# ------------------------------------------------------------------ #
def initialize(self) -> bool:
"""Validate configuration and mark the engine as ready.
Returns ``True`` when the engine is ready to start recording.
"""
if self._is_initialized:
return True
if self.mode == "managed":
if not self.server_url:
logger.error("Managed mode requires a server URL (remote.server_url)")
return False
if not self.auth_token:
logger.error("Managed mode requires an auth token (remote.auth_token)")
return False
elif self.mode == "byok":
if not self.byok_api_key:
logger.error("BYOK mode requires an API key (remote.byok_api_key)")
return False
else:
logger.error("Unknown remote mode: %s (expected 'managed' or 'byok')", self.mode)
return False
self._is_initialized = True
logger.info("DeepgramTranscriptionEngine initialised in %s mode", self.mode)
return True
def start_recording(self) -> bool:
"""Open the audio stream and connect the WebSocket.
Returns ``True`` on success.
"""
if not self._is_initialized:
logger.error("Engine not initialised -- call initialize() first")
return False
if self._is_recording:
return True
self._stop_event.clear()
self._is_recording = True
# Start the asyncio event-loop thread (handles WS send/receive)
self._thread = threading.Thread(target=self._run_event_loop, daemon=True)
self._thread.start()
# Start the audio capture stream
try:
self._start_audio_stream()
except Exception as exc:
logger.error("Failed to open audio stream: %s", exc)
self._is_recording = False
self._stop_event.set()
return False
logger.info("Recording started")
return True
def stop_recording(self):
"""Stop audio capture and close the WebSocket."""
if not self._is_recording:
return
self._is_recording = False
self._stop_event.set()
# Stop audio stream
self._stop_audio_stream()
# Close WebSocket from outside the event-loop thread
if self._ws is not None and self._loop is not None and not self._loop.is_closed():
asyncio.run_coroutine_threadsafe(self._close_ws(), self._loop)
# Wait for the thread to finish
if self._thread is not None:
self._thread.join(timeout=5)
self._thread = None
logger.info("Recording stopped")
def stop(self):
"""Full shutdown -- stop recording and release all resources."""
self.stop_recording()
self._is_initialized = False
logger.info("DeepgramTranscriptionEngine shut down")
def is_ready(self) -> bool:
"""Return ``True`` if the engine has been successfully initialised."""
return self._is_initialized
# ------------------------------------------------------------------ #
# Audio capture (sounddevice)
# ------------------------------------------------------------------ #
def _start_audio_stream(self):
"""Open a ``sounddevice.InputStream`` that feeds the audio queue."""
import sounddevice as sd
def _audio_callback(indata, frames, time_info, status): # noqa: ARG001
if status:
logger.warning("Audio stream status: %s", status)
if self._is_recording:
# float32 -> int16 PCM bytes
pcm = (indata * 32767).astype(np.int16).tobytes()
self._audio_queue.put(pcm)
self._stream = sd.InputStream(
samplerate=self.sample_rate,
blocksize=self.blocksize,
channels=self.channels,
dtype="float32",
device=self.input_device_index,
callback=_audio_callback,
)
self._stream.start()
def _stop_audio_stream(self):
"""Close the audio input stream."""
if self._stream is not None:
try:
self._stream.stop()
self._stream.close()
except Exception as exc:
logger.debug("Error closing audio stream: %s", exc)
finally:
self._stream = None
# ------------------------------------------------------------------ #
# Asyncio event-loop (runs in daemon thread)
# ------------------------------------------------------------------ #
def _run_event_loop(self):
"""Entry point for the daemon thread -- runs the async event loop."""
self._loop = asyncio.new_event_loop()
asyncio.set_event_loop(self._loop)
try:
self._loop.run_until_complete(self._ws_lifecycle())
except Exception as exc:
logger.error("Event-loop error: %s", exc)
finally:
try:
self._loop.run_until_complete(self._loop.shutdown_asyncgens())
except Exception:
pass
self._loop.close()
self._loop = None
async def _ws_lifecycle(self):
"""Connect, authenticate (if managed), then run send/receive loops."""
import websockets
try:
ws_url, extra_headers = self._build_ws_url_and_headers()
logger.info("Connecting to %s", ws_url)
self._ws = await websockets.connect(
ws_url,
additional_headers=extra_headers,
ping_interval=20,
ping_timeout=10,
)
# Managed mode: send auth message and wait for ready
if self.mode == "managed":
if not await self._managed_handshake():
return
# Run send and receive concurrently
await asyncio.gather(
self._send_loop(),
self._receive_loop(),
)
except asyncio.CancelledError:
pass
except Exception as exc:
msg = f"WebSocket error: {exc}"
logger.error(msg)
if self._on_error:
self._on_error(msg)
finally:
await self._close_ws()
def _build_ws_url_and_headers(self):
"""Return ``(url, headers)`` depending on the current mode."""
if self.mode == "managed":
# Ensure the server URL uses wss:// and append the path
url = self.server_url.rstrip("/")
if not url.startswith("ws://") and not url.startswith("wss://"):
url = f"wss://{url}"
url = f"{url}/ws/transcribe"
return url, {}
# BYOK -- connect directly to Deepgram
params = (
f"model={self.deepgram_model}"
f"&language={self.language}"
"&interim_results=true"
"&encoding=linear16"
f"&sample_rate={self.sample_rate}"
f"&channels={self.channels}"
)
url = f"wss://api.deepgram.com/v1/listen?{params}"
headers = {"Authorization": f"Token {self.byok_api_key}"}
return url, headers
# -- managed-mode handshake ---------------------------------------- #
async def _managed_handshake(self) -> bool:
"""Send auth message and wait for ``ready`` (managed mode).
Returns ``True`` on success.
"""
auth_msg = {
"type": "auth",
"token": self.auth_token,
"config": {
"model": self.deepgram_model,
"language": self.language,
"sample_rate": self.sample_rate,
"channels": self.channels,
"encoding": "linear16",
"interim_results": True,
},
}
await self._ws.send(json.dumps(auth_msg))
try:
raw = await asyncio.wait_for(self._ws.recv(), timeout=15)
data = json.loads(raw)
if data.get("type") == "ready":
logger.info("Managed proxy is ready")
return True
if data.get("type") == "error":
err = data.get("message", "unknown error")
logger.error("Auth error from proxy: %s", err)
if self._on_error:
self._on_error(f"Proxy auth error: {err}")
return False
logger.warning("Unexpected handshake message: %s", data)
return False
except asyncio.TimeoutError:
logger.error("Timed out waiting for proxy ready message")
if self._on_error:
self._on_error("Timed out waiting for proxy ready message")
return False
# -- send loop ----------------------------------------------------- #
async def _send_loop(self):
"""Drain the audio queue and push raw PCM bytes over the WebSocket."""
while not self._stop_event.is_set():
try:
pcm_bytes = self._audio_queue.get(timeout=0.1)
except Empty:
continue
try:
await self._ws.send(pcm_bytes)
except Exception as exc:
if not self._stop_event.is_set():
logger.error("Send error: %s", exc)
break
# -- receive loop -------------------------------------------------- #
async def _receive_loop(self):
"""Listen for messages from the WebSocket and dispatch them."""
while not self._stop_event.is_set():
try:
raw = await asyncio.wait_for(self._ws.recv(), timeout=1.0)
except asyncio.TimeoutError:
continue
except Exception as exc:
if not self._stop_event.is_set():
logger.error("Receive error: %s", exc)
break
try:
data = json.loads(raw)
except (json.JSONDecodeError, TypeError):
logger.debug("Non-JSON message received, ignoring")
continue
if self.mode == "managed":
self._handle_managed_message(data)
else:
self._handle_byok_message(data)
# ------------------------------------------------------------------ #
# Message handlers
# ------------------------------------------------------------------ #
def _handle_managed_message(self, data: dict):
"""Process a message from the managed proxy."""
msg_type = data.get("type", "")
if msg_type == "transcript":
text = data.get("text", "")
is_final = data.get("is_final", False)
if text.strip():
result = TranscriptionResult(
text=text,
is_final=is_final,
timestamp=datetime.now(),
user_name=self.user_name,
)
if is_final:
if self.final_callback:
self.final_callback(result)
else:
if self.realtime_callback:
self.realtime_callback(result)
elif msg_type == "credits_low":
seconds_remaining = data.get("seconds_remaining", 0)
logger.warning("Credits low -- %d seconds remaining", seconds_remaining)
if self._on_credits_low:
self._on_credits_low(int(seconds_remaining))
elif msg_type == "error":
code = data.get("code", "")
message = data.get("message", "Unknown error")
logger.error("Proxy error [%s]: %s", code, message)
if self._on_error:
self._on_error(f"[{code}] {message}" if code else message)
elif msg_type == "session_end":
seconds_used = data.get("seconds_used", 0)
logger.info("Session ended -- %d seconds used", seconds_used)
elif msg_type == "ready":
# May arrive again after reconnects; safe to ignore.
logger.debug("Received ready message (already connected)")
else:
logger.debug("Unhandled managed message type: %s", msg_type)
def _handle_byok_message(self, data: dict):
"""Process a message received directly from the Deepgram API."""
msg_type = data.get("type", "")
if msg_type == "Results":
channel = data.get("channel", {})
alternatives = channel.get("alternatives", [])
if not alternatives:
return
transcript = alternatives[0].get("transcript", "")
is_final = data.get("is_final", False)
if transcript.strip():
result = TranscriptionResult(
text=transcript,
is_final=is_final,
timestamp=datetime.now(),
user_name=self.user_name,
)
if is_final:
if self.final_callback:
self.final_callback(result)
else:
if self.realtime_callback:
self.realtime_callback(result)
elif msg_type == "Metadata":
logger.debug("Deepgram metadata: %s", data)
elif msg_type == "UtteranceEnd":
logger.debug("Deepgram utterance end")
else:
logger.debug("Unhandled Deepgram message type: %s", msg_type)
# ------------------------------------------------------------------ #
# Helpers
# ------------------------------------------------------------------ #
async def _close_ws(self):
"""Close the WebSocket connection if open."""
if self._ws is not None:
try:
await self._ws.close()
except Exception:
pass
self._ws = None
def set_user_name(self, user_name: str):
"""Update the user name attached to future transcriptions."""
self.user_name = user_name
def is_recording_active(self) -> bool:
"""Return ``True`` if audio is currently being captured."""
return self._is_recording
def __repr__(self) -> str:
return (
f"DeepgramTranscriptionEngine(mode={self.mode}, "
f"recording={self._is_recording})"
)
def __del__(self):
"""Best-effort cleanup."""
try:
self.stop()
except Exception:
pass

View File

@@ -68,11 +68,14 @@ web_server:
port: 8080
host: "127.0.0.1"
remote_processing:
enabled: false # Enable remote transcription offloading
server_url: "" # WebSocket URL of remote transcription service (e.g., ws://your-server:8765/ws/transcribe)
api_key: "" # API key for authentication
fallback_to_local: true # Fall back to local processing if remote fails
remote:
mode: local # local | managed | byok
server_url: "" # Proxy server URL for managed mode (e.g., wss://your-proxy.com)
auth_token: "" # JWT stored after login (managed mode)
byok_api_key: "" # Deepgram API key for BYOK mode
deepgram_model: nova-2 # Deepgram model to use
language: en-US # Language code
fallback_to_local: true # Fall back to local Whisper if remote fails
updates:
auto_check: true # Check for updates on startup

View File

@@ -18,6 +18,7 @@ sys.path.append(str(Path(__file__).resolve().parent.parent))
from client.config import Config
from client.device_utils import DeviceManager
from client.transcription_engine_realtime import RealtimeTranscriptionEngine, TranscriptionResult
from client.deepgram_transcription import DeepgramTranscriptionEngine
from client.server_sync import ServerSyncClient
from gui.settings_dialog_qt import SettingsDialog
from server.web_display import TranscriptionWebServer
@@ -394,6 +395,23 @@ class MainWindow(QMainWindow):
min_gap = self.config.get('transcription.min_gap_between_recordings', 0.0)
min_recording = self.config.get('transcription.min_length_of_recording', 0.5)
remote_mode = self.config.get('remote.mode', 'local')
if remote_mode in ('managed', 'byok'):
# Use Deepgram-based remote transcription
self.transcription_engine = DeepgramTranscriptionEngine(
config=self.config,
user_name=user_name,
input_device_index=audio_device
)
self.transcription_engine.set_callbacks(
realtime_callback=self._on_realtime_transcription,
final_callback=self._on_final_transcription
)
self.transcription_engine.set_error_callback(self._on_remote_error)
self.transcription_engine.set_credits_low_callback(self._on_credits_low)
else:
# Use local Whisper transcription
self.transcription_engine = RealtimeTranscriptionEngine(
model=model,
device=device,
@@ -430,8 +448,11 @@ class MainWindow(QMainWindow):
def _on_engine_ready(self, success: bool, message: str):
"""Handle engine initialization completion."""
if success:
# Update device label with actual device used
if self.transcription_engine:
remote_mode = self.config.get('remote.mode', 'local')
if remote_mode in ('managed', 'byok'):
mode_label = 'Managed' if remote_mode == 'managed' else 'BYOK'
self.device_label.setText(f"Device: Deepgram ({mode_label})")
elif self.transcription_engine:
actual_device = self.transcription_engine.device
compute_type = self.transcription_engine.compute_type
device_display = f"{actual_device.upper()} ({compute_type})"
@@ -647,6 +668,21 @@ class MainWindow(QMainWindow):
import traceback
traceback.print_exc()
def _on_remote_error(self, error_msg: str):
"""Handle error from remote transcription service."""
print(f"Remote transcription error: {error_msg}")
self.status_label.setText(f"⚠ Remote error: {error_msg}")
# Fallback to local if enabled
if self.config.get('remote.fallback_to_local', True) and self.is_transcribing:
print("Falling back to local transcription...")
self.status_label.setText("⚠ Remote failed — falling back to local")
def _on_credits_low(self, seconds_remaining: int):
"""Handle low credits warning from proxy."""
minutes = seconds_remaining // 60
self.status_label.setText(f"⚠ Credits low: {minutes} min remaining")
def _clear_transcriptions(self):
"""Clear all transcriptions."""
if not self.transcriptions:

View File

@@ -4,7 +4,7 @@ from PySide6.QtWidgets import (
QDialog, QVBoxLayout, QHBoxLayout, QFormLayout,
QLabel, QLineEdit, QComboBox, QCheckBox, QSlider,
QPushButton, QMessageBox, QGroupBox, QScrollArea, QWidget,
QFileDialog, QColorDialog
QFileDialog, QColorDialog, QRadioButton
)
from PySide6.QtCore import Qt
from PySide6.QtGui import QScreen, QFontDatabase, QColor
@@ -487,46 +487,91 @@ class SettingsDialog(QDialog):
server_group.setLayout(server_layout)
content_layout.addWidget(server_group)
# Remote Processing Group
remote_group = QGroupBox("Remote Processing (GPU Offload)")
remote_layout = QFormLayout()
remote_layout.setSpacing(10)
# Transcription Mode Group
mode_group = QGroupBox("Transcription Mode")
mode_layout = QVBoxLayout()
mode_layout.setSpacing(10)
self.remote_enabled_check = QCheckBox()
self.remote_enabled_check.setToolTip(
"Enable remote transcription processing:\n"
"• Offload transcription to a GPU-equipped server\n"
"• Reduces local CPU/GPU usage\n"
"• Requires running the remote transcription service"
)
remote_layout.addRow("Enable Remote Processing:", self.remote_enabled_check)
# Radio buttons for mode selection
self.mode_local_radio = QRadioButton("Local (Whisper)")
self.mode_local_radio.setToolTip("Transcribe locally using Whisper models")
self.mode_managed_radio = QRadioButton("Remote - Managed")
self.mode_managed_radio.setToolTip("Use the transcription proxy service with prepaid credits")
self.mode_byok_radio = QRadioButton("Remote - BYOK (Bring Your Own Key)")
self.mode_byok_radio.setToolTip("Connect directly to Deepgram with your own API key")
self.remote_url_input = QLineEdit()
self.remote_url_input.setPlaceholderText("ws://your-server:8765/ws/transcribe")
self.remote_url_input.setToolTip(
"WebSocket URL of the remote transcription service:\n"
"• Format: ws://host:port/ws/transcribe\n"
"• Use wss:// for secure connections"
)
remote_layout.addRow("Server URL:", self.remote_url_input)
mode_layout.addWidget(self.mode_local_radio)
mode_layout.addWidget(self.mode_managed_radio)
mode_layout.addWidget(self.mode_byok_radio)
self.remote_api_key_input = QLineEdit()
self.remote_api_key_input.setEchoMode(QLineEdit.Password)
self.remote_api_key_input.setPlaceholderText("your-api-key")
self.remote_api_key_input.setToolTip(
"API key for authentication with the remote service"
)
remote_layout.addRow("API Key:", self.remote_api_key_input)
# Managed mode fields (shown when managed radio selected)
self.managed_widget = QWidget()
managed_layout = QFormLayout()
managed_layout.setSpacing(8)
self.remote_fallback_check = QCheckBox("Enable")
self.remote_fallback_check.setChecked(True)
self.remote_fallback_check.setToolTip(
"Fall back to local transcription if remote service is unavailable"
)
remote_layout.addRow("Fallback to Local:", self.remote_fallback_check)
self.managed_server_url = QLineEdit()
self.managed_server_url.setPlaceholderText("wss://your-proxy-server.com")
managed_layout.addRow("Server URL:", self.managed_server_url)
remote_group.setLayout(remote_layout)
content_layout.addWidget(remote_group)
# Login/Register buttons in a row
auth_widget = QWidget()
auth_layout = QHBoxLayout()
auth_layout.setContentsMargins(0, 0, 0, 0)
self.managed_login_btn = QPushButton("Login")
self.managed_login_btn.clicked.connect(self._managed_login)
self.managed_register_btn = QPushButton("Register")
self.managed_register_btn.clicked.connect(self._managed_register)
auth_layout.addWidget(self.managed_login_btn)
auth_layout.addWidget(self.managed_register_btn)
auth_layout.addStretch()
auth_widget.setLayout(auth_layout)
managed_layout.addRow("Account:", auth_widget)
self.managed_balance_label = QLabel("Not logged in")
managed_layout.addRow("Balance:", self.managed_balance_label)
self.managed_fallback_check = QCheckBox("Enable")
self.managed_fallback_check.setChecked(True)
self.managed_fallback_check.setToolTip("Fall back to local Whisper if remote fails")
managed_layout.addRow("Fallback to Local:", self.managed_fallback_check)
self.managed_widget.setLayout(managed_layout)
mode_layout.addWidget(self.managed_widget)
# BYOK mode fields (shown when BYOK radio selected)
self.byok_widget = QWidget()
byok_layout = QFormLayout()
byok_layout.setSpacing(8)
self.byok_api_key_input = QLineEdit()
self.byok_api_key_input.setEchoMode(QLineEdit.Password)
self.byok_api_key_input.setPlaceholderText("your-deepgram-api-key")
byok_layout.addRow("Deepgram API Key:", self.byok_api_key_input)
self.byok_model_combo = QComboBox()
self.byok_model_combo.addItems(["nova-2", "nova-2-general", "nova-2-meeting", "nova-2-phonecall", "whisper-large", "whisper-medium", "whisper-small"])
byok_layout.addRow("Model:", self.byok_model_combo)
self.byok_language_input = QLineEdit()
self.byok_language_input.setText("en-US")
self.byok_language_input.setPlaceholderText("en-US")
byok_layout.addRow("Language:", self.byok_language_input)
self.byok_fallback_check = QCheckBox("Enable")
self.byok_fallback_check.setChecked(True)
self.byok_fallback_check.setToolTip("Fall back to local Whisper if Deepgram fails")
byok_layout.addRow("Fallback to Local:", self.byok_fallback_check)
self.byok_widget.setLayout(byok_layout)
mode_layout.addWidget(self.byok_widget)
mode_group.setLayout(mode_layout)
content_layout.addWidget(mode_group)
# Connect radio buttons to show/hide relevant widgets
self.mode_local_radio.toggled.connect(self._on_mode_changed)
self.mode_managed_radio.toggled.connect(self._on_mode_changed)
self.mode_byok_radio.toggled.connect(self._on_mode_changed)
# Updates Group
updates_group = QGroupBox("Software Updates")
@@ -794,11 +839,28 @@ class SettingsDialog(QDialog):
self.server_room_input.setText(self.config.get('server_sync.room', 'default'))
self.server_passphrase_input.setText(self.config.get('server_sync.passphrase', ''))
# Remote processing settings
self.remote_enabled_check.setChecked(self.config.get('remote_processing.enabled', False))
self.remote_url_input.setText(self.config.get('remote_processing.server_url', ''))
self.remote_api_key_input.setText(self.config.get('remote_processing.api_key', ''))
self.remote_fallback_check.setChecked(self.config.get('remote_processing.fallback_to_local', True))
# Transcription mode settings
mode = self.config.get('remote.mode', 'local')
if mode == 'managed':
self.mode_managed_radio.setChecked(True)
elif mode == 'byok':
self.mode_byok_radio.setChecked(True)
else:
self.mode_local_radio.setChecked(True)
self.managed_server_url.setText(self.config.get('remote.server_url', ''))
self.managed_fallback_check.setChecked(self.config.get('remote.fallback_to_local', True))
self.byok_api_key_input.setText(self.config.get('remote.byok_api_key', ''))
self.byok_model_combo.setCurrentText(self.config.get('remote.deepgram_model', 'nova-2'))
self.byok_language_input.setText(self.config.get('remote.language', 'en-US'))
self.byok_fallback_check.setChecked(self.config.get('remote.fallback_to_local', True))
# Trigger visibility update
self._on_mode_changed()
# Update balance if managed mode and has token
if self.config.get('remote.auth_token'):
self._update_managed_balance()
# Update settings
self.update_auto_check.setChecked(self.config.get('updates.auto_check', True))
@@ -869,11 +931,21 @@ class SettingsDialog(QDialog):
self.config.set('server_sync.room', self.server_room_input.text())
self.config.set('server_sync.passphrase', self.server_passphrase_input.text())
# Remote processing settings
self.config.set('remote_processing.enabled', self.remote_enabled_check.isChecked())
self.config.set('remote_processing.server_url', self.remote_url_input.text())
self.config.set('remote_processing.api_key', self.remote_api_key_input.text())
self.config.set('remote_processing.fallback_to_local', self.remote_fallback_check.isChecked())
# Transcription mode settings
if self.mode_managed_radio.isChecked():
self.config.set('remote.mode', 'managed')
elif self.mode_byok_radio.isChecked():
self.config.set('remote.mode', 'byok')
else:
self.config.set('remote.mode', 'local')
self.config.set('remote.server_url', self.managed_server_url.text())
self.config.set('remote.fallback_to_local',
self.managed_fallback_check.isChecked() if self.mode_managed_radio.isChecked()
else self.byok_fallback_check.isChecked())
self.config.set('remote.byok_api_key', self.byok_api_key_input.text())
self.config.set('remote.deepgram_model', self.byok_model_combo.currentText())
self.config.set('remote.language', self.byok_language_input.text())
# Update settings
self.config.set('updates.auto_check', self.update_auto_check.isChecked())
@@ -892,6 +964,194 @@ class SettingsDialog(QDialog):
except Exception as e:
QMessageBox.critical(self, "Error", f"Failed to save settings:\n{e}")
def _on_mode_changed(self):
"""Show/hide mode-specific widgets based on selected radio button."""
self.managed_widget.setVisible(self.mode_managed_radio.isChecked())
self.byok_widget.setVisible(self.mode_byok_radio.isChecked())
def _managed_login(self):
"""Open a login dialog and authenticate with the managed proxy server."""
import json
import urllib.request
import urllib.error
dialog = QDialog(self)
dialog.setWindowTitle("Login")
dialog.setMinimumWidth(350)
layout = QFormLayout()
email_input = QLineEdit()
email_input.setPlaceholderText("you@example.com")
layout.addRow("Email:", email_input)
password_input = QLineEdit()
password_input.setEchoMode(QLineEdit.Password)
layout.addRow("Password:", password_input)
button_layout = QHBoxLayout()
cancel_btn = QPushButton("Cancel")
cancel_btn.clicked.connect(dialog.reject)
login_btn = QPushButton("Login")
login_btn.setDefault(True)
button_layout.addStretch()
button_layout.addWidget(cancel_btn)
button_layout.addWidget(login_btn)
layout.addRow("", button_layout)
dialog.setLayout(layout)
def do_login():
server_url = self.managed_server_url.text().rstrip('/')
if not server_url:
QMessageBox.warning(dialog, "Error", "Please enter a Server URL first.")
return
payload = json.dumps({
"email": email_input.text(),
"password": password_input.text()
}).encode('utf-8')
req = urllib.request.Request(
f"{server_url}/auth/login",
data=payload,
headers={"Content-Type": "application/json"},
method="POST"
)
try:
with urllib.request.urlopen(req, timeout=10) as resp:
data = json.loads(resp.read().decode('utf-8'))
token = data.get('token', '')
if token:
self.config.set('remote.auth_token', token)
self._update_managed_balance()
QMessageBox.information(dialog, "Success", "Logged in successfully.")
dialog.accept()
else:
QMessageBox.warning(dialog, "Error", "Login succeeded but no token received.")
except urllib.error.HTTPError as e:
try:
body = json.loads(e.read().decode('utf-8'))
msg = body.get('detail', body.get('message', str(e)))
except Exception:
msg = str(e)
QMessageBox.warning(dialog, "Login Failed", msg)
except Exception as e:
QMessageBox.warning(dialog, "Error", f"Could not connect to server:\n{e}")
login_btn.clicked.connect(do_login)
dialog.exec()
def _managed_register(self):
"""Open a registration dialog and create an account on the managed proxy server."""
import json
import urllib.request
import urllib.error
dialog = QDialog(self)
dialog.setWindowTitle("Register")
dialog.setMinimumWidth(350)
layout = QFormLayout()
email_input = QLineEdit()
email_input.setPlaceholderText("you@example.com")
layout.addRow("Email:", email_input)
password_input = QLineEdit()
password_input.setEchoMode(QLineEdit.Password)
layout.addRow("Password:", password_input)
confirm_input = QLineEdit()
confirm_input.setEchoMode(QLineEdit.Password)
layout.addRow("Confirm Password:", confirm_input)
button_layout = QHBoxLayout()
cancel_btn = QPushButton("Cancel")
cancel_btn.clicked.connect(dialog.reject)
register_btn = QPushButton("Register")
register_btn.setDefault(True)
button_layout.addStretch()
button_layout.addWidget(cancel_btn)
button_layout.addWidget(register_btn)
layout.addRow("", button_layout)
dialog.setLayout(layout)
def do_register():
if password_input.text() != confirm_input.text():
QMessageBox.warning(dialog, "Error", "Passwords do not match.")
return
server_url = self.managed_server_url.text().rstrip('/')
if not server_url:
QMessageBox.warning(dialog, "Error", "Please enter a Server URL first.")
return
payload = json.dumps({
"email": email_input.text(),
"password": password_input.text()
}).encode('utf-8')
req = urllib.request.Request(
f"{server_url}/auth/register",
data=payload,
headers={"Content-Type": "application/json"},
method="POST"
)
try:
with urllib.request.urlopen(req, timeout=10) as resp:
data = json.loads(resp.read().decode('utf-8'))
token = data.get('token', '')
if token:
self.config.set('remote.auth_token', token)
self._update_managed_balance()
QMessageBox.information(dialog, "Success", "Account created and logged in.")
dialog.accept()
else:
QMessageBox.information(dialog, "Success",
"Account created. Please log in.")
dialog.accept()
except urllib.error.HTTPError as e:
try:
body = json.loads(e.read().decode('utf-8'))
msg = body.get('detail', body.get('message', str(e)))
except Exception:
msg = str(e)
QMessageBox.warning(dialog, "Registration Failed", msg)
except Exception as e:
QMessageBox.warning(dialog, "Error", f"Could not connect to server:\n{e}")
register_btn.clicked.connect(do_register)
dialog.exec()
def _update_managed_balance(self):
"""Fetch and display the current account balance from the managed proxy server."""
import json
import urllib.request
import urllib.error
server_url = self.managed_server_url.text().rstrip('/')
token = self.config.get('remote.auth_token', '')
if not server_url or not token:
self.managed_balance_label.setText("Not logged in")
return
req = urllib.request.Request(
f"{server_url}/billing/balance",
headers={
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
},
method="GET"
)
try:
with urllib.request.urlopen(req, timeout=10) as resp:
data = json.loads(resp.read().decode('utf-8'))
balance = data.get('balance', data.get('credits', 'N/A'))
self.managed_balance_label.setText(str(balance))
except urllib.error.HTTPError as e:
if e.code == 401:
self.managed_balance_label.setText("Session expired - please login again")
self.config.set('remote.auth_token', '')
else:
self.managed_balance_label.setText("Error fetching balance")
except Exception:
self.managed_balance_label.setText("Could not connect to server")
def _check_for_updates_now(self):
"""Manually check for updates."""
from version import __version__