voice-to-notes

Author	SHA1	Message	Date
Claude	58faa83cb3	Cross-platform distribution, UI improvements, and performance optimizations - PyInstaller frozen sidecar: spec file, build script, and ffmpeg path resolver for self-contained distribution without Python prerequisites - Dual-mode sidecar launcher: frozen binary (production) with dev mode fallback - Parallel transcription + diarization pipeline (~30-40% faster) - GPU auto-detection for diarization (CUDA when available) - Async run_pipeline command for real-time progress event delivery - Web Audio API backend for instant playback and seeking - OpenAI-compatible provider replacing LiteLLM client-side routing - Cross-platform RAM detection (Linux/macOS/Windows) - Settings: speaker count hint, token reveal toggles, dark dropdown styling - Loading splash screen, flexbox layout fix for viewport overflow - Gitea Actions CI/CD pipeline (Linux, Windows, macOS ARM) - Updated README and CLAUDE.md documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 21:33:43 -07:00
Claude	c23b9a90dd	Merge perf/diarize-threading: diarization progress via background thread	2026-03-20 13:52:59 -07:00
Claude	03af5a189c	Run pyannote diarization in background thread with progress reporting Move the blocking pipeline() call to a daemon thread and emit estimated progress messages every 2 seconds from the main thread. The progress estimate uses audio duration to calibrate the expected total time. Also pass audio_duration_sec from PipelineService to DiarizeService. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 13:50:57 -07:00
Josh Knapp	585411f402	Fix speaker diarization: WAV conversion, pyannote 4.0 compat, telemetry bug - Convert non-WAV audio to 16kHz mono WAV before diarization (pyannote v4.0.4 AudioDecoder returns None duration for FLAC, causing crash) - Handle pyannote 4.0 DiarizeOutput return type (unwrap .speaker_diarization) - Disable pyannote telemetry (np.isfinite(None) bug with max_speakers) - Use huggingface_hub.login() to persist token for all sub-downloads - Pre-download sub-models (segmentation-3.0, speaker-diarization-community-1) - Add third required model license link in settings UI - Improve SpeakerManager hints based on settings state - Add word-wrap to transcript text Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 19:46:07 -08:00
Josh Knapp	baf820286f	Add HuggingFace token setting for speaker detection - Add "Speakers" tab in Settings with HF token input field - Include step-by-step instructions for obtaining the token - Pass hf_token from settings through Rust → Python pipeline → diarize - Token can also be set via HF_TOKEN environment variable as fallback - Move skip_diarization checkbox to Speakers tab Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 18:08:51 -08:00
Josh Knapp	87b3ad94f9	Improve import UX: progress overlay, pyannote fix, debug logging - Enhanced ProgressOverlay with spinner, better styling, and z-index 9999 - Import button shows "Processing..." with pulse animation while transcribing - Fix pyannote API: use token= instead of deprecated use_auth_token= - Read HF_TOKEN from environment for pyannote model download - Add console logging for click-to-seek debugging - Add color-scheme: dark for native form controls Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 17:43:49 -08:00
Josh Knapp	44480906a4	Phase 3: Speaker diarization and full transcription pipeline - Implement DiarizeService with pyannote.audio speaker detection - Build PipelineService combining transcribe → diarize → merge with overlap-based speaker assignment per segment - Add pipeline.start and diarize.start IPC handlers - Add run_pipeline Tauri command for full pipeline execution - Wire frontend to use pipeline: speakers auto-created with colors, segments assigned to detected speakers - Build SpeakerManager with rename support (double-click or edit button) - Add speaker color coding throughout transcript display - Add pyannote.audio dependency - Tests: 24 Python (including merge logic), 6 Rust, 0 Svelte errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 16:09:48 -08:00
Josh Knapp	503cc6c0cf	Phase 1 foundation: Tauri shell, Python sidecar, SQLite database Tauri v2 + Svelte + TypeScript frontend: - App shell with workspace layout (waveform, transcript, speakers, AI chat) - Placeholder components for all major UI areas - Typed stores (project, transcript, playback, AI) - TypeScript interfaces matching the database schema - Tauri bridge service with typed invoke wrappers - svelte-check passes with 0 errors Rust backend: - Tauri v2 app entry point with command registration - SQLite database layer (rusqlite with bundled SQLite) - Full schema: projects, media_files, speakers, segments, words, ai_outputs, annotations (with indexes) - Model structs with serde serialization - CRUD queries for projects, speakers, segments, words - Segment text editing preserves original text - Schema versioning for future migrations - 6 tests passing - Command stubs for project, transcribe, export, AI, settings, system - App state management Python sidecar: - JSON-line IPC protocol (stdin/stdout) - Message types: IPCMessage, progress, error, ready - Handler registry with routing and error handling - Ping/pong handler for connectivity testing - Service stubs: transcribe, diarize, pipeline, AI, export - Provider stubs: local (llama-server), OpenAI, Anthropic, LiteLLM - Hardware detection stubs - 14 tests passing, ruff clean Also adds: - Testing strategy document (docs/TESTING.md) - Validation script (scripts/validate.sh) - Updated .gitignore for Svelte, Rust, Python artifacts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 15:16:06 -08:00

8 Commits