Phase 1 foundation: Tauri shell, Python sidecar, SQLite database

Tauri v2 + Svelte + TypeScript frontend: - App shell with workspace layout (waveform, transcript, speakers, AI chat) - Placeholder components for all major UI areas - Typed stores (project, transcript, playback, AI) - TypeScript interfaces matching the database schema - Tauri bridge service with typed invoke wrappers - svelte-check passes with 0 errors Rust backend: - Tauri v2 app entry point with command registration - SQLite database layer (rusqlite with bundled SQLite) - Full schema: projects, media_files, speakers, segments, words, ai_outputs, annotations (with indexes) - Model structs with serde serialization - CRUD queries for projects, speakers, segments, words - Segment text editing preserves original text - Schema versioning for future migrations - 6 tests passing - Command stubs for project, transcribe, export, AI, settings, system - App state management Python sidecar: - JSON-line IPC protocol (stdin/stdout) - Message types: IPCMessage, progress, error, ready - Handler registry with routing and error handling - Ping/pong handler for connectivity testing - Service stubs: transcribe, diarize, pipeline, AI, export - Provider stubs: local (llama-server), OpenAI, Anthropic, LiteLLM - Hardware detection stubs - 14 tests passing, ruff clean Also adds: - Testing strategy document (docs/TESTING.md) - Validation script (scripts/validate.sh) - Updated .gitignore for Svelte, Rust, Python artifacts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 15:16:06 -08:00
parent c450ef3c0c
commit 503cc6c0cf
95 changed files with 9607 additions and 0 deletions
--- a/python/voice_to_notes/services/init.py
+++ b/python/voice_to_notes/services/init.py
@@ -0,0 +1 @@
+"""Service layer — transcription, diarization, AI, and export."""
--- a/python/voice_to_notes/services/ai_provider.py
+++ b/python/voice_to_notes/services/ai_provider.py
@@ -0,0 +1,13 @@
+"""AI provider service — routes requests to configured provider."""
+
+from __future__ import annotations
+
+
+class AIProviderService:
+    """Manages AI provider selection and routes chat/summarize requests."""
+
+    # TODO: Implement provider routing
+    # - Select provider based on config (local, openai, anthropic, litellm)
+    # - Forward chat messages
+    # - Handle streaming responses
+    pass
--- a/python/voice_to_notes/services/diarize.py
+++ b/python/voice_to_notes/services/diarize.py
@@ -0,0 +1,13 @@
+"""Diarization service — pyannote.audio speaker identification."""
+
+from __future__ import annotations
+
+
+class DiarizeService:
+    """Handles speaker diarization via pyannote.audio."""
+
+    # TODO: Implement pyannote.audio integration
+    # - Load community-1 model
+    # - Run diarization on audio
+    # - Return speaker segments with timestamps
+    pass
--- a/python/voice_to_notes/services/export.py
+++ b/python/voice_to_notes/services/export.py
@@ -0,0 +1,14 @@
+"""Export service — caption and text document generation."""
+
+from __future__ import annotations
+
+
+class ExportService:
+    """Handles export to SRT, WebVTT, ASS, plain text, and Markdown."""
+
+    # TODO: Implement pysubs2 integration
+    # - SRT with [Speaker]: prefix
+    # - WebVTT with <v Speaker> voice tags
+    # - ASS with named styles per speaker
+    # - Plain text and Markdown with speaker labels
+    pass
--- a/python/voice_to_notes/services/pipeline.py
+++ b/python/voice_to_notes/services/pipeline.py
@@ -0,0 +1,14 @@
+"""Combined transcription + diarization pipeline."""
+
+from __future__ import annotations
+
+
+class PipelineService:
+    """Runs the full WhisperX-style pipeline: transcribe -> align -> diarize -> merge."""
+
+    # TODO: Implement combined pipeline
+    # 1. faster-whisper transcription
+    # 2. wav2vec2 word-level alignment
+    # 3. pyannote diarization
+    # 4. Merge words with speaker segments
+    pass
--- a/python/voice_to_notes/services/transcribe.py
+++ b/python/voice_to_notes/services/transcribe.py
@@ -0,0 +1,13 @@
+"""Transcription service — faster-whisper + wav2vec2 pipeline."""
+
+from __future__ import annotations
+
+
+class TranscribeService:
+    """Handles audio transcription via faster-whisper."""
+
+    # TODO: Implement faster-whisper integration
+    # - Load model based on hardware detection
+    # - Transcribe audio with word-level timestamps
+    # - Report progress via IPC
+    pass
				`@@ -0,0 +1 @@`
				`"""Service layer — transcription, diarization, AI, and export."""`