Phase 1 foundation: Tauri shell, Python sidecar, SQLite database
Tauri v2 + Svelte + TypeScript frontend:
- App shell with workspace layout (waveform, transcript, speakers, AI chat)
- Placeholder components for all major UI areas
- Typed stores (project, transcript, playback, AI)
- TypeScript interfaces matching the database schema
- Tauri bridge service with typed invoke wrappers
- svelte-check passes with 0 errors
Rust backend:
- Tauri v2 app entry point with command registration
- SQLite database layer (rusqlite with bundled SQLite)
- Full schema: projects, media_files, speakers, segments, words,
ai_outputs, annotations (with indexes)
- Model structs with serde serialization
- CRUD queries for projects, speakers, segments, words
- Segment text editing preserves original text
- Schema versioning for future migrations
- 6 tests passing
- Command stubs for project, transcribe, export, AI, settings, system
- App state management
Python sidecar:
- JSON-line IPC protocol (stdin/stdout)
- Message types: IPCMessage, progress, error, ready
- Handler registry with routing and error handling
- Ping/pong handler for connectivity testing
- Service stubs: transcribe, diarize, pipeline, AI, export
- Provider stubs: local (llama-server), OpenAI, Anthropic, LiteLLM
- Hardware detection stubs
- 14 tests passing, ruff clean
Also adds:
- Testing strategy document (docs/TESTING.md)
- Validation script (scripts/validate.sh)
- Updated .gitignore for Svelte, Rust, Python artifacts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
1
python/voice_to_notes/services/__init__.py
Normal file
1
python/voice_to_notes/services/__init__.py
Normal file
@@ -0,0 +1 @@
|
||||
"""Service layer — transcription, diarization, AI, and export."""
|
||||
13
python/voice_to_notes/services/ai_provider.py
Normal file
13
python/voice_to_notes/services/ai_provider.py
Normal file
@@ -0,0 +1,13 @@
|
||||
"""AI provider service — routes requests to configured provider."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
|
||||
class AIProviderService:
|
||||
"""Manages AI provider selection and routes chat/summarize requests."""
|
||||
|
||||
# TODO: Implement provider routing
|
||||
# - Select provider based on config (local, openai, anthropic, litellm)
|
||||
# - Forward chat messages
|
||||
# - Handle streaming responses
|
||||
pass
|
||||
13
python/voice_to_notes/services/diarize.py
Normal file
13
python/voice_to_notes/services/diarize.py
Normal file
@@ -0,0 +1,13 @@
|
||||
"""Diarization service — pyannote.audio speaker identification."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
|
||||
class DiarizeService:
|
||||
"""Handles speaker diarization via pyannote.audio."""
|
||||
|
||||
# TODO: Implement pyannote.audio integration
|
||||
# - Load community-1 model
|
||||
# - Run diarization on audio
|
||||
# - Return speaker segments with timestamps
|
||||
pass
|
||||
14
python/voice_to_notes/services/export.py
Normal file
14
python/voice_to_notes/services/export.py
Normal file
@@ -0,0 +1,14 @@
|
||||
"""Export service — caption and text document generation."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
|
||||
class ExportService:
|
||||
"""Handles export to SRT, WebVTT, ASS, plain text, and Markdown."""
|
||||
|
||||
# TODO: Implement pysubs2 integration
|
||||
# - SRT with [Speaker]: prefix
|
||||
# - WebVTT with <v Speaker> voice tags
|
||||
# - ASS with named styles per speaker
|
||||
# - Plain text and Markdown with speaker labels
|
||||
pass
|
||||
14
python/voice_to_notes/services/pipeline.py
Normal file
14
python/voice_to_notes/services/pipeline.py
Normal file
@@ -0,0 +1,14 @@
|
||||
"""Combined transcription + diarization pipeline."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
|
||||
class PipelineService:
|
||||
"""Runs the full WhisperX-style pipeline: transcribe -> align -> diarize -> merge."""
|
||||
|
||||
# TODO: Implement combined pipeline
|
||||
# 1. faster-whisper transcription
|
||||
# 2. wav2vec2 word-level alignment
|
||||
# 3. pyannote diarization
|
||||
# 4. Merge words with speaker segments
|
||||
pass
|
||||
13
python/voice_to_notes/services/transcribe.py
Normal file
13
python/voice_to_notes/services/transcribe.py
Normal file
@@ -0,0 +1,13 @@
|
||||
"""Transcription service — faster-whisper + wav2vec2 pipeline."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
|
||||
class TranscribeService:
|
||||
"""Handles audio transcription via faster-whisper."""
|
||||
|
||||
# TODO: Implement faster-whisper integration
|
||||
# - Load model based on hardware detection
|
||||
# - Transcribe audio with word-level timestamps
|
||||
# - Report progress via IPC
|
||||
pass
|
||||
Reference in New Issue
Block a user