# Voice to Notes — Project Guidelines ## Project Overview Desktop app for transcribing audio/video with speaker identification. Runs locally on user's computer. See `docs/ARCHITECTURE.md` for full architecture. ## Tech Stack - **Desktop shell:** Tauri v2 (Rust backend + Svelte/TypeScript frontend) - **ML pipeline:** Python sidecar process (faster-whisper, pyannote.audio, wav2vec2) - **Database:** SQLite (via rusqlite in Rust) - **Local AI:** Bundled llama-server (llama.cpp) — default, no install needed - **Cloud AI providers:** LiteLLM, OpenAI, Anthropic (optional, user-configured) - **Caption export:** pysubs2 (Python) - **Audio UI:** wavesurfer.js - **Transcript editor:** TipTap (ProseMirror) ## Key Architecture Decisions - Python sidecar communicates with Rust via JSON-line IPC (stdin/stdout) - All ML models must work on CPU. GPU (CUDA) is optional acceleration. - AI cloud providers are optional. Bundled llama-server (llama.cpp) is the default local AI — no separate install needed. - Rust backend manages llama-server lifecycle (start/stop/port allocation). - Project is open source (MIT license). - SQLite database is per-project, stored alongside media files. - Word-level timestamps are required for click-to-seek playback sync. ## Directory Structure ``` src/ # Svelte frontend source src-tauri/ # Rust backend source python/ # Python sidecar source voice_to_notes/ # Python package tests/ # Python tests docs/ # Architecture and design documents ``` ## Conventions - Rust: follow standard Rust conventions, use `cargo fmt` and `cargo clippy` - Python: Python 3.11+, use type hints, follow PEP 8, use `ruff` for linting - TypeScript: strict mode, prefer Svelte stores for state management - IPC messages: JSON-line format, each message has `id`, `type`, `payload` fields - Database: UUIDs as primary keys (TEXT type in SQLite) - All timestamps in milliseconds (integer) relative to media file start ## Platform Targets - Linux (primary development target) - Windows (must work, tested before release) - macOS (future, not yet targeted)