31 Commits

Author SHA1 Message Date
Gitea Actions
68ae48a771 chore: bump version to 0.2.5 [skip ci] 2026-03-22 03:23:01 +00:00
Gitea Actions
ef1481b359 chore: bump version to 0.2.4 [skip ci] 2026-03-22 01:37:55 +00:00
Claude
ba3d5c3997 Fix sidecar.log not created: ensure data dir exists
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build (Linux) (push) Successful in 7m47s
Release / Build (macOS) (push) Successful in 30m20s
Release / Build (Windows) (push) Successful in 36m45s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 18:37:49 -07:00
Gitea Actions
9a6ea84637 chore: bump version to 0.2.3 [skip ci] 2026-03-21 21:46:10 +00:00
Claude
011ff4e178 Fix sidecar crash recovery and Windows console window
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build (macOS) (push) Successful in 4m40s
Release / Build (Linux) (push) Successful in 7m34s
Release / Build (Windows) (push) Successful in 17m2s
- Fix is_running() to check actual process liveness via try_wait()
  instead of just checking if the handle exists
- Auto-restart sidecar on pipe errors (broken pipe, closed stdout)
  with one retry attempt
- Hide sidecar console window on Windows (CREATE_NO_WINDOW flag)
- Log sidecar stderr to sidecar.log file for crash diagnostics
- Include exit status in error message when sidecar fails to start

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 14:46:01 -07:00
Gitea Actions
d2d111a5c7 chore: bump version to 0.2.2 [skip ci] 2026-03-21 18:56:14 +00:00
Gitea Actions
9033881274 chore: bump version to 0.2.1 [skip ci] 2026-03-21 18:53:23 +00:00
Claude
2d0d4cfc50 Skip AppImage and RPM builds to avoid slow 360MB+ compression
Some checks failed
Build Windows / Build (Windows) (push) Has been cancelled
Build Linux / Build (Linux) (push) Has been cancelled
Build macOS / Build (macOS) (push) Has been cancelled
AppImage bundler compresses the entire sidecar.zip into squashfs,
causing builds to hang/timeout. Limit targets to deb (Linux),
nsis+msi (Windows), and dmg (macOS).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 11:28:55 -07:00
Claude
a0cb034ab5 Bump version to 0.2.0
Some checks failed
Build Windows / Build (Windows) (push) Successful in 11m37s
Build macOS / Build (macOS) (push) Successful in 3m30s
Build Linux / Build (Linux) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 07:34:37 -07:00
Claude
52d7d06d84 Fix sidecar.zip not bundled: move resources config into tauri.conf.json
Some checks failed
Build Linux / Build (Linux) (push) Has been cancelled
Build Windows / Build (Windows) (push) Has started running
Build macOS / Build (macOS) (push) Has been cancelled
The TAURI_CONFIG env var approach for resources wasn't being applied
by the NSIS bundler, so sidecar.zip was never included in the installer.

- Add resources: ["sidecar.zip"] directly to tauri.conf.json
- build.rs creates a minimal placeholder zip for dev builds so
  compilation succeeds even without the real sidecar
- Remove TAURI_CONFIG env var from all CI workflows (no longer needed)
- Add sidecar.zip to .gitignore (generated by CI, not tracked)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 07:33:02 -07:00
Claude
462a4b80f6 Fix Tauri build stack overflow: zip sidecar and extract on first launch
All checks were successful
Build macOS / Build (macOS) (push) Successful in 3m50s
Build Linux / Build (Linux) (push) Successful in 8m3s
Build Windows / Build (Windows) (push) Successful in 9m8s
Tauri's build script overflows the stack when processing resource globs
matching thousands of files from PyInstaller's ML output (torch, pyannote).

Instead of bundling the sidecar directory directly:
- CI zips the sidecar output into a single sidecar.zip
- Tauri bundles just the one zip file (no recursion)
- On first launch, Rust extracts the zip to the app data directory
- Versioned extraction dir (sidecar-{version}) ensures updates re-extract

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 07:12:22 -07:00
Claude
12869e3757 Fix sidecar not found on Windows/macOS/Linux: switch from externalBin to resources
Some checks failed
Build macOS / Build (macOS) (push) Failing after 2m53s
Build Linux / Build (Linux) (push) Failing after 6m30s
Build Windows / Build (Windows) (push) Failing after 8m15s
Tauri's externalBin only bundled the single sidecar executable, but
PyInstaller's onedir output requires companion DLLs and _internal/.
The binary was also renamed with a target triple suffix that
resolve_sidecar_path() didn't look for, causing it to fall back to
dev mode which used a compile-time CI path (CARGO_MANIFEST_DIR).

- Switch from externalBin to bundle.resources to include all sidecar files
- Pass Tauri resource_dir to sidecar manager for platform-aware path resolution
- Remove rename_binary() since externalBin target triple naming is no longer needed
- Remove broken production-to-dev fallback that could never work on user machines

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 06:55:44 -07:00
Claude
27b705b5b6 File-based project save/load, AI chat formatting, text edit fix
Project files (.vtn):
- Save Project: serializes transcript, speakers, audio path to JSON file
- Open Project: loads .vtn file, restores audio/transcript/speakers
- User chooses filename and location via save dialog
- Replaces SQLite-based project persistence (DB commands remain for future use)
- Text edits update in-memory store immediately, persist on explicit save
- Fix Windows path separator in project name extraction

AI chat:
- Markdown rendering in assistant messages (headers, lists, bold, code)
- Better visual distinction with border-left accents
- Styled markdown elements for dark theme

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:17:35 -07:00
Claude
8e7d21d22b Persist segment text edits to database on Enter
- Add update_segment Tauri command (calls existing update_segment_text query)
- Wire onTextEdit handler from TranscriptEditor to invoke update_segment
- Edits are saved to SQLite immediately when user presses Enter

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:08:52 -07:00
Claude
61caa07e4c Add project save/load and improve AI chat formatting
Project persistence:
- save_project_transcript command: persists segments, speakers, words to SQLite
- load_project_transcript command: loads full transcript with nested words
- delete_project command: soft-delete projects
- Auto-save after pipeline completes (named from filename)
- Project dropdown in header to switch between saved transcripts
- Projects load audio, segments, and speakers from database

AI chat improvements:
- Markdown rendering in assistant messages (headers, lists, bold, italic, code)
- Better message spacing and visual distinction (border-left accents)
- Styled markdown elements matching dark theme
- Improved empty state and quick action button sizing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:06:29 -07:00
Claude
58faa83cb3 Cross-platform distribution, UI improvements, and performance optimizations
- PyInstaller frozen sidecar: spec file, build script, and ffmpeg path resolver
  for self-contained distribution without Python prerequisites
- Dual-mode sidecar launcher: frozen binary (production) with dev mode fallback
- Parallel transcription + diarization pipeline (~30-40% faster)
- GPU auto-detection for diarization (CUDA when available)
- Async run_pipeline command for real-time progress event delivery
- Web Audio API backend for instant playback and seeking
- OpenAI-compatible provider replacing LiteLLM client-side routing
- Cross-platform RAM detection (Linux/macOS/Windows)
- Settings: speaker count hint, token reveal toggles, dark dropdown styling
- Loading splash screen, flexbox layout fix for viewport overflow
- Gitea Actions CI/CD pipeline (Linux, Windows, macOS ARM)
- Updated README and CLAUDE.md documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 21:33:43 -07:00
Claude
42ccd3e21d Fix cargo fmt formatting and diarize threading test mock 2026-03-20 13:56:32 -07:00
Claude
c3b6ad38fd Merge perf/stream-segments: streaming partial transcript segments and speaker updates 2026-03-20 13:51:51 -07:00
67ed69df00 Stream transcript segments to frontend as they are transcribed
Send each segment to the frontend immediately after transcription via
a new pipeline.segment IPC message, then send speaker assignments as a
batch pipeline.speaker_update message after diarization completes. This
lets the UI display segments progressively instead of waiting for the
entire pipeline to finish.

Changes:
- Add partial_segment_message and speaker_update_message IPC factories
- Add on_segment callback parameter to TranscribeService.transcribe()
- Emit partial segments and speaker updates from PipelineService.run()
- Add send_and_receive_with_progress to SidecarManager (Rust)
- Route pipeline.segment/speaker_update events in run_pipeline command
- Listen for streaming events in Svelte frontend (+page.svelte)
- Add tests for new message types, callback signature, and update logic

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:47:57 -07:00
a3612c986d Add Test & Download button for diarization model, clickable links
- Add diarize.download IPC handler that downloads the pyannote model
  and returns user-friendly error messages (missing license, bad token)
- Add download_diarize_model Tauri command
- Add "Test & Download Model" button in Speakers settings tab
- Update instructions to list both required model licenses
  (speaker-diarization-3.1 AND segmentation-3.0)
- Make all HuggingFace URLs clickable (opens in system browser)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 18:21:42 -08:00
baf820286f Add HuggingFace token setting for speaker detection
- Add "Speakers" tab in Settings with HF token input field
- Include step-by-step instructions for obtaining the token
- Pass hf_token from settings through Rust → Python pipeline → diarize
- Token can also be set via HF_TOKEN environment variable as fallback
- Move skip_diarization checkbox to Speakers tab

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 18:08:51 -08:00
4d7b9d524f Fix IPC stdout corruption, dark window background, overlay timing
- Redirect sys.stdout to stderr in Python sidecar so library print()
  calls don't corrupt the JSON-line IPC stream
- Save real stdout fd for exclusive IPC use via init_ipc()
- Skip non-JSON lines in Rust reader instead of failing with parse error
- Set Tauri window background color to match dark theme (#0a0a23)
- Add inline dark background on html/body to prevent white flash
- Use Svelte tick() to ensure progress overlay renders before invoke
- Improve ProgressOverlay with spinner, better styling, z-index 9999

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:50:55 -08:00
669d88f143 Fix progress feedback, diarization fallback, and dropdown readability
- Stream pipeline progress to frontend via Tauri events so the progress
  overlay updates in real time during transcription/diarization
- Gracefully fall back to transcription-only when diarization fails
  (e.g. pyannote not installed) instead of erroring the whole pipeline
- Add color-scheme: dark to fix native select/option elements rendering
  with unreadable white backgrounds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:14:25 -08:00
d00281f0c7 Fix critical integration issues for end-to-end functionality
- Rewrite SidecarManager as singleton with OnceLock, reusing one Python
  process across all commands instead of spawning per call
- Separate stdin/stdout ownership with dedicated BufReader to prevent
  data corruption between wait_for_ready and send_and_receive
- Add ensure_running() for auto-start on first command
- Fix asset protocol URL: use convertFileSrc() instead of manual
  encodeURIComponent which broke file paths with slashes
- Add +layout.svelte with global dark theme, CSS reset, and custom
  scrollbar styling to prevent white flash on startup
- Register AppState with Tauri .manage(), initialize SQLite database
  on app startup at ~/.voicetonotes/voice_to_notes.db
- Wire project commands (create/get/list) to real database queries
  instead of placeholder stubs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:50:14 -08:00
97a1a15755 Phase 6: Llama-server manager, settings UI, packaging, and polish
- Implement LlamaManager in Rust for llama-server lifecycle: spawn with
  port allocation, health check, clean shutdown on Drop, model listing
- Add llama_start/stop/status/list_models Tauri commands
- Add load_settings/save_settings commands with JSON persistence
- Build SettingsModal with tabs for Transcription, AI Provider, Local AI
  settings (model size, device, language, API keys, provider selection)
- Wire settings into pipeline calls (model, device, language, skip diarization)
- Configure Tauri packaging: asset protocol for local audio files,
  CSP policy, bundle metadata, Linux .deb/.AppImage and Windows .msi config
- Add keyboard shortcuts: Space (play/pause), Ctrl+O (import),
  Ctrl+, (settings), Escape (close menus/modals)
- Close export dropdown on outside click
- Tests: 30 Python, 6 Rust, 0 Svelte errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:38:23 -08:00
d67625cd5a Phase 5: AI provider system with local and cloud support
- Implement AIProvider base interface with chat() and is_available()
- Add LocalProvider connecting to bundled llama-server via OpenAI SDK
- Add OpenAIProvider for direct OpenAI API access
- Add AnthropicProvider for Anthropic Claude API
- Add LiteLLMProvider for multi-provider gateway
- Build AIProviderService with provider routing, auto-selection,
  and transcript context injection
- Add ai.chat IPC handler supporting chat, list_providers, set_provider,
  and configure actions
- Add ai_chat, ai_list_providers, ai_configure Tauri commands
- Build interactive AIChatPanel with message history, quick actions
  (Summarize, Action Items), and transcript context awareness
- Tests: 30 Python, 6 Rust, 0 Svelte errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:25:10 -08:00
415a648a2b Phase 4: Export to SRT, WebVTT, ASS, plain text, and Markdown
- Implement ExportService using pysubs2 for caption formats (SRT, VTT, ASS)
  and custom formatters for plain text and Markdown
- SRT exports with [Speaker]: prefix, WebVTT with <v Speaker> voice tags,
  ASS with color-coded speaker styles
- Plain text groups by speaker with labels, Markdown adds timestamps
- Add export.start IPC handler and export_transcript Tauri command
- Add export dropdown menu in header (appears after transcription)
- Uses native save dialog for output file selection
- Add pysubs2 dependency
- Tests: 30 Python (6 export tests), 6 Rust, 0 Svelte errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:18:54 -08:00
44480906a4 Phase 3: Speaker diarization and full transcription pipeline
- Implement DiarizeService with pyannote.audio speaker detection
- Build PipelineService combining transcribe → diarize → merge with
  overlap-based speaker assignment per segment
- Add pipeline.start and diarize.start IPC handlers
- Add run_pipeline Tauri command for full pipeline execution
- Wire frontend to use pipeline: speakers auto-created with colors,
  segments assigned to detected speakers
- Build SpeakerManager with rename support (double-click or edit button)
- Add speaker color coding throughout transcript display
- Add pyannote.audio dependency
- Tests: 24 Python (including merge logic), 6 Rust, 0 Svelte errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:09:48 -08:00
842f8d5f90 Add auto-scroll, file dialog, and transcript editing
- Auto-scroll transcript to active segment during playback with smart
  pause when user manually scrolls (resumes after 3s)
- Replace prompt() with native Tauri file dialog for audio/video import
  with file type filters
- Add inline transcript editing via double-click with Enter to save,
  Esc to cancel, preserving original text for change tracking
- Show "edited" badge on modified segments

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:02:27 -08:00
48fe41b064 Phase 2: Core transcription pipeline and audio playback
- Implement faster-whisper TranscribeService with word-level timestamps,
  progress reporting, and hardware auto-detection
- Wire up Rust SidecarManager for Python process lifecycle (spawn, IPC, shutdown)
- Add transcribe_file Tauri command bridging frontend to Python sidecar
- Integrate wavesurfer.js WaveformPlayer with play/pause, skip, seek controls
- Build TranscriptEditor with word-level click-to-seek and active highlighting
- Connect file import flow: prompt → asset load → transcribe → display
- Add typed tauri-bridge service with TranscriptionResult interface
- Add Python tests for hardware detection and transcription result formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 15:53:09 -08:00
503cc6c0cf Phase 1 foundation: Tauri shell, Python sidecar, SQLite database
Tauri v2 + Svelte + TypeScript frontend:
- App shell with workspace layout (waveform, transcript, speakers, AI chat)
- Placeholder components for all major UI areas
- Typed stores (project, transcript, playback, AI)
- TypeScript interfaces matching the database schema
- Tauri bridge service with typed invoke wrappers
- svelte-check passes with 0 errors

Rust backend:
- Tauri v2 app entry point with command registration
- SQLite database layer (rusqlite with bundled SQLite)
  - Full schema: projects, media_files, speakers, segments, words,
    ai_outputs, annotations (with indexes)
  - Model structs with serde serialization
  - CRUD queries for projects, speakers, segments, words
  - Segment text editing preserves original text
  - Schema versioning for future migrations
  - 6 tests passing
- Command stubs for project, transcribe, export, AI, settings, system
- App state management

Python sidecar:
- JSON-line IPC protocol (stdin/stdout)
- Message types: IPCMessage, progress, error, ready
- Handler registry with routing and error handling
- Ping/pong handler for connectivity testing
- Service stubs: transcribe, diarize, pipeline, AI, export
- Provider stubs: local (llama-server), OpenAI, Anthropic, LiteLLM
- Hardware detection stubs
- 14 tests passing, ruff clean

Also adds:
- Testing strategy document (docs/TESTING.md)
- Validation script (scripts/validate.sh)
- Updated .gitignore for Svelte, Rust, Python artifacts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 15:16:06 -08:00