Commit Graph

148 Commits

Author SHA1 Message Date
cc24511b87 Merge branch 'main' into perf/pipeline-improvements
Some checks failed
Build & Release / Build sidecar (aarch64-apple-darwin) (pull_request) Failing after 9s
Build & Release / Build sidecar (x86_64-pc-windows-msvc) (pull_request) Has been cancelled
Build & Release / Build app (x86_64-unknown-linux-gnu) (pull_request) Has been cancelled
Build & Release / Build app (aarch64-apple-darwin) (pull_request) Has been cancelled
Build & Release / Build app (x86_64-pc-windows-msvc) (pull_request) Has been cancelled
Build & Release / Create Release (pull_request) Has been cancelled
Build & Release / Build sidecar (x86_64-unknown-linux-gnu) (pull_request) Has been cancelled
2026-03-21 05:44:38 +00:00
Claude
23c013bbac Fix macOS CI and remove duplicate GitHub Actions workflow
Some checks failed
Build & Release / Build sidecar (aarch64-apple-darwin) (pull_request) Failing after 6s
Build & Release / Build sidecar (x86_64-pc-windows-msvc) (pull_request) Has been cancelled
Build & Release / Build app (x86_64-unknown-linux-gnu) (pull_request) Has been cancelled
Build & Release / Build app (aarch64-apple-darwin) (pull_request) Has been cancelled
Build & Release / Build app (x86_64-pc-windows-msvc) (pull_request) Has been cancelled
Build & Release / Create Release (pull_request) Has been cancelled
Build & Release / Build sidecar (x86_64-unknown-linux-gnu) (pull_request) Has been cancelled
- Set AGENT_TOOLSDIRECTORY via step-level env on setup-python (not
  GITHUB_ENV which only applies to subsequent steps)
- Use runner.temp for toolcache dir (always writable, no sudo needed)
- Remove .github/workflows/build.yml to prevent duplicate CI runs
- Remove unused Windows env check step

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:42:24 -07:00
34797b295d Merge pull request 'perf/pipeline-improvements' (#3) from perf/pipeline-improvements into main
Some checks failed
Build & Release / Build sidecar (aarch64-apple-darwin) (push) Failing after 8s
Build & Release / Build sidecar (x86_64-pc-windows-msvc) (push) Has been cancelled
Build & Release / Build app (x86_64-unknown-linux-gnu) (push) Has been cancelled
Build & Release / Build app (aarch64-apple-darwin) (push) Has been cancelled
Build & Release / Build app (x86_64-pc-windows-msvc) (push) Has been cancelled
Build & Release / Create Release (push) Has been cancelled
Build & Release / Build sidecar (x86_64-unknown-linux-gnu) (push) Has been cancelled
Reviewed-on: #3
2026-03-21 05:40:03 +00:00
Claude
666e6c5b25 Merge remote-tracking branch 'origin/main' into perf/pipeline-improvements
Some checks failed
Build & Release / Build sidecar (aarch64-apple-darwin) (pull_request) Failing after 9s
Build & Release / Build app (x86_64-unknown-linux-gnu) (pull_request) Has been cancelled
Build & Release / Build app (aarch64-apple-darwin) (pull_request) Has been cancelled
Build & Release / Build app (x86_64-pc-windows-msvc) (pull_request) Has been cancelled
Build & Release / Create Release (pull_request) Has been cancelled
Build & Release / Build sidecar (x86_64-pc-windows-msvc) (pull_request) Has been cancelled
Build & Release / Build sidecar (x86_64-unknown-linux-gnu) (pull_request) Has been cancelled
2026-03-20 22:39:19 -07:00
Claude
c8754076f4 Fix macOS CI: use workspace dir for Python toolcache instead of sudo
Set AGENT_TOOLSDIRECTORY to a workspace-local path so setup-python
doesn't need /Users/runner or sudo access.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:34:11 -07:00
f674b63ca9 Merge pull request 'perf/pipeline-improvements' (#2) from perf/pipeline-improvements into main
Some checks failed
Build & Release / Build sidecar (x86_64-pc-windows-msvc) (push) Has been cancelled
Build & Release / Build app (x86_64-unknown-linux-gnu) (push) Has been cancelled
Build & Release / Build app (aarch64-apple-darwin) (push) Has been cancelled
Build & Release / Build app (x86_64-pc-windows-msvc) (push) Has been cancelled
Build & Release / Create Release (push) Has been cancelled
Build & Release / Build sidecar (x86_64-unknown-linux-gnu) (push) Has been cancelled
Build & Release / Build sidecar (aarch64-apple-darwin) (push) Has been cancelled
Reviewed-on: #2
2026-03-21 05:29:36 +00:00
66a9033a64 Merge branch 'main' into perf/pipeline-improvements
Some checks failed
Build & Release / Build sidecar (x86_64-pc-windows-msvc) (pull_request) Has been cancelled
Build & Release / Build app (x86_64-unknown-linux-gnu) (pull_request) Has been cancelled
Build & Release / Build app (aarch64-apple-darwin) (pull_request) Has been cancelled
Build & Release / Build app (x86_64-pc-windows-msvc) (pull_request) Has been cancelled
Build & Release / Create Release (pull_request) Has been cancelled
Build & Release / Build sidecar (x86_64-unknown-linux-gnu) (pull_request) Has been cancelled
Build & Release / Build sidecar (aarch64-apple-darwin) (pull_request) Has been cancelled
2026-03-21 05:29:29 +00:00
Claude
3a97aa2831 Remove redundant title from app header
Some checks failed
Build & Release / Build sidecar (x86_64-unknown-linux-gnu) (pull_request) Successful in 4m37s
Build & Release / Build app (x86_64-unknown-linux-gnu) (pull_request) Has been cancelled
Build & Release / Build app (aarch64-apple-darwin) (pull_request) Has been cancelled
Build & Release / Build app (x86_64-pc-windows-msvc) (pull_request) Has been cancelled
Build & Release / Create Release (pull_request) Has been cancelled
Build & Release / Build sidecar (x86_64-pc-windows-msvc) (pull_request) Has been cancelled
Build & Release / Build sidecar (aarch64-apple-darwin) (pull_request) Has been cancelled
The app name is already in the window title bar, so the in-header
"Voice to Notes" heading was redundant and had poor contrast.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:28:49 -07:00
Claude
882aa147c7 Smart word timing redistribution on transcript edits
When editing a segment, word timing is now intelligently redistributed:
- Spelling fixes (same word count): each word keeps its original timing
- Word splits (e.g. "gonna" → "going to"): original word's time range
  is divided proportionally across the new words
- Inserted words: timing interpolated from neighboring words
- Deleted words: remaining words keep their timing, gaps collapse

This preserves click-to-seek accuracy for common edits like fixing
misheard words or splitting concatenated words.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:23:39 -07:00
Claude
67fc23e8aa Preserve word-level timing on spelling edits
When the edited text has the same word count as the original (e.g. fixing
"Whisper" to "wisper"), each word keeps its original start/end timestamps.
Only falls back to segment-level timing when words are added or removed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:21:52 -07:00
Claude
727107323c Fix transcript text edit not showing after Enter
The display renders segment.words (not segment.text), so editing the text
field alone had no visible effect. Now finishEditing() rebuilds the words
array from the edited text so the change is immediately visible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:20:20 -07:00
Claude
27b705b5b6 File-based project save/load, AI chat formatting, text edit fix
Project files (.vtn):
- Save Project: serializes transcript, speakers, audio path to JSON file
- Open Project: loads .vtn file, restores audio/transcript/speakers
- User chooses filename and location via save dialog
- Replaces SQLite-based project persistence (DB commands remain for future use)
- Text edits update in-memory store immediately, persist on explicit save
- Fix Windows path separator in project name extraction

AI chat:
- Markdown rendering in assistant messages (headers, lists, bold, code)
- Better visual distinction with border-left accents
- Styled markdown elements for dark theme

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:17:35 -07:00
Claude
8e7d21d22b Persist segment text edits to database on Enter
- Add update_segment Tauri command (calls existing update_segment_text query)
- Wire onTextEdit handler from TranscriptEditor to invoke update_segment
- Edits are saved to SQLite immediately when user presses Enter

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:08:52 -07:00
Claude
61caa07e4c Add project save/load and improve AI chat formatting
Project persistence:
- save_project_transcript command: persists segments, speakers, words to SQLite
- load_project_transcript command: loads full transcript with nested words
- delete_project command: soft-delete projects
- Auto-save after pipeline completes (named from filename)
- Project dropdown in header to switch between saved transcripts
- Projects load audio, segments, and speakers from database

AI chat improvements:
- Markdown rendering in assistant messages (headers, lists, bold, italic, code)
- Better message spacing and visual distinction (border-left accents)
- Styled markdown elements matching dark theme
- Improved empty state and quick action button sizing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:06:29 -07:00
Claude
331003d1c9 Fix CI: downgrade artifact actions to v3 for Gitea compatibility
upload-artifact@v4 and download-artifact@v4 require GitHub's backend
and are not supported on Gitea. v3 works with Gitea Actions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 21:58:29 -07:00
Claude
4a4402f71a Fix CI: macOS Python toolcache permissions, Windows pip invocation
- Create /Users/runner directory on macOS before setup-python (permission fix)
- Use `python -m pip` everywhere instead of calling pip directly (Windows fix)
- Refactor build_sidecar.py to use pip_install() helper via python -m pip

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 21:56:25 -07:00
Claude
3e7671453a Ensure required tools are installed on all CI runners
- pip/setuptools/wheel for sidecar build step
- jq/curl for release API calls
- create-dmg for macOS bundling
- Linux system deps (gtk, webkit, patchelf)
- Validation check on release creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 21:53:58 -07:00
d8ed77b786 Merge pull request 'perf/pipeline-improvements' (#1) from perf/pipeline-improvements into main
Some checks failed
Build & Release / Build sidecar (aarch64-apple-darwin) (push) Failing after 10s
Build & Release / Build sidecar (x86_64-pc-windows-msvc) (push) Failing after 22s
Build & Release / Build sidecar (x86_64-unknown-linux-gnu) (push) Failing after 2m52s
Build & Release / Build app (x86_64-unknown-linux-gnu) (push) Has been skipped
Build & Release / Build app (aarch64-apple-darwin) (push) Has been skipped
Build & Release / Build app (x86_64-pc-windows-msvc) (push) Has been skipped
Build & Release / Create Release (push) Has been skipped
Reviewed-on: #1
2026-03-21 04:53:45 +00:00
Claude
09d7c2064f Add release job to Gitea Actions workflow
Some checks failed
Build & Release / Build sidecar (aarch64-apple-darwin) (pull_request) Failing after 38s
Build & Release / Build sidecar (x86_64-pc-windows-msvc) (pull_request) Failing after 1m1s
Build & Release / Build sidecar (x86_64-unknown-linux-gnu) (pull_request) Failing after 3m18s
Build & Release / Build app (x86_64-unknown-linux-gnu) (pull_request) Has been skipped
Build & Release / Build app (aarch64-apple-darwin) (pull_request) Has been skipped
Build & Release / Build app (x86_64-pc-windows-msvc) (pull_request) Has been skipped
Build & Release / Create Release (pull_request) Has been skipped
Creates a pre-release with all platform artifacts on every push to main.
Uses BUILD_TOKEN secret for Gitea API authentication.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 21:52:21 -07:00
Claude
58faa83cb3 Cross-platform distribution, UI improvements, and performance optimizations
- PyInstaller frozen sidecar: spec file, build script, and ffmpeg path resolver
  for self-contained distribution without Python prerequisites
- Dual-mode sidecar launcher: frozen binary (production) with dev mode fallback
- Parallel transcription + diarization pipeline (~30-40% faster)
- GPU auto-detection for diarization (CUDA when available)
- Async run_pipeline command for real-time progress event delivery
- Web Audio API backend for instant playback and seeking
- OpenAI-compatible provider replacing LiteLLM client-side routing
- Cross-platform RAM detection (Linux/macOS/Windows)
- Settings: speaker count hint, token reveal toggles, dark dropdown styling
- Loading splash screen, flexbox layout fix for viewport overflow
- Gitea Actions CI/CD pipeline (Linux, Windows, macOS ARM)
- Updated README and CLAUDE.md documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 21:33:43 -07:00
Claude
42ccd3e21d Fix cargo fmt formatting and diarize threading test mock 2026-03-20 13:56:32 -07:00
Claude
0771508203 Merge perf/chunked-transcription: chunk-based processing for large files 2026-03-20 13:54:14 -07:00
Claude
c23b9a90dd Merge perf/diarize-threading: diarization progress via background thread 2026-03-20 13:52:59 -07:00
Claude
35af6e9e0c Merge perf/progress-every-segment: emit progress for every segment 2026-03-20 13:52:18 -07:00
Claude
c3b6ad38fd Merge perf/stream-segments: streaming partial transcript segments and speaker updates 2026-03-20 13:51:51 -07:00
Claude
03af5a189c Run pyannote diarization in background thread with progress reporting
Move the blocking pipeline() call to a daemon thread and emit estimated
progress messages every 2 seconds from the main thread. The progress
estimate uses audio duration to calibrate the expected total time.
Also pass audio_duration_sec from PipelineService to DiarizeService.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:50:57 -07:00
Claude
16f4b57771 Add chunked transcription for large audio files (>1 hour)
Split files >1 hour into 5-minute chunks via ffmpeg, transcribe each
chunk independently, then merge results with corrected timestamps.
Also add chunk-level progress markers every 10 segments for all files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:49:20 -07:00
Claude
6eb13bce63 Remove progress throttle so every segment emits a progress update
Previously, progress messages were only sent every 5th segment due to
a `segment_count % 5` guard. This made the UI feel unresponsive for
short recordings with few segments. Now every segment emits a progress
update with a more descriptive message including the segment number
and audio percentage.

Adds a test verifying that all 8 mock segments produce progress
messages, not just every 5th.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:49:14 -07:00
67ed69df00 Stream transcript segments to frontend as they are transcribed
Send each segment to the frontend immediately after transcription via
a new pipeline.segment IPC message, then send speaker assignments as a
batch pipeline.speaker_update message after diarization completes. This
lets the UI display segments progressively instead of waiting for the
entire pipeline to finish.

Changes:
- Add partial_segment_message and speaker_update_message IPC factories
- Add on_segment callback parameter to TranscribeService.transcribe()
- Emit partial segments and speaker updates from PipelineService.run()
- Add send_and_receive_with_progress to SidecarManager (Rust)
- Route pipeline.segment/speaker_update events in run_pipeline command
- Listen for streaming events in Svelte frontend (+page.svelte)
- Add tests for new message types, callback signature, and update logic

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:47:57 -07:00
585411f402 Fix speaker diarization: WAV conversion, pyannote 4.0 compat, telemetry bug
- Convert non-WAV audio to 16kHz mono WAV before diarization (pyannote
  v4.0.4 AudioDecoder returns None duration for FLAC, causing crash)
- Handle pyannote 4.0 DiarizeOutput return type (unwrap .speaker_diarization)
- Disable pyannote telemetry (np.isfinite(None) bug with max_speakers)
- Use huggingface_hub.login() to persist token for all sub-downloads
- Pre-download sub-models (segmentation-3.0, speaker-diarization-community-1)
- Add third required model license link in settings UI
- Improve SpeakerManager hints based on settings state
- Add word-wrap to transcript text

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 19:46:07 -08:00
a3612c986d Add Test & Download button for diarization model, clickable links
- Add diarize.download IPC handler that downloads the pyannote model
  and returns user-friendly error messages (missing license, bad token)
- Add download_diarize_model Tauri command
- Add "Test & Download Model" button in Speakers settings tab
- Update instructions to list both required model licenses
  (speaker-diarization-3.1 AND segmentation-3.0)
- Make all HuggingFace URLs clickable (opens in system browser)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 18:21:42 -08:00
baf820286f Add HuggingFace token setting for speaker detection
- Add "Speakers" tab in Settings with HF token input field
- Include step-by-step instructions for obtaining the token
- Pass hf_token from settings through Rust → Python pipeline → diarize
- Token can also be set via HF_TOKEN environment variable as fallback
- Move skip_diarization checkbox to Speakers tab

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 18:08:51 -08:00
ed626b8ba0 Fix progress overlay, play-from-position, layout cutoff, speaker info
- Replace progress bar with task checklist showing pipeline steps
  (load model, transcribe, load diarization, identify speakers, merge)
- Fix WaveformPlayer: track ready state, disable controls until loaded,
  play from current position instead of resetting to start
- Fix workspace height calc to prevent bottom content cutoff
- Show HF_TOKEN setup hint in SpeakerManager when no speakers detected
- Add console logging for progress events to aid debugging

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 18:02:48 -08:00
4d7b9d524f Fix IPC stdout corruption, dark window background, overlay timing
- Redirect sys.stdout to stderr in Python sidecar so library print()
  calls don't corrupt the JSON-line IPC stream
- Save real stdout fd for exclusive IPC use via init_ipc()
- Skip non-JSON lines in Rust reader instead of failing with parse error
- Set Tauri window background color to match dark theme (#0a0a23)
- Add inline dark background on html/body to prevent white flash
- Use Svelte tick() to ensure progress overlay renders before invoke
- Improve ProgressOverlay with spinner, better styling, z-index 9999

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:50:55 -08:00
87b3ad94f9 Improve import UX: progress overlay, pyannote fix, debug logging
- Enhanced ProgressOverlay with spinner, better styling, and z-index 9999
- Import button shows "Processing..." with pulse animation while transcribing
- Fix pyannote API: use token= instead of deprecated use_auth_token=
- Read HF_TOKEN from environment for pyannote model download
- Add console logging for click-to-seek debugging
- Add color-scheme: dark for native form controls

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:43:49 -08:00
669d88f143 Fix progress feedback, diarization fallback, and dropdown readability
- Stream pipeline progress to frontend via Tauri events so the progress
  overlay updates in real time during transcription/diarization
- Gracefully fall back to transcription-only when diarization fails
  (e.g. pyannote not installed) instead of erroring the whole pipeline
- Add color-scheme: dark to fix native select/option elements rendering
  with unreadable white backgrounds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:14:25 -08:00
d00281f0c7 Fix critical integration issues for end-to-end functionality
- Rewrite SidecarManager as singleton with OnceLock, reusing one Python
  process across all commands instead of spawning per call
- Separate stdin/stdout ownership with dedicated BufReader to prevent
  data corruption between wait_for_ready and send_and_receive
- Add ensure_running() for auto-start on first command
- Fix asset protocol URL: use convertFileSrc() instead of manual
  encodeURIComponent which broke file paths with slashes
- Add +layout.svelte with global dark theme, CSS reset, and custom
  scrollbar styling to prevent white flash on startup
- Register AppState with Tauri .manage(), initialize SQLite database
  on app startup at ~/.voicetonotes/voice_to_notes.db
- Wire project commands (create/get/list) to real database queries
  instead of placeholder stubs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:50:14 -08:00
d3c2954c5e Add STT and diarization research report
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:44:58 -08:00
97a1a15755 Phase 6: Llama-server manager, settings UI, packaging, and polish
- Implement LlamaManager in Rust for llama-server lifecycle: spawn with
  port allocation, health check, clean shutdown on Drop, model listing
- Add llama_start/stop/status/list_models Tauri commands
- Add load_settings/save_settings commands with JSON persistence
- Build SettingsModal with tabs for Transcription, AI Provider, Local AI
  settings (model size, device, language, API keys, provider selection)
- Wire settings into pipeline calls (model, device, language, skip diarization)
- Configure Tauri packaging: asset protocol for local audio files,
  CSP policy, bundle metadata, Linux .deb/.AppImage and Windows .msi config
- Add keyboard shortcuts: Space (play/pause), Ctrl+O (import),
  Ctrl+, (settings), Escape (close menus/modals)
- Close export dropdown on outside click
- Tests: 30 Python, 6 Rust, 0 Svelte errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:38:23 -08:00
d67625cd5a Phase 5: AI provider system with local and cloud support
- Implement AIProvider base interface with chat() and is_available()
- Add LocalProvider connecting to bundled llama-server via OpenAI SDK
- Add OpenAIProvider for direct OpenAI API access
- Add AnthropicProvider for Anthropic Claude API
- Add LiteLLMProvider for multi-provider gateway
- Build AIProviderService with provider routing, auto-selection,
  and transcript context injection
- Add ai.chat IPC handler supporting chat, list_providers, set_provider,
  and configure actions
- Add ai_chat, ai_list_providers, ai_configure Tauri commands
- Build interactive AIChatPanel with message history, quick actions
  (Summarize, Action Items), and transcript context awareness
- Tests: 30 Python, 6 Rust, 0 Svelte errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:25:10 -08:00
415a648a2b Phase 4: Export to SRT, WebVTT, ASS, plain text, and Markdown
- Implement ExportService using pysubs2 for caption formats (SRT, VTT, ASS)
  and custom formatters for plain text and Markdown
- SRT exports with [Speaker]: prefix, WebVTT with <v Speaker> voice tags,
  ASS with color-coded speaker styles
- Plain text groups by speaker with labels, Markdown adds timestamps
- Add export.start IPC handler and export_transcript Tauri command
- Add export dropdown menu in header (appears after transcription)
- Uses native save dialog for output file selection
- Add pysubs2 dependency
- Tests: 30 Python (6 export tests), 6 Rust, 0 Svelte errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:18:54 -08:00
44480906a4 Phase 3: Speaker diarization and full transcription pipeline
- Implement DiarizeService with pyannote.audio speaker detection
- Build PipelineService combining transcribe → diarize → merge with
  overlap-based speaker assignment per segment
- Add pipeline.start and diarize.start IPC handlers
- Add run_pipeline Tauri command for full pipeline execution
- Wire frontend to use pipeline: speakers auto-created with colors,
  segments assigned to detected speakers
- Build SpeakerManager with rename support (double-click or edit button)
- Add speaker color coding throughout transcript display
- Add pyannote.audio dependency
- Tests: 24 Python (including merge logic), 6 Rust, 0 Svelte errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:09:48 -08:00
842f8d5f90 Add auto-scroll, file dialog, and transcript editing
- Auto-scroll transcript to active segment during playback with smart
  pause when user manually scrolls (resumes after 3s)
- Replace prompt() with native Tauri file dialog for audio/video import
  with file type filters
- Add inline transcript editing via double-click with Enter to save,
  Esc to cancel, preserving original text for change tracking
- Show "edited" badge on modified segments

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:02:27 -08:00
48fe41b064 Phase 2: Core transcription pipeline and audio playback
- Implement faster-whisper TranscribeService with word-level timestamps,
  progress reporting, and hardware auto-detection
- Wire up Rust SidecarManager for Python process lifecycle (spawn, IPC, shutdown)
- Add transcribe_file Tauri command bridging frontend to Python sidecar
- Integrate wavesurfer.js WaveformPlayer with play/pause, skip, seek controls
- Build TranscriptEditor with word-level click-to-seek and active highlighting
- Connect file import flow: prompt → asset load → transcribe → display
- Add typed tauri-bridge service with TranscriptionResult interface
- Add Python tests for hardware detection and transcription result formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 15:53:09 -08:00
503cc6c0cf Phase 1 foundation: Tauri shell, Python sidecar, SQLite database
Tauri v2 + Svelte + TypeScript frontend:
- App shell with workspace layout (waveform, transcript, speakers, AI chat)
- Placeholder components for all major UI areas
- Typed stores (project, transcript, playback, AI)
- TypeScript interfaces matching the database schema
- Tauri bridge service with typed invoke wrappers
- svelte-check passes with 0 errors

Rust backend:
- Tauri v2 app entry point with command registration
- SQLite database layer (rusqlite with bundled SQLite)
  - Full schema: projects, media_files, speakers, segments, words,
    ai_outputs, annotations (with indexes)
  - Model structs with serde serialization
  - CRUD queries for projects, speakers, segments, words
  - Segment text editing preserves original text
  - Schema versioning for future migrations
  - 6 tests passing
- Command stubs for project, transcribe, export, AI, settings, system
- App state management

Python sidecar:
- JSON-line IPC protocol (stdin/stdout)
- Message types: IPCMessage, progress, error, ready
- Handler registry with routing and error handling
- Ping/pong handler for connectivity testing
- Service stubs: transcribe, diarize, pipeline, AI, export
- Provider stubs: local (llama-server), OpenAI, Anthropic, LiteLLM
- Hardware detection stubs
- 14 tests passing, ruff clean

Also adds:
- Testing strategy document (docs/TESTING.md)
- Validation script (scripts/validate.sh)
- Updated .gitignore for Svelte, Rust, Python artifacts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 15:16:06 -08:00
c450ef3c0c Switch local AI from Ollama to bundled llama-server, add MIT license
- Replace Ollama dependency with bundled llama-server (llama.cpp)
  so users need no separate install for local AI inference
- Rust backend manages llama-server lifecycle (spawn, port, shutdown)
- Add MIT license for open source release
- Update architecture doc, CLAUDE.md, and README accordingly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 09:00:47 -08:00
0edb06a913 Add architecture document and project guidelines
Detailed architecture covering Tauri + Svelte frontend, Rust backend,
Python sidecar for ML (faster-whisper, pyannote.audio), IPC protocol,
SQLite schema, AI provider system, export formats, and phased
implementation plan with agent work breakdown.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 08:37:45 -08:00
d2bdbe3315 Initial project setup with README and gitignore
Establish the voice-to-notes project with documentation covering
goals, platform targets, and planned feature set.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 08:11:57 -08:00