Tauri's externalBin only bundled the single sidecar executable, but
PyInstaller's onedir output requires companion DLLs and _internal/.
The binary was also renamed with a target triple suffix that
resolve_sidecar_path() didn't look for, causing it to fall back to
dev mode which used a compile-time CI path (CARGO_MANIFEST_DIR).
- Switch from externalBin to bundle.resources to include all sidecar files
- Pass Tauri resource_dir to sidecar manager for platform-aware path resolution
- Remove rename_binary() since externalBin target triple naming is no longer needed
- Remove broken production-to-dev fallback that could never work on user machines
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Removes the artifact upload/download overhead between sidecar and app
build steps. Each platform now runs as a single job: build sidecar,
copy it into src-tauri/binaries, build Tauri app, upload to release.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use [System.Uri]::EscapeDataString for proper encoding of filenames
containing spaces in the Gitea API URL. Add size logging and error handling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use curl -T (streaming) instead of --data-binary (loads into memory)
to handle large .deb/.AppImage files
- URL-encode spaces in filenames for the Gitea API
- Use IFS= read -r to handle filenames with spaces
- Add HTTP status code logging for upload debugging
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each platform (Linux, macOS, Windows) now has its own workflow file
that builds the sidecar, builds the Tauri app, and uploads to a shared
"latest" release independently. A failure on one platform no longer
blocks releases for the others.
- build-linux.yml: bash throughout, apt for deps
- build-macos.yml: bash throughout, brew for deps
- build-windows.yml: powershell throughout, choco for deps
- All use uv for Python, upload to shared "latest" release tag
- Each platform replaces its own artifacts on the release
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Linux: add xdg-utils to system deps (provides xdg-open needed by
Tauri's AppImage bundler)
- Windows: replace dtolnay/rust-toolchain action (uses bash internally)
with direct rustup install via PowerShell
- Unix: install Rust via rustup.rs shell script instead of GitHub action
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
pyannote.audio requires ffmpeg at import time (torchcodec loads
FFmpeg shared libraries). Install via brew (macOS), apt (Linux),
choco (Windows) before building the sidecar.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Windows runner doesn't have bash. Split Python setup and build steps
into Unix (default shell) and Windows (powershell) variants.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- uv pip --python works better with the venv directory path than the
python binary path (avoids "No virtual environment found" on Windows)
- Add .exe suffix to Windows python path for non-uv fallback
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- build_sidecar.py: pip_install() now includes 'install' in the command,
callers pass only package names (was doubling up as 'uv pip install install torch')
- CI: set shell: bash on uv steps so Windows doesn't try to use cmd.exe
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Unix runners use the bash install script, Windows uses the PowerShell
installer. Both check if uv is already present first.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
astral-sh/setup-uv is not available on Gitea's action registry.
Use the official install script instead, skipping if uv is already present.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CI: install uv via astral-sh/setup-uv, use uv to install Python
and run the build script (replaces setup-python which fails on
self-hosted macOS runners)
- build_sidecar.py: auto-detects uv and uses it for venv creation
and package installation (much faster), falls back to standard
venv + pip when uv is not available
- Remove .github/workflows duplicate
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Set AGENT_TOOLSDIRECTORY via step-level env on setup-python (not
GITHUB_ENV which only applies to subsequent steps)
- Use runner.temp for toolcache dir (always writable, no sudo needed)
- Remove .github/workflows/build.yml to prevent duplicate CI runs
- Remove unused Windows env check step
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Set AGENT_TOOLSDIRECTORY to a workspace-local path so setup-python
doesn't need /Users/runner or sudo access.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The app name is already in the window title bar, so the in-header
"Voice to Notes" heading was redundant and had poor contrast.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When editing a segment, word timing is now intelligently redistributed:
- Spelling fixes (same word count): each word keeps its original timing
- Word splits (e.g. "gonna" → "going to"): original word's time range
is divided proportionally across the new words
- Inserted words: timing interpolated from neighboring words
- Deleted words: remaining words keep their timing, gaps collapse
This preserves click-to-seek accuracy for common edits like fixing
misheard words or splitting concatenated words.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the edited text has the same word count as the original (e.g. fixing
"Whisper" to "wisper"), each word keeps its original start/end timestamps.
Only falls back to segment-level timing when words are added or removed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The display renders segment.words (not segment.text), so editing the text
field alone had no visible effect. Now finishEditing() rebuilds the words
array from the edited text so the change is immediately visible.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Project files (.vtn):
- Save Project: serializes transcript, speakers, audio path to JSON file
- Open Project: loads .vtn file, restores audio/transcript/speakers
- User chooses filename and location via save dialog
- Replaces SQLite-based project persistence (DB commands remain for future use)
- Text edits update in-memory store immediately, persist on explicit save
- Fix Windows path separator in project name extraction
AI chat:
- Markdown rendering in assistant messages (headers, lists, bold, code)
- Better visual distinction with border-left accents
- Styled markdown elements for dark theme
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add update_segment Tauri command (calls existing update_segment_text query)
- Wire onTextEdit handler from TranscriptEditor to invoke update_segment
- Edits are saved to SQLite immediately when user presses Enter
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Project persistence:
- save_project_transcript command: persists segments, speakers, words to SQLite
- load_project_transcript command: loads full transcript with nested words
- delete_project command: soft-delete projects
- Auto-save after pipeline completes (named from filename)
- Project dropdown in header to switch between saved transcripts
- Projects load audio, segments, and speakers from database
AI chat improvements:
- Markdown rendering in assistant messages (headers, lists, bold, italic, code)
- Better message spacing and visual distinction (border-left accents)
- Styled markdown elements matching dark theme
- Improved empty state and quick action button sizing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
upload-artifact@v4 and download-artifact@v4 require GitHub's backend
and are not supported on Gitea. v3 works with Gitea Actions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create /Users/runner directory on macOS before setup-python (permission fix)
- Use `python -m pip` everywhere instead of calling pip directly (Windows fix)
- Refactor build_sidecar.py to use pip_install() helper via python -m pip
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- pip/setuptools/wheel for sidecar build step
- jq/curl for release API calls
- create-dmg for macOS bundling
- Linux system deps (gtk, webkit, patchelf)
- Validation check on release creation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Creates a pre-release with all platform artifacts on every push to main.
Uses BUILD_TOKEN secret for Gitea API authentication.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move the blocking pipeline() call to a daemon thread and emit estimated
progress messages every 2 seconds from the main thread. The progress
estimate uses audio duration to calibrate the expected total time.
Also pass audio_duration_sec from PipelineService to DiarizeService.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split files >1 hour into 5-minute chunks via ffmpeg, transcribe each
chunk independently, then merge results with corrected timestamps.
Also add chunk-level progress markers every 10 segments for all files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, progress messages were only sent every 5th segment due to
a `segment_count % 5` guard. This made the UI feel unresponsive for
short recordings with few segments. Now every segment emits a progress
update with a more descriptive message including the segment number
and audio percentage.
Adds a test verifying that all 8 mock segments produce progress
messages, not just every 5th.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Send each segment to the frontend immediately after transcription via
a new pipeline.segment IPC message, then send speaker assignments as a
batch pipeline.speaker_update message after diarization completes. This
lets the UI display segments progressively instead of waiting for the
entire pipeline to finish.
Changes:
- Add partial_segment_message and speaker_update_message IPC factories
- Add on_segment callback parameter to TranscribeService.transcribe()
- Emit partial segments and speaker updates from PipelineService.run()
- Add send_and_receive_with_progress to SidecarManager (Rust)
- Route pipeline.segment/speaker_update events in run_pipeline command
- Listen for streaming events in Svelte frontend (+page.svelte)
- Add tests for new message types, callback signature, and update logic
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Convert non-WAV audio to 16kHz mono WAV before diarization (pyannote
v4.0.4 AudioDecoder returns None duration for FLAC, causing crash)
- Handle pyannote 4.0 DiarizeOutput return type (unwrap .speaker_diarization)
- Disable pyannote telemetry (np.isfinite(None) bug with max_speakers)
- Use huggingface_hub.login() to persist token for all sub-downloads
- Pre-download sub-models (segmentation-3.0, speaker-diarization-community-1)
- Add third required model license link in settings UI
- Improve SpeakerManager hints based on settings state
- Add word-wrap to transcript text
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add diarize.download IPC handler that downloads the pyannote model
and returns user-friendly error messages (missing license, bad token)
- Add download_diarize_model Tauri command
- Add "Test & Download Model" button in Speakers settings tab
- Update instructions to list both required model licenses
(speaker-diarization-3.1 AND segmentation-3.0)
- Make all HuggingFace URLs clickable (opens in system browser)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add "Speakers" tab in Settings with HF token input field
- Include step-by-step instructions for obtaining the token
- Pass hf_token from settings through Rust → Python pipeline → diarize
- Token can also be set via HF_TOKEN environment variable as fallback
- Move skip_diarization checkbox to Speakers tab
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace progress bar with task checklist showing pipeline steps
(load model, transcribe, load diarization, identify speakers, merge)
- Fix WaveformPlayer: track ready state, disable controls until loaded,
play from current position instead of resetting to start
- Fix workspace height calc to prevent bottom content cutoff
- Show HF_TOKEN setup hint in SpeakerManager when no speakers detected
- Add console logging for progress events to aid debugging
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Redirect sys.stdout to stderr in Python sidecar so library print()
calls don't corrupt the JSON-line IPC stream
- Save real stdout fd for exclusive IPC use via init_ipc()
- Skip non-JSON lines in Rust reader instead of failing with parse error
- Set Tauri window background color to match dark theme (#0a0a23)
- Add inline dark background on html/body to prevent white flash
- Use Svelte tick() to ensure progress overlay renders before invoke
- Improve ProgressOverlay with spinner, better styling, z-index 9999
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Enhanced ProgressOverlay with spinner, better styling, and z-index 9999
- Import button shows "Processing..." with pulse animation while transcribing
- Fix pyannote API: use token= instead of deprecated use_auth_token=
- Read HF_TOKEN from environment for pyannote model download
- Add console logging for click-to-seek debugging
- Add color-scheme: dark for native form controls
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Stream pipeline progress to frontend via Tauri events so the progress
overlay updates in real time during transcription/diarization
- Gracefully fall back to transcription-only when diarization fails
(e.g. pyannote not installed) instead of erroring the whole pipeline
- Add color-scheme: dark to fix native select/option elements rendering
with unreadable white backgrounds
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>