CSP: Add blob: to connect-src/img-src/media-src for wavesurfer.js audio
playback. Add http://tauri.localhost to default-src for devtools.
pyannote: sys.modules block didn't work — pyannote still uses AudioDecoder
unconditionally. New approach: monkey-patch Audio.__call__ in diarize.py
to use torchaudio.load() directly, bypassing the broken torchcodec path.
Patch runs once before pipeline loading.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CSP: Add connect-src for ipc.localhost and asset.localhost so Tauri IPC
commands and local file loading (waveform, audio playback) work.
pyannote: Block torchcodec in sys.modules at startup so pyannote.audio
falls back to torchaudio for audio decoding. pyannote has a bug where
it uses AudioDecoder unconditionally even when torchcodec import fails.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
torchcodec is partially bundled but non-functional (missing FFmpeg DLLs),
causing pyannote.audio to try AudioDecoder which fails with NameError.
Excluding it forces pyannote to fall back to torchaudio for audio loading.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sidecar now has its own version (1.0.0) and release lifecycle:
- Sidecar tags: sidecar-v1.0.0, sidecar-v1.0.1, etc.
- App tags: v0.2.x (unchanged)
- Sidecar workflow triggers only on python/** changes or manual dispatch
- App release no longer bumps python/pyproject.toml
Sidecar version tracked via sidecar-version.txt in app data dir:
- resolve_sidecar_path() reads version from file instead of CARGO_PKG_VERSION
- download_sidecar() fetches latest sidecar-v* release from Gitea API
- check_sidecar_update() compares local vs remote sidecar versions
- Version file written after successful download
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Default torch on PyPI is CPU-only on Windows. Must use PyTorch's own
package index (cu126) to get CUDA-enabled wheels. This also pins the
CUDA version on Linux for deterministic builds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- transcribe: catch model load failures on CUDA and retry with CPU
- hardware detect: test CUDA runtime actually works (torch.zeros on cuda)
before recommending GPU, since CPU-only builds detect CUDA via driver
but lack cublas/cuDNN libraries
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Exclude ctranslate2.converters from PyInstaller bundle — these modules
import torch at module level causing circular import crashes, and are
only needed for model conversion (never used at runtime)
- Defer all heavy ML imports to first handler call instead of startup,
so the sidecar can send its ready message without loading torch/whisper
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New release.yml: bumps patch version, commits with skip-ci marker, tags, creates Gitea release
- Build workflows now trigger on v* tags only (not branch push)
- Simplified upload steps: use tag directly, retry loop for release lookup
- Fix macOS: install jq if missing
- Sync python/pyproject.toml version to 0.2.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tauri's externalBin only bundled the single sidecar executable, but
PyInstaller's onedir output requires companion DLLs and _internal/.
The binary was also renamed with a target triple suffix that
resolve_sidecar_path() didn't look for, causing it to fall back to
dev mode which used a compile-time CI path (CARGO_MANIFEST_DIR).
- Switch from externalBin to bundle.resources to include all sidecar files
- Pass Tauri resource_dir to sidecar manager for platform-aware path resolution
- Remove rename_binary() since externalBin target triple naming is no longer needed
- Remove broken production-to-dev fallback that could never work on user machines
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- uv pip --python works better with the venv directory path than the
python binary path (avoids "No virtual environment found" on Windows)
- Add .exe suffix to Windows python path for non-uv fallback
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- build_sidecar.py: pip_install() now includes 'install' in the command,
callers pass only package names (was doubling up as 'uv pip install install torch')
- CI: set shell: bash on uv steps so Windows doesn't try to use cmd.exe
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CI: install uv via astral-sh/setup-uv, use uv to install Python
and run the build script (replaces setup-python which fails on
self-hosted macOS runners)
- build_sidecar.py: auto-detects uv and uses it for venv creation
and package installation (much faster), falls back to standard
venv + pip when uv is not available
- Remove .github/workflows duplicate
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create /Users/runner directory on macOS before setup-python (permission fix)
- Use `python -m pip` everywhere instead of calling pip directly (Windows fix)
- Refactor build_sidecar.py to use pip_install() helper via python -m pip
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move the blocking pipeline() call to a daemon thread and emit estimated
progress messages every 2 seconds from the main thread. The progress
estimate uses audio duration to calibrate the expected total time.
Also pass audio_duration_sec from PipelineService to DiarizeService.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split files >1 hour into 5-minute chunks via ffmpeg, transcribe each
chunk independently, then merge results with corrected timestamps.
Also add chunk-level progress markers every 10 segments for all files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, progress messages were only sent every 5th segment due to
a `segment_count % 5` guard. This made the UI feel unresponsive for
short recordings with few segments. Now every segment emits a progress
update with a more descriptive message including the segment number
and audio percentage.
Adds a test verifying that all 8 mock segments produce progress
messages, not just every 5th.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Send each segment to the frontend immediately after transcription via
a new pipeline.segment IPC message, then send speaker assignments as a
batch pipeline.speaker_update message after diarization completes. This
lets the UI display segments progressively instead of waiting for the
entire pipeline to finish.
Changes:
- Add partial_segment_message and speaker_update_message IPC factories
- Add on_segment callback parameter to TranscribeService.transcribe()
- Emit partial segments and speaker updates from PipelineService.run()
- Add send_and_receive_with_progress to SidecarManager (Rust)
- Route pipeline.segment/speaker_update events in run_pipeline command
- Listen for streaming events in Svelte frontend (+page.svelte)
- Add tests for new message types, callback signature, and update logic
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Convert non-WAV audio to 16kHz mono WAV before diarization (pyannote
v4.0.4 AudioDecoder returns None duration for FLAC, causing crash)
- Handle pyannote 4.0 DiarizeOutput return type (unwrap .speaker_diarization)
- Disable pyannote telemetry (np.isfinite(None) bug with max_speakers)
- Use huggingface_hub.login() to persist token for all sub-downloads
- Pre-download sub-models (segmentation-3.0, speaker-diarization-community-1)
- Add third required model license link in settings UI
- Improve SpeakerManager hints based on settings state
- Add word-wrap to transcript text
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add diarize.download IPC handler that downloads the pyannote model
and returns user-friendly error messages (missing license, bad token)
- Add download_diarize_model Tauri command
- Add "Test & Download Model" button in Speakers settings tab
- Update instructions to list both required model licenses
(speaker-diarization-3.1 AND segmentation-3.0)
- Make all HuggingFace URLs clickable (opens in system browser)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add "Speakers" tab in Settings with HF token input field
- Include step-by-step instructions for obtaining the token
- Pass hf_token from settings through Rust → Python pipeline → diarize
- Token can also be set via HF_TOKEN environment variable as fallback
- Move skip_diarization checkbox to Speakers tab
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace progress bar with task checklist showing pipeline steps
(load model, transcribe, load diarization, identify speakers, merge)
- Fix WaveformPlayer: track ready state, disable controls until loaded,
play from current position instead of resetting to start
- Fix workspace height calc to prevent bottom content cutoff
- Show HF_TOKEN setup hint in SpeakerManager when no speakers detected
- Add console logging for progress events to aid debugging
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Redirect sys.stdout to stderr in Python sidecar so library print()
calls don't corrupt the JSON-line IPC stream
- Save real stdout fd for exclusive IPC use via init_ipc()
- Skip non-JSON lines in Rust reader instead of failing with parse error
- Set Tauri window background color to match dark theme (#0a0a23)
- Add inline dark background on html/body to prevent white flash
- Use Svelte tick() to ensure progress overlay renders before invoke
- Improve ProgressOverlay with spinner, better styling, z-index 9999
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Enhanced ProgressOverlay with spinner, better styling, and z-index 9999
- Import button shows "Processing..." with pulse animation while transcribing
- Fix pyannote API: use token= instead of deprecated use_auth_token=
- Read HF_TOKEN from environment for pyannote model download
- Add console logging for click-to-seek debugging
- Add color-scheme: dark for native form controls
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Stream pipeline progress to frontend via Tauri events so the progress
overlay updates in real time during transcription/diarization
- Gracefully fall back to transcription-only when diarization fails
(e.g. pyannote not installed) instead of erroring the whole pipeline
- Add color-scheme: dark to fix native select/option elements rendering
with unreadable white backgrounds
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Implement ExportService using pysubs2 for caption formats (SRT, VTT, ASS)
and custom formatters for plain text and Markdown
- SRT exports with [Speaker]: prefix, WebVTT with <v Speaker> voice tags,
ASS with color-coded speaker styles
- Plain text groups by speaker with labels, Markdown adds timestamps
- Add export.start IPC handler and export_transcript Tauri command
- Add export dropdown menu in header (appears after transcription)
- Uses native save dialog for output file selection
- Add pysubs2 dependency
- Tests: 30 Python (6 export tests), 6 Rust, 0 Svelte errors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>