perf/pipeline-improvements #1

2026-03-21T04:53:36Z

jknapp commented

2026-03-21 04:53:36 +00:00

No description provided.

jknapp added 18 commits 2026-03-21 04:53:36 +00:00

Fix progress feedback, diarization fallback, and dropdown readability 669d88f143

- Stream pipeline progress to frontend via Tauri events so the progress
  overlay updates in real time during transcription/diarization
- Gracefully fall back to transcription-only when diarization fails
  (e.g. pyannote not installed) instead of erroring the whole pipeline
- Add color-scheme: dark to fix native select/option elements rendering
  with unreadable white backgrounds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Improve import UX: progress overlay, pyannote fix, debug logging 87b3ad94f9

- Enhanced ProgressOverlay with spinner, better styling, and z-index 9999
- Import button shows "Processing..." with pulse animation while transcribing
- Fix pyannote API: use token= instead of deprecated use_auth_token=
- Read HF_TOKEN from environment for pyannote model download
- Add console logging for click-to-seek debugging
- Add color-scheme: dark for native form controls

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix IPC stdout corruption, dark window background, overlay timing 4d7b9d524f

- Redirect sys.stdout to stderr in Python sidecar so library print()
  calls don't corrupt the JSON-line IPC stream
- Save real stdout fd for exclusive IPC use via init_ipc()
- Skip non-JSON lines in Rust reader instead of failing with parse error
- Set Tauri window background color to match dark theme (#0a0a23)
- Add inline dark background on html/body to prevent white flash
- Use Svelte tick() to ensure progress overlay renders before invoke
- Improve ProgressOverlay with spinner, better styling, z-index 9999

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix progress overlay, play-from-position, layout cutoff, speaker info ed626b8ba0

- Replace progress bar with task checklist showing pipeline steps
  (load model, transcribe, load diarization, identify speakers, merge)
- Fix WaveformPlayer: track ready state, disable controls until loaded,
  play from current position instead of resetting to start
- Fix workspace height calc to prevent bottom content cutoff
- Show HF_TOKEN setup hint in SpeakerManager when no speakers detected
- Add console logging for progress events to aid debugging

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add HuggingFace token setting for speaker detection baf820286f

- Add "Speakers" tab in Settings with HF token input field
- Include step-by-step instructions for obtaining the token
- Pass hf_token from settings through Rust → Python pipeline → diarize
- Token can also be set via HF_TOKEN environment variable as fallback
- Move skip_diarization checkbox to Speakers tab

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add Test & Download button for diarization model, clickable links a3612c986d

- Add diarize.download IPC handler that downloads the pyannote model
  and returns user-friendly error messages (missing license, bad token)
- Add download_diarize_model Tauri command
- Add "Test & Download Model" button in Speakers settings tab
- Update instructions to list both required model licenses
  (speaker-diarization-3.1 AND segmentation-3.0)
- Make all HuggingFace URLs clickable (opens in system browser)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix speaker diarization: WAV conversion, pyannote 4.0 compat, telemetry bug 585411f402

- Convert non-WAV audio to 16kHz mono WAV before diarization (pyannote
  v4.0.4 AudioDecoder returns None duration for FLAC, causing crash)
- Handle pyannote 4.0 DiarizeOutput return type (unwrap .speaker_diarization)
- Disable pyannote telemetry (np.isfinite(None) bug with max_speakers)
- Use huggingface_hub.login() to persist token for all sub-downloads
- Pre-download sub-models (segmentation-3.0, speaker-diarization-community-1)
- Add third required model license link in settings UI
- Improve SpeakerManager hints based on settings state
- Add word-wrap to transcript text

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Stream transcript segments to frontend as they are transcribed 67ed69df00

Send each segment to the frontend immediately after transcription via
a new pipeline.segment IPC message, then send speaker assignments as a
batch pipeline.speaker_update message after diarization completes. This
lets the UI display segments progressively instead of waiting for the
entire pipeline to finish.

Changes:
- Add partial_segment_message and speaker_update_message IPC factories
- Add on_segment callback parameter to TranscribeService.transcribe()
- Emit partial segments and speaker updates from PipelineService.run()
- Add send_and_receive_with_progress to SidecarManager (Rust)
- Route pipeline.segment/speaker_update events in run_pipeline command
- Listen for streaming events in Svelte frontend (+page.svelte)
- Add tests for new message types, callback signature, and update logic

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove progress throttle so every segment emits a progress update 6eb13bce63

Previously, progress messages were only sent every 5th segment due to
a `segment_count % 5` guard. This made the UI feel unresponsive for
short recordings with few segments. Now every segment emits a progress
update with a more descriptive message including the segment number
and audio percentage.

Adds a test verifying that all 8 mock segments produce progress
messages, not just every 5th.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add chunked transcription for large audio files (>1 hour) 16f4b57771

Split files >1 hour into 5-minute chunks via ffmpeg, transcribe each
chunk independently, then merge results with corrected timestamps.
Also add chunk-level progress markers every 10 segments for all files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Run pyannote diarization in background thread with progress reporting 03af5a189c

Move the blocking pipeline() call to a daemon thread and emit estimated
progress messages every 2 seconds from the main thread. The progress
estimate uses audio duration to calibrate the expected total time.
Also pass audio_duration_sec from PipelineService to DiarizeService.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge perf/stream-segments: streaming partial transcript segments and speaker updates c3b6ad38fd

Merge perf/progress-every-segment: emit progress for every segment 35af6e9e0c

Merge perf/diarize-threading: diarization progress via background thread c23b9a90dd

Merge perf/chunked-transcription: chunk-based processing for large files 0771508203

Fix cargo fmt formatting and diarize threading test mock 42ccd3e21d

Cross-platform distribution, UI improvements, and performance optimizations 58faa83cb3

- PyInstaller frozen sidecar: spec file, build script, and ffmpeg path resolver
  for self-contained distribution without Python prerequisites
- Dual-mode sidecar launcher: frozen binary (production) with dev mode fallback
- Parallel transcription + diarization pipeline (~30-40% faster)
- GPU auto-detection for diarization (CUDA when available)
- Async run_pipeline command for real-time progress event delivery
- Web Audio API backend for instant playback and seeking
- OpenAI-compatible provider replacing LiteLLM client-side routing
- Cross-platform RAM detection (Linux/macOS/Windows)
- Settings: speaker count hint, token reveal toggles, dark dropdown styling
- Loading splash screen, flexbox layout fix for viewport overflow
- Gitea Actions CI/CD pipeline (Linux, Windows, macOS ARM)
- Updated README and CLAUDE.md documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add release job to Gitea Actions workflow

Build & Release / Build sidecar (aarch64-apple-darwin) (pull_request) Failing after 38s

Details

Build & Release / Build sidecar (x86_64-pc-windows-msvc) (pull_request) Failing after 1m1s

Details

Build & Release / Build sidecar (x86_64-unknown-linux-gnu) (pull_request) Failing after 3m18s

Details

Build & Release / Build app (x86_64-unknown-linux-gnu) (pull_request) Has been skipped

Details

Build & Release / Build app (aarch64-apple-darwin) (pull_request) Has been skipped

Details

Build & Release / Build app (x86_64-pc-windows-msvc) (pull_request) Has been skipped

Details

Build & Release / Create Release (pull_request) Has been skipped

Details

09d7c2064f

Creates a pre-release with all platform artifacts on every push to main.
Uses BUILD_TOKEN secret for Gitea API authentication.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jknapp merged commit d8ed77b786 into main

2026-03-21 04:53:45 +00:00

jknapp referenced this issue from a commit

2026-03-21 04:53:47 +00:00

Merge pull request 'perf/pipeline-improvements' (#1) from perf/pipeline-improvements into main

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: MacroPad/voice-to-notes#1