Fix speaker diarization: WAV conversion, pyannote 4.0 compat, telemetry bug

- Convert non-WAV audio to 16kHz mono WAV before diarization (pyannote v4.0.4 AudioDecoder returns None duration for FLAC, causing crash) - Handle pyannote 4.0 DiarizeOutput return type (unwrap .speaker_diarization) - Disable pyannote telemetry (np.isfinite(None) bug with max_speakers) - Use huggingface_hub.login() to persist token for all sub-downloads - Pre-download sub-models (segmentation-3.0, speaker-diarization-community-1) - Add third required model license link in settings UI - Improve SpeakerManager hints based on settings state - Add word-wrap to transcript text Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 19:46:07 -08:00
parent a3612c986d
commit 585411f402
6 changed files with 133 additions and 25 deletions
--- a/python/voice_to_notes/services/pipeline.py
+++ b/python/voice_to_notes/services/pipeline.py
@@ -127,15 +127,17 @@ class PipelineService:
                hf_token=hf_token,
            )
        except Exception as e:
+            import traceback
            print(
                f"[sidecar] Diarization failed, falling back to transcription-only: {e}",
                file=sys.stderr,
                flush=True,
            )
+            traceback.print_exc(file=sys.stderr)
            write_message(
                progress_message(
                    request_id, 80, "pipeline",
-                    "Diarization unavailable, using transcription only..."
+                    f"Diarization failed ({e}), using transcription only..."
                )
            )