35 Commits

Author SHA1 Message Date
Gitea Actions
e0e1638327 chore: bump version to 0.2.45 [skip ci] 2026-03-23 20:53:55 +00:00
Claude
c4fffad027 Fix permissions on demand instead of every launch
Some checks failed
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m17s
Release / Build App (Windows) (push) Failing after 1m56s
Release / Build App (Linux) (push) Successful in 3m39s
Instead of chmod on every app start, catch EACCES (error 13) when
spawning sidecar or ffmpeg, fix permissions, then retry once:
- sidecar spawn: catches permission denied, runs set_executable_permissions
  on the sidecar dir, retries spawn
- ffmpeg: catches permission denied, chmod +x ffmpeg and ffprobe, retries

Zero overhead on normal launches. Only fixes permissions when actually needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 13:53:47 -07:00
Gitea Actions
618edf65ab chore: bump version to 0.2.44 [skip ci] 2026-03-23 20:45:32 +00:00
Claude
c5b8eb06c6 Fix permissions on already-extracted sidecar dirs
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m20s
Release / Build App (Windows) (push) Successful in 2m59s
Release / Build App (Linux) (push) Successful in 3m35s
The chmod fix only ran after fresh extraction, but existing sidecar
dirs extracted by older versions still lacked execute permissions.
Now set_executable_permissions() runs on EVERY app launch (both the
early-return path for existing dirs and after fresh extraction).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 13:45:26 -07:00
Gitea Actions
4f44bdd037 chore: bump version to 0.2.43 [skip ci] 2026-03-23 20:30:33 +00:00
Claude
32bfbd3791 Set execute permissions on ALL files in sidecar dir on Unix
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m43s
Release / Build App (Windows) (push) Successful in 3m20s
Release / Build App (Linux) (push) Successful in 3m36s
Previously only the main sidecar binary got chmod 755. Now all files
in the extraction directory get execute permissions — covers ffmpeg,
ffprobe, and any other bundled binaries. Applied in three places:
- sidecar/mod.rs: after local extraction
- commands/sidecar.rs: after download extraction
- commands/media.rs: removed single-file fix (now handled globally)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 13:30:26 -07:00
Gitea Actions
2bfb1b276e chore: bump version to 0.2.42 [skip ci] 2026-03-23 20:18:57 +00:00
Claude
908762073f Fix ffmpeg permission denied on Linux
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m31s
Release / Build App (Windows) (push) Successful in 3m25s
Release / Build App (Linux) (push) Successful in 3m28s
The bundled ffmpeg in the sidecar extract dir lacked execute permissions.
Now sets chmod 755 on Unix when find_ffmpeg locates the bundled binary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 13:18:51 -07:00
Gitea Actions
2011015c9a chore: bump version to 0.2.41 [skip ci] 2026-03-23 17:25:07 +00:00
Claude
fc5cfc4374 Save As: use save dialog so user can type a new project name
All checks were successful
Release / Bump version and tag (push) Successful in 4s
Release / Build App (macOS) (push) Successful in 1m20s
Release / Build App (Windows) (push) Successful in 3m5s
Release / Build App (Linux) (push) Successful in 3m44s
Changed from folder picker (can only select existing folders) to save
dialog where the user can type a new name. The typed name becomes the
project folder, created automatically if it doesn't exist. Any file
extension the user types is stripped (e.g. "My Project.vtn" becomes
the folder "My Project/").

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 10:25:00 -07:00
Gitea Actions
ac0fe3b4c7 chore: bump version to 0.2.40 [skip ci] 2026-03-23 16:56:19 +00:00
Claude
e05f9afaff Add Save As, auto-migrate v1 projects to folder structure
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m17s
Release / Build App (Windows) (push) Successful in 3m5s
Release / Build App (Linux) (push) Successful in 3m22s
Save behavior:
- Save on v2 project: saves in place (no dialog)
- Save on v1 project: auto-migrates to folder structure next to the
  original .vtn (creates ProjectName/ folder with .vtn + audio.wav)
- Save on unsaved project: opens folder picker (Save As)
- Save As: always opens folder picker for a new location

Added projectIsV2 state to track project format version.
Split "Save Project" button into "Save" + "Save As".
Extracted saveToFolder() helper for shared save logic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 09:56:13 -07:00
Gitea Actions
548d260061 chore: bump version to 0.2.39 [skip ci] 2026-03-23 16:51:24 +00:00
Claude
168a43e0e1 Save project: pick folder instead of file
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m16s
Release / Build App (Windows) (push) Successful in 3m7s
Release / Build App (Linux) (push) Successful in 3m22s
Changed save dialog from file picker (.vtn) to folder picker. The
project name is derived from the folder name. Files are created
inside the chosen folder:
  Folder/
    Folder.vtn
    audio.wav

Also: save-in-place for already-saved projects (Ctrl+S just saves,
no dialog). Extracted buildProjectData() helper for reuse.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 09:51:14 -07:00
Gitea Actions
543decd769 chore: bump version to 0.2.38 [skip ci] 2026-03-23 16:48:36 +00:00
Claude
e05f88eecf Make ProjectFile struct support both v1 and v2 formats
Some checks failed
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m20s
Release / Build App (Linux) (push) Has been cancelled
Release / Build App (Windows) (push) Has been cancelled
audio_file, source_file, audio_wav are all optional with serde defaults.
v1 projects have audio_file, v2 projects have source_file + audio_wav.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 09:48:29 -07:00
Gitea Actions
fee1255cac chore: bump version to 0.2.37 [skip ci] 2026-03-23 15:47:16 +00:00
Claude
2e9f2519b1 Project folders, always-extract audio, re-link support
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m17s
Release / Build App (Windows) (push) Successful in 3m6s
Release / Build App (Linux) (push) Successful in 3m25s
Projects now save as folders containing .vtn + audio.wav:
  My Transcript/
    My Transcript.vtn
    audio.wav

Audio handling:
- Always extract to 22kHz mono WAV on import (all formats, not just video)
- Prevents WebAudio crash from decoding large MP3/FLAC/OGG to PCM in memory
- WAV saved alongside .vtn on project save (moved from temp)
- Sidecar still uses original file (does its own conversion)

Project format v2:
- source_file: original import path (for re-extraction)
- audio_wav: relative path to extracted WAV (portable)

Re-link on open:
- If audio.wav exists → load directly
- If missing but source exists → re-extract automatically
- If both missing → dialog to locate file via file picker
- V1 project migration: extracts WAV on first open

New Rust commands: check_file_exists, copy_file, create_dir
extract_audio: now accepts optional output_path, uses 22kHz sample rate

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 08:47:08 -07:00
Gitea Actions
82bfcfb793 chore: bump version to 0.2.36 [skip ci] 2026-03-23 14:58:10 +00:00
Gitea Actions
73eab2e80c chore: bump sidecar version to 1.0.13 [skip ci] 2026-03-23 14:58:07 +00:00
Claude
33ca3e4a28 Show chunk context in transcription progress for large files
All checks were successful
Build Sidecars / Bump sidecar version and tag (push) Successful in 3s
Release / Bump version and tag (push) Successful in 3s
Build Sidecars / Build Sidecar (macOS) (push) Successful in 8m30s
Release / Build App (macOS) (push) Successful in 1m19s
Build Sidecars / Build Sidecar (Linux) (push) Successful in 12m9s
Release / Build App (Linux) (push) Successful in 3m36s
Build Sidecars / Build Sidecar (Windows) (push) Successful in 29m36s
Release / Build App (Windows) (push) Successful in 3m13s
Files >1 hour are split into 5-minute chunks. Previously each chunk
showed "Starting transcription..." making it look like a restart.
Now shows "Chunk 3/12: Starting transcription..." and
"Chunk 3/12: Transcribing segment 5 (42% of audio)..."

Also skips the "Loading model..." message for chunks after the first
since the model is already loaded.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 07:57:59 -07:00
Gitea Actions
e65d8b0510 chore: bump version to 0.2.35 [skip ci] 2026-03-23 14:31:13 +00:00
Claude
a7364f2e50 Fix 's is not defined' in AIChatPanel
All checks were successful
Release / Bump version and tag (push) Successful in 4s
Release / Build App (macOS) (push) Successful in 1m18s
Release / Build App (Linux) (push) Successful in 3m37s
Release / Build App (Windows) (push) Successful in 3m53s
Leftover reference to removed 's' variable — changed to $settings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 07:31:07 -07:00
Gitea Actions
809acfc781 chore: bump version to 0.2.34 [skip ci] 2026-03-23 13:42:26 +00:00
Claude
96e9a6d38b Fix Ollama: remove duplicate stale configMap in AIChatPanel
All checks were successful
Release / Bump version and tag (push) Successful in 6s
Release / Build App (macOS) (push) Successful in 1m17s
Release / Build App (Linux) (push) Successful in 4m49s
Release / Build App (Windows) (push) Successful in 3m8s
AIChatPanel had its own hardcoded configMap with the old llama-server
URL (localhost:8080) and field names (local_model_path). Every chat
message reconfigured the provider with these wrong values, overriding
the correct settings applied at startup.

Fix: replace the duplicate with a call to the shared configureAIProvider().
Also strip trailing slashes from ollama_url before appending /v1 to
prevent double-slash URLs (http://localhost:11434//v1).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 06:33:03 -07:00
Gitea Actions
ddfbd65478 chore: bump version to 0.2.33 [skip ci] 2026-03-23 13:24:46 +00:00
Gitea Actions
e80ee3a18f chore: bump sidecar version to 1.0.12 [skip ci] 2026-03-23 13:24:34 +00:00
Claude
806586ae3d Fix diarization performance for long files + better progress
Some checks failed
Build Sidecars / Bump sidecar version and tag (push) Successful in 11s
Release / Bump version and tag (push) Successful in 10s
Build Sidecars / Build Sidecar (macOS) (push) Successful in 4m0s
Release / Build App (macOS) (push) Successful in 1m16s
Release / Build App (Linux) (push) Has been cancelled
Release / Build App (Windows) (push) Has been cancelled
Build Sidecars / Build Sidecar (Linux) (push) Successful in 17m34s
Build Sidecars / Build Sidecar (Windows) (push) Successful in 28m9s
- Cache loaded audio in _sf_load() — previously the entire WAV file was
  re-read from disk for every 10s crop call. For a 3-hour file with
  1000+ chunks, this meant ~345GB of disk reads. Now read once, cached.
- Better progress messages for long files: show elapsed time in m:ss
  format, warn "(180min audio, this may take a while)" for files >10min
- Increased progress poll interval from 2s to 5s (less noise)
- Better time estimate: use 0.8x audio duration (was 0.5x)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 06:24:21 -07:00
Gitea Actions
999bdaa671 chore: bump version to 0.2.32 [skip ci] 2026-03-23 12:38:47 +00:00
Claude
b1d46fd42e Add cancel button to processing overlay with confirmation
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m21s
Release / Build App (Windows) (push) Successful in 3m8s
Release / Build App (Linux) (push) Successful in 3m40s
- Cancel button on the progress overlay during transcription
- Clicking Cancel shows confirmation: "Processing is incomplete. If you
  cancel now, the transcription will need to be started over."
- "Continue Processing" dismisses the dialog, "Cancel Processing" stops
- Cancel clears partial results (segments, speakers) and resets UI
- Pipeline results are discarded if cancelled during processing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 05:38:40 -07:00
Gitea Actions
818cbfa69c chore: bump version to 0.2.31 [skip ci] 2026-03-23 12:30:19 +00:00
Claude
aa319eb823 Fix Ollama settings on startup + video extraction UX
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m18s
Release / Build App (Linux) (push) Successful in 3m44s
Release / Build App (Windows) (push) Successful in 3m57s
AI provider:
- Extract configureAIProvider() from saveSettings for reuse
- Call it on app startup after sidecar is ready (was only called on Save)
- Call it after first-time sidecar download completes
- Sidecar now receives correct Ollama URL/model immediately

Video extraction:
- Hide ffmpeg console window on Windows (CREATE_NO_WINDOW flag)
- Show "Extracting audio from video..." overlay with spinner during extraction
- UI stays responsive while ffmpeg runs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 05:30:14 -07:00
Gitea Actions
8faa336cbc chore: bump version to 0.2.30 [skip ci] 2026-03-23 03:12:25 +00:00
Claude
02c70f90c8 Extract audio from video files before loading
All checks were successful
Release / Bump version and tag (push) Successful in 3s
Release / Build App (macOS) (push) Successful in 1m17s
Release / Build App (Linux) (push) Successful in 4m53s
Release / Build App (Windows) (push) Successful in 3m45s
Video files (MP4, MKV, etc.) are now processed with ffmpeg to extract
audio to a temp WAV file before loading into wavesurfer. This prevents
the WebView crash caused by trying to fetch multi-GB files into memory.

- New extract_audio Tauri command uses ffmpeg (sidecar-bundled or system)
- Frontend detects video extensions and extracts audio automatically
- User-friendly error if ffmpeg is not installed with install instructions
- Reverted wavesurfer MediaElement approach in favor of clean extraction
- Added FFmpeg install guide to USER_GUIDE.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 20:04:10 -07:00
Gitea Actions
66db827f17 chore: bump version to 0.2.29 [skip ci] 2026-03-23 02:55:23 +00:00
18 changed files with 712 additions and 85 deletions

View File

@@ -26,10 +26,13 @@ The sidecar only needs to be downloaded once. Updates are detected automatically
## Basic Workflow
### 1. Import Audio
### 1. Import Audio or Video
- Click **Import Audio** or press **Ctrl+O** (Cmd+O on Mac)
- Supported formats: MP3, WAV, FLAC, OGG, M4A, AAC, WMA, MP4, MKV, AVI, MOV, WebM
- **Audio formats:** MP3, WAV, FLAC, OGG, M4A, AAC, WMA
- **Video formats:** MP4, MKV, AVI, MOV, WebM — audio is automatically extracted
> **Note:** Video file import requires [FFmpeg](#installing-ffmpeg) to be installed on your system.
### 2. Transcribe
@@ -181,8 +184,42 @@ If you prefer cloud-based AI:
---
## Installing FFmpeg
FFmpeg is required for importing video files (MP4, MKV, AVI, etc.). It's used to extract the audio track before transcription.
**Windows:**
```
winget install ffmpeg
```
Or download from [ffmpeg.org/download.html](https://ffmpeg.org/download.html) and add to your PATH.
**macOS:**
```
brew install ffmpeg
```
**Linux (Debian/Ubuntu):**
```
sudo apt install ffmpeg
```
**Linux (Fedora/RHEL):**
```
sudo dnf install ffmpeg
```
After installing, restart Voice to Notes. FFmpeg is not needed for audio-only files (MP3, WAV, FLAC, etc.).
---
## Troubleshooting
### Video import fails / "FFmpeg not found"
- Install FFmpeg using the instructions above
- Make sure `ffmpeg` is in your system PATH
- Restart Voice to Notes after installing
### Transcription is slow
- Use a smaller model (tiny or base)
- If you have an NVIDIA GPU, select CUDA in Settings > Transcription > Device

View File

@@ -1,6 +1,6 @@
{
"name": "voice-to-notes",
"version": "0.2.28",
"version": "0.2.45",
"description": "Desktop app for transcribing audio/video with speaker identification",
"type": "module",
"scripts": {

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "voice-to-notes"
version = "1.0.11"
version = "1.0.13"
description = "Python sidecar for Voice to Notes — transcription, diarization, and AI services"
requires-python = ">=3.11"
license = "MIT"

View File

@@ -41,14 +41,23 @@ def _patch_pyannote_audio() -> None:
import torch
from pyannote.audio.core.io import Audio
# Cache loaded audio to avoid re-reading the entire file for every crop call.
# For a 3-hour file, crop is called 1000+ times — without caching, each call
# reads ~345MB from disk.
_audio_cache: dict[str, tuple] = {}
def _sf_load(audio_path: str) -> tuple:
"""Load audio via soundfile, return (channels, samples) tensor + sample_rate."""
data, sample_rate = sf.read(str(audio_path), dtype="float32")
"""Load audio via soundfile with caching."""
key = str(audio_path)
if key in _audio_cache:
return _audio_cache[key]
data, sample_rate = sf.read(key, dtype="float32")
waveform = torch.from_numpy(np.array(data))
if waveform.ndim == 1:
waveform = waveform.unsqueeze(0)
else:
waveform = waveform.T
_audio_cache[key] = (waveform, sample_rate)
return waveform, sample_rate
def _soundfile_call(self, file: dict) -> tuple:
@@ -56,7 +65,7 @@ def _patch_pyannote_audio() -> None:
return _sf_load(file["audio"])
def _soundfile_crop(self, file: dict, segment, **kwargs) -> tuple:
"""Replacement for Audio.crop — load full file then slice.
"""Replacement for Audio.crop — load file once (cached) then slice.
Pads short segments with zeros to match the expected duration,
which pyannote requires for batched embedding extraction.
@@ -279,13 +288,20 @@ class DiarizeService:
thread.start()
elapsed = 0.0
estimated_total = max(audio_duration_sec * 0.5, 30.0) if audio_duration_sec else 120.0
while not done_event.wait(timeout=2.0):
elapsed += 2.0
estimated_total = max(audio_duration_sec * 0.8, 30.0) if audio_duration_sec else 120.0
duration_str = ""
if audio_duration_sec and audio_duration_sec > 600:
mins = int(audio_duration_sec / 60)
duration_str = f" ({mins}min audio, this may take a while)"
while not done_event.wait(timeout=5.0):
elapsed += 5.0
pct = min(20 + int((elapsed / estimated_total) * 65), 85)
elapsed_min = int(elapsed / 60)
elapsed_sec = int(elapsed % 60)
time_str = f"{elapsed_min}m{elapsed_sec:02d}s" if elapsed_min > 0 else f"{int(elapsed)}s"
write_message(progress_message(
request_id, pct, "diarizing",
f"Analyzing speakers ({int(elapsed)}s elapsed)..."))
f"Analyzing speakers ({time_str} elapsed){duration_str}"))
thread.join()

View File

@@ -113,17 +113,22 @@ class TranscribeService:
compute_type: str = "int8",
language: str | None = None,
on_segment: Callable[[SegmentResult, int], None] | None = None,
chunk_label: str | None = None,
) -> TranscriptionResult:
"""Transcribe an audio file with word-level timestamps.
Sends progress messages via IPC during processing.
If chunk_label is set (e.g. "chunk 3/12"), messages are prefixed with it.
"""
# Stage: loading model
write_message(progress_message(request_id, 0, "loading_model", f"Loading {model_name}..."))
prefix = f"{chunk_label}: " if chunk_label else ""
# Stage: loading model (skip for chunks after the first — model already loaded)
if not chunk_label:
write_message(progress_message(request_id, 0, "loading_model", f"Loading {model_name}..."))
model = self._ensure_model(model_name, device, compute_type)
# Stage: transcribing
write_message(progress_message(request_id, 10, "transcribing", "Starting transcription..."))
write_message(progress_message(request_id, 10, "transcribing", f"{prefix}Starting transcription..."))
start_time = time.time()
segments_iter, info = model.transcribe(
@@ -176,7 +181,7 @@ class TranscribeService:
request_id,
progress_pct,
"transcribing",
f"Transcribing segment {segment_count} ({progress_pct}% of audio)...",
f"{prefix}Transcribing segment {segment_count} ({progress_pct}% of audio)...",
)
)
@@ -271,6 +276,7 @@ class TranscribeService:
chunk_result = self.transcribe(
request_id, tmp.name, model_name, device,
compute_type, language, on_segment=chunk_on_segment,
chunk_label=f"Chunk {chunk_idx + 1}/{num_chunks}",
)
# Offset timestamps and merge

View File

@@ -1,6 +1,6 @@
[package]
name = "voice-to-notes"
version = "0.2.28"
version = "0.2.45"
description = "Voice to Notes — desktop transcription with speaker identification"
authors = ["Voice to Notes Contributors"]
license = "MIT"

View File

@@ -0,0 +1,152 @@
use std::path::PathBuf;
use std::process::Command;
#[cfg(target_os = "windows")]
use std::os::windows::process::CommandExt;
/// Extract audio from a video file to a WAV file using ffmpeg.
/// Returns the path to the extracted audio file.
#[tauri::command]
pub fn extract_audio(file_path: String, output_path: Option<String>) -> Result<String, String> {
let input = PathBuf::from(&file_path);
if !input.exists() {
return Err(format!("File not found: {}", file_path));
}
// Use provided output path, or fall back to a temp WAV file
let stem = input.file_stem().unwrap_or_default().to_string_lossy();
let output = match output_path {
Some(ref p) => PathBuf::from(p),
None => std::env::temp_dir().join(format!("{stem}_audio.wav")),
};
eprintln!(
"[media] Extracting audio: {} -> {}",
input.display(),
output.display()
);
// Find ffmpeg — check sidecar extract dir first, then system PATH
let ffmpeg = find_ffmpeg().ok_or("ffmpeg not found. Install ffmpeg or ensure it's in PATH.")?;
let mut cmd = Command::new(&ffmpeg);
cmd.args([
"-y", // Overwrite output
"-i",
&file_path,
"-vn", // No video
"-acodec",
"pcm_s16le", // WAV PCM 16-bit
"-ar",
"22050", // 22kHz mono for better playback quality
"-ac",
"1", // Mono
])
.arg(output.to_str().unwrap())
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::piped());
// Hide the console window on Windows (CREATE_NO_WINDOW = 0x08000000)
#[cfg(target_os = "windows")]
cmd.creation_flags(0x08000000);
let status = match cmd.status() {
Ok(s) => s,
Err(e) if e.raw_os_error() == Some(13) => {
// Permission denied — fix permissions and retry
eprintln!("[media] Permission denied on ffmpeg, fixing permissions and retrying...");
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
if let Ok(meta) = std::fs::metadata(&ffmpeg) {
let mut perms = meta.permissions();
perms.set_mode(0o755);
let _ = std::fs::set_permissions(&ffmpeg, perms);
}
// Also fix ffprobe if it exists
let ffprobe = ffmpeg.replace("ffmpeg", "ffprobe");
if let Ok(meta) = std::fs::metadata(&ffprobe) {
let mut perms = meta.permissions();
perms.set_mode(0o755);
let _ = std::fs::set_permissions(&ffprobe, perms);
}
}
Command::new(&ffmpeg)
.args(["-y", "-i", &file_path, "-vn", "-acodec", "pcm_s16le", "-ar", "22050", "-ac", "1"])
.arg(output.to_str().unwrap())
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::piped())
.status()
.map_err(|e| format!("Failed to run ffmpeg after chmod: {e}"))?
}
Err(e) => return Err(format!("Failed to run ffmpeg: {e}")),
};
if !status.success() {
return Err(format!("ffmpeg exited with status {status}"));
}
if !output.exists() {
return Err("ffmpeg completed but output file not found".to_string());
}
eprintln!("[media] Audio extracted successfully");
Ok(output.to_string_lossy().to_string())
}
#[tauri::command]
pub fn check_file_exists(path: String) -> bool {
std::path::Path::new(&path).exists()
}
#[tauri::command]
pub fn copy_file(src: String, dst: String) -> Result<(), String> {
std::fs::copy(&src, &dst).map_err(|e| format!("Failed to copy file: {e}"))?;
Ok(())
}
#[tauri::command]
pub fn create_dir(path: String) -> Result<(), String> {
std::fs::create_dir_all(&path).map_err(|e| format!("Failed to create directory: {e}"))?;
Ok(())
}
/// Find ffmpeg binary — check sidecar directory first, then system PATH.
fn find_ffmpeg() -> Option<String> {
// Check sidecar extract dir (ffmpeg is bundled with the sidecar)
if let Some(data_dir) = crate::sidecar::DATA_DIR.get() {
// Read sidecar version to find the right directory
let version_file = data_dir.join("sidecar-version.txt");
if let Ok(version) = std::fs::read_to_string(&version_file) {
let version = version.trim();
let sidecar_dir = data_dir.join(format!("sidecar-{version}"));
let ffmpeg_name = if cfg!(target_os = "windows") {
"ffmpeg.exe"
} else {
"ffmpeg"
};
let ffmpeg_path = sidecar_dir.join(ffmpeg_name);
if ffmpeg_path.exists() {
return Some(ffmpeg_path.to_string_lossy().to_string());
}
}
}
// Fall back to system PATH
let ffmpeg_name = if cfg!(target_os = "windows") {
"ffmpeg.exe"
} else {
"ffmpeg"
};
if Command::new(ffmpeg_name)
.arg("-version")
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status()
.is_ok()
{
return Some(ffmpeg_name.to_string());
}
None
}

View File

@@ -1,5 +1,6 @@
pub mod ai;
pub mod export;
pub mod media;
pub mod project;
pub mod settings;
pub mod sidecar;

View File

@@ -12,7 +12,12 @@ use crate::state::AppState;
pub struct ProjectFile {
pub version: u32,
pub name: String,
pub audio_file: String,
#[serde(default)]
pub audio_file: Option<String>,
#[serde(default)]
pub source_file: Option<String>,
#[serde(default)]
pub audio_wav: Option<String>,
pub created_at: String,
pub segments: Vec<ProjectFileSegment>,
pub speakers: Vec<ProjectFileSpeaker>,

View File

@@ -197,15 +197,21 @@ pub async fn download_sidecar(app: AppHandle, variant: String) -> Result<(), Str
let extract_dir = data_dir.join(format!("sidecar-{}", sidecar_version));
SidecarManager::extract_zip(&zip_path, &extract_dir)?;
// Make the binary executable on Unix
// Make all binaries executable on Unix (sidecar, ffmpeg, ffprobe, etc.)
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
let binary_path = extract_dir.join("voice-to-notes-sidecar");
if let Ok(meta) = std::fs::metadata(&binary_path) {
let mut perms = meta.permissions();
perms.set_mode(0o755);
let _ = std::fs::set_permissions(&binary_path, perms);
if let Ok(entries) = std::fs::read_dir(&extract_dir) {
for entry in entries.flatten() {
let path = entry.path();
if path.is_file() {
if let Ok(meta) = std::fs::metadata(&path) {
let mut perms = meta.permissions();
perms.set_mode(0o755);
let _ = std::fs::set_permissions(&path, perms);
}
}
}
}
}

View File

@@ -9,6 +9,7 @@ use tauri::Manager;
use commands::ai::{ai_chat, ai_configure, ai_list_providers};
use commands::export::export_transcript;
use commands::media::{check_file_exists, copy_file, create_dir, extract_audio};
use commands::project::{
create_project, delete_project, get_project, list_projects, load_project_file,
load_project_transcript, save_project_file, save_project_transcript, update_segment,
@@ -73,6 +74,10 @@ pub fn run() {
check_sidecar_update,
log_frontend,
toggle_devtools,
extract_audio,
check_file_exists,
copy_file,
create_dir,
])
.run(tauri::generate_context!())
.expect("error while running tauri application");

View File

@@ -113,16 +113,8 @@ impl SidecarManager {
));
}
// Make executable on Unix
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
if let Ok(meta) = std::fs::metadata(&binary_path) {
let mut perms = meta.permissions();
perms.set_mode(0o755);
let _ = std::fs::set_permissions(&binary_path, perms);
}
}
Self::set_executable_permissions(&extract_dir);
Self::cleanup_old_sidecars(data_dir, &current_version);
Ok(binary_path)
@@ -207,6 +199,24 @@ impl SidecarManager {
/// Remove old sidecar-* directories that don't match the current version.
/// Called after the current version's sidecar is confirmed ready.
/// Set execute permissions on all files in a directory (Unix only).
#[cfg(unix)]
fn set_executable_permissions(dir: &Path) {
use std::os::unix::fs::PermissionsExt;
if let Ok(entries) = std::fs::read_dir(dir) {
for entry in entries.flatten() {
let path = entry.path();
if path.is_file() {
if let Ok(meta) = std::fs::metadata(&path) {
let mut perms = meta.permissions();
perms.set_mode(0o755);
let _ = std::fs::set_permissions(&path, perms);
}
}
}
}
}
pub(crate) fn cleanup_old_sidecars(data_dir: &Path, current_version: &str) {
let current_dir_name = format!("sidecar-{}", current_version);
@@ -321,12 +331,39 @@ impl SidecarManager {
#[cfg(target_os = "windows")]
cmd.creation_flags(0x08000000);
let child = cmd
.spawn()
.map_err(|e| format!("Failed to start sidecar binary: {e}"))?;
self.attach(child)?;
self.wait_for_ready()
match cmd.spawn() {
Ok(child) => {
self.attach(child)?;
self.wait_for_ready()
}
Err(e) if e.raw_os_error() == Some(13) => {
// Permission denied — fix permissions and retry once
eprintln!("[sidecar-rs] Permission denied, fixing permissions and retrying...");
if let Some(dir) = path.parent() {
Self::set_executable_permissions(dir);
}
let mut retry_cmd = Command::new(path);
retry_cmd
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(if let Some(data_dir) = DATA_DIR.get() {
let log_path = data_dir.join("sidecar.log");
std::fs::File::create(&log_path)
.map(Stdio::from)
.unwrap_or_else(|_| Stdio::inherit())
} else {
Stdio::inherit()
});
#[cfg(target_os = "windows")]
retry_cmd.creation_flags(0x08000000);
let child = retry_cmd
.spawn()
.map_err(|e| format!("Failed to start sidecar binary after chmod: {e}"))?;
self.attach(child)?;
self.wait_for_ready()
}
Err(e) => Err(format!("Failed to start sidecar binary: {e}")),
}
}
/// Spawn the Python sidecar in dev mode (system Python).

View File

@@ -1,7 +1,7 @@
{
"$schema": "https://schema.tauri.app/config/2",
"productName": "Voice to Notes",
"version": "0.2.28",
"version": "0.2.45",
"identifier": "com.voicetonotes.app",
"build": {
"beforeDevCommand": "npm run dev",

View File

@@ -1,7 +1,7 @@
<script lang="ts">
import { invoke } from '@tauri-apps/api/core';
import { segments, speakers } from '$lib/stores/transcript';
import { settings } from '$lib/stores/settings';
import { settings, configureAIProvider } from '$lib/stores/settings';
interface ChatMessage {
role: 'user' | 'assistant';
@@ -45,22 +45,12 @@
}));
// Ensure the provider is configured with current credentials before chatting
const s = $settings;
const configMap: Record<string, Record<string, string>> = {
openai: { api_key: s.openai_api_key, model: s.openai_model },
anthropic: { api_key: s.anthropic_api_key, model: s.anthropic_model },
litellm: { api_key: s.litellm_api_key, api_base: s.litellm_api_base, model: s.litellm_model },
local: { model: s.local_model_path, base_url: 'http://localhost:8080' },
};
const config = configMap[s.ai_provider];
if (config) {
await invoke('ai_configure', { provider: s.ai_provider, config });
}
await configureAIProvider($settings);
const result = await invoke<{ response: string }>('ai_chat', {
messages: chatMessages,
transcriptContext: getTranscriptContext(),
provider: s.ai_provider,
provider: $settings.ai_provider,
});
messages = [...messages, { role: 'assistant', content: result.response }];

View File

@@ -4,9 +4,25 @@
percent?: number;
stage?: string;
message?: string;
onCancel?: () => void;
}
let { visible = false, percent = 0, stage = '', message = '' }: Props = $props();
let { visible = false, percent = 0, stage = '', message = '', onCancel }: Props = $props();
let showConfirm = $state(false);
function handleCancelClick() {
showConfirm = true;
}
function confirmCancel() {
showConfirm = false;
onCancel?.();
}
function dismissCancel() {
showConfirm = false;
}
// Pipeline steps in order
const pipelineSteps = [
@@ -89,6 +105,20 @@
<p class="status-text">{message || 'Please wait...'}</p>
<p class="hint-text">This may take several minutes for large files</p>
{#if onCancel && !showConfirm}
<button class="cancel-btn" onclick={handleCancelClick}>Cancel</button>
{/if}
{#if showConfirm}
<div class="confirm-box">
<p class="confirm-text">Processing is incomplete. If you cancel now, the transcription will need to be started over.</p>
<div class="confirm-actions">
<button class="confirm-keep" onclick={dismissCancel}>Continue Processing</button>
<button class="confirm-cancel" onclick={confirmCancel}>Cancel Processing</button>
</div>
</div>
{/if}
</div>
</div>
{/if}
@@ -174,4 +204,62 @@
font-size: 0.75rem;
color: #555;
}
.cancel-btn {
margin-top: 1.25rem;
width: 100%;
padding: 0.5rem;
background: none;
border: 1px solid #4a5568;
color: #999;
border-radius: 6px;
cursor: pointer;
font-size: 0.85rem;
}
.cancel-btn:hover {
color: #e0e0e0;
border-color: #e94560;
}
.confirm-box {
margin-top: 1.25rem;
padding: 0.75rem;
background: rgba(233, 69, 96, 0.08);
border: 1px solid #e94560;
border-radius: 6px;
}
.confirm-text {
margin: 0 0 0.75rem;
font-size: 0.8rem;
color: #e0e0e0;
line-height: 1.4;
}
.confirm-actions {
display: flex;
gap: 0.5rem;
}
.confirm-keep {
flex: 1;
padding: 0.4rem;
background: #0f3460;
border: 1px solid #4a5568;
color: #e0e0e0;
border-radius: 4px;
cursor: pointer;
font-size: 0.8rem;
}
.confirm-keep:hover {
background: #1a4a7a;
}
.confirm-cancel {
flex: 1;
padding: 0.4rem;
background: #e94560;
border: none;
color: white;
border-radius: 4px;
cursor: pointer;
font-size: 0.8rem;
}
.confirm-cancel:hover {
background: #d63851;
}
</style>

View File

@@ -57,6 +57,12 @@
isReady = false;
});
wavesurfer.on('error', (err: Error) => {
console.error('[voice-to-notes] WaveSurfer error:', err);
isLoading = false;
loadError = 'Failed to load audio';
});
if (audioUrl) {
loadAudio(audioUrl);
}

View File

@@ -52,23 +52,27 @@ export async function loadSettings(): Promise<void> {
}
}
export async function saveSettings(s: AppSettings): Promise<void> {
settings.set(s);
await invoke('save_settings', { settings: s });
// Configure the AI provider in the Python sidecar
export async function configureAIProvider(s: AppSettings): Promise<void> {
const configMap: Record<string, Record<string, string>> = {
openai: { api_key: s.openai_api_key, model: s.openai_model },
anthropic: { api_key: s.anthropic_api_key, model: s.anthropic_model },
litellm: { api_key: s.litellm_api_key, api_base: s.litellm_api_base, model: s.litellm_model },
local: { model: s.ollama_model, base_url: s.ollama_url + '/v1' },
local: { model: s.ollama_model, base_url: s.ollama_url.replace(/\/+$/, '') + '/v1' },
};
const config = configMap[s.ai_provider];
if (config) {
try {
await invoke('ai_configure', { provider: s.ai_provider, config });
} catch {
// Sidecar may not be running yet — provider will be configured on first use
// Sidecar may not be running yet
}
}
}
export async function saveSettings(s: AppSettings): Promise<void> {
settings.set(s);
await invoke('save_settings', { settings: s });
// Configure the AI provider in the Python sidecar
await configureAIProvider(s);
}

View File

@@ -10,7 +10,7 @@
import SettingsModal from '$lib/components/SettingsModal.svelte';
import SidecarSetup from '$lib/components/SidecarSetup.svelte';
import { segments, speakers } from '$lib/stores/transcript';
import { settings, loadSettings } from '$lib/stores/settings';
import { settings, loadSettings, configureAIProvider } from '$lib/stores/settings';
import type { Segment, Speaker } from '$lib/types/transcript';
import { onMount, tick } from 'svelte';
@@ -31,7 +31,9 @@
// Project management state
let currentProjectPath = $state<string | null>(null);
let currentProjectName = $state('');
let projectIsV2 = $state(false);
let audioFilePath = $state('');
let audioWavPath = $state('');
async function checkSidecar() {
try {
@@ -54,6 +56,7 @@
function handleSidecarSetupComplete() {
sidecarReady = true;
configureAIProvider($settings);
checkSidecarUpdate();
}
@@ -71,6 +74,7 @@
});
checkSidecar().then(() => {
if (sidecarReady) {
configureAIProvider($settings);
checkSidecarUpdate();
}
});
@@ -117,25 +121,32 @@
};
});
let isTranscribing = $state(false);
let transcriptionCancelled = $state(false);
let transcriptionProgress = $state(0);
let transcriptionStage = $state('');
let transcriptionMessage = $state('');
let extractingAudio = $state(false);
function handleCancelProcessing() {
transcriptionCancelled = true;
isTranscribing = false;
transcriptionProgress = 0;
transcriptionStage = '';
transcriptionMessage = '';
// Clear any partial results
segments.set([]);
speakers.set([]);
}
// Speaker color palette for auto-assignment
const speakerColors = ['#e94560', '#4ecdc4', '#ffe66d', '#a8e6cf', '#ff8b94', '#c7ceea', '#ffd93d', '#6bcb77'];
async function saveProject() {
const defaultName = currentProjectName || 'Untitled';
const outputPath = await save({
defaultPath: `${defaultName}.vtn`,
filters: [{ name: 'Voice to Notes Project', extensions: ['vtn'] }],
});
if (!outputPath) return;
const projectData = {
version: 1,
name: outputPath.split(/[\\/]/).pop()?.replace('.vtn', '') || defaultName,
audio_file: audioFilePath,
function buildProjectData(projectName: string) {
return {
version: 2,
name: projectName,
source_file: audioFilePath,
audio_wav: 'audio.wav',
created_at: new Date().toISOString(),
segments: $segments.map(seg => {
const speaker = $speakers.find(s => s.id === seg.speaker_id);
@@ -159,17 +170,75 @@
color: s.color || '#e94560',
})),
};
}
/** Save to a specific folder — creates .vtn + audio.wav inside it. */
async function saveToFolder(folderPath: string): Promise<boolean> {
const projectName = folderPath.split(/[\\/]/).pop() || currentProjectName || 'Untitled';
const vtnPath = `${folderPath}/${projectName}.vtn`;
const wavPath = `${folderPath}/audio.wav`;
const projectData = buildProjectData(projectName);
try {
await invoke('save_project_file', { path: outputPath, project: projectData });
currentProjectPath = outputPath;
currentProjectName = projectData.name;
await invoke('create_dir', { path: folderPath });
if (audioWavPath && audioWavPath !== wavPath) {
await invoke('copy_file', { src: audioWavPath, dst: wavPath });
audioWavPath = wavPath;
}
await invoke('save_project_file', { path: vtnPath, project: projectData });
currentProjectPath = vtnPath;
currentProjectName = projectName;
projectIsV2 = true;
return true;
} catch (err) {
console.error('Failed to save project:', err);
alert(`Failed to save: ${err}`);
return false;
}
}
async function saveProject() {
// Already saved as v2 folder — save in place
if (currentProjectPath && projectIsV2) {
const folderPath = currentProjectPath.replace(/[\\/][^\\/]+$/, '');
await saveToFolder(folderPath);
return;
}
// V1 project opened — migrate to folder structure
if (currentProjectPath && !projectIsV2) {
const oldVtnDir = currentProjectPath.replace(/[\\/][^\\/]+$/, '');
const projectName = currentProjectPath.split(/[\\/]/).pop()?.replace(/\.vtn$/i, '') || 'Untitled';
const folderPath = `${oldVtnDir}/${projectName}`;
const success = await saveToFolder(folderPath);
if (success) {
// Optionally remove the old .vtn file
try {
// Leave old file — user can delete manually
} catch {}
}
return;
}
// Never saved — pick a folder
await saveProjectAs();
}
async function saveProjectAs() {
// Use save dialog so the user can type a new project name.
// The chosen path is treated as the project folder (created if needed).
const defaultName = currentProjectName || 'Untitled';
const chosenPath = await save({
defaultPath: defaultName,
title: 'Save Project — enter a project name',
});
if (!chosenPath) return;
// Strip any file extension the user may have typed (e.g. ".vtn")
const folderPath = chosenPath.replace(/\.[^.\\/]+$/, '');
await saveToFolder(folderPath);
}
async function openProject() {
const filePath = await open({
filters: [{ name: 'Voice to Notes Project', extensions: ['vtn'] }],
@@ -179,9 +248,11 @@
try {
const project = await invoke<{
version: number;
version?: number;
name: string;
audio_file: string;
audio_file?: string;
source_file?: string;
audio_wav?: string;
segments: Array<{
text: string;
start_ms: number;
@@ -231,10 +302,135 @@
}));
segments.set(newSegments);
// Load audio
audioFilePath = project.audio_file;
audioUrl = convertFileSrc(project.audio_file);
waveformPlayer?.loadAudio(audioUrl);
// Determine the directory the .vtn file is in
const vtnDir = (filePath as string).replace(/[\\/][^\\/]+$/, '');
const version = project.version ?? 1;
projectIsV2 = version >= 2;
// Resolve audio for wavesurfer playback
if (version >= 2) {
// Version 2: audio_wav is relative to the .vtn directory, source_file is the original import path
audioFilePath = project.source_file || '';
const wavRelative = project.audio_wav || 'audio.wav';
const resolvedWav = `${vtnDir}/${wavRelative}`;
const wavExists = await invoke<boolean>('check_file_exists', { path: resolvedWav });
if (wavExists) {
audioWavPath = resolvedWav;
audioUrl = convertFileSrc(resolvedWav);
waveformPlayer?.loadAudio(audioUrl);
} else {
// WAV missing — try re-extracting from the original source file
const sourceExists = audioFilePath ? await invoke<boolean>('check_file_exists', { path: audioFilePath }) : false;
if (sourceExists) {
extractingAudio = true;
await tick();
try {
const outputPath = `${vtnDir}/${wavRelative}`;
const wavPath = await invoke<string>('extract_audio', { filePath: audioFilePath, outputPath });
audioWavPath = wavPath;
audioUrl = convertFileSrc(wavPath);
waveformPlayer?.loadAudio(audioUrl);
} catch (err) {
console.error('Failed to re-extract audio:', err);
alert(`Failed to re-extract audio: ${err}`);
} finally {
extractingAudio = false;
}
} else {
// Both missing — ask user to locate the file
const shouldRelink = confirm(
'The audio file for this project could not be found.\n\n' +
`Original source: ${audioFilePath || '(unknown)'}\n\n` +
'Would you like to locate the file?'
);
if (shouldRelink) {
const newPath = await open({
multiple: false,
filters: [{
name: 'Audio/Video',
extensions: ['mp3', 'wav', 'flac', 'ogg', 'm4a', 'aac', 'wma',
'mp4', 'mkv', 'avi', 'mov', 'webm'],
}],
});
if (newPath) {
audioFilePath = newPath;
extractingAudio = true;
await tick();
try {
const outputPath = `${vtnDir}/${wavRelative}`;
const wavPath = await invoke<string>('extract_audio', { filePath: newPath, outputPath });
audioWavPath = wavPath;
audioUrl = convertFileSrc(wavPath);
waveformPlayer?.loadAudio(audioUrl);
} catch (err) {
console.error('Failed to extract audio from re-linked file:', err);
alert(`Failed to extract audio: ${err}`);
} finally {
extractingAudio = false;
}
}
}
}
}
} else {
// Version 1 (legacy): audio_file is the source path
const sourceFile = project.audio_file || '';
audioFilePath = sourceFile;
const sourceExists = sourceFile ? await invoke<boolean>('check_file_exists', { path: sourceFile }) : false;
if (sourceExists) {
// Extract WAV next to the .vtn file for playback
extractingAudio = true;
await tick();
try {
const outputPath = `${vtnDir}/audio.wav`;
const wavPath = await invoke<string>('extract_audio', { filePath: sourceFile, outputPath });
audioWavPath = wavPath;
audioUrl = convertFileSrc(wavPath);
waveformPlayer?.loadAudio(audioUrl);
} catch (err) {
console.error('Failed to extract audio:', err);
alert(`Failed to extract audio: ${err}`);
} finally {
extractingAudio = false;
}
} else {
// Source missing — ask user to locate the file
const shouldRelink = confirm(
'The audio file for this project could not be found.\n\n' +
`Original path: ${sourceFile || '(unknown)'}\n\n` +
'Would you like to locate the file?'
);
if (shouldRelink) {
const newPath = await open({
multiple: false,
filters: [{
name: 'Audio/Video',
extensions: ['mp3', 'wav', 'flac', 'ogg', 'm4a', 'aac', 'wma',
'mp4', 'mkv', 'avi', 'mov', 'webm'],
}],
});
if (newPath) {
audioFilePath = newPath;
extractingAudio = true;
await tick();
try {
const outputPath = `${vtnDir}/audio.wav`;
const wavPath = await invoke<string>('extract_audio', { filePath: newPath, outputPath });
audioWavPath = wavPath;
audioUrl = convertFileSrc(wavPath);
waveformPlayer?.loadAudio(audioUrl);
} catch (err) {
console.error('Failed to extract audio from re-linked file:', err);
alert(`Failed to extract audio: ${err}`);
} finally {
extractingAudio = false;
}
}
}
}
}
currentProjectPath = filePath as string;
currentProjectName = project.name;
@@ -265,9 +461,35 @@
});
if (!filePath) return;
// Track the original file path and convert to asset URL for wavesurfer
// Always extract audio to WAV for wavesurfer playback
extractingAudio = true;
await tick();
try {
const wavPath = await invoke<string>('extract_audio', { filePath });
audioWavPath = wavPath;
} catch (err) {
console.error('[voice-to-notes] Failed to extract audio:', err);
const msg = String(err);
if (msg.includes('ffmpeg not found')) {
alert(
'FFmpeg is required to extract audio.\n\n' +
'Install FFmpeg:\n' +
' Windows: winget install ffmpeg\n' +
' macOS: brew install ffmpeg\n' +
' Linux: sudo apt install ffmpeg\n\n' +
'Then restart Voice to Notes and try again.'
);
} else {
alert(`Failed to extract audio: ${msg}`);
}
return;
} finally {
extractingAudio = false;
}
// Track the original file path for the sidecar (it does its own conversion)
audioFilePath = filePath;
audioUrl = convertFileSrc(filePath);
audioUrl = convertFileSrc(audioWavPath);
waveformPlayer?.loadAudio(audioUrl);
// Clear previous results
@@ -276,6 +498,7 @@
// Start pipeline (transcription + diarization)
isTranscribing = true;
transcriptionCancelled = false;
transcriptionProgress = 0;
transcriptionStage = 'Starting...';
transcriptionMessage = 'Initializing pipeline...';
@@ -386,6 +609,9 @@
numSpeakers: $settings.num_speakers && $settings.num_speakers > 0 ? $settings.num_speakers : undefined,
});
// If cancelled while processing, discard results
if (transcriptionCancelled) return;
// Create speaker entries from pipeline result
const newSpeakers: Speaker[] = (result.speakers || []).map((label, idx) => ({
id: `speaker-${idx}`,
@@ -524,7 +750,10 @@
</button>
{#if $segments.length > 0}
<button class="settings-btn" onclick={saveProject}>
Save Project
Save
</button>
<button class="settings-btn" onclick={saveProjectAs}>
Save As
</button>
{/if}
<button class="import-btn" onclick={handleFileImport} disabled={isTranscribing}>
@@ -573,8 +802,18 @@
percent={transcriptionProgress}
stage={transcriptionStage}
message={transcriptionMessage}
onCancel={handleCancelProcessing}
/>
{#if extractingAudio}
<div class="extraction-overlay">
<div class="extraction-card">
<div class="extraction-spinner"></div>
<p>Extracting audio...</p>
</div>
</div>
{/if}
<SettingsModal
visible={showSettings}
onClose={() => showSettings = false}
@@ -781,4 +1020,39 @@
.update-dismiss:hover {
color: #e0e0e0;
}
/* Audio extraction overlay */
.extraction-overlay {
position: fixed;
inset: 0;
background: rgba(0, 0, 0, 0.8);
display: flex;
align-items: center;
justify-content: center;
z-index: 9999;
}
.extraction-card {
background: #16213e;
padding: 2rem 2.5rem;
border-radius: 12px;
color: #e0e0e0;
border: 1px solid #2a3a5e;
box-shadow: 0 8px 32px rgba(0, 0, 0, 0.5);
display: flex;
flex-direction: column;
align-items: center;
gap: 1rem;
}
.extraction-card p {
margin: 0;
font-size: 1rem;
}
.extraction-spinner {
width: 32px;
height: 32px;
border: 3px solid #2a3a5e;
border-top-color: #e94560;
border-radius: 50%;
animation: spin 0.8s linear infinite;
}
</style>