Update README, add User Guide and Contributing docs
- README: Updated to reflect current architecture (decoupled app/sidecar), Ollama as local AI, CUDA support, split CI workflows - USER_GUIDE.md: Complete how-to including first-time setup, transcription workflow, speaker detection setup, Ollama configuration, export formats, keyboard shortcuts, and troubleshooting - CONTRIBUTING.md: Dev setup, project structure, conventions, CI/CD overview Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
140
CONTRIBUTING.md
Normal file
140
CONTRIBUTING.md
Normal file
@@ -0,0 +1,140 @@
|
||||
# Contributing to Voice to Notes
|
||||
|
||||
Thank you for your interest in contributing! This guide covers how to set up the project for development and submit changes.
|
||||
|
||||
## Development Setup
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- **Node.js 20+** and npm
|
||||
- **Rust** (stable toolchain)
|
||||
- **Python 3.11+** with [uv](https://docs.astral.sh/uv/) (recommended) or pip
|
||||
- **System libraries (Linux only):**
|
||||
```bash
|
||||
sudo apt install libgtk-3-dev libwebkit2gtk-4.1-dev libappindicator3-dev librsvg2-dev patchelf xdg-utils
|
||||
```
|
||||
|
||||
### Clone and Install
|
||||
|
||||
```bash
|
||||
git clone https://repo.anhonesthost.net/MacroPad/voice-to-notes.git
|
||||
cd voice-to-notes
|
||||
|
||||
# Frontend
|
||||
npm install
|
||||
|
||||
# Python sidecar
|
||||
cd python && pip install -e ".[dev]" && cd ..
|
||||
```
|
||||
|
||||
### Running in Dev Mode
|
||||
|
||||
```bash
|
||||
npm run tauri:dev
|
||||
```
|
||||
|
||||
This runs the Svelte dev server + Tauri with hot-reload. The Python sidecar runs from your system Python (no PyInstaller needed in dev mode).
|
||||
|
||||
### Building
|
||||
|
||||
```bash
|
||||
# Build the Python sidecar (frozen binary)
|
||||
cd python && python build_sidecar.py --cpu-only && cd ..
|
||||
|
||||
# Build the full app
|
||||
npm run tauri build
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
src/ # Svelte 5 frontend
|
||||
lib/components/ # Reusable UI components
|
||||
lib/stores/ # Svelte stores (app state)
|
||||
routes/ # SvelteKit pages
|
||||
src-tauri/ # Rust backend (Tauri v2)
|
||||
src/sidecar/ # Python sidecar lifecycle (download, extract, IPC)
|
||||
src/commands/ # Tauri command handlers
|
||||
src/db/ # SQLite database layer
|
||||
python/ # Python ML sidecar
|
||||
voice_to_notes/ # Main package
|
||||
services/ # Transcription, diarization, AI, export
|
||||
ipc/ # JSON-line IPC protocol
|
||||
hardware/ # GPU/CPU detection
|
||||
.gitea/workflows/ # CI/CD pipelines
|
||||
docs/ # Documentation
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
The app has three layers:
|
||||
|
||||
1. **Frontend (Svelte)** — UI, audio playback (wavesurfer.js), transcript editing (TipTap)
|
||||
2. **Backend (Rust/Tauri)** — Desktop integration, file access, SQLite, sidecar process management
|
||||
3. **Sidecar (Python)** — ML inference (faster-whisper, pyannote.audio), AI chat, export
|
||||
|
||||
Rust and Python communicate via **JSON-line IPC** over stdin/stdout pipes. Each request has an `id`, `type`, and `payload`. The Python sidecar runs as a child process managed by `SidecarManager` in Rust.
|
||||
|
||||
## Conventions
|
||||
|
||||
### Rust
|
||||
- Follow standard Rust conventions
|
||||
- Run `cargo fmt` and `cargo clippy` before committing
|
||||
- Tauri commands go in `src-tauri/src/commands/`
|
||||
|
||||
### Python
|
||||
- Python 3.11+, type hints everywhere
|
||||
- Use `ruff` for linting: `ruff check python/`
|
||||
- Tests with pytest: `cd python && pytest`
|
||||
- IPC messages: JSON-line format with `id`, `type`, `payload` fields
|
||||
|
||||
### TypeScript / Svelte
|
||||
- Svelte 5 runes (`$state`, `$derived`, `$effect`)
|
||||
- Strict TypeScript
|
||||
- Components in `src/lib/components/`
|
||||
- State in `src/lib/stores/`
|
||||
|
||||
### General
|
||||
- All timestamps in milliseconds (integer)
|
||||
- UUIDs as primary keys in the database
|
||||
- Don't bundle API keys or secrets — those are user-configured
|
||||
|
||||
## Submitting Changes
|
||||
|
||||
1. Fork the repository
|
||||
2. Create a feature branch: `git checkout -b my-feature`
|
||||
3. Make your changes
|
||||
4. Test locally with `npm run tauri:dev`
|
||||
5. Run linters: `cargo fmt && cargo clippy`, `ruff check python/`
|
||||
6. Commit with a clear message describing the change
|
||||
7. Open a Pull Request against `main`
|
||||
|
||||
## CI/CD
|
||||
|
||||
Pushes to `main` automatically:
|
||||
- Bump the app version and create a release (`release.yml`)
|
||||
- Build app installers for all platforms
|
||||
|
||||
Changes to `python/` also trigger sidecar builds (`build-sidecar.yml`).
|
||||
|
||||
## Areas for Contribution
|
||||
|
||||
- UI/UX improvements
|
||||
- New export formats
|
||||
- Additional AI provider integrations
|
||||
- Performance optimizations
|
||||
- Accessibility improvements
|
||||
- Documentation and translations
|
||||
- Bug reports and testing on different platforms
|
||||
|
||||
## Reporting Issues
|
||||
|
||||
Open an issue on the [repository](https://repo.anhonesthost.net/MacroPad/voice-to-notes/issues) with:
|
||||
- Steps to reproduce
|
||||
- Expected vs actual behavior
|
||||
- Platform and version info
|
||||
- Sidecar logs (`%LOCALAPPDATA%\com.voicetonotes.app\sidecar.log` on Windows)
|
||||
|
||||
## License
|
||||
|
||||
By contributing, you agree that your contributions will be licensed under the [MIT License](LICENSE).
|
||||
111
README.md
111
README.md
@@ -1,32 +1,55 @@
|
||||
# Voice to Notes
|
||||
|
||||
A desktop application that transcribes audio/video recordings with speaker identification, producing editable transcriptions with synchronized audio playback.
|
||||
A desktop application that transcribes audio and video recordings with speaker identification, synchronized playback, and AI-powered analysis. Export to SRT, WebVTT, ASS captions, plain text, or Markdown.
|
||||
|
||||
## Features
|
||||
|
||||
- **Speech-to-Text Transcription** — Accurate transcription via faster-whisper (Whisper models) with word-level timestamps
|
||||
- **Speaker Identification (Diarization)** — Detect and distinguish between speakers using pyannote.audio
|
||||
- **Synchronized Playback** — Click any word to seek to that point in the audio (Web Audio API for instant playback)
|
||||
- **AI Integration** — Ask questions about your transcript via OpenAI, Anthropic, or any OpenAI-compatible API (LiteLLM proxies, Ollama, vLLM)
|
||||
- **Export Formats** — SRT, WebVTT, ASS captions, plain text, and Markdown with speaker labels
|
||||
- **Cross-Platform** — Builds for Linux, Windows, and macOS (Apple Silicon)
|
||||
- **Speech-to-Text** — Accurate transcription via faster-whisper with word-level timestamps. Supports 99 languages.
|
||||
- **Speaker Identification** — Detect and label speakers using pyannote.audio. Rename speakers for clean exports.
|
||||
- **GPU Acceleration** — CUDA support for NVIDIA GPUs (Windows/Linux). Falls back to CPU automatically.
|
||||
- **Synchronized Playback** — Click any word to seek. Waveform visualization via wavesurfer.js.
|
||||
- **AI Chat** — Ask questions about your transcript. Works with Ollama (local), OpenAI, Anthropic, or any OpenAI-compatible API.
|
||||
- **Export** — SRT, WebVTT, ASS, plain text, Markdown — all with speaker labels.
|
||||
- **Cross-Platform** — Linux, Windows, macOS (Apple Silicon).
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. Download the installer from [Releases](https://repo.anhonesthost.net/MacroPad/voice-to-notes/releases)
|
||||
2. On first launch, choose **CPU** or **CUDA** sidecar (the AI engine downloads separately, ~500MB–2GB)
|
||||
3. Import an audio/video file and click **Transcribe**
|
||||
|
||||
See the full [User Guide](docs/USER_GUIDE.md) for detailed setup and usage instructions.
|
||||
|
||||
## Platform Support
|
||||
|
||||
| Platform | Architecture | Status |
|
||||
|----------|-------------|--------|
|
||||
| Linux | x86_64 | Supported |
|
||||
| Windows | x86_64 | Supported |
|
||||
| macOS | ARM (Apple Silicon) | Supported |
|
||||
| Platform | Architecture | Installers |
|
||||
|----------|-------------|------------|
|
||||
| Linux | x86_64 | .deb, .rpm |
|
||||
| Windows | x86_64 | .msi, .exe (NSIS) |
|
||||
| macOS | ARM (Apple Silicon) | .dmg |
|
||||
|
||||
## Architecture
|
||||
|
||||
The app is split into two independently versioned components:
|
||||
|
||||
- **App** (v0.2.x) — Tauri desktop shell with Svelte frontend. Small installer (~50MB).
|
||||
- **Sidecar** (v1.x) — Python ML engine (faster-whisper, pyannote.audio). Downloaded on first launch. CPU (~500MB) or CUDA (~2GB) variants.
|
||||
|
||||
This separation means app UI updates don't require re-downloading the sidecar, and sidecar updates don't require reinstalling the app.
|
||||
|
||||
## Tech Stack
|
||||
|
||||
- **Desktop shell:** Tauri v2 (Rust backend + Svelte 5 / TypeScript frontend)
|
||||
- **ML pipeline:** Python sidecar (faster-whisper, pyannote.audio) — frozen via PyInstaller for distribution
|
||||
- **Audio playback:** wavesurfer.js with Web Audio API backend
|
||||
- **AI providers:** OpenAI, Anthropic, OpenAI-compatible endpoints (local or remote)
|
||||
- **Local AI:** Bundled llama-server (llama.cpp)
|
||||
- **Caption export:** pysubs2
|
||||
| Component | Technology |
|
||||
|-----------|-----------|
|
||||
| Desktop shell | Tauri v2 (Rust + Svelte 5 / TypeScript) |
|
||||
| Transcription | faster-whisper (CTranslate2) |
|
||||
| Speaker ID | pyannote.audio 3.1 |
|
||||
| Audio UI | wavesurfer.js |
|
||||
| Transcript editor | TipTap (ProseMirror) |
|
||||
| AI (local) | Ollama (any model) |
|
||||
| AI (cloud) | OpenAI, Anthropic, OpenAI-compatible |
|
||||
| Caption export | pysubs2 |
|
||||
| Database | SQLite (rusqlite) |
|
||||
|
||||
## Development
|
||||
|
||||
@@ -34,8 +57,8 @@ A desktop application that transcribes audio/video recordings with speaker ident
|
||||
|
||||
- Node.js 20+
|
||||
- Rust (stable)
|
||||
- Python 3.11+ with ML dependencies
|
||||
- System: `libgtk-3-dev`, `libwebkit2gtk-4.1-dev` (Linux)
|
||||
- Python 3.11+ with uv or pip
|
||||
- Linux: `libgtk-3-dev`, `libwebkit2gtk-4.1-dev`, `libappindicator3-dev`, `librsvg2-dev`
|
||||
|
||||
### Getting Started
|
||||
|
||||
@@ -44,47 +67,61 @@ A desktop application that transcribes audio/video recordings with speaker ident
|
||||
npm install
|
||||
|
||||
# Install Python sidecar dependencies
|
||||
cd python && pip install -e . && cd ..
|
||||
cd python && pip install -e ".[dev]" && cd ..
|
||||
|
||||
# Run in dev mode (uses system Python for the sidecar)
|
||||
npm run tauri:dev
|
||||
```
|
||||
|
||||
### Building for Distribution
|
||||
### Building
|
||||
|
||||
```bash
|
||||
# Build the frozen Python sidecar
|
||||
npm run sidecar:build
|
||||
# Build the frozen Python sidecar (CPU-only)
|
||||
cd python && python build_sidecar.py --cpu-only && cd ..
|
||||
|
||||
# Build the Tauri app (requires sidecar in src-tauri/binaries/)
|
||||
# Build with CUDA support
|
||||
cd python && python build_sidecar.py --with-cuda && cd ..
|
||||
|
||||
# Build the Tauri app
|
||||
npm run tauri build
|
||||
```
|
||||
|
||||
### CI/CD
|
||||
|
||||
Gitea Actions workflows are in `.gitea/workflows/`. The build pipeline:
|
||||
Two Gitea Actions workflows in `.gitea/workflows/`:
|
||||
|
||||
1. **Build sidecar** — PyInstaller-frozen Python binary per platform (CPU-only PyTorch)
|
||||
2. **Build Tauri app** — Bundles the sidecar via `externalBin`, produces .deb/.AppImage (Linux), .msi (Windows), .dmg (macOS)
|
||||
**`release.yml`** — Triggers on push to main:
|
||||
1. Bumps app version (patch), creates git tag and Gitea release
|
||||
2. Builds lightweight app installers for all platforms (no sidecar bundled)
|
||||
|
||||
**`build-sidecar.yml`** — Triggers on changes to `python/` or manual dispatch:
|
||||
1. Bumps sidecar version, creates `sidecar-v*` tag and release
|
||||
2. Builds CPU + CUDA variants for Linux/Windows, CPU for macOS
|
||||
3. Uploads as separate release assets
|
||||
|
||||
#### Required Secrets
|
||||
|
||||
| Secret | Purpose | Required? |
|
||||
|--------|---------|-----------|
|
||||
| `TAURI_SIGNING_PRIVATE_KEY` | Signs Tauri update bundles | Optional (for auto-updates) |
|
||||
|
||||
No other secrets are needed for building. AI provider API keys and HuggingFace tokens are configured by end users in the app's Settings.
|
||||
| Secret | Purpose |
|
||||
|--------|---------|
|
||||
| `BUILD_TOKEN` | Gitea API token for creating releases and pushing tags |
|
||||
|
||||
### Project Structure
|
||||
|
||||
```
|
||||
src/ # Svelte 5 frontend
|
||||
src-tauri/ # Rust backend (Tauri commands, sidecar manager, SQLite)
|
||||
python/ # Python sidecar (transcription, diarization, AI)
|
||||
voice_to_notes/ # Python package
|
||||
lib/components/ # UI components (waveform, transcript editor, settings, etc.)
|
||||
lib/stores/ # Svelte stores (settings, transcript state)
|
||||
routes/ # SvelteKit pages
|
||||
src-tauri/ # Rust backend
|
||||
src/sidecar/ # Sidecar process manager (download, extract, IPC)
|
||||
src/commands/ # Tauri command handlers
|
||||
nsis-hooks.nsh # Windows uninstall cleanup
|
||||
python/ # Python sidecar
|
||||
voice_to_notes/ # Python package (transcription, diarization, AI, export)
|
||||
build_sidecar.py # PyInstaller build script
|
||||
voice_to_notes.spec # PyInstaller spec
|
||||
.gitea/workflows/ # Gitea Actions CI/CD
|
||||
.gitea/workflows/ # CI/CD (release.yml, build-sidecar.yml)
|
||||
docs/ # Documentation
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
203
docs/USER_GUIDE.md
Normal file
203
docs/USER_GUIDE.md
Normal file
@@ -0,0 +1,203 @@
|
||||
# Voice to Notes — User Guide
|
||||
|
||||
## Getting Started
|
||||
|
||||
### Installation
|
||||
|
||||
Download the installer for your platform from the [Releases](https://repo.anhonesthost.net/MacroPad/voice-to-notes/releases) page:
|
||||
|
||||
- **Windows:** `.msi` or `-setup.exe`
|
||||
- **Linux:** `.deb` or `.rpm`
|
||||
- **macOS:** `.dmg`
|
||||
|
||||
### First-Time Setup
|
||||
|
||||
On first launch, Voice to Notes will prompt you to download its AI engine (the "sidecar"):
|
||||
|
||||
1. Choose **Standard (CPU)** (~500 MB) or **GPU Accelerated (CUDA)** (~2 GB)
|
||||
- Choose CUDA if you have an NVIDIA GPU for significantly faster transcription
|
||||
- CPU works on all computers
|
||||
2. Click **Download & Install** and wait for the download to complete
|
||||
3. The app will proceed to the main interface once the sidecar is ready
|
||||
|
||||
The sidecar only needs to be downloaded once. Updates are detected automatically on launch.
|
||||
|
||||
---
|
||||
|
||||
## Basic Workflow
|
||||
|
||||
### 1. Import Audio
|
||||
|
||||
- Click **Import Audio** or press **Ctrl+O** (Cmd+O on Mac)
|
||||
- Supported formats: MP3, WAV, FLAC, OGG, M4A, AAC, WMA, MP4, MKV, AVI, MOV, WebM
|
||||
|
||||
### 2. Transcribe
|
||||
|
||||
After importing, click **Transcribe** to start the transcription pipeline:
|
||||
|
||||
- **Transcription:** Converts speech to text with word-level timestamps
|
||||
- **Speaker Detection:** Identifies different speakers (if configured — see [Speaker Detection](#speaker-detection))
|
||||
- A progress bar shows the current stage and percentage
|
||||
|
||||
### 3. Review and Edit
|
||||
|
||||
- The **waveform** displays at the top — click anywhere to seek
|
||||
- The **transcript** shows below with speaker labels and timestamps
|
||||
- **Click any word** in the transcript to jump to that point in the audio
|
||||
- The current word highlights during playback
|
||||
- **Edit text** directly in the transcript — word timings are preserved
|
||||
|
||||
### 4. Export
|
||||
|
||||
Click **Export** and choose a format:
|
||||
|
||||
| Format | Extension | Best For |
|
||||
|--------|-----------|----------|
|
||||
| SRT | `.srt` | Video subtitles (most compatible) |
|
||||
| WebVTT | `.vtt` | Web video players, HTML5 |
|
||||
| ASS/SSA | `.ass` | Styled subtitles with speaker colors |
|
||||
| Plain Text | `.txt` | Reading, sharing, pasting |
|
||||
| Markdown | `.md` | Documentation, notes |
|
||||
|
||||
All formats include speaker labels when speaker detection is enabled.
|
||||
|
||||
### 5. Save Project
|
||||
|
||||
- **Ctrl+S** (Cmd+S) saves the current project as a `.vtn` file
|
||||
- This preserves the full transcript, speaker assignments, and edits
|
||||
- Reopen later to continue editing or re-export
|
||||
|
||||
---
|
||||
|
||||
## Playback Controls
|
||||
|
||||
| Action | Shortcut |
|
||||
|--------|----------|
|
||||
| Play / Pause | **Space** |
|
||||
| Skip back 5s | **Left Arrow** |
|
||||
| Skip forward 5s | **Right Arrow** |
|
||||
| Seek to word | Click any word in the transcript |
|
||||
| Import audio | **Ctrl+O** / **Cmd+O** |
|
||||
| Open settings | **Ctrl+,** / **Cmd+,** |
|
||||
|
||||
---
|
||||
|
||||
## Speaker Detection
|
||||
|
||||
Speaker detection (diarization) identifies who is speaking at each point in the audio. It requires a one-time setup:
|
||||
|
||||
### Setup
|
||||
|
||||
1. Go to **Settings > Speakers**
|
||||
2. Create a free account at [huggingface.co](https://huggingface.co/join)
|
||||
3. Accept the license on **all three** model pages:
|
||||
- [pyannote/speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1)
|
||||
- [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0)
|
||||
- [pyannote/speaker-diarization-community-1](https://huggingface.co/pyannote/speaker-diarization-community-1)
|
||||
4. Create a token at [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens) (read access is sufficient)
|
||||
5. Paste the token in Settings and click **Test & Download Model**
|
||||
|
||||
### Speaker Options
|
||||
|
||||
- **Number of speakers:** Set to auto-detect or specify a fixed number for faster results
|
||||
- **Skip speaker detection:** Check this to only transcribe without identifying speakers
|
||||
|
||||
### Managing Speakers
|
||||
|
||||
After transcription, speakers appear as "Speaker 1", "Speaker 2", etc. in the left sidebar. Double-click a speaker name to rename it — the new name appears throughout the transcript and in exports.
|
||||
|
||||
---
|
||||
|
||||
## AI Chat
|
||||
|
||||
The AI chat panel lets you ask questions about your transcript. The AI sees the full transcript with speaker labels as context.
|
||||
|
||||
Example prompts:
|
||||
- "Summarize this conversation"
|
||||
- "What were the key action items?"
|
||||
- "What did Speaker 1 say about the budget?"
|
||||
|
||||
### Setting Up Ollama (Local AI)
|
||||
|
||||
[Ollama](https://ollama.com) runs AI models locally on your computer — no API keys or internet required.
|
||||
|
||||
1. **Install Ollama:**
|
||||
- Download from [ollama.com](https://ollama.com)
|
||||
- Or on Linux: `curl -fsSL https://ollama.com/install.sh | sh`
|
||||
|
||||
2. **Pull a model:**
|
||||
```bash
|
||||
ollama pull llama3.2
|
||||
```
|
||||
Other good options: `mistral`, `gemma2`, `phi3`
|
||||
|
||||
3. **Configure in Voice to Notes:**
|
||||
- Go to **Settings > AI Provider**
|
||||
- Select **Ollama**
|
||||
- URL: `http://localhost:11434` (default, usually no change needed)
|
||||
- Model: `llama3.2` (or whichever model you pulled)
|
||||
|
||||
4. **Use:** Open the AI chat panel (right sidebar) and start asking questions
|
||||
|
||||
### Cloud AI Providers
|
||||
|
||||
If you prefer cloud-based AI:
|
||||
|
||||
**OpenAI:**
|
||||
- Select **OpenAI** in Settings > AI Provider
|
||||
- Enter your API key from [platform.openai.com/api-keys](https://platform.openai.com/api-keys)
|
||||
- Default model: `gpt-4o-mini`
|
||||
|
||||
**Anthropic:**
|
||||
- Select **Anthropic** in Settings > AI Provider
|
||||
- Enter your API key from [console.anthropic.com](https://console.anthropic.com)
|
||||
- Default model: `claude-sonnet-4-6`
|
||||
|
||||
**OpenAI Compatible:**
|
||||
- For any provider with an OpenAI-compatible API (vLLM, LiteLLM, etc.)
|
||||
- Enter the API base URL, key, and model name
|
||||
|
||||
---
|
||||
|
||||
## Settings Reference
|
||||
|
||||
### Transcription
|
||||
|
||||
| Setting | Options | Default |
|
||||
|---------|---------|---------|
|
||||
| Whisper Model | tiny, base, small, medium, large-v3 | base |
|
||||
| Device | CPU, CUDA | CPU |
|
||||
| Language | Auto-detect, or specify (en, es, fr, etc.) | Auto-detect |
|
||||
|
||||
**Model recommendations:**
|
||||
- **tiny/base:** Fast, good for clear audio with one speaker
|
||||
- **small:** Best balance of speed and accuracy
|
||||
- **medium:** Better accuracy, noticeably slower
|
||||
- **large-v3:** Best accuracy, requires 8GB+ VRAM (GPU) or 16GB+ RAM (CPU)
|
||||
|
||||
### Debug
|
||||
|
||||
- **Enable Developer Tools:** Opens the browser inspector for debugging
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Transcription is slow
|
||||
- Use a smaller model (tiny or base)
|
||||
- If you have an NVIDIA GPU, select CUDA in Settings > Transcription > Device
|
||||
- Ensure you downloaded the CUDA sidecar during setup
|
||||
|
||||
### Speaker detection not working
|
||||
- Verify your HuggingFace token in Settings > Speakers
|
||||
- Click "Test & Download Model" to re-download
|
||||
- Make sure you accepted the license on all three model pages
|
||||
|
||||
### Audio won't play / No waveform
|
||||
- Check that the audio file still exists at its original location
|
||||
- Try re-importing the file
|
||||
- Supported formats: MP3, WAV, FLAC, OGG, M4A, AAC, WMA
|
||||
|
||||
### App shows "Setting up Voice to Notes"
|
||||
- This is the first-launch sidecar download — it only happens once
|
||||
- If it fails, check your internet connection and click Retry
|
||||
Reference in New Issue
Block a user