Files
local-transcription/config/default_config.yaml
Josh Knapp 0ba84e6ddd Improve transcription accuracy with overlapping audio chunks
Changes:
1. Changed UI text from "Recording" to "Transcribing" for clarity
2. Implemented overlapping audio chunks to prevent word cutoff

Audio Overlap Feature:
- Added overlap_duration parameter (default: 0.5 seconds)
- Audio chunks now overlap by 0.5s to capture words at boundaries
- Prevents missed words when chunks are processed separately
- Configurable via audio.overlap_duration in config.yaml

How it works:
- Each 3-second chunk includes 0.5s from the previous chunk
- Buffer advances by (chunk_size - overlap_size) instead of full chunk
- Ensures words at chunk boundaries are captured in at least one chunk
- No duplicate transcription due to Whisper's context handling

Example with 3s chunks and 0.5s overlap:
  Chunk 1: [0.0s - 3.0s]
  Chunk 2: [2.5s - 5.5s]  <- 0.5s overlap
  Chunk 3: [5.0s - 8.0s]  <- 0.5s overlap

Files modified:
- client/audio_capture.py: Implemented overlapping buffer logic
- config/default_config.yaml: Added overlap_duration setting
- gui/main_window_qt.py: Updated UI text, passed overlap param
- main_cli.py: Passed overlap param

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-26 08:47:19 -08:00

42 lines
713 B
YAML

user:
name: "User"
id: ""
audio:
input_device: "default"
sample_rate: 16000
chunk_duration: 3.0
overlap_duration: 0.5 # Overlap between chunks to prevent word cutoff (seconds)
noise_suppression:
enabled: true
strength: 0.7
method: "noisereduce"
transcription:
model: "base"
device: "auto"
language: "en"
task: "transcribe"
processing:
use_vad: true
min_confidence: 0.5
server_sync:
enabled: false
url: "ws://localhost:8000"
api_key: ""
display:
show_timestamps: true
max_lines: 100
font_family: "Courier"
font_size: 12
theme: "dark"
fade_after_seconds: 10 # Time before transcriptions fade out (0 = never fade)
web_server:
port: 8080
host: "127.0.0.1"