Phase 1 Complete - Standalone Desktop Application Features: - Real-time speech-to-text with Whisper (faster-whisper) - PySide6 desktop GUI with settings dialog - Web server for OBS browser source integration - Audio capture with automatic sample rate detection and resampling - Noise suppression with Voice Activity Detection (VAD) - Configurable display settings (font, timestamps, fade duration) - Settings apply without restart (with automatic model reloading) - Auto-fade for web display transcriptions - CPU/GPU support with automatic device detection - Standalone executable builds (PyInstaller) - CUDA build support (works on systems without CUDA hardware) Components: - Audio capture with sounddevice - Noise reduction with noisereduce + webrtcvad - Transcription with faster-whisper - GUI with PySide6 - Web server with FastAPI + WebSocket - Configuration system with YAML Build System: - Standard builds (CPU-only): build.sh / build.bat - CUDA builds (universal): build-cuda.sh / build-cuda.bat - Comprehensive BUILD.md documentation - Cross-platform support (Linux, Windows) Documentation: - README.md with project overview and quick start - BUILD.md with detailed build instructions - NEXT_STEPS.md with future enhancement roadmap - INSTALL.md with setup instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
6.3 KiB
Building Local Transcription
This guide explains how to build standalone executables for Linux and Windows.
Prerequisites
- Python 3.8+ installed on your system
- uv package manager (install from https://docs.astral.sh/uv/)
- All project dependencies installed (
uv sync)
Building for Linux
Standard Build (CPU-only):
# Make the build script executable (first time only)
chmod +x build.sh
# Run the build script
./build.sh
CUDA Build (GPU Support):
Build with CUDA support even without NVIDIA hardware:
# Make the build script executable (first time only)
chmod +x build-cuda.sh
# Run the CUDA build script
./build-cuda.sh
This will:
- Install PyTorch with CUDA 12.1 support
- Bundle CUDA runtime libraries (~600MB extra)
- Create an executable that works on both GPU and CPU systems
- Automatically fall back to CPU if no CUDA GPU is available
The executable will be created in dist/LocalTranscription/LocalTranscription
Manual build:
# Clean previous builds
rm -rf build dist
# Build with PyInstaller
uv run pyinstaller local-transcription.spec
Distribution:
cd dist
tar -czf LocalTranscription-Linux.tar.gz LocalTranscription/
Building for Windows
Standard Build (CPU-only):
# Run the build script
build.bat
CUDA Build (GPU Support):
Build with CUDA support even without NVIDIA hardware:
# Run the CUDA build script
build-cuda.bat
This will:
- Install PyTorch with CUDA 12.1 support
- Bundle CUDA runtime libraries (~600MB extra)
- Create an executable that works on both GPU and CPU systems
- Automatically fall back to CPU if no CUDA GPU is available
The executable will be created in dist\LocalTranscription\LocalTranscription.exe
Manual build:
# Clean previous builds
rmdir /s /q build
rmdir /s /q dist
# Build with PyInstaller
uv run pyinstaller local-transcription.spec
Distribution:
- Compress the
dist\LocalTranscriptionfolder to a ZIP file - Or use an installer creator like NSIS or Inno Setup
Important Notes
Cross-Platform Building
You cannot cross-compile!
- Linux executables must be built on Linux
- Windows executables must be built on Windows
- Mac executables must be built on macOS
First Run
On the first run, the application will:
- Create a config directory at
~/.local-transcription/(Linux) or%USERPROFILE%\.local-transcription\(Windows) - Download the Whisper model (if not already present)
- The model will be cached in
~/.cache/huggingface/by default
Executable Size
The built executable will be large (300MB - 2GB+) because it includes:
- Python runtime
- PySide6 (Qt framework)
- PyTorch/faster-whisper
- NumPy, SciPy, and other dependencies
Console Window
By default, the console window is visible (for debugging). To hide it:
- Edit
local-transcription.spec - Change
console=Truetoconsole=Falsein theEXEsection - Rebuild
GPU Support
Building with CUDA (Recommended for Distribution)
Yes, you CAN build with CUDA support on systems without NVIDIA GPUs!
PyTorch provides CUDA-enabled builds that bundle the CUDA runtime libraries. This means:
- You don't need NVIDIA hardware to create CUDA-enabled builds
- The executable will work everywhere - on systems with or without NVIDIA GPUs
- Automatic fallback - the app detects available hardware and uses GPU if available, CPU otherwise
- Larger file size - adds ~600MB-1GB to the executable size
How it works:
# Linux
./build-cuda.sh
# Windows
build-cuda.bat
The build script will:
- Install PyTorch with bundled CUDA 12.1 runtime
- Package all CUDA libraries into the executable
- Create a universal build that runs on any system
When users run the executable:
- If they have an NVIDIA GPU with drivers: Uses GPU acceleration
- If they don't have NVIDIA GPU: Automatically uses CPU
- No configuration needed - it just works!
Alternative: CPU-Only Builds
If you only want CPU support (smaller file size):
# Linux
./build.sh
# Windows
build.bat
AMD GPU Support
- ROCm: Requires special PyTorch builds from AMD
- Not recommended for general distribution
- Better to use CUDA build (works on all systems) or CPU build
Optimizations
To reduce size:
- Remove unused model sizes: The app downloads models on-demand, so you don't need to bundle them
- Use UPX compression: Already enabled in the spec file
- Exclude dev dependencies: Only build dependencies are needed
Testing the Build
After building, test the executable:
Linux:
cd dist/LocalTranscription
./LocalTranscription
Windows:
cd dist\LocalTranscription
LocalTranscription.exe
Troubleshooting
Missing modules error
If you get "No module named X" errors, add the module to the hiddenimports list in local-transcription.spec
DLL errors (Windows)
Make sure Visual C++ Redistributable is installed on the target system: https://aka.ms/vs/17/release/vc_redist.x64.exe
Audio device errors
The application needs access to audio devices. Ensure:
- Microphone permissions are granted
- Audio drivers are installed
- PulseAudio (Linux) or Windows Audio is running
Model download fails
Ensure internet connection on first run. Models are downloaded from: https://huggingface.co/guillaumekln/faster-whisper-base
Advanced: Adding an Icon
- Create or obtain an
.icofile (Windows) or.pngfile (Linux) - Edit
local-transcription.spec - Change
icon=Nonetoicon='path/to/your/icon.ico' - Rebuild
Advanced: Creating an Installer
Windows (using Inno Setup):
- Install Inno Setup: https://jrsoftware.org/isinfo.php
- Create an
.issscript file - Build the installer
Linux (using AppImage):
# Install appimagetool
wget https://github.com/AppImage/AppImageKit/releases/download/continuous/appimagetool-x86_64.AppImage
chmod +x appimagetool-x86_64.AppImage
# Create AppDir structure
mkdir -p LocalTranscription.AppDir/usr/bin
cp -r dist/LocalTranscription/* LocalTranscription.AppDir/usr/bin/
# Create desktop file and icon
# (Create .desktop file and icon as needed)
# Build AppImage
./appimagetool-x86_64.AppImage LocalTranscription.AppDir
Support
For build issues, check:
- PyInstaller documentation: https://pyinstaller.org/
- Project issues: https://github.com/anthropics/claude-code/issues