Files
local-transcription/BUILD.md
Josh Knapp 472233aec4 Initial commit: Local Transcription App v1.0
Phase 1 Complete - Standalone Desktop Application

Features:
- Real-time speech-to-text with Whisper (faster-whisper)
- PySide6 desktop GUI with settings dialog
- Web server for OBS browser source integration
- Audio capture with automatic sample rate detection and resampling
- Noise suppression with Voice Activity Detection (VAD)
- Configurable display settings (font, timestamps, fade duration)
- Settings apply without restart (with automatic model reloading)
- Auto-fade for web display transcriptions
- CPU/GPU support with automatic device detection
- Standalone executable builds (PyInstaller)
- CUDA build support (works on systems without CUDA hardware)

Components:
- Audio capture with sounddevice
- Noise reduction with noisereduce + webrtcvad
- Transcription with faster-whisper
- GUI with PySide6
- Web server with FastAPI + WebSocket
- Configuration system with YAML

Build System:
- Standard builds (CPU-only): build.sh / build.bat
- CUDA builds (universal): build-cuda.sh / build-cuda.bat
- Comprehensive BUILD.md documentation
- Cross-platform support (Linux, Windows)

Documentation:
- README.md with project overview and quick start
- BUILD.md with detailed build instructions
- NEXT_STEPS.md with future enhancement roadmap
- INSTALL.md with setup instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-25 18:48:23 -08:00

260 lines
6.3 KiB
Markdown

# Building Local Transcription
This guide explains how to build standalone executables for Linux and Windows.
## Prerequisites
1. **Python 3.8+** installed on your system
2. **uv** package manager (install from https://docs.astral.sh/uv/)
3. All project dependencies installed (`uv sync`)
## Building for Linux
### Standard Build (CPU-only):
```bash
# Make the build script executable (first time only)
chmod +x build.sh
# Run the build script
./build.sh
```
### CUDA Build (GPU Support):
Build with CUDA support even without NVIDIA hardware:
```bash
# Make the build script executable (first time only)
chmod +x build-cuda.sh
# Run the CUDA build script
./build-cuda.sh
```
This will:
- Install PyTorch with CUDA 12.1 support
- Bundle CUDA runtime libraries (~600MB extra)
- Create an executable that works on both GPU and CPU systems
- Automatically fall back to CPU if no CUDA GPU is available
The executable will be created in `dist/LocalTranscription/LocalTranscription`
### Manual build:
```bash
# Clean previous builds
rm -rf build dist
# Build with PyInstaller
uv run pyinstaller local-transcription.spec
```
### Distribution:
```bash
cd dist
tar -czf LocalTranscription-Linux.tar.gz LocalTranscription/
```
## Building for Windows
### Standard Build (CPU-only):
```cmd
# Run the build script
build.bat
```
### CUDA Build (GPU Support):
Build with CUDA support even without NVIDIA hardware:
```cmd
# Run the CUDA build script
build-cuda.bat
```
This will:
- Install PyTorch with CUDA 12.1 support
- Bundle CUDA runtime libraries (~600MB extra)
- Create an executable that works on both GPU and CPU systems
- Automatically fall back to CPU if no CUDA GPU is available
The executable will be created in `dist\LocalTranscription\LocalTranscription.exe`
### Manual build:
```cmd
# Clean previous builds
rmdir /s /q build
rmdir /s /q dist
# Build with PyInstaller
uv run pyinstaller local-transcription.spec
```
### Distribution:
- Compress the `dist\LocalTranscription` folder to a ZIP file
- Or use an installer creator like NSIS or Inno Setup
## Important Notes
### Cross-Platform Building
**You cannot cross-compile!**
- Linux executables must be built on Linux
- Windows executables must be built on Windows
- Mac executables must be built on macOS
### First Run
On the first run, the application will:
1. Create a config directory at `~/.local-transcription/` (Linux) or `%USERPROFILE%\.local-transcription\` (Windows)
2. Download the Whisper model (if not already present)
3. The model will be cached in `~/.cache/huggingface/` by default
### Executable Size
The built executable will be large (300MB - 2GB+) because it includes:
- Python runtime
- PySide6 (Qt framework)
- PyTorch/faster-whisper
- NumPy, SciPy, and other dependencies
### Console Window
By default, the console window is visible (for debugging). To hide it:
1. Edit `local-transcription.spec`
2. Change `console=True` to `console=False` in the `EXE` section
3. Rebuild
### GPU Support
#### Building with CUDA (Recommended for Distribution)
**Yes, you CAN build with CUDA support on systems without NVIDIA GPUs!**
PyTorch provides CUDA-enabled builds that bundle the CUDA runtime libraries. This means:
1. **You don't need NVIDIA hardware** to create CUDA-enabled builds
2. **The executable will work everywhere** - on systems with or without NVIDIA GPUs
3. **Automatic fallback** - the app detects available hardware and uses GPU if available, CPU otherwise
4. **Larger file size** - adds ~600MB-1GB to the executable size
**How it works:**
```bash
# Linux
./build-cuda.sh
# Windows
build-cuda.bat
```
The build script will:
- Install PyTorch with bundled CUDA 12.1 runtime
- Package all CUDA libraries into the executable
- Create a universal build that runs on any system
**When users run the executable:**
- If they have an NVIDIA GPU with drivers: Uses GPU acceleration
- If they don't have NVIDIA GPU: Automatically uses CPU
- No configuration needed - it just works!
#### Alternative: CPU-Only Builds
If you only want CPU support (smaller file size):
```bash
# Linux
./build.sh
# Windows
build.bat
```
#### AMD GPU Support
- **ROCm**: Requires special PyTorch builds from AMD
- Not recommended for general distribution
- Better to use CUDA build (works on all systems) or CPU build
### Optimizations
To reduce size:
1. **Remove unused model sizes**: The app downloads models on-demand, so you don't need to bundle them
2. **Use UPX compression**: Already enabled in the spec file
3. **Exclude dev dependencies**: Only build dependencies are needed
## Testing the Build
After building, test the executable:
### Linux:
```bash
cd dist/LocalTranscription
./LocalTranscription
```
### Windows:
```cmd
cd dist\LocalTranscription
LocalTranscription.exe
```
## Troubleshooting
### Missing modules error
If you get "No module named X" errors, add the module to the `hiddenimports` list in `local-transcription.spec`
### DLL errors (Windows)
Make sure Visual C++ Redistributable is installed on the target system:
https://aka.ms/vs/17/release/vc_redist.x64.exe
### Audio device errors
The application needs access to audio devices. Ensure:
- Microphone permissions are granted
- Audio drivers are installed
- PulseAudio (Linux) or Windows Audio is running
### Model download fails
Ensure internet connection on first run. Models are downloaded from:
https://huggingface.co/guillaumekln/faster-whisper-base
## Advanced: Adding an Icon
1. Create or obtain an `.ico` file (Windows) or `.png` file (Linux)
2. Edit `local-transcription.spec`
3. Change `icon=None` to `icon='path/to/your/icon.ico'`
4. Rebuild
## Advanced: Creating an Installer
### Windows (using Inno Setup):
1. Install Inno Setup: https://jrsoftware.org/isinfo.php
2. Create an `.iss` script file
3. Build the installer
### Linux (using AppImage):
```bash
# Install appimagetool
wget https://github.com/AppImage/AppImageKit/releases/download/continuous/appimagetool-x86_64.AppImage
chmod +x appimagetool-x86_64.AppImage
# Create AppDir structure
mkdir -p LocalTranscription.AppDir/usr/bin
cp -r dist/LocalTranscription/* LocalTranscription.AppDir/usr/bin/
# Create desktop file and icon
# (Create .desktop file and icon as needed)
# Build AppImage
./appimagetool-x86_64.AppImage LocalTranscription.AppDir
```
## Support
For build issues, check:
1. PyInstaller documentation: https://pyinstaller.org/
2. Project issues: https://github.com/anthropics/claude-code/issues