Phase 1 Complete - Standalone Desktop Application Features: - Real-time speech-to-text with Whisper (faster-whisper) - PySide6 desktop GUI with settings dialog - Web server for OBS browser source integration - Audio capture with automatic sample rate detection and resampling - Noise suppression with Voice Activity Detection (VAD) - Configurable display settings (font, timestamps, fade duration) - Settings apply without restart (with automatic model reloading) - Auto-fade for web display transcriptions - CPU/GPU support with automatic device detection - Standalone executable builds (PyInstaller) - CUDA build support (works on systems without CUDA hardware) Components: - Audio capture with sounddevice - Noise reduction with noisereduce + webrtcvad - Transcription with faster-whisper - GUI with PySide6 - Web server with FastAPI + WebSocket - Configuration system with YAML Build System: - Standard builds (CPU-only): build.sh / build.bat - CUDA builds (universal): build-cuda.sh / build-cuda.bat - Comprehensive BUILD.md documentation - Cross-platform support (Linux, Windows) Documentation: - README.md with project overview and quick start - BUILD.md with detailed build instructions - NEXT_STEPS.md with future enhancement roadmap - INSTALL.md with setup instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
260 lines
6.3 KiB
Markdown
260 lines
6.3 KiB
Markdown
# Building Local Transcription
|
|
|
|
This guide explains how to build standalone executables for Linux and Windows.
|
|
|
|
## Prerequisites
|
|
|
|
1. **Python 3.8+** installed on your system
|
|
2. **uv** package manager (install from https://docs.astral.sh/uv/)
|
|
3. All project dependencies installed (`uv sync`)
|
|
|
|
## Building for Linux
|
|
|
|
### Standard Build (CPU-only):
|
|
|
|
```bash
|
|
# Make the build script executable (first time only)
|
|
chmod +x build.sh
|
|
|
|
# Run the build script
|
|
./build.sh
|
|
```
|
|
|
|
### CUDA Build (GPU Support):
|
|
|
|
Build with CUDA support even without NVIDIA hardware:
|
|
|
|
```bash
|
|
# Make the build script executable (first time only)
|
|
chmod +x build-cuda.sh
|
|
|
|
# Run the CUDA build script
|
|
./build-cuda.sh
|
|
```
|
|
|
|
This will:
|
|
- Install PyTorch with CUDA 12.1 support
|
|
- Bundle CUDA runtime libraries (~600MB extra)
|
|
- Create an executable that works on both GPU and CPU systems
|
|
- Automatically fall back to CPU if no CUDA GPU is available
|
|
|
|
The executable will be created in `dist/LocalTranscription/LocalTranscription`
|
|
|
|
### Manual build:
|
|
```bash
|
|
# Clean previous builds
|
|
rm -rf build dist
|
|
|
|
# Build with PyInstaller
|
|
uv run pyinstaller local-transcription.spec
|
|
```
|
|
|
|
### Distribution:
|
|
```bash
|
|
cd dist
|
|
tar -czf LocalTranscription-Linux.tar.gz LocalTranscription/
|
|
```
|
|
|
|
## Building for Windows
|
|
|
|
### Standard Build (CPU-only):
|
|
|
|
```cmd
|
|
# Run the build script
|
|
build.bat
|
|
```
|
|
|
|
### CUDA Build (GPU Support):
|
|
|
|
Build with CUDA support even without NVIDIA hardware:
|
|
|
|
```cmd
|
|
# Run the CUDA build script
|
|
build-cuda.bat
|
|
```
|
|
|
|
This will:
|
|
- Install PyTorch with CUDA 12.1 support
|
|
- Bundle CUDA runtime libraries (~600MB extra)
|
|
- Create an executable that works on both GPU and CPU systems
|
|
- Automatically fall back to CPU if no CUDA GPU is available
|
|
|
|
The executable will be created in `dist\LocalTranscription\LocalTranscription.exe`
|
|
|
|
### Manual build:
|
|
```cmd
|
|
# Clean previous builds
|
|
rmdir /s /q build
|
|
rmdir /s /q dist
|
|
|
|
# Build with PyInstaller
|
|
uv run pyinstaller local-transcription.spec
|
|
```
|
|
|
|
### Distribution:
|
|
- Compress the `dist\LocalTranscription` folder to a ZIP file
|
|
- Or use an installer creator like NSIS or Inno Setup
|
|
|
|
## Important Notes
|
|
|
|
### Cross-Platform Building
|
|
|
|
**You cannot cross-compile!**
|
|
- Linux executables must be built on Linux
|
|
- Windows executables must be built on Windows
|
|
- Mac executables must be built on macOS
|
|
|
|
### First Run
|
|
|
|
On the first run, the application will:
|
|
1. Create a config directory at `~/.local-transcription/` (Linux) or `%USERPROFILE%\.local-transcription\` (Windows)
|
|
2. Download the Whisper model (if not already present)
|
|
3. The model will be cached in `~/.cache/huggingface/` by default
|
|
|
|
### Executable Size
|
|
|
|
The built executable will be large (300MB - 2GB+) because it includes:
|
|
- Python runtime
|
|
- PySide6 (Qt framework)
|
|
- PyTorch/faster-whisper
|
|
- NumPy, SciPy, and other dependencies
|
|
|
|
### Console Window
|
|
|
|
By default, the console window is visible (for debugging). To hide it:
|
|
|
|
1. Edit `local-transcription.spec`
|
|
2. Change `console=True` to `console=False` in the `EXE` section
|
|
3. Rebuild
|
|
|
|
### GPU Support
|
|
|
|
#### Building with CUDA (Recommended for Distribution)
|
|
|
|
**Yes, you CAN build with CUDA support on systems without NVIDIA GPUs!**
|
|
|
|
PyTorch provides CUDA-enabled builds that bundle the CUDA runtime libraries. This means:
|
|
|
|
1. **You don't need NVIDIA hardware** to create CUDA-enabled builds
|
|
2. **The executable will work everywhere** - on systems with or without NVIDIA GPUs
|
|
3. **Automatic fallback** - the app detects available hardware and uses GPU if available, CPU otherwise
|
|
4. **Larger file size** - adds ~600MB-1GB to the executable size
|
|
|
|
**How it works:**
|
|
```bash
|
|
# Linux
|
|
./build-cuda.sh
|
|
|
|
# Windows
|
|
build-cuda.bat
|
|
```
|
|
|
|
The build script will:
|
|
- Install PyTorch with bundled CUDA 12.1 runtime
|
|
- Package all CUDA libraries into the executable
|
|
- Create a universal build that runs on any system
|
|
|
|
**When users run the executable:**
|
|
- If they have an NVIDIA GPU with drivers: Uses GPU acceleration
|
|
- If they don't have NVIDIA GPU: Automatically uses CPU
|
|
- No configuration needed - it just works!
|
|
|
|
#### Alternative: CPU-Only Builds
|
|
|
|
If you only want CPU support (smaller file size):
|
|
```bash
|
|
# Linux
|
|
./build.sh
|
|
|
|
# Windows
|
|
build.bat
|
|
```
|
|
|
|
#### AMD GPU Support
|
|
|
|
- **ROCm**: Requires special PyTorch builds from AMD
|
|
- Not recommended for general distribution
|
|
- Better to use CUDA build (works on all systems) or CPU build
|
|
|
|
### Optimizations
|
|
|
|
To reduce size:
|
|
|
|
1. **Remove unused model sizes**: The app downloads models on-demand, so you don't need to bundle them
|
|
2. **Use UPX compression**: Already enabled in the spec file
|
|
3. **Exclude dev dependencies**: Only build dependencies are needed
|
|
|
|
## Testing the Build
|
|
|
|
After building, test the executable:
|
|
|
|
### Linux:
|
|
```bash
|
|
cd dist/LocalTranscription
|
|
./LocalTranscription
|
|
```
|
|
|
|
### Windows:
|
|
```cmd
|
|
cd dist\LocalTranscription
|
|
LocalTranscription.exe
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Missing modules error
|
|
If you get "No module named X" errors, add the module to the `hiddenimports` list in `local-transcription.spec`
|
|
|
|
### DLL errors (Windows)
|
|
Make sure Visual C++ Redistributable is installed on the target system:
|
|
https://aka.ms/vs/17/release/vc_redist.x64.exe
|
|
|
|
### Audio device errors
|
|
The application needs access to audio devices. Ensure:
|
|
- Microphone permissions are granted
|
|
- Audio drivers are installed
|
|
- PulseAudio (Linux) or Windows Audio is running
|
|
|
|
### Model download fails
|
|
Ensure internet connection on first run. Models are downloaded from:
|
|
https://huggingface.co/guillaumekln/faster-whisper-base
|
|
|
|
## Advanced: Adding an Icon
|
|
|
|
1. Create or obtain an `.ico` file (Windows) or `.png` file (Linux)
|
|
2. Edit `local-transcription.spec`
|
|
3. Change `icon=None` to `icon='path/to/your/icon.ico'`
|
|
4. Rebuild
|
|
|
|
## Advanced: Creating an Installer
|
|
|
|
### Windows (using Inno Setup):
|
|
|
|
1. Install Inno Setup: https://jrsoftware.org/isinfo.php
|
|
2. Create an `.iss` script file
|
|
3. Build the installer
|
|
|
|
### Linux (using AppImage):
|
|
|
|
```bash
|
|
# Install appimagetool
|
|
wget https://github.com/AppImage/AppImageKit/releases/download/continuous/appimagetool-x86_64.AppImage
|
|
chmod +x appimagetool-x86_64.AppImage
|
|
|
|
# Create AppDir structure
|
|
mkdir -p LocalTranscription.AppDir/usr/bin
|
|
cp -r dist/LocalTranscription/* LocalTranscription.AppDir/usr/bin/
|
|
|
|
# Create desktop file and icon
|
|
# (Create .desktop file and icon as needed)
|
|
|
|
# Build AppImage
|
|
./appimagetool-x86_64.AppImage LocalTranscription.AppDir
|
|
```
|
|
|
|
## Support
|
|
|
|
For build issues, check:
|
|
1. PyInstaller documentation: https://pyinstaller.org/
|
|
2. Project issues: https://github.com/anthropics/claude-code/issues
|