Initial commit: Local Transcription App v1.0
Phase 1 Complete - Standalone Desktop Application Features: - Real-time speech-to-text with Whisper (faster-whisper) - PySide6 desktop GUI with settings dialog - Web server for OBS browser source integration - Audio capture with automatic sample rate detection and resampling - Noise suppression with Voice Activity Detection (VAD) - Configurable display settings (font, timestamps, fade duration) - Settings apply without restart (with automatic model reloading) - Auto-fade for web display transcriptions - CPU/GPU support with automatic device detection - Standalone executable builds (PyInstaller) - CUDA build support (works on systems without CUDA hardware) Components: - Audio capture with sounddevice - Noise reduction with noisereduce + webrtcvad - Transcription with faster-whisper - GUI with PySide6 - Web server with FastAPI + WebSocket - Configuration system with YAML Build System: - Standard builds (CPU-only): build.sh / build.bat - CUDA builds (universal): build-cuda.sh / build-cuda.bat - Comprehensive BUILD.md documentation - Cross-platform support (Linux, Windows) Documentation: - README.md with project overview and quick start - BUILD.md with detailed build instructions - NEXT_STEPS.md with future enhancement roadmap - INSTALL.md with setup instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
259
BUILD.md
Normal file
259
BUILD.md
Normal file
@@ -0,0 +1,259 @@
|
||||
# Building Local Transcription
|
||||
|
||||
This guide explains how to build standalone executables for Linux and Windows.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **Python 3.8+** installed on your system
|
||||
2. **uv** package manager (install from https://docs.astral.sh/uv/)
|
||||
3. All project dependencies installed (`uv sync`)
|
||||
|
||||
## Building for Linux
|
||||
|
||||
### Standard Build (CPU-only):
|
||||
|
||||
```bash
|
||||
# Make the build script executable (first time only)
|
||||
chmod +x build.sh
|
||||
|
||||
# Run the build script
|
||||
./build.sh
|
||||
```
|
||||
|
||||
### CUDA Build (GPU Support):
|
||||
|
||||
Build with CUDA support even without NVIDIA hardware:
|
||||
|
||||
```bash
|
||||
# Make the build script executable (first time only)
|
||||
chmod +x build-cuda.sh
|
||||
|
||||
# Run the CUDA build script
|
||||
./build-cuda.sh
|
||||
```
|
||||
|
||||
This will:
|
||||
- Install PyTorch with CUDA 12.1 support
|
||||
- Bundle CUDA runtime libraries (~600MB extra)
|
||||
- Create an executable that works on both GPU and CPU systems
|
||||
- Automatically fall back to CPU if no CUDA GPU is available
|
||||
|
||||
The executable will be created in `dist/LocalTranscription/LocalTranscription`
|
||||
|
||||
### Manual build:
|
||||
```bash
|
||||
# Clean previous builds
|
||||
rm -rf build dist
|
||||
|
||||
# Build with PyInstaller
|
||||
uv run pyinstaller local-transcription.spec
|
||||
```
|
||||
|
||||
### Distribution:
|
||||
```bash
|
||||
cd dist
|
||||
tar -czf LocalTranscription-Linux.tar.gz LocalTranscription/
|
||||
```
|
||||
|
||||
## Building for Windows
|
||||
|
||||
### Standard Build (CPU-only):
|
||||
|
||||
```cmd
|
||||
# Run the build script
|
||||
build.bat
|
||||
```
|
||||
|
||||
### CUDA Build (GPU Support):
|
||||
|
||||
Build with CUDA support even without NVIDIA hardware:
|
||||
|
||||
```cmd
|
||||
# Run the CUDA build script
|
||||
build-cuda.bat
|
||||
```
|
||||
|
||||
This will:
|
||||
- Install PyTorch with CUDA 12.1 support
|
||||
- Bundle CUDA runtime libraries (~600MB extra)
|
||||
- Create an executable that works on both GPU and CPU systems
|
||||
- Automatically fall back to CPU if no CUDA GPU is available
|
||||
|
||||
The executable will be created in `dist\LocalTranscription\LocalTranscription.exe`
|
||||
|
||||
### Manual build:
|
||||
```cmd
|
||||
# Clean previous builds
|
||||
rmdir /s /q build
|
||||
rmdir /s /q dist
|
||||
|
||||
# Build with PyInstaller
|
||||
uv run pyinstaller local-transcription.spec
|
||||
```
|
||||
|
||||
### Distribution:
|
||||
- Compress the `dist\LocalTranscription` folder to a ZIP file
|
||||
- Or use an installer creator like NSIS or Inno Setup
|
||||
|
||||
## Important Notes
|
||||
|
||||
### Cross-Platform Building
|
||||
|
||||
**You cannot cross-compile!**
|
||||
- Linux executables must be built on Linux
|
||||
- Windows executables must be built on Windows
|
||||
- Mac executables must be built on macOS
|
||||
|
||||
### First Run
|
||||
|
||||
On the first run, the application will:
|
||||
1. Create a config directory at `~/.local-transcription/` (Linux) or `%USERPROFILE%\.local-transcription\` (Windows)
|
||||
2. Download the Whisper model (if not already present)
|
||||
3. The model will be cached in `~/.cache/huggingface/` by default
|
||||
|
||||
### Executable Size
|
||||
|
||||
The built executable will be large (300MB - 2GB+) because it includes:
|
||||
- Python runtime
|
||||
- PySide6 (Qt framework)
|
||||
- PyTorch/faster-whisper
|
||||
- NumPy, SciPy, and other dependencies
|
||||
|
||||
### Console Window
|
||||
|
||||
By default, the console window is visible (for debugging). To hide it:
|
||||
|
||||
1. Edit `local-transcription.spec`
|
||||
2. Change `console=True` to `console=False` in the `EXE` section
|
||||
3. Rebuild
|
||||
|
||||
### GPU Support
|
||||
|
||||
#### Building with CUDA (Recommended for Distribution)
|
||||
|
||||
**Yes, you CAN build with CUDA support on systems without NVIDIA GPUs!**
|
||||
|
||||
PyTorch provides CUDA-enabled builds that bundle the CUDA runtime libraries. This means:
|
||||
|
||||
1. **You don't need NVIDIA hardware** to create CUDA-enabled builds
|
||||
2. **The executable will work everywhere** - on systems with or without NVIDIA GPUs
|
||||
3. **Automatic fallback** - the app detects available hardware and uses GPU if available, CPU otherwise
|
||||
4. **Larger file size** - adds ~600MB-1GB to the executable size
|
||||
|
||||
**How it works:**
|
||||
```bash
|
||||
# Linux
|
||||
./build-cuda.sh
|
||||
|
||||
# Windows
|
||||
build-cuda.bat
|
||||
```
|
||||
|
||||
The build script will:
|
||||
- Install PyTorch with bundled CUDA 12.1 runtime
|
||||
- Package all CUDA libraries into the executable
|
||||
- Create a universal build that runs on any system
|
||||
|
||||
**When users run the executable:**
|
||||
- If they have an NVIDIA GPU with drivers: Uses GPU acceleration
|
||||
- If they don't have NVIDIA GPU: Automatically uses CPU
|
||||
- No configuration needed - it just works!
|
||||
|
||||
#### Alternative: CPU-Only Builds
|
||||
|
||||
If you only want CPU support (smaller file size):
|
||||
```bash
|
||||
# Linux
|
||||
./build.sh
|
||||
|
||||
# Windows
|
||||
build.bat
|
||||
```
|
||||
|
||||
#### AMD GPU Support
|
||||
|
||||
- **ROCm**: Requires special PyTorch builds from AMD
|
||||
- Not recommended for general distribution
|
||||
- Better to use CUDA build (works on all systems) or CPU build
|
||||
|
||||
### Optimizations
|
||||
|
||||
To reduce size:
|
||||
|
||||
1. **Remove unused model sizes**: The app downloads models on-demand, so you don't need to bundle them
|
||||
2. **Use UPX compression**: Already enabled in the spec file
|
||||
3. **Exclude dev dependencies**: Only build dependencies are needed
|
||||
|
||||
## Testing the Build
|
||||
|
||||
After building, test the executable:
|
||||
|
||||
### Linux:
|
||||
```bash
|
||||
cd dist/LocalTranscription
|
||||
./LocalTranscription
|
||||
```
|
||||
|
||||
### Windows:
|
||||
```cmd
|
||||
cd dist\LocalTranscription
|
||||
LocalTranscription.exe
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Missing modules error
|
||||
If you get "No module named X" errors, add the module to the `hiddenimports` list in `local-transcription.spec`
|
||||
|
||||
### DLL errors (Windows)
|
||||
Make sure Visual C++ Redistributable is installed on the target system:
|
||||
https://aka.ms/vs/17/release/vc_redist.x64.exe
|
||||
|
||||
### Audio device errors
|
||||
The application needs access to audio devices. Ensure:
|
||||
- Microphone permissions are granted
|
||||
- Audio drivers are installed
|
||||
- PulseAudio (Linux) or Windows Audio is running
|
||||
|
||||
### Model download fails
|
||||
Ensure internet connection on first run. Models are downloaded from:
|
||||
https://huggingface.co/guillaumekln/faster-whisper-base
|
||||
|
||||
## Advanced: Adding an Icon
|
||||
|
||||
1. Create or obtain an `.ico` file (Windows) or `.png` file (Linux)
|
||||
2. Edit `local-transcription.spec`
|
||||
3. Change `icon=None` to `icon='path/to/your/icon.ico'`
|
||||
4. Rebuild
|
||||
|
||||
## Advanced: Creating an Installer
|
||||
|
||||
### Windows (using Inno Setup):
|
||||
|
||||
1. Install Inno Setup: https://jrsoftware.org/isinfo.php
|
||||
2. Create an `.iss` script file
|
||||
3. Build the installer
|
||||
|
||||
### Linux (using AppImage):
|
||||
|
||||
```bash
|
||||
# Install appimagetool
|
||||
wget https://github.com/AppImage/AppImageKit/releases/download/continuous/appimagetool-x86_64.AppImage
|
||||
chmod +x appimagetool-x86_64.AppImage
|
||||
|
||||
# Create AppDir structure
|
||||
mkdir -p LocalTranscription.AppDir/usr/bin
|
||||
cp -r dist/LocalTranscription/* LocalTranscription.AppDir/usr/bin/
|
||||
|
||||
# Create desktop file and icon
|
||||
# (Create .desktop file and icon as needed)
|
||||
|
||||
# Build AppImage
|
||||
./appimagetool-x86_64.AppImage LocalTranscription.AppDir
|
||||
```
|
||||
|
||||
## Support
|
||||
|
||||
For build issues, check:
|
||||
1. PyInstaller documentation: https://pyinstaller.org/
|
||||
2. Project issues: https://github.com/anthropics/claude-code/issues
|
||||
Reference in New Issue
Block a user