Files
local-transcription/BUILD.md
Josh Knapp 472233aec4 Initial commit: Local Transcription App v1.0
Phase 1 Complete - Standalone Desktop Application

Features:
- Real-time speech-to-text with Whisper (faster-whisper)
- PySide6 desktop GUI with settings dialog
- Web server for OBS browser source integration
- Audio capture with automatic sample rate detection and resampling
- Noise suppression with Voice Activity Detection (VAD)
- Configurable display settings (font, timestamps, fade duration)
- Settings apply without restart (with automatic model reloading)
- Auto-fade for web display transcriptions
- CPU/GPU support with automatic device detection
- Standalone executable builds (PyInstaller)
- CUDA build support (works on systems without CUDA hardware)

Components:
- Audio capture with sounddevice
- Noise reduction with noisereduce + webrtcvad
- Transcription with faster-whisper
- GUI with PySide6
- Web server with FastAPI + WebSocket
- Configuration system with YAML

Build System:
- Standard builds (CPU-only): build.sh / build.bat
- CUDA builds (universal): build-cuda.sh / build-cuda.bat
- Comprehensive BUILD.md documentation
- Cross-platform support (Linux, Windows)

Documentation:
- README.md with project overview and quick start
- BUILD.md with detailed build instructions
- NEXT_STEPS.md with future enhancement roadmap
- INSTALL.md with setup instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-25 18:48:23 -08:00

6.3 KiB

Building Local Transcription

This guide explains how to build standalone executables for Linux and Windows.

Prerequisites

  1. Python 3.8+ installed on your system
  2. uv package manager (install from https://docs.astral.sh/uv/)
  3. All project dependencies installed (uv sync)

Building for Linux

Standard Build (CPU-only):

# Make the build script executable (first time only)
chmod +x build.sh

# Run the build script
./build.sh

CUDA Build (GPU Support):

Build with CUDA support even without NVIDIA hardware:

# Make the build script executable (first time only)
chmod +x build-cuda.sh

# Run the CUDA build script
./build-cuda.sh

This will:

  • Install PyTorch with CUDA 12.1 support
  • Bundle CUDA runtime libraries (~600MB extra)
  • Create an executable that works on both GPU and CPU systems
  • Automatically fall back to CPU if no CUDA GPU is available

The executable will be created in dist/LocalTranscription/LocalTranscription

Manual build:

# Clean previous builds
rm -rf build dist

# Build with PyInstaller
uv run pyinstaller local-transcription.spec

Distribution:

cd dist
tar -czf LocalTranscription-Linux.tar.gz LocalTranscription/

Building for Windows

Standard Build (CPU-only):

# Run the build script
build.bat

CUDA Build (GPU Support):

Build with CUDA support even without NVIDIA hardware:

# Run the CUDA build script
build-cuda.bat

This will:

  • Install PyTorch with CUDA 12.1 support
  • Bundle CUDA runtime libraries (~600MB extra)
  • Create an executable that works on both GPU and CPU systems
  • Automatically fall back to CPU if no CUDA GPU is available

The executable will be created in dist\LocalTranscription\LocalTranscription.exe

Manual build:

# Clean previous builds
rmdir /s /q build
rmdir /s /q dist

# Build with PyInstaller
uv run pyinstaller local-transcription.spec

Distribution:

  • Compress the dist\LocalTranscription folder to a ZIP file
  • Or use an installer creator like NSIS or Inno Setup

Important Notes

Cross-Platform Building

You cannot cross-compile!

  • Linux executables must be built on Linux
  • Windows executables must be built on Windows
  • Mac executables must be built on macOS

First Run

On the first run, the application will:

  1. Create a config directory at ~/.local-transcription/ (Linux) or %USERPROFILE%\.local-transcription\ (Windows)
  2. Download the Whisper model (if not already present)
  3. The model will be cached in ~/.cache/huggingface/ by default

Executable Size

The built executable will be large (300MB - 2GB+) because it includes:

  • Python runtime
  • PySide6 (Qt framework)
  • PyTorch/faster-whisper
  • NumPy, SciPy, and other dependencies

Console Window

By default, the console window is visible (for debugging). To hide it:

  1. Edit local-transcription.spec
  2. Change console=True to console=False in the EXE section
  3. Rebuild

GPU Support

Yes, you CAN build with CUDA support on systems without NVIDIA GPUs!

PyTorch provides CUDA-enabled builds that bundle the CUDA runtime libraries. This means:

  1. You don't need NVIDIA hardware to create CUDA-enabled builds
  2. The executable will work everywhere - on systems with or without NVIDIA GPUs
  3. Automatic fallback - the app detects available hardware and uses GPU if available, CPU otherwise
  4. Larger file size - adds ~600MB-1GB to the executable size

How it works:

# Linux
./build-cuda.sh

# Windows
build-cuda.bat

The build script will:

  • Install PyTorch with bundled CUDA 12.1 runtime
  • Package all CUDA libraries into the executable
  • Create a universal build that runs on any system

When users run the executable:

  • If they have an NVIDIA GPU with drivers: Uses GPU acceleration
  • If they don't have NVIDIA GPU: Automatically uses CPU
  • No configuration needed - it just works!

Alternative: CPU-Only Builds

If you only want CPU support (smaller file size):

# Linux
./build.sh

# Windows
build.bat

AMD GPU Support

  • ROCm: Requires special PyTorch builds from AMD
  • Not recommended for general distribution
  • Better to use CUDA build (works on all systems) or CPU build

Optimizations

To reduce size:

  1. Remove unused model sizes: The app downloads models on-demand, so you don't need to bundle them
  2. Use UPX compression: Already enabled in the spec file
  3. Exclude dev dependencies: Only build dependencies are needed

Testing the Build

After building, test the executable:

Linux:

cd dist/LocalTranscription
./LocalTranscription

Windows:

cd dist\LocalTranscription
LocalTranscription.exe

Troubleshooting

Missing modules error

If you get "No module named X" errors, add the module to the hiddenimports list in local-transcription.spec

DLL errors (Windows)

Make sure Visual C++ Redistributable is installed on the target system: https://aka.ms/vs/17/release/vc_redist.x64.exe

Audio device errors

The application needs access to audio devices. Ensure:

  • Microphone permissions are granted
  • Audio drivers are installed
  • PulseAudio (Linux) or Windows Audio is running

Model download fails

Ensure internet connection on first run. Models are downloaded from: https://huggingface.co/guillaumekln/faster-whisper-base

Advanced: Adding an Icon

  1. Create or obtain an .ico file (Windows) or .png file (Linux)
  2. Edit local-transcription.spec
  3. Change icon=None to icon='path/to/your/icon.ico'
  4. Rebuild

Advanced: Creating an Installer

Windows (using Inno Setup):

  1. Install Inno Setup: https://jrsoftware.org/isinfo.php
  2. Create an .iss script file
  3. Build the installer

Linux (using AppImage):

# Install appimagetool
wget https://github.com/AppImage/AppImageKit/releases/download/continuous/appimagetool-x86_64.AppImage
chmod +x appimagetool-x86_64.AppImage

# Create AppDir structure
mkdir -p LocalTranscription.AppDir/usr/bin
cp -r dist/LocalTranscription/* LocalTranscription.AppDir/usr/bin/

# Create desktop file and icon
# (Create .desktop file and icon as needed)

# Build AppImage
./appimagetool-x86_64.AppImage LocalTranscription.AppDir

Support

For build issues, check:

  1. PyInstaller documentation: https://pyinstaller.org/
  2. Project issues: https://github.com/anthropics/claude-code/issues