Files

Josh Knapp 472233aec4 Initial commit: Local Transcription App v1.0

Phase 1 Complete - Standalone Desktop Application

Features:
- Real-time speech-to-text with Whisper (faster-whisper)
- PySide6 desktop GUI with settings dialog
- Web server for OBS browser source integration
- Audio capture with automatic sample rate detection and resampling
- Noise suppression with Voice Activity Detection (VAD)
- Configurable display settings (font, timestamps, fade duration)
- Settings apply without restart (with automatic model reloading)
- Auto-fade for web display transcriptions
- CPU/GPU support with automatic device detection
- Standalone executable builds (PyInstaller)
- CUDA build support (works on systems without CUDA hardware)

Components:
- Audio capture with sounddevice
- Noise reduction with noisereduce + webrtcvad
- Transcription with faster-whisper
- GUI with PySide6
- Web server with FastAPI + WebSocket
- Configuration system with YAML

Build System:
- Standard builds (CPU-only): build.sh / build.bat
- CUDA builds (universal): build-cuda.sh / build-cuda.bat
- Comprehensive BUILD.md documentation
- Cross-platform support (Linux, Windows)

Documentation:
- README.md with project overview and quick start
- BUILD.md with detailed build instructions
- NEXT_STEPS.md with future enhancement roadmap
- INSTALL.md with setup instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-12-25 18:48:23 -08:00

6.3 KiB

Raw Blame History

Building Local Transcription

This guide explains how to build standalone executables for Linux and Windows.

Prerequisites

Python 3.8+ installed on your system
uv package manager (install from https://docs.astral.sh/uv/)
All project dependencies installed (uv sync)

Building for Linux

Standard Build (CPU-only):

# Make the build script executable (first time only)
chmod +x build.sh

# Run the build script
./build.sh

CUDA Build (GPU Support):

Build with CUDA support even without NVIDIA hardware:

# Make the build script executable (first time only)
chmod +x build-cuda.sh

# Run the CUDA build script
./build-cuda.sh

This will:

Install PyTorch with CUDA 12.1 support
Bundle CUDA runtime libraries (~600MB extra)
Create an executable that works on both GPU and CPU systems
Automatically fall back to CPU if no CUDA GPU is available

The executable will be created in dist/LocalTranscription/LocalTranscription

Manual build:

# Clean previous builds
rm -rf build dist

# Build with PyInstaller
uv run pyinstaller local-transcription.spec

Distribution:

cd dist
tar -czf LocalTranscription-Linux.tar.gz LocalTranscription/

Building for Windows

Standard Build (CPU-only):

# Run the build script
build.bat

CUDA Build (GPU Support):

Build with CUDA support even without NVIDIA hardware:

# Run the CUDA build script
build-cuda.bat

This will:

Install PyTorch with CUDA 12.1 support
Bundle CUDA runtime libraries (~600MB extra)
Create an executable that works on both GPU and CPU systems
Automatically fall back to CPU if no CUDA GPU is available

The executable will be created in dist\LocalTranscription\LocalTranscription.exe

Manual build:

# Clean previous builds
rmdir /s /q build
rmdir /s /q dist

# Build with PyInstaller
uv run pyinstaller local-transcription.spec

Distribution:

Compress the dist\LocalTranscription folder to a ZIP file
Or use an installer creator like NSIS or Inno Setup

Important Notes

Cross-Platform Building

You cannot cross-compile!

Linux executables must be built on Linux
Windows executables must be built on Windows
Mac executables must be built on macOS

First Run

On the first run, the application will:

Create a config directory at ~/.local-transcription/ (Linux) or %USERPROFILE%\.local-transcription\ (Windows)
Download the Whisper model (if not already present)
The model will be cached in ~/.cache/huggingface/ by default

Executable Size

The built executable will be large (300MB - 2GB+) because it includes:

Python runtime
PySide6 (Qt framework)
PyTorch/faster-whisper
NumPy, SciPy, and other dependencies

Console Window

By default, the console window is visible (for debugging). To hide it:

Edit local-transcription.spec
Change console=True to console=False in the EXE section
Rebuild

GPU Support

Building with CUDA (Recommended for Distribution)

Yes, you CAN build with CUDA support on systems without NVIDIA GPUs!

PyTorch provides CUDA-enabled builds that bundle the CUDA runtime libraries. This means:

You don't need NVIDIA hardware to create CUDA-enabled builds
The executable will work everywhere - on systems with or without NVIDIA GPUs
Automatic fallback - the app detects available hardware and uses GPU if available, CPU otherwise
Larger file size - adds ~600MB-1GB to the executable size

How it works:

# Linux
./build-cuda.sh

# Windows
build-cuda.bat

The build script will:

Install PyTorch with bundled CUDA 12.1 runtime
Package all CUDA libraries into the executable
Create a universal build that runs on any system

When users run the executable:

If they have an NVIDIA GPU with drivers: Uses GPU acceleration
If they don't have NVIDIA GPU: Automatically uses CPU
No configuration needed - it just works!

Alternative: CPU-Only Builds

If you only want CPU support (smaller file size):

# Linux
./build.sh

# Windows
build.bat

AMD GPU Support

ROCm: Requires special PyTorch builds from AMD
Not recommended for general distribution
Better to use CUDA build (works on all systems) or CPU build

Optimizations

To reduce size:

Remove unused model sizes: The app downloads models on-demand, so you don't need to bundle them
Use UPX compression: Already enabled in the spec file
Exclude dev dependencies: Only build dependencies are needed

Testing the Build

After building, test the executable:

Linux:

cd dist/LocalTranscription
./LocalTranscription

Windows:

cd dist\LocalTranscription
LocalTranscription.exe

Troubleshooting

Missing modules error

If you get "No module named X" errors, add the module to the hiddenimports list in local-transcription.spec

DLL errors (Windows)

Make sure Visual C++ Redistributable is installed on the target system: https://aka.ms/vs/17/release/vc_redist.x64.exe

Audio device errors

The application needs access to audio devices. Ensure:

Microphone permissions are granted
Audio drivers are installed
PulseAudio (Linux) or Windows Audio is running

Model download fails

Ensure internet connection on first run. Models are downloaded from: https://huggingface.co/guillaumekln/faster-whisper-base

Advanced: Adding an Icon

Create or obtain an .ico file (Windows) or .png file (Linux)
Edit local-transcription.spec
Change icon=None to icon='path/to/your/icon.ico'
Rebuild

Advanced: Creating an Installer

Windows (using Inno Setup):

Install Inno Setup: https://jrsoftware.org/isinfo.php
Create an .iss script file
Build the installer

Linux (using AppImage):

# Install appimagetool
wget https://github.com/AppImage/AppImageKit/releases/download/continuous/appimagetool-x86_64.AppImage
chmod +x appimagetool-x86_64.AppImage

# Create AppDir structure
mkdir -p LocalTranscription.AppDir/usr/bin
cp -r dist/LocalTranscription/* LocalTranscription.AppDir/usr/bin/

# Create desktop file and icon
# (Create .desktop file and icon as needed)

# Build AppImage
./appimagetool-x86_64.AppImage LocalTranscription.AppDir

Support

For build issues, check:

PyInstaller documentation: https://pyinstaller.org/
Project issues: https://github.com/anthropics/claude-code/issues

6.3 KiB Raw Blame History

Building Local Transcription

Prerequisites

Building for Linux

Standard Build (CPU-only):

CUDA Build (GPU Support):

Manual build:

Distribution:

Building for Windows

Standard Build (CPU-only):

CUDA Build (GPU Support):

Manual build:

Distribution:

Important Notes

Cross-Platform Building

First Run

Executable Size

Console Window

GPU Support

Building with CUDA (Recommended for Distribution)

Alternative: CPU-Only Builds

AMD GPU Support

Optimizations

Testing the Build

Linux:

Windows:

Troubleshooting

Missing modules error

DLL errors (Windows)

Audio device errors

Model download fails

Advanced: Adding an Icon

Advanced: Creating an Installer

Windows (using Inno Setup):

Linux (using AppImage):

Support

6.3 KiB

Raw Blame History