Initial commit: Local Transcription App v1.0

Phase 1 Complete - Standalone Desktop Application Features: - Real-time speech-to-text with Whisper (faster-whisper) - PySide6 desktop GUI with settings dialog - Web server for OBS browser source integration - Audio capture with automatic sample rate detection and resampling - Noise suppression with Voice Activity Detection (VAD) - Configurable display settings (font, timestamps, fade duration) - Settings apply without restart (with automatic model reloading) - Auto-fade for web display transcriptions - CPU/GPU support with automatic device detection - Standalone executable builds (PyInstaller) - CUDA build support (works on systems without CUDA hardware) Components: - Audio capture with sounddevice - Noise reduction with noisereduce + webrtcvad - Transcription with faster-whisper - GUI with PySide6 - Web server with FastAPI + WebSocket - Configuration system with YAML Build System: - Standard builds (CPU-only): build.sh / build.bat - CUDA builds (universal): build-cuda.sh / build-cuda.bat - Comprehensive BUILD.md documentation - Cross-platform support (Linux, Windows) Documentation: - README.md with project overview and quick start - BUILD.md with detailed build instructions - NEXT_STEPS.md with future enhancement roadmap - INSTALL.md with setup instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-25 18:48:23 -08:00
commit 472233aec4
31 changed files with 5116 additions and 0 deletions
--- a/BUILD.md
+++ b/BUILD.md
@@ -0,0 +1,259 @@
+# Building Local Transcription
+
+This guide explains how to build standalone executables for Linux and Windows.
+
+## Prerequisites
+
+1. **Python 3.8+** installed on your system
+2. **uv** package manager (install from https://docs.astral.sh/uv/)
+3. All project dependencies installed (`uv sync`)
+
+## Building for Linux
+
+### Standard Build (CPU-only):
+
+```bash
+# Make the build script executable (first time only)
+chmod +x build.sh
+
+# Run the build script
+./build.sh
+```
+
+### CUDA Build (GPU Support):
+
+Build with CUDA support even without NVIDIA hardware:
+
+```bash
+# Make the build script executable (first time only)
+chmod +x build-cuda.sh
+
+# Run the CUDA build script
+./build-cuda.sh
+```
+
+This will:
+- Install PyTorch with CUDA 12.1 support
+- Bundle CUDA runtime libraries (~600MB extra)
+- Create an executable that works on both GPU and CPU systems
+- Automatically fall back to CPU if no CUDA GPU is available
+
+The executable will be created in `dist/LocalTranscription/LocalTranscription`
+
+### Manual build:
+```bash
+# Clean previous builds
+rm -rf build dist
+
+# Build with PyInstaller
+uv run pyinstaller local-transcription.spec
+```
+
+### Distribution:
+```bash
+cd dist
+tar -czf LocalTranscription-Linux.tar.gz LocalTranscription/
+```
+
+## Building for Windows
+
+### Standard Build (CPU-only):
+
+```cmd
+# Run the build script
+build.bat
+```
+
+### CUDA Build (GPU Support):
+
+Build with CUDA support even without NVIDIA hardware:
+
+```cmd
+# Run the CUDA build script
+build-cuda.bat
+```
+
+This will:
+- Install PyTorch with CUDA 12.1 support
+- Bundle CUDA runtime libraries (~600MB extra)
+- Create an executable that works on both GPU and CPU systems
+- Automatically fall back to CPU if no CUDA GPU is available
+
+The executable will be created in `dist\LocalTranscription\LocalTranscription.exe`
+
+### Manual build:
+```cmd
+# Clean previous builds
+rmdir /s /q build
+rmdir /s /q dist
+
+# Build with PyInstaller
+uv run pyinstaller local-transcription.spec
+```
+
+### Distribution:
+- Compress the `dist\LocalTranscription` folder to a ZIP file
+- Or use an installer creator like NSIS or Inno Setup
+
+## Important Notes
+
+### Cross-Platform Building
+
+**You cannot cross-compile!**
+- Linux executables must be built on Linux
+- Windows executables must be built on Windows
+- Mac executables must be built on macOS
+
+### First Run
+
+On the first run, the application will:
+1. Create a config directory at `~/.local-transcription/` (Linux) or `%USERPROFILE%\.local-transcription\` (Windows)
+2. Download the Whisper model (if not already present)
+3. The model will be cached in `~/.cache/huggingface/` by default
+
+### Executable Size
+
+The built executable will be large (300MB - 2GB+) because it includes:
+- Python runtime
+- PySide6 (Qt framework)
+- PyTorch/faster-whisper
+- NumPy, SciPy, and other dependencies
+
+### Console Window
+
+By default, the console window is visible (for debugging). To hide it:
+
+1. Edit `local-transcription.spec`
+2. Change `console=True` to `console=False` in the `EXE` section
+3. Rebuild
+
+### GPU Support
+
+#### Building with CUDA (Recommended for Distribution)
+
+**Yes, you CAN build with CUDA support on systems without NVIDIA GPUs!**
+
+PyTorch provides CUDA-enabled builds that bundle the CUDA runtime libraries. This means:
+
+1. **You don't need NVIDIA hardware** to create CUDA-enabled builds
+2. **The executable will work everywhere** - on systems with or without NVIDIA GPUs
+3. **Automatic fallback** - the app detects available hardware and uses GPU if available, CPU otherwise
+4. **Larger file size** - adds ~600MB-1GB to the executable size
+
+**How it works:**
+```bash
+# Linux
+./build-cuda.sh
+
+# Windows
+build-cuda.bat
+```
+
+The build script will:
+- Install PyTorch with bundled CUDA 12.1 runtime
+- Package all CUDA libraries into the executable
+- Create a universal build that runs on any system
+
+**When users run the executable:**
+- If they have an NVIDIA GPU with drivers: Uses GPU acceleration
+- If they don't have NVIDIA GPU: Automatically uses CPU
+- No configuration needed - it just works!
+
+#### Alternative: CPU-Only Builds
+
+If you only want CPU support (smaller file size):
+```bash
+# Linux
+./build.sh
+
+# Windows
+build.bat
+```
+
+#### AMD GPU Support
+
+- **ROCm**: Requires special PyTorch builds from AMD
+- Not recommended for general distribution
+- Better to use CUDA build (works on all systems) or CPU build
+
+### Optimizations
+
+To reduce size:
+
+1. **Remove unused model sizes**: The app downloads models on-demand, so you don't need to bundle them
+2. **Use UPX compression**: Already enabled in the spec file
+3. **Exclude dev dependencies**: Only build dependencies are needed
+
+## Testing the Build
+
+After building, test the executable:
+
+### Linux:
+```bash
+cd dist/LocalTranscription
+./LocalTranscription
+```
+
+### Windows:
+```cmd
+cd dist\LocalTranscription
+LocalTranscription.exe
+```
+
+## Troubleshooting
+
+### Missing modules error
+If you get "No module named X" errors, add the module to the `hiddenimports` list in `local-transcription.spec`
+
+### DLL errors (Windows)
+Make sure Visual C++ Redistributable is installed on the target system:
+https://aka.ms/vs/17/release/vc_redist.x64.exe
+
+### Audio device errors
+The application needs access to audio devices. Ensure:
+- Microphone permissions are granted
+- Audio drivers are installed
+- PulseAudio (Linux) or Windows Audio is running
+
+### Model download fails
+Ensure internet connection on first run. Models are downloaded from:
+https://huggingface.co/guillaumekln/faster-whisper-base
+
+## Advanced: Adding an Icon
+
+1. Create or obtain an `.ico` file (Windows) or `.png` file (Linux)
+2. Edit `local-transcription.spec`
+3. Change `icon=None` to `icon='path/to/your/icon.ico'`
+4. Rebuild
+
+## Advanced: Creating an Installer
+
+### Windows (using Inno Setup):
+
+1. Install Inno Setup: https://jrsoftware.org/isinfo.php
+2. Create an `.iss` script file
+3. Build the installer
+
+### Linux (using AppImage):
+
+```bash
+# Install appimagetool
+wget https://github.com/AppImage/AppImageKit/releases/download/continuous/appimagetool-x86_64.AppImage
+chmod +x appimagetool-x86_64.AppImage
+
+# Create AppDir structure
+mkdir -p LocalTranscription.AppDir/usr/bin
+cp -r dist/LocalTranscription/* LocalTranscription.AppDir/usr/bin/
+
+# Create desktop file and icon
+# (Create .desktop file and icon as needed)
+
+# Build AppImage
+./appimagetool-x86_64.AppImage LocalTranscription.AppDir
+```
+
+## Support
+
+For build issues, check:
+1. PyInstaller documentation: https://pyinstaller.org/
+2. Project issues: https://github.com/anthropics/claude-code/issues