Initial commit: Local Transcription App v1.0
Phase 1 Complete - Standalone Desktop Application Features: - Real-time speech-to-text with Whisper (faster-whisper) - PySide6 desktop GUI with settings dialog - Web server for OBS browser source integration - Audio capture with automatic sample rate detection and resampling - Noise suppression with Voice Activity Detection (VAD) - Configurable display settings (font, timestamps, fade duration) - Settings apply without restart (with automatic model reloading) - Auto-fade for web display transcriptions - CPU/GPU support with automatic device detection - Standalone executable builds (PyInstaller) - CUDA build support (works on systems without CUDA hardware) Components: - Audio capture with sounddevice - Noise reduction with noisereduce + webrtcvad - Transcription with faster-whisper - GUI with PySide6 - Web server with FastAPI + WebSocket - Configuration system with YAML Build System: - Standard builds (CPU-only): build.sh / build.bat - CUDA builds (universal): build-cuda.sh / build-cuda.bat - Comprehensive BUILD.md documentation - Cross-platform support (Linux, Windows) Documentation: - README.md with project overview and quick start - BUILD.md with detailed build instructions - NEXT_STEPS.md with future enhancement roadmap - INSTALL.md with setup instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
194
INSTALL.md
Normal file
194
INSTALL.md
Normal file
@@ -0,0 +1,194 @@
|
||||
# Installation Guide
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Python 3.9 or higher**
|
||||
- **uv** (Python package installer)
|
||||
- **FFmpeg** (required by faster-whisper)
|
||||
- **CUDA-capable GPU** (optional, for GPU acceleration)
|
||||
|
||||
### Installing uv
|
||||
|
||||
If you don't have `uv` installed:
|
||||
|
||||
```bash
|
||||
# On macOS and Linux
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
|
||||
# On Windows
|
||||
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
|
||||
|
||||
# Or with pip
|
||||
pip install uv
|
||||
```
|
||||
|
||||
### Installing FFmpeg
|
||||
|
||||
#### On Ubuntu/Debian:
|
||||
```bash
|
||||
sudo apt update
|
||||
sudo apt install ffmpeg
|
||||
```
|
||||
|
||||
#### On macOS (with Homebrew):
|
||||
```bash
|
||||
brew install ffmpeg
|
||||
```
|
||||
|
||||
#### On Windows:
|
||||
Download from [ffmpeg.org](https://ffmpeg.org/download.html) and add to PATH.
|
||||
|
||||
## Installation Steps
|
||||
|
||||
### 1. Navigate to Project Directory
|
||||
|
||||
```bash
|
||||
cd /home/jknapp/code/local-transcription
|
||||
```
|
||||
|
||||
### 2. Install Dependencies with uv
|
||||
|
||||
```bash
|
||||
# uv will automatically create a virtual environment and install dependencies
|
||||
uv sync
|
||||
```
|
||||
|
||||
This single command will:
|
||||
- Create a virtual environment (`.venv/`)
|
||||
- Install all dependencies from `pyproject.toml`
|
||||
- Lock dependencies for reproducibility
|
||||
|
||||
**Note for CUDA users:** If you have an NVIDIA GPU, install PyTorch with CUDA support:
|
||||
|
||||
```bash
|
||||
# For CUDA 11.8
|
||||
uv pip install torch --index-url https://download.pytorch.org/whl/cu118
|
||||
|
||||
# For CUDA 12.1
|
||||
uv pip install torch --index-url https://download.pytorch.org/whl/cu121
|
||||
```
|
||||
|
||||
### 3. Run the Application
|
||||
|
||||
```bash
|
||||
# Option 1: Using uv run (automatically uses the venv)
|
||||
uv run python main.py
|
||||
|
||||
# Option 2: Activate venv manually
|
||||
source .venv/bin/activate # On Windows: .venv\Scripts\activate
|
||||
python main.py
|
||||
```
|
||||
|
||||
On first run, the application will:
|
||||
- Download the Whisper model (this may take a few minutes)
|
||||
- Create a configuration file at `~/.local-transcription/config.yaml`
|
||||
|
||||
## Quick Start Commands
|
||||
|
||||
```bash
|
||||
# Install everything
|
||||
uv sync
|
||||
|
||||
# Run the application
|
||||
uv run python main.py
|
||||
|
||||
# Install with server dependencies (for Phase 2+)
|
||||
uv sync --extra server
|
||||
|
||||
# Update dependencies
|
||||
uv sync --upgrade
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Settings can be changed through the GUI (Settings button) or by editing:
|
||||
```
|
||||
~/.local-transcription/config.yaml
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Audio Device Issues
|
||||
|
||||
If no audio devices are detected:
|
||||
```bash
|
||||
uv run python -c "import sounddevice as sd; print(sd.query_devices())"
|
||||
```
|
||||
|
||||
### GPU Not Detected
|
||||
|
||||
Check if CUDA is available:
|
||||
```bash
|
||||
uv run python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
|
||||
```
|
||||
|
||||
### Model Download Fails
|
||||
|
||||
Models are downloaded to `~/.cache/huggingface/`. If download fails:
|
||||
- Check internet connection
|
||||
- Ensure sufficient disk space (~1-3 GB depending on model size)
|
||||
|
||||
### uv Command Not Found
|
||||
|
||||
Make sure uv is in your PATH:
|
||||
```bash
|
||||
# Add to ~/.bashrc or ~/.zshrc
|
||||
export PATH="$HOME/.cargo/bin:$PATH"
|
||||
```
|
||||
|
||||
## Performance Tips
|
||||
|
||||
For best real-time performance:
|
||||
|
||||
1. **Use GPU if available** - 5-10x faster than CPU
|
||||
2. **Start with smaller models**:
|
||||
- `tiny`: Fastest, ~39M parameters, 1-2s latency
|
||||
- `base`: Good balance, ~74M parameters, 2-3s latency
|
||||
- `small`: Better accuracy, ~244M parameters, 3-5s latency
|
||||
3. **Enable VAD** (Voice Activity Detection) to skip silent audio
|
||||
4. **Adjust chunk duration**: Smaller = lower latency, larger = better accuracy
|
||||
|
||||
## System Requirements
|
||||
|
||||
### Minimum:
|
||||
- CPU: Dual-core 2GHz+
|
||||
- RAM: 4GB
|
||||
- Model: tiny or base
|
||||
|
||||
### Recommended:
|
||||
- CPU: Quad-core 3GHz+ or GPU (NVIDIA GTX 1060+)
|
||||
- RAM: 8GB
|
||||
- Model: base or small
|
||||
|
||||
### For Best Performance:
|
||||
- GPU: NVIDIA RTX 2060 or better
|
||||
- RAM: 16GB
|
||||
- Model: small or medium
|
||||
|
||||
## Development
|
||||
|
||||
### Install development dependencies:
|
||||
```bash
|
||||
uv sync --extra dev
|
||||
```
|
||||
|
||||
### Run tests:
|
||||
```bash
|
||||
uv run pytest
|
||||
```
|
||||
|
||||
### Format code:
|
||||
```bash
|
||||
uv run black .
|
||||
uv run ruff check .
|
||||
```
|
||||
|
||||
## Why uv?
|
||||
|
||||
`uv` is significantly faster than pip:
|
||||
- **10-100x faster** dependency resolution
|
||||
- **Automatic virtual environment** management
|
||||
- **Reproducible builds** with lockfile
|
||||
- **Drop-in replacement** for pip commands
|
||||
|
||||
Learn more at [astral.sh/uv](https://astral.sh/uv)
|
||||
Reference in New Issue
Block a user