# Voice to Notes

A desktop application that transcribes audio/video recordings with speaker identification, producing editable transcriptions with synchronized audio playback.

## Goals

- **Speech-to-Text Transcription** — Accurately convert spoken audio from recordings into text
- **Speaker Identification (Diarization)** — Detect and distinguish between different speakers in a conversation
- **Speaker Naming** — Assign and persist speaker names/IDs across the transcription
- **Synchronized Playback** — Click any transcribed text segment to play back the corresponding audio for review and correction
- **Export Formats**
  - Closed captioning files (SRT, VTT) for video
  - Plain text documents with speaker labels
- **AI Integration** — Connect to AI providers to ask questions about the conversation and generate condensed notes/summaries

## Platform Support

| Platform | Status |
|----------|--------|
| Linux    | Planned (initial target) |
| Windows  | Planned (initial target) |
| macOS    | Future (pending hardware) |

## Project Status

**Early planning phase** — Architecture and technology decisions in progress.

## License

MIT