Add speech-to-text feature using Faster Whisper container
Some checks failed
Build App / compute-version (pull_request) Successful in 3s
Build App / build-macos (pull_request) Successful in 2m28s
Build STT Container / build-stt-container (pull_request) Successful in 3m18s
Build App / build-windows (pull_request) Successful in 4m40s
Build App / build-linux (pull_request) Failing after 1m46s
Build App / create-tag (pull_request) Has been skipped
Build App / sync-to-github (pull_request) Has been skipped
Some checks failed
Build App / compute-version (pull_request) Successful in 3s
Build App / build-macos (pull_request) Successful in 2m28s
Build STT Container / build-stt-container (pull_request) Successful in 3m18s
Build App / build-windows (pull_request) Successful in 4m40s
Build App / build-linux (pull_request) Failing after 1m46s
Build App / create-tag (pull_request) Has been skipped
Build App / sync-to-github (pull_request) Has been skipped
Adds a mic button to the terminal UI that captures speech, transcribes it via a Faster Whisper sidecar container, and injects the text into the terminal input. Includes settings panel for model selection (tiny/small/medium), port config, and container lifecycle management. - stt-container/: Dockerfile + FastAPI server for Whisper transcription - Rust backend: STT container management, transcribe_audio IPC command - Frontend: useSTT hook, SttButton, SttSettings, WAV encoder - CI: Gitea Actions workflow for multi-arch STT image builds Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -76,6 +76,48 @@ pub struct AppSettings {
|
||||
pub dismissed_image_digest: Option<String>,
|
||||
#[serde(default)]
|
||||
pub web_terminal: WebTerminalSettings,
|
||||
#[serde(default)]
|
||||
pub stt: SttSettings,
|
||||
}
|
||||
|
||||
fn default_stt_model() -> String {
|
||||
"tiny".to_string()
|
||||
}
|
||||
|
||||
fn default_stt_port() -> u16 {
|
||||
9876
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct SttSettings {
|
||||
#[serde(default)]
|
||||
pub enabled: bool,
|
||||
#[serde(default = "default_stt_model")]
|
||||
pub model: String,
|
||||
#[serde(default = "default_stt_port")]
|
||||
pub port: u16,
|
||||
#[serde(default)]
|
||||
pub language: Option<String>,
|
||||
}
|
||||
|
||||
impl Default for SttSettings {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
enabled: false,
|
||||
model: default_stt_model(),
|
||||
port: 9876,
|
||||
language: None,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct SttStatus {
|
||||
pub container_exists: bool,
|
||||
pub running: bool,
|
||||
pub port: u16,
|
||||
pub model: String,
|
||||
pub image_exists: bool,
|
||||
}
|
||||
|
||||
fn default_web_terminal_port() -> u16 {
|
||||
@@ -120,6 +162,7 @@ impl Default for AppSettings {
|
||||
default_microphone: None,
|
||||
dismissed_image_digest: None,
|
||||
web_terminal: WebTerminalSettings::default(),
|
||||
stt: SttSettings::default(),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user