Three changes to reduce transcription delay:
1. Send loop: queue.get() was blocking the asyncio event loop, stalling
the receive loop and delaying transcription results. Now uses
run_in_executor() to avoid blocking the event loop.
2. Block size: reduced from 4096 (~256ms) to 1024 (~64ms) for more
frequent, smaller audio chunks. Deepgram handles streaming better
with smaller packets.
3. Added punctuate=true and smart_format=true to Deepgram BYOK
params for cleaner transcription output.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>