Models & Labs

Together AI Develops Fastest Speech-to-Text Stack

Together AI BlogMay 29, 2026high confidence

Why it matters

→Together AI's stack significantly reduces transcription time, enhancing efficiency in audio processing.
→The optimization of the entire data path sets a new standard for ASR performance.
→This development is crucial for applications requiring rapid and accurate speech-to-text conversion.

Together AI Develops Fastest Speech-to-Text Stack — ©Together AI Blog

Together AI has developed a highly efficient speech-to-text stack, utilizing NVIDIA's Parakeet-TDT and OpenAI's Whisper models. By optimizing the data path from CPU preprocessing to GPU execution, they have achieved the ability to transcribe 20 hours of audio in under 10 seconds. Key innovations include profile-aware TensorRT execution and GPU-side decoder control, which significantly enhance performance. This advancement is particularly impactful for applications that demand low latency and high throughput in audio transcription.

Read original

Together AI Develops Fastest Speech-to-Text Stack

Why it matters

More from Together AI Blog

ThunderAgent Boosts Agentic Inference Efficiency

More in Models & Labs

Llama.cpp adds GLM-5.2 speculative decoding support

Llama.cpp b10178 Release Adds Trace Logging

Together AI partners with Moonshot AI for Kimi models

Together AI Enhances Model Inference Configuration

llama.cpp b10180 Release Enhances SYCL Performance