Skip to main content

Transcribers

Transcribers (ASR - Automatic Speech Recognition) are the “Ears” of your AI agent. They are responsible for turning human speech into text in real-time.

Key Performance Indicators

  1. WER (Word Error Rate): How accurately the engine recognizes words.
  2. Latency: How quickly it produces text from sound.
  3. Acoustic Robustness: How well it handles background noise, accents, and poor phone connections.

Supported Providers

Deepgram (The Gold Standard)

The fastest and most accurate ASR for English. Includes features like formatting, smart-speak, and multi-channel support.

Sarvam AI (Indian Regional Expert)

Optimized for the nuances of Indian accents and regional languages like Hindi, Odia, and Tamil. We recommend Sarvam for any deployment targeting the Indian market.

OpenAI Whisper

Extremely high accuracy for a wide variety of global languages, though often with higher latency than Deepgram.

Configuration in Movoice AI

You can set your preferred transcriber in the Engine Tab of the Agent Studio. Note that some transcribers are only available for specific languages.