What is considered a good word error rate?

In the context of dictation, a WER below 5% is considered excellent. In customer service (voicebots), where complex proper nouns, dialects, and background noise are common, rates between 10% and 15% are typical in practice. The key factor here is the 'Semantic WER'—that is, ensuring that business-critical data such as customer numbers or amounts are recognized without errors.

How does WER affect a voice bot's ROI?

The WER is directly related to the containment rate. Every misunderstood word increases the risk of failed attempts (fallbacks) or leads to unnecessary handovers to human agents. Reducing the WER therefore directly improves the project’s ROI.

Is a low WER alone enough to make a good voicebot?

No. WER simply measures the quality of the transcription (speech-to-text). For a voicebot to be successful, this error-free transcript must be processed by an intelligent logic layer (multi-agent orchestration, Knowledge AI/RAG) in order to understand the context and perform the correct action.

Word Error Rate (WER)

June 2, 2026

|By Julia Schönau

–-> Go to BOTwiki

The Word Error Rate (WER) is the key metric for measuring the quality of speech-to-text systems. It indicates how many words in a spoken sentence were transcribed incorrectly by the recognition system, expressed as a percentage of the total number of spoken words. A low WER is a prerequisite for reliable voicebots, because every recognition error subsequently degrades the classification of the request, entity extraction, and thus end-to-end automation.

How the Word Error Rate is Calculated

The WER sums up three types of errors and compares them to the length of the reference text:

Substitutions (S): A word has been replaced by another.
Insertions (I): An additional word has been inserted.
Deletions (D): A word is missing from the transcription.

The formula is WER = (S + I + D) / N, where N is the number of words in the reference text. For example, a WER value of 5% means that one word was incorrectly recognized in a 20-word sentence.

WER and Its Impact on Voice Bots

In the voice channel, the WER directly impacts subsequent steps. If the system misidentifies a customer number or a plan name, the entire workflow fails. That is why the WER is not just a quality metric, but an input variable for multi-agent orchestration: When confidence is low, the triage agent specifically requests a repeat or compares the audio text with stored custom entities.

WHO for proper nouns, numbers, and technical terms

The average error rate (ER) of modern speech-to-text systems for standard conversations is in the low single digits. For proper nouns, addresses, numbers, or industry-specific terminology, it is often significantly higher—unfortunately, precisely where it is most critical for service processes. Custom vocabularies, industry-specific language models, and downstream plausibility checks via phonebots provide a solution.

Frequently Asked Questions (FAQ)

In the context of dictation, an error rate (ER) below 5% is considered very good. In the service sector, where proper names, addresses, and numbers are involved, realistic target rates vary by industry—the key is to ensure that critical data points (customer number, address, amount) are accurate.

Any gap in recognition leads either to follow-up questions or to escalation to employees. Both reduce the automation rate. A low WER is therefore a direct driver of ROI.

No. WER is a necessary but not sufficient condition. It is only through the combination of multi-agent orchestration, hybrid intelligence, and knowledge AI that a good transcript can be transformed into a robust service process.

–> Back to the BOTwiki

Product

Features

Integrations

Resources

Documentation & Know-How

Recommendations

Word Error Rate (WER)

How the Word Error Rate is Calculated

WER and Its Impact on Voice Bots

WHO for proper nouns, numbers, and technical terms

Frequently Asked Questions (FAQ)

Product

Features

Integrations

Resources

Documentation & Know-How

Recommendations

Word Error Rate (WER)

How the Word Error Rate is Calculated

WER and Its Impact on Voice Bots

WHO for proper nouns, numbers, and technical terms

Frequently Asked Questions (FAQ)

What is considered a good word error rate?+

How does WER affect a voice bot's ROI?+

Is a low WER alone enough to make a good voicebot?+