The Usefulness of Speech to Text

Audio to Text Transcription is Convenient, but How Useful is it?

Speech to Text (STT) automatically converts your voicemail into text. You receive the converted text to your email and your cell phone. Now, you don’t need to listen to that 3 minute voicemail, you can read it in less than 30 seconds.


Once you start receiving your voicemails in text form, it’s hard to go back. To transcribe audio to text, taking into account various accents, tones, inflections and back ground noise is not simple.


Amazingly, you can pull out your smartphone right now and ask your phone a question. Over 90% of the time, it will pick up exactly what you are saying. Even better, it will then answer your question.


We’ve simply taken this technology and made the life of a business executive that much more efficient. An average business executive spends 45 minutes a week listening to voicemail. That’s almost an entire workweek.


What could you do with that time?


Close more deals? Spend an extra week with your family at the beach? The choice is yours. Creating this convenience for you, an innovative speech to text service, sounds like it might be too good to be true.


Questions? Call Us!

Turning your voicemail message into text automatically so that you can read your message as well as hear it is a huge convenience. And it is no small technological feat.


Even though computers are incredibly fast and capable of astounding calculations, they are still no match for the human brain. When you hear a message, your brain can sort through background noise, change in inflection, gender, volume, accent, emotion, etc. to understand the message. Your brain will even “understand” words left out or unintelligible simply from the context.


Programming a computer to achieve a useful interpretation from a recording is very challenging. And it is not 100% by any means. Several key issues affect the accuracy such as volume, enunciation of the speaker, background noise and quality of the recording. However, the Speech to Text technology scores very high on what we call the “Usefulness Index”.

  • The Usefulness Index

    The Usefulness Index addresses 3 questions:

  1. Identity - Who is calling?
  2. Purpose – What do they want or why are they calling?
  3. Contact – How do I get back in touch with them?

To evaluate the effectiveness of the Speech to Text (STT) process, I submitted a dozen voicemail messages left for me for transcription. A dozen different speakers, different accents, gender and recording quality.

Each STT transcribed message got 1 point if it met one of the 3 requirements for the Usefulness Index. So each message had a potential score of 3 points.

Here are the results:

Caller Points
Christi 3
Cindy 3
Betty 2
Denise 3
Jennifer 2
Mitzi 2
Alex 1
Andy 2
Bruce 3
Joe 2
Kevin 3
Toby 3
Total Score: 29 - Possible 36

Average message score 2.4 points

Half of the test messages scored 100% on the Usefulness Index. Five scored 2 out of 3 and only 1 almost failed.

When you remember that Caller ID (ANI) is captured on every call and included with the transcription, then Usefulness Index Requirement number 3 is always met and a couple of the scores improve.

  • And the Winner is...

    Only perfect 50% of the time. But that 50% cuts in half the time I have to spend actually listening to messages. And on most of the other 50%, I have enough info to decide if I need to listen to the entire message. The benefits of an audio message such as inflection and tone of voice are still there if needed, but now I don't have to listen to every message to know what it's about. And the cost has now fallen to an affordable rate.

If I need a true word to word transcription, then a hybrid type service is required that combines computer processing with human oversight. At a higher cost.

One day the natural human interface of speech will replace buttons, the mouse and the keyboard just like on Star Trek. We aren't there yet, but it might be closer than you think. In the mean time, Speech To Text in its current form can make your day more productive.

The Productivity of Speech to Text

A business executive spends an average of 45 minutes per-week checking voicemail; That's three hours per-month, per-person spent on non-revenue producing activity.


Speech to Text (STT) is a business productivity tool; the service allows users to spend less time checking voicemail, and more time closing deals. The service enhances CRM because names, numbers and calls are all stored in your email client and searchable. Being able to respond to time sensitive voicemail as quickly as possible will increase productivity and revenue.


Mobile phones, home phones, work phones, Internet phones, any phone works with the service. Speech to Text (STT) integrates with all Major U.S. carriers and networks including: AT&T, Alltel, Cincinnati Bell, Sprint, Skype, T-Mobile, Verizon, Virgin and more.


Call Diamond Voice to add this service today. You will see and feel the time saved immediately. You won’t want to go back to listening to voicemails ever again!

Copyright © 1997-2019 All rights reserved  Diamond Voice Cloud Phone Services   124 Lofton St.   Cedar Hill,  TX  75104