This site does not support Internet Explorer. Use a modern browser for an improved experience.
Call Tracking

Know what makes your phone ring and smartly route inbound calls.

Analytics Add-Ons

Form Tracking

Learn which ads, campaigns, or keywords inspire form submissions.

Conversation Intelligence

Automatically transcribe and analyze calls using the power of AI.

Communications Add-On

Lead Center

Call, text, chat, and manage your conversations from one unified inbox.

Pricing
Industries
Agencies

Know which tactics deliver your clients' best calls and form submissions.

Real Estate

Close more qualified buyers and renters with better marketing and communications.

Healthcare

Gain the insights you need to maintain the health of your marketing campaigns.

Legal Services

Stay competitive by making the most of your marketing dollars.

Home Services

Tighten up response times — and never miss another opportunity.

Automotive

Accelerate the ads, keywords, and campaigns that drive buyers to your showroom.

Financial Services

Know what inspired your best customers to act then attract more like them.

Resources
callrail marketing graphs
Content Hub

Market smarter by keeping up with emerging trends, tips, and tools.

NewsEbooksWebinarsDefinitionsCase StudiesGuidesTips
Integrations

Deliver automated insights by connecting CallRail to platforms you already use.

Partnerships

Learn about our affiliate, marketing agency, and technology partnerships.

(888) 907-4718 Support Sign In
Start Free Trial
Already have an account? Sign In
(888) 907-4718
  • Learn
  • Marketing Strategies

A Brief History of Voice Recognition Technology

by Michael Saba

Programmers and engineers have made great leaps in the science of voice recognition over the past decade, so you’d be forgiven for thinking that this technology is a relatively new development. Much of the reporting and scholarship around voice recognition tech only focuses on the post-2011 Age of Siri, following the release of Apple’s now-ubiquitous personal assistant.

But there’s a rich secret history to voice recognition tech that stretches back to the mid-20th-century, to those early days when rudimentary computers needed to fill an entire warehouse with vacuum tubes and diodes just to crunch a simple equation. And this history not only reveals some interesting trivia about the technology we know and love today, it also points the way towards potential future breakthroughs in the field.

Let’s explore the untold story of voice recognition technology, and see how much progress has been made over the years (and how much has stayed the same).

AUDREY and the Shoebox

In the early 20th century, the U.S. research firm Bell Laboratories (named after founder Alexander Graham Bell, the inventor of the telephone) racked up a string of impressive technological advances: The invention of radio astronomy (1931), solar batteries (1941), and transistors (1947).

Then in 1952, Bell Labs would mark another groundbreaking technological advancement: The AUDREY System, a set of vacuum-tube circuitry housed in a six-foot-high relay rack that could understand numerical digits spoken into its speaker box. When adapted to a specific speaking voice, AUDREY could accurately interpret more than 97% of digits spoken to it. AUDREY is no doubt primitive by today’s standards, but it laid the groundwork for voice-dialing, a technology that was widely used among toll-line operators. (Remember those?)

Ten years later, IBM unveiled its Shoebox machine at the 1962 World Fair in Seattle. Like AUDREY, Shoebox could understand up to 16 words, including the digits 0 through 9. And when Shoebox heard a number combined with a command word (like “plus” or “total”), it would then instruct a linked adding machine to calculate and print the answer to simple arithmetic problems.

Just like that, the world’s first calculator powered by voice recognition was born!

HARPY takes wing

Voice recognition began to take off as a field in the 1970s, thanks in large part to interest and funding from the U.S. Department of Defense and DARPA. Running from 1971 to 1976, DARPA’s Speech Understanding Research (SUR) program was one of the largest research initiatives ever undertaken in the field of voice recognition.

SUR ultimately helped created Carnegie Mellon’s “HARPY” voice recognition system, which was capable of processing and understanding more than 1,000 words. HARPY was particularly significant due to its use of “beam search” technology, which was a far more efficient method for machines to retrieve the meaning of words from a database and better determine the structure of a spoken sentence.

Indeed, advances in voice recognition have always been closely tied to similar strides in search engine tech — look no further than Google’s current dominance in both fields for proof-positive of this fact.

From recognition to prediction

By the 1980s voice recognition tech had begun to advance at an exponential rate, going from simple machines that could understand only dozens or hundreds of spoken words, to complex networked machines that could comprehend tens of thousands.

These advances were largely powered by the development of the Hidden Markov Model (HMM), a statistical method that allowed computers to better predict whether a sound corresponds to a word, rather than trying to match the sound’s pattern against a rigid template. In this way, HMM enabled voice recognition machines to greatly expand their vocabulary while also comprehending more conversational speech patterns.

Armed with this technology, voice recognition began to be adopted for commercial use and became increasingly common in several specialized industries. The 1980s is also when voice recognition began to make its way into home consumer electronics, like with World of Wonder’s 1987 “Julie” doll, which could understand basic phrases and reply back. (“Finally, the doll that understands you!“)

Voice recognition goes mainstream

In 1990, we saw the release of the very first consumer-grade voice recognition product: Dragon Dictate, priced at $9,000 (that’s $17,000 in 2017 dollars). Following this, Dragon Dictate’s 1997 successor, Dragon NaturallySpeaking, was the first commercial voice recognition program that could understand the natural speech of up to 100 words per minute.

1997 also saw the release of BellSouth’s VAL, the very first “voice portal.” VAL was an interactive system that could respond to questions over the phone, laying the groundwork for the same technology powering the voice-activated menus you hear today when calling your bank or ISP.

But after more than 40 years of advancement after advancement in voice recognition technology, developments in the field stalled out from the mid-1990s through to the late 2000s. At the time, voice recognition programs had hit a ceiling of about 80% accuracy in recognizing spoken words due to the HMM underpinning speech technology.

It wasn’t until 2010 that voice technology began to take off again — this time, in a big way.

Google, Siri, and the voice recognition revolution

Apple’s iPhone had already made waves when it came out in 2007, as tech began to re-orient itself towards an increasingly smartphone-centric and mobile-first future. But with the release of Google Voice Search App for the iPhone in 2008, voice recognition technology began to once again make major strides.

In many ways, smartphones proved to be the ideal proving grounds for the new wave of voice recognition technology. Voice was simply an easier and more efficient input method on devices with such small screens and keyboards, which incentivized the development of hands-free technology.

But even more significantly, the design principles Google laid down with Voice Search in 2008 continue to define voice recognition technology to this day: The processing power necessary for voice recognition could be offloaded to Google’s cloud data centers, enabling the kind of high-volume data analysis capable of storing human speech patterns and accurately matching words against them.

Google’s approach was then perfected by Apple in 2011 with the release of Siri, an AI-driven personal assistant technology that likewise relies on cloud computing to predict what you’re saying. In many ways, Siri is a prime example of Apple doing what it does best: Taking existing technology and applying a mirror-sheen of polish to it. Siri’s easy-to-use interface combined with her sparkling ‘personality’ and Apple’s expert marketing of the iPhone helped make the program nearly ubiquitous.

Voice recognition: The next generation

As more and more users opt for mobile devices instead of desktop computers, voice recognition is becoming increasingly central to our day-to-day lives. Soon, personal assistant apps powered by artificial intelligence will be available on every single laptop, tablet and mobile phone, and they’ll all be able to hold up their end of a convincing conversation.

Instead of typing in a search request, smartphone owners can now simply say “Siri, where can I get good pizza nearby?” And rather than having an ad served through a channel like Google Ads (AdWords,) human-sounding personal assistants like Alexa and Cortana will soon be able to integrate the sales pitch into a natural conversation instead.

And beyond entertainment or personal use, there have also been many exciting developments around how voice recognition is used in business and commerce. Voice recognition technology powered by AI can now be used to transcribe phone calls, and even predict the outcome of a conversation based on its tone and the words used.

We may not be able to predict exactly what the field will look like in another 10 years, but one thing is certain: Voice recognition will continue to be at the forefront of exciting new developments in consumer tech and marketing.

Stay in the know

Subscribe to our newsletter

Company
About Us Careers Culture Contact Us
Support
Help Center Developers System Status
Resources
Content Hub Refer a Friend Partnerships Integrations
Try CallRail
Free Trial Pricing Request a Demo Contact Sales Enterprise

Terms of UsePrivacy NoticeSecurity

Copyright © 2011-2022 CallRail, Inc. All rights reserved.