A Breakthrough in Human Computer Interaction

A Breakthrough in Human Computer Interaction

Speech Recognition, Blue Waveform

 

It’s no secret I believe speech is the next input mechanism. We are in the Voice-to-Text Era. I wrote in late 2014 that speech is the fastest user interface, and the newest speech recognition experiments confirm it.

Andrew Ng, a luminary in the world of machine learning, and his teammates at Baidu, Stanford and University of Washington have developed Deep Speech 2, a neural network based speech recognition system. They tested the speech and accuracy of the system and compared it to people typing on their mobile phones.

 

 

The results were clear no matter the language. For English, speech recognition was three times faster than typing, and the error rate was 20.4 percent lower. In Mandarin Chinese, speech was 2.8 times faster, with an error rate 63.4 percent lower than typing.

Stanford News

 

 

There are many limitations of speech. It’s not always a convenient input mechanism; speaking to your phone in the middle of a meeting is bound to derail it. And speech also faces the uphill battle of changing societal norms. Saying “send email to Tomer hi comma how is the pilot going question mark” will likely arouse the same feelings of judgemental hostility to geekdom in passersby as the Bluetooth headset five years ago.

Regardless of all these frictions, speech is much faster and far more accurate irrespective of language. This speed advantage will render speech to be the primary form of input to computers, initially with mobile phones, but ultimately with laptops.

The ramifications are broad. We will redesign offices to contain the persistent murmurs of people speaking to their machines. Natural language understanding, the science of computers deciphering our meaning, will become critically important to master for major Internet and Software companies. Users and buyers of software will change their buying parameters to include speech recognition.

Ultimately, we will all be more productive for it. The best tools are the ones we don’t recognize we’re using because they are extensions of our bodies. Learn to use a fork to eat, and quickly, spearing a morsel on tines becomes as natural as raising a peach to your lips.

The QWERTY keyboard, a relic of an era when the typewriter needed to slow the typist to prevent hammers from criss-crossing and jamming, will be a curiosity, a relic soon. Speech will replace it, and transform the human-computer interaction in the process.

[Tomasz Tunguz]

October 6, 2016 / by / in , , , , , , , , , ,

Leave a Reply

Show Buttons
Hide Buttons

IMPORTANT MESSAGE: Scooblrinc.com is a website owned and operated by Scooblr, Inc. By accessing this website and any pages thereof, you agree to be bound by the Terms of Use and Privacy Policy, as amended from time to time. Scooblr, Inc. does not verify or assure that information provided by any company offering services is accurate or complete or that the valuation is appropriate. Neither Scooblr nor any of its directors, officers, employees, representatives, affiliates or agents shall have any liability whatsoever arising, for any error or incompleteness of fact or opinion in, or lack of care in the preparation or publication, of the materials posted on this website. Scooblr does not give advice, provide analysis or recommendations regarding any offering, service posted on the website. The information on this website does not constitute an offer of, or the solicitation of an offer to buy or subscribe for, any services to any person in any jurisdiction to whom or in which such offer or solicitation is unlawful.