November 19, 2010
By JOSHUA BROCKMAN
Instead of talking on your smart phone, think about talking to it. And don’t forget to tell your tablet what’s on your holiday shopping list.
Advances in voice recognition technology are making all of this possible. The technology has migrated to a number of free apps created by Google and Nuance, the company that owns and sells Dragon NaturallySpeaking software for computers.
Nuance’s apps — essentially stripped down versions of Dragon — for the iPhone and iPad (Dragon Dictation and Dragon Search) and Blackberry (Dragon for E-mail) transform words into action.
Dragon Dictation allows you to dictate things and then use your fingers to paste words into e-mail, texts, or into Twitter or Facebook. And Dragon Search lets you search the Web with your voice.
Don’t Forget To Buy Bananas
I tested out both apps on an iPod Touch and an iPad. The search app works the same on both. But the dictation app for the iPad allows you to save your dictation as a note and then update it whenever you want. This feature is especially handy for things like shopping or to-do lists.
Nuance’s app for the BlackBerry is directly integrated into e-mail so that you can use your voice to address the e-mail, specify the subject and dictate the text into the body of the message. But you still have to use your fingers to send it on its way.
Google’s New Voice Actions
Nuance does face some competition from Google, which has developed its own voice recognition technology. It’s used on Android mobile devices such as the popular Droid family of phones and Samsung’s new tablet, the Galaxy Tab.
I downloaded Google’s Voice Search App onto a Droid X smart phone on Verizon’s network. Then, I tested out the new “voice actions” commands (this requires Android 2.2) while walking the streets of Washington, D.C., during rush hour.
The voice actions allow you to do everything from listening to music to sending a note to yourself (akin to the shopping list feature described above). You simply press and hold the phone’s search button (you can also click on the microphone icon on the screen) to call up the “speak now” command.
So, how were the results? It was spot on most of the time. But accuracy largely depends on how clearly you speak, whether there’s a foreign word or name involved and if you hesitate too much (think, before speaking).
A Portable Phonebook, Navigation Tool
One of my favorite features is the ability to call businesses or restaurants by simply saying the name of the establishment and the city, such as “call the Metropolitan Opera in New York City.” The phone leaps into action as if you have a personal concierge at your service.
The same goes for navigation: The command “navigate to the Washington Monument” provides turn-by-turn directions for driving or walking using Google Maps.
Monet At A Moment’s Notice
Google has had a basic voice Web search built into Android phones for some time. But the latest version is much improved. By saying “pictures of Monet,” images of the artist’s work pop up within seconds.
Bill Byrne, who heads Google’s voice interface team, says his group is especially interested in developing more voice search features for the Web for people who don’t speak English. The voice search feature is available now for more than a dozen languages. That’s in addition to seven versions of English based on someone’s accent.
Basic voice Web search is built into Android devices by just clicking the microphone to the right of the Google search window. It’s also available for other smart phones by downloading the Google Mobile App for the iPhone and BlackBerry.
Dragon Software Advances
Nuance’s apps give you a feel for what their software can do on a computer. But it’s just the tip of the iceberg. That’s because the software actually learns from you and you can personalize it.
I tested the software on a PC laptop. The price of the software remains relatively accessible — it costs about $200 for the premium edition for PCs (it’s also available for Macs).
This summer, Nuance released version 11 of Dragon and there are some noticeable improvements in accuracy over the previous version. It actually watches, and then learns from corrections entered on the keyboard. That’s in addition to corrections you can make using your voice.
It also will learn your writing style by searching through documents or e-mails you specify. A number of authors are using voice recognition software to write books. Some, like David Henry Sterry, have turned to the technology because of wrist, hand or finger injuries from relentless typing.
Sterry says he prefers it to typing because it creates a “more direct synaptic connection to my subconscious.”
Custom Voice Commands
One of my favorite new features is a pop-up screen where you can easily access tips and commands. There’s also a nifty tool for creating custom voice commands, which enable you to say a keyword or a phrase and Dragon then inserts a passage of text you’ve already dictated or written. This is a handy feature for anything you might normally type over and over again, such as a mailing address, a customized e-mail signature or any type of legal disclosures.
Traps
In my conversation with NPR host Robert Siegel, we demonstrated how voice recognition technology is still far from perfect — especially if you’re trying to communicate words in foreign languages.
You also have to be connected to the Internet to be able to use voice recognition apps because the heavy lifting of deciphering what you said is done by servers in the cloud — not directly on the smart phone or device.
Using Dragon NaturallySpeaking effectively also requires you to be in a relatively quiet space. I sometimes run a fan at my desk. And if the microphone is on and I’m not speaking, the program thinks I’m saying the word “him” over and over again and will proceed to fill my screen with that single word, akin to what happened to the villain Carl in the movie Ghost.
It’s easy to turn the microphone on and off with your voice by saying “go to sleep” and “wake up!” But saying this just results in a lot of sideways glances from my colleagues, who can only assume that I’m sleep deprived.
© NPR