Will Microsoft’s Tellme Bring Us Closer to True Voice Recognition Technology?

Will Microsoft's Tellme Bring Us Closer to True Voice Recognition Technology?“Siri, do you like movies about gladiators?”

Voice recognition, in theory, is a remarkable innovation. In my experience, and probably the experience of frustrated millions across the world, the reality of how it’s been utilized so far has been (to put it kindly) disappointing. We’re expecting hands free, cutting edge interaction between human being and computer — as seen in Star Trek and Buck Rogers in the 25th Century — but what we’re served is usually closer to a concoction of Red Dwarf‘s deranged Holly, 2001: A Space Odyssey’s HAL 9000, and a broken electric pencil sharpener; ask, and ye shall not receive exactly what you were expecting. One needn’t go far to witness the ineptness of a computer’s ability to effectively discern the nuances of human speech without botching the job; being Scottish isn’t necessary to get truly dismal results, but it sure helps (apologies to LockerGnome’s John McKinlay, who probably has his own devious ways to outwit smart alec elevators in spite of being a son of Caledonia).

A few problems faced by voice recognition developers go beyond the obvious fluctuations in human dialogue and dialects. For instance, ambient noise that a microphone may pick up isn’t always easy to filter out of the equation. When a human being is trying to pick out a conversation in a crowded room, he or she will usually have the benefit of visual and other sensory cues to aid in the filtering process. Most computers in 2011 can’t claim this advantage, so voice recognition developers rely on mathematical methods to ascertain which incoming signals are legitimately intended for interpretation, and which ones are just noise.

Microsoft’s Kinect for its Xbox 360 gaming and media console is probably the most successful contender in this round of voice recognition efforts; it goes further than most other such systems and actually is able to integrate visual cues into its repertoire for a more complete, hands-free user interface. Microsoft has announced that, while Kinect has surpassed the company’s and consumers’ hopes and expectations with the original unit that shipped, new developments in its Tellme technology (that also powers Windows Phone 7 and other devices) will expand dramatically upon current voice recognition potential.

“We are laying a foundation that will transform how people interact with devices,” says Thomas Soemo, Microsoft’s principal program manager lead for the Xbox platform. “We are at that cusp. With Kinect, we’ve put speech into the living room. Now, Microsoft will continue to push the boundaries of NUIs [Natural User Interfaces] to enable seamless experiences that span devices and platforms.”

Adds Keith Herold, a senior Tellme program manager lead: “What are the most amazing experiences with speech we can imagine? Can we create technology that is as natural as talking to a friend? This is where we want to go, and it’s happening in front of our eyes.”

What are your experiences with voice recognition technology? Do you see Microsoft getting closer to perfecting it with Tellme? I own a Kinect and I think it’s an overall cool idea, but I see room for improvement. Then again, I live on a busy street, so I admit that the ambient noise we were talking about earlier makes voice recognition technology a lot more challenging for such a device. What improvements, if any, would you suggest to Microsoft’s Tellme team if you could bend some ears there?

Article Written by

Our resident "Bob" (pictured here through the lens of photographer Jason DeFillippo) is in love with a woman who talks to animals. He has a fondness for belting out songs about seafaring and whiskey (arguably inappropriate in most social situations). He's arm-wrestled robots and won. He was born in a lighthouse on the storm-tossed shores of an island that has since been washed away and forgotten, so he's technically a citizen of nowhere. He's never killed in anger. He once underwent therapy for having an alien in his face, but he assures us that he's now feeling "much better." Fogarty also claims that he was once marooned along a tiny archipelago and survived for months using only his wits and a machete, but we find that a little hard to believe.

  • Bwillwall

    In my opinion it is impressive how Kinect can pick up commands from so far away and with just the cue word “XBOX”. However Siri is so much more impressive to me, because it will actually listen for any word, and try and discern what to do with it. And it does a very good job, and will improve in the future as it is still in beta. Siri nearly always gets my words right, though it could just be my voice because not everyone seems to think so. So even if it might appear that this is just Microsoft following Apple like they do for everything, this does have a very good potential to begin taking advantage of other sources other than just the mic for voice recognition.

  • John

    I have a hearing impairment, so my voice and the phone do not always like each other. I always prefer to use the numbers if possible, and normally even if they dont tell you you can, you can. usually 1 is the default choice (yes, 2 is no) etc. Speaking of TellME, does anyone have any alturnative phone thing for 1-800-555TELL? Its going away June1st. I want something for phones. Thanks…