Google Granted Voice Search Patent - Technical and Other Challenges
(Page 3 of 4 )
Getting a voice-activated search interface to work well enough for people to use is not a piece of cake. First, there’s the fact that most search queries are short: usually five or six words at most, and often more like two or three. This would be fine if you could try to match those words up with a limited “vocabulary” – indeed, that’s why so many phone trees designed to recognize voice input work as well as they do. But a search engine requires recognition of a huge vocabulary; as Franz and Milch note, “Even a vocabulary of 100,000 words covers only about 80% of the query traffic.”
Even those two problems wouldn’t be so bad, but remember that a voice-activated search interface needs to deliver results in real time. Web surfers are used to seeing results from a search engine within fractions of a second of hitting “Submit;” indeed, Google’s own website always tells you how quickly it conducted your search, like a badge of pride. Anyone using the interface that is told “One moment please” and has to wait more than a second or two is going to feel like they’re being put on hold. I don’t know about you, but I get put on hold too frequently as it is; I don’t want that kind of frustration from my search engine.
Then of course there’s the obvious challenge of interpreting what the speaker said, even if the speaker has an unusual accent, or mumbles, or is in a noisy environment, or has a speech impediment, or…you get the picture. Text is much easier; even with misspellings, there are a limited number of possibilities (and the “Did you mean…?” clickable sentence that sometimes pops up when you search Google often takes care of that). If people from different parts of the same country have trouble understanding each other, what hope does Google have of doing better?
Leaving aside the technical challenges, it’s worth mentioning that Google is not alone in this field. VoiceSignal is a company that converts voice to text. I’ve already mentioned V-ENABLE; Promptu is another company that is working on voice search for mobile applications. AgileTV makes voice recognition software to search television. Microsoft may have been working on something like this as well. Remember the brouhaha over Kai-Fu Lee leaving the software giant for Google? That might have had less to do with opening China and more to do with the fact that his area of expertise is natural language.
Next: Why it Matters >>
More Search Engine News Articles
More By Terri Wells