| Recent
Articles |
Yahoo Messenger Released For Vista
Yahoo released the first public version of its WPF-powered Yahoo Messenger software for Windows Vista, eleven months after the project was announced. Basically an entirely new IM client that interoperates...
Google's Chart Serving API Google has this really cool new API that you can use to generate charts. By simply including parameters in a URL, you get a chart image with the chart coming out...
Lots Of New Google Maps Features Google Maps has announced lots of new features this week for everything from collaboration to portability. My Location for mobile Maps-even for nonGPS phones.
Google Maps And Wikipedia Mashup An addictive little mashup called WikipediaVision has combined Google Maps with live data on updates from the English Wikipedia to display the geolocation of...
A Feature Packed WYSIWYG Wordpress Plugin Being a designer/developer, I'm very handy at coding and designing all my own posts. Need the text red? No problem. Need other fun formatting? Not an issue.
Check Out Windows Live Celebrity Search No one seems to be writing about this, so take a look at Windows Live Search for Celebrities. If you search for many famous people, a Smart Answer will list their...
26 Free Must-Have Buzz Monitoring Tools There are a lot of companies that will happily relieve you of your dollars, in exchange for buzz monitoring services. While many large companies will enjoy the peace of mind that comes from having a company...
|
|
12.20.07
Watch Google And Speech Technology
By
Barry Welford
Phonemes wanted - talk to Google...If you want confirmation that speech technology is the next big technical and economic opportunity, then keep an eye on Google.
This year they encouraged the formation of the Open Handset Alliance. This undermines the walled gardens created by the existing telecom companies. The picture now is very much a more level and competitive playing field.
It is interesting to see how Google is now developing its own stake in what will be a highly profitable marketplace. Marissa Mayer, Google's vice president of Search Products & User Experience, in an interview (Google wants your phonemes) revealed one part of the effort.
You may have heard about our [directory assistance] 1-800-GOOG-411 service. The reason we really did it is because we need to build a great speech-to-text model.
The speech recognition experts that we have say: If you want us to build a really robust speech model, we need a lot of phonemes, which is a syllable as spoken by a particular voice with a particular intonation. So we need a lot of people talking, saying things so that we can ultimately train off of that. … So 1-800-GOOG-411 is about that: Getting a bunch of different speech samples so that when you call up, we can (understand) with high accuracy.
This approach is adopted because Google Is All About Large Amounts of Data. Peter Norvig, director of research at Google, believes the following:
The way to get better understanding of text is through statistics rather than through handcrafted grammars and lexicons. The statistical approach is cheaper, faster, more robust, easier to internationalize, and so far more effective.
We wanted speech technology that could serve as an interface for phones and also index audio text. After looking at the existing technology, we decided to build our own. We thought that, having the data and computational resources that we do, we could help advance the field. Currently, we are up to state-of-the-art with what we built on our own, and we have the computational infrastructure to improve further. As we get more data from more interaction with users and from uploaded videos, our systems will improve because the data trains the algorithms over time.
Google is certainly in a privileged position to gain access to large amounts of data that can be used to improve other services. However it seems somewhat paradoxical to be using number crunching to better understand language and speech.
Others take a different view. For example, Powerset is building a consumer search engine based on breakthrough natural language processing technology licensed from PARC and developed internally. The search engine aims to leverage the structure and nuances of natural language to ultimately transform the way humans interact with computers.
It will be interesting to see which approach wins out.
Related: Can You Hear The Future?
Comments
About the Author: Barry Welford, President of SMM Strategic Marketing Montreal works with business owners and senior management on Internet Marketing strategy and action plans to grow their companies. He is a moderator at the Cre8asite Forums and writes on current issues on the Internet and on the Mobile Web in three blogs, BPWrap, StayGoLinks and The Other Bloke's Blog.
|