Google Falls in Love with Television

America’s love affair with television started decades ago, so it should certainly come as no surprise that Google has fallen in love with it too. The search engine has been testing some prototype technology that, if it works as advertised, could change the experience of watching TV as much as the shift from black and white to color. It might even change it as much as the advent of TiVo.

Google’s love affair can be dated back at least to January 2005, when the company launched Google Video (in beta of course). The search service lets users look for TV programs. It searches the closed captioned information that comes with TV shows. Like YouTube, it also lets users upload their own video.

A number of observers have criticized the service, saying that better video search services are available elsewhere. Google’s video store in particular has come in for heavy criticism due to its chaotic nature and a somewhat cluttered interface, very unusual for the search engine. To be fair, though, it’s not bad if you know exactly what you’re looking for (don’t depend on the category drop-downs to show you everything!).

Google has some very big plans for the future, though. In January, the company purchased dMark Broadcasting, a firm that created technology to automate the buying and scheduling of advertisements over the radio. Obviously, this is one way Google can extend its advertising revenue stream further; one can reasonably anticipate that Google will integrate this technology with its own AdWords program to allow advertisers to bid on slots for radio commercials.

Technologically, Google can go further than that, however. When speaking at a luncheon for New York publishing executives a few months ago, Google CEO Eric Schmidt expressed his belief that radio ads could be more personalized than they are now (i.e. designed to appeal to the demographic that likes that particular music). Schmidt envisioned a system that takes into account the listener’s location via GPS, as well as his or her needs – so that he could be reminded, for example, that he needs a pair of pants and should turn left to get to the upcoming clothing store. We don’t have that yet (and thankfully so; I don’t need my radio arguing with me about my sense of fashion, thank you), but it’s the two ideas of automation and personalization that Google may be using to take the next step with TV and video.

{mospagebreak title=Using Video Fingerprints}

It starts with a paper titled “Social- and Interactive-Television Applications Based on Real-Time Ambient-Audio Identification,” a 10-page PDF written by Michael Fink, Michele Covell and Shumeet Baluja that you can check out yourself. The latter two are researchers with Google. The paper itself describes “mass personalization, a framework for combining mass media with a highly personalized Web-based experience.”

Mass personalization? What kind of oxymoron is that? Well, if you want a quick summary, you could do worse than to check out the official Google Research blog entry on the paper. It was presented at Euro ITV (the interactive television conference), which took place in Athens, and it received the best paper award.

The paper described a way of using technology to present relevant information to a TV viewer on their computer. It works by use of a software program that utilizes the computer’s built-in microphone to pick up the sounds in a room. It picks out the audio from a TV program and reduces a five-second snippet to a digital four byte “fingerprint.”  The fingerprint is then sent to an Internet server to find a matching fingerprint from a pre-recorded show.

If it does find a match, the software enables the computer to display relevant content. This could be almost anything. Want to chat with other people watching the same program? It could create a chat room on the fly. Want to buy a dress like the one Nicole Kidman is wearing? Ads could automatically pop up to tell you where you can get it. Want more information about where that “Survivor” episode is taking place? Your computer can automatically display maps to show you. 

Michael Fink, lead researcher for the project, says that this wasn’t done with a commercial service immediately in mind. “We weren’t really pitching an application that we want to do here and now, but rather a concept,” he explained. “We wanted to open people’s minds to the possibility of using ambient audio as a medium for querying web content.”

Indeed, according to the Google Research blog, “all of this would be done without users ever having to type or even know the name of the program or channel being viewed.” It’s hard to imagine, but yes, this system could even keep up with channel surfers. It could, potentially, add a whole new dimension to television viewing.

{mospagebreak title=Possible Obstacles}

It could also raise all sorts of specters about privacy violations. With AOL’s release of 20 million uncensored searches from more than 600,000 users, this is clearly a sore spot. But the researchers claim that the technology used to make the fingerprints can’t be reversed – that is, it can’t be used for picking up private conversations in the room. This fact might end up being less important than user perception, however. If someone believes the service can be used for “spying,” they’re not going to use it.

Another problem with turning this technology into a product is the sheer amount of data. How many hundreds of hours of programming will need to be recorded and analyzed before you can match up fingerprints correctly at least most of the time? When the researchers tested the prototype, they used a database server with fingerprints from about 100 hours of video. The prototype did fairly well in tests, making mistakes in matches (and thus displaying irrelevant data) at most six percent of the time.

Still, 100 hours of video is not going to be enough for a “beta” service with a wide release. Google historically has managed mind-boggling mountains of data with aplomb; still, video is a pretty complicated beast. Wade Roush of Technology Review noted the problem with matching relevant ads to programming is that it “would only work if someone first manually notated what is onscreen at any given moment in a broadcast. With the volume of TV programming broadcast every day, that would be a tedious job.”

One more possible obstacle to this service is a simple answer to a question. How many people watch TV and surf the Internet at the same time? TV watching has historically been something of a passive activity; there’s a reason for the nickname “couch potato” after all. But that seems to be changing with younger generations who have become accustomed to using technology to help them multitask. Still, even with that, how many people who do watch TV and web surf at the same time would be interested in a service that helps them synchronize what they’re watching with what they’re surfing?

{mospagebreak title=Possible Opportunities}

If it does become a service – and Peter Norvig, director of research at Google, has said that the company’s work on audio and video processes will show up eventually in real products – Google will be betting that a lot of people will want to have this kind of experience. It will also be betting that a lot of advertisers will be interested in reaching these people. From the consumer’s point of view, it would be rather different from commercials that actually interrupt the program; rather than zapping the advertisements, as with TiVo, you could simply ignore them. Something that is less intrusive is usually considered to be less obnoxious – and if it’s actually relevant, a viewer might be willing to pay attention.

It would also be very different from the advertisers’ point of view. Commercials are usually squeezed in to specific points in a program, and, as mentioned, they interrupt the action. There are a limited number of slots. With this system, however, relevant content could crop up on the web at literally any moment in a program. “Say I’m an advertiser, and I would like a link to my website to appear with a specific episode of Seinfeld,” Fink explains. “We could open each moment of audio to a bidding process. The Google model of advertisers bidding for related words on Web pages, which has proved to be very successful online, could be carried over.”

Again, though, it works best if enough users are interested in getting relevant content in this way. And even though they won’t be giving up their privacy by revealing their living room conversations, they will be giving up some information: their TV viewing habits. Whether Google will do anything with this information beyond serving ads and creating social connections with others watching the same thing (and whether users will be comfortable with it) is another question. We all get to stay tuned in the coming months for the answer.

[gp-comments width="770" linklove="off" ]