To a very limited extent, it’s already happening. But if former Bing product lead Mark Johnson is right, it’s going to be a long time before we see much more progress in that area. “If every user that comes [to Bing or Google] is getting a personalized experience based on Facebook data, based on the web graph, based on the social graph – holy crap, that’s a lot of processes to do,” Johnson notes. He’s quoted by Austin Carr in an article for Fast Company.
To Johnson, it’s a matter of economics. Interpreting all of those social signals would involve huge server costs. Neither Bing nor Google would be willing to pay those costs without seeing “significant increases in quality of search results,” according to Johnson. Server costs would probably need to drop pretty dramatically before the economics of adding social signals in a more personalized way to search started making sense.
Here’s the problem: search engines typically try to precache queries. PC Mag defines precaching as downloading data ahead of time in anticipation of its use. “For example, when a Web page is retrieved, the pages that users typically jump to when they leave that page might be precached in anticipation,” PC Mag explains. When a user sends a query to a search engine that it has seen before, it retrieves at least some precached results, for the sake of speed and conservation of resources.
But imagine what would happen if search engines had to take both your personalized search profile and your social graph into consideration. Since many people post regularly to Facebook, Twitter, and other social sites, Bing and Google wouldn’t be able to just retrieve the same results they’ve used before. To deliver the best results, they’d have to recalculate everything every time someone searched. As Johnson notes, “if you have to go to the server to calculate every single query, your front-end costs go up. If Bing or Google sees a query that it’s never seen before, and had to actually calculate a result on the fly, we’re talking about using time on hundreds of thousands of servers.”
While Google does see its fair share of queries it has never seen before, the percentage of new queries is not as high as you might think. According to Google’s internal data “We’ve never seen 16% of the queries we see every day.” Because of the economics involved, it’s fair to assume that Google uses precaching to put up results for the other 84 percent of the queries it receives, and it’s likely Bing does the same thing.
So how are Google and Microsoft proceeding? They both think that the search industry must incorporate social factors into results in the future. But so far they can only accomplish this at a basic level, by annotating results. For example, if I search on Bing or Google and one of my friends has shared something relevant to my search, I might see an indication of this next to the link. This is why getting shared on Twitter or “liked” on Facebook is becoming more important.
Bing director Stefan Weitz hopes to bring even more social signals into search. As he told Fast Company back in May, “ There are more signals than just ‘Likes.’ There are tweets, check-ins – when I’m at Spur restaurant in Seattle, and I say it’s the best lamb tartare and post that on Yelp, that’s a signal as well. There’s a world where all these social and personal signals – whatever you want to call them – are consumed and indexed and made sense of.”
It’s not getting the social data itself that’s the problem, though; it’s sifting through it, or consuming and indexing and making sense of it, to use Weitz’s words. As Johnson observes, “Having data alone does not give you an analysis of that data. I think that’s a problem with a lot of advertisers – sure, they’re sitting on mounds of data. But they don’t know what to do with it.” And when you consider that whatever you do with it must be cost-effective and beneficial to someone who’s not willing to wait very long for an answer, well, you have an overwhelming problem on your hands.
And while Microsoft and Google are the only companies that have the resources and skills to potentially mine the social data and add the information gleaned from it to search results, Johnson doesn’t think this is going to happen soon. It isn’t simply a matter of tweaking the algorithms, and even if it were, “Tweaking the algorithms is not simple!” Johnson emphasized. “The outside world makes it seem like it’s so easy to just ‘bubble it up to the top.”
Johnson knows whereof he speaks. He’s not only the former project lead on Bing, but he has a number of search startups under his belt. These include SideStep (acquired by Kayak for $180 million), Kosmix (bought by Wal-Mart for $300 million) and PowerSet, a search engine purchased by Microsoft in 2008 for $100 million. Johnson’s latest venture, Zite, focuses on content delivery on mobile devices.
And yet, back in May Weitz told Fast Company that personalized social search was "not 10 years, not five years away, it’s a couple years away–tops–where social is literally so imbued into the experience that it’s just another ranking factor like anything else." Time alone will tell who is right, but given how differently many people use social sites, I’m inclined to think the problem isn’t as easy to solve as Weitz seems to be implying. Of course, if it really does lead to better search results, I wouldn’t mind being proven wrong.