When it comes to his work, Singhal really is “the human search engine.” He’s been working on unpaid search at Google for more than a decade. He dreams of creating a technology similar to the computer on the starship Enterprise, that can answer questions such as “How did Alfred Nobel make his money?” At the moment, that’s beyond the capability of modern search engines…but we’re getting closer.
Trial and error plays a huge role in the process. Consider this: Google’s search algorithm simultaneously considers more than 200 factors when ranking results. The company launched more than 500 changes to that algorithm last year alone. But according to Singhal, “Concurrently we have approximately 100 ideas floating around that people are testing – we test thousands in a year. Last year we ran around 20,000 experiments. Clearly they don’t all make it out there but we run the process very scientifically.”
Each algorithm change goes through the same process: build, evaluate, launch, learn, improve, repeat. A change is proposed, tested on a small scale, evaluated, scaled up, tested again, evaluated, and so forth. Vanessa Fox at Search Engine Land described a five-step process:
First, a Google engineer comes up with an idea for a signal to introduce or adjust to improve the relevance of the search engine’s results.
Second, developers run the algorithm change on test data. If it runs smoothly, human raters enter the equation in a sort of blind A/B test. They take a look at search results for a wide range of queries, but aren’t told which results were compiled before incorporating the change, and which came after the change. The human raters state in their reports what percentage of results became more relevant, and what percentage became less relevant.
Third, engineers tweak the algorithm and repeat step two several times to reduce the percentage of queries that become less relevant after the change. Until the ratings from human testers show that the tweak makes the results better overall, Google doesn’t move this change on to the next step.
Fourth, the algorithm change goes into wider testing. Google rolls it out at one of its many data centers, and rolls out modified results to perhaps one percent of the queries that hit the center. What do searchers think of these new results? Singhal notes that searcher behavior answers that question. Searchers clicking on higher ranked pages likely means that the top results are more relevant.
Finally, Google gets a statistical analysis of the results from an independent analyst. These reports get presented once a week at search quality meetings. At these meetings, engineers examine the data, discuss the effects of the changes, and debate whether or not to roll them out more widely. Does the change improve the quality of search results overall? Is it good for the web? Can the internal systems handle it without becoming overly taxed? In these typically hour-long discussions, participants answer these questions for a number of proposed changes; some will roll out, while others will get held back for more research, development, and improvement. Some ideas might even be shelved indefinitely.
It’s worth noting, by the way, that never in this process does Google consider how a change to its unpaid search algorithm will affect its revenue. Indeed, Singhal insisted, in response to a question posed after he finished his keynote address, that “no revenue measurement is included in our evaluation of a rankings change.” For Google, relevance trumps everything.