Search Engines and Algorithms: Optimizing for MSN’s RankNet Technology
(Page 1 of 5 )
The latest major change in the search engine giant, MSN Search, has been the inoculation of “neural networks” into its search engine algorithm, something internal researchers call “RankNet.” This change took place in late June of this year. This algorithm is fresh, and it is becoming a great consideration for many search optimizers. In this article, Jennifer Sullivan continues her reviews of search engines and their algorithms, this time focussing on MSN's RankNet.
MSN RankNet: What Is It?
RankNet is, in essence, a “learning machine” that takes the patterns of human searches into account, and learns from them, in order to provide more relevant results the next time around. They start from a baseline of predictions made that are input into its neural net. Chris Burgess of MSN says, “We take a bunch of data, ‘propagate’ it through the network (basically, take a bunch of weighted sums of the inputs and munch them together), and get values out of the network.”
They make their predictions with supervised learning, which means, “…a machine learning technique for creating a function from training data. The training data consist of pairs of input objects (typically vectors), and desired outputs. The output of the function can be a continuous value (called regression), or can predict a class label of the input object (called classification). The task of the supervised learner is to predict the value of the function for any valid input object after having seen only a small number of training examples (i.e. pairs of input and target output). To achieve this, the learner has to generalize from the presented data to unseen situations in a ‘reasonable’ way.”
MSN uses 569 different generalized properties to predict the relevancy of a document, as part of the input objects of their network, during this supervised learning or training. This is NOT the same as saying there are 569 different factors they weigh when determining a specific document’s relevancy to a particular query, but rather how certain features of a document might render it relevant, then build upon that data.