The Emerging Importance of Behavioral Data in Rankings

As SEOs, we watch the search engines carefully for any indication of what factors they consider important when it comes to ranking a web site in their results. When those factors change, we change our approach. That was especially true for links. Now, Google seems to be using something even more telling than links for ranking web sites — user behavior.

Links are a natural part of the web. They tie everything together in a logical and easy to follow manner. They hold the web together and make the Internet usable.

With the release of PageRank, links took on another form. Instead of real votes for websites, links gained search ranking value. The more one has, the higher one ranks. It doesn’t matter much if the link is designed for visitors to follow (the real purpose of the link), as long as search engines count it in the algorithm.

The fact that Google, and later other search engines, started using links as a ranking factor changed the face of the web and the true nature of links.

Now as we move closer to the new decade (already?!), search engines are incorporating another factor into their ranking algorithm – user behavior data. Search giants (Google in particular) have collected, and continue to collect, enough data to start using it as a ranking factor.

In this article I discuss the ways in which Google tracks users, signs of changes in the search engine optimization industry, and the incorporation of the third (or more correctly fourth) ranking indicator.

Google is the biggest spyware you will ever encounter. Webmasters and computer geeks freak out at the mention of spyware, viruses and Trojans, and think building walls of anti-virus software will protect them. Yet Google tracks and records more than any hacker can dream of.

Google AdSense – AdSense tracks millions of users every day, making billions of clicks all across the net. With AdSense code, Google has the equivalent of analytics installed on every website that participates in the program, but that analytics data is accessible only to Google. They can track anything from screen resolution, flash version, browser, clicks, links, referrals, and much more. And they can do so, without any fear of lawsuits, because it is done to “battle” click fraud. Of course the information is indeed used to battle click fraud, but I am sure Google finds many more uses for the data.

Google Analytics – With AdSense, Google can track informational websites. Analytics lets it track sales-based websites. Without the analytics data, Google has no window into non-informational sites, which is a big loss. The real goal of the analytics software is not to help small webmasters, but to collect data. On top of usage information, Google Analytics also tracks conversion rates, so theoretically they can adjust PPC costs for vendors based on their site conversion rates and revenues.

Google Cookies – Google puts a special cookie on all computers that visit Google’s search engine. This is not a regular cookie, but a super smart one. It is both very beneficial to and super evil. On the good side Google cookie allows you to save preferences, such as number of search results per page, language, filters and other options. On the evil side, the cookie has a unique identifier (one for you, one for me and one for the world’s other six billion people) and tracks all searches, all visited sites, the amount of time you spent on those sites, the keywords you used and more. Just imagine someone standing behind your shoulders and looking at what you do… for years…

Google Checkout – This is Google’s move to identify the real people behind the computer screens. It is a way to connect user cookies and other identifiers to a real name, with a real address and credit card number.

Google Toolbar – Every page you access gets sent to Google. It may also be tracking more than just web pages, but entire browsing behavior.

Google Chrome – Chrome is a very interesting browser. Google’s move into the browser market is not without a reason. They are already spending around $70 million per year on Firefox. Why do they need another browser? That $70 million per year is not much for Google, so they’re not trying to save money. There’s much more to it.

Chrome has a unique identifier, which is created by default. With it, Google knows it is you, no matter where you access the Internet. Another tracking feature is the typing tracker. Whatever you put into the address bar gets transmitted to Google, whether you actually go there or not.

So one of the big reasons for Google Chrome is collecting user data. In fact, Microsoft’s Internet Explorer 8 will transmit user data back to Microsoft, just like Chrome. The big advantage that Microsoft has is their dominant market share, as many more people use IE than Firefox and Chrome.

Gmail – Care to share your email with the big G? Then start using Gmail! :-) They read email in order to show relevant advertisements, but it can get a lot deeper.

Google Desktop – I am personally extremely paranoid about this tool. Google scans your computer and lets you search it with their technology. It also transmits whatever it finds on the hard drive back to the Google’s mother lodge in Mountain View.

Google Maps – Have you ever used Google Maps to get directions from your home address to, let’s say, a restaurant or a club? If you use the same computer (thus the same tracking cookie) and look for directions several times, Google can automatically assume that this is your address. Why? Because you ask for directions starting with the same departure place, so it must be your home/work or some place where you hang out a lot.

Search Wiki – Google’s addition of a search wiki that lets you reorganize search results is none other than an attempt to track user reactions to search results and collect results en masse.

Now imagine all this data in one place. If you’ve been using Google for several years, it knows an awful lot about you, your interests, habits, friends, political interests, and possibly darkest secrets.

Moving on, the point was not to scare you (though you should be suspicious), but to show you that user data is being collected on an immense scale, and the question is, how does Google use that data? We know it can sell it, but can it be used in search algorithms? The answer is, it already is, and it is going to be used more and more.

First search engines looked at on-page factors. It worked for some time, but failed in the long run due to abuse by search engine marketers.

The second generation of search engines started looking at links as an indicator of importance and relevancy. It worked, and continues to work, but like on-page factors before it, it is starting to be extremely abused.

In earlier times on the web there were no link farms, no link buying and no link directories. Links were real votes by real people. But that’s not entirely true nowadays.

As links became an important factor, SEOs came up with, and continue to come up with, ways to get around the system. For instance, link buying is hard to detect, depending on how it is done, and Google is having a lot of trouble with it. There are site networks, farms, etc – all designed for one purpose – links.

Links are becoming a less trustworthy indicator, due to mass abuse. This means that search engines can’t rely on links to the same extent they have in the past if they want to continue to deliver relevant results to their users.

We are moving into another generation of search engines, where behavioral data plays a much more important role. This means that bounce rate, page views, brand search and other indicators can be the deciding factor in taking the top spot.

In his blog on brand shift, Aaron Wall mentions that a linkbait with 10,000 page views has lifted his website dramatically in the search results, which in part may be due to page views:

Yesterday we launched a well received linkbait, and the same day our rankings for our most valuable keywords were lifted in both Live and Google, part of that may have been the new links, but I would be willing to bet some of it was caused from 10,000′s of users finding their way to our site.

Microsoft has made it clear to search community they are considering user behavioral data an important indicator of quality and relevancy with the release of BrowseRank.

"The more visits of the page made by the users and the longer time periods spent by the users on the page, the more likely the page is important. We can leverage hundreds of millions of users’ implicit voting on page importance," the researchers said in BrowseRank: Letting Web Users Vote for Page Importance, a paper from the SIGIR (Special Interest Group on Information Retrieval) conference in Singapore. Authors are Bin Gao, Tie-Yan Liu, and Hang Li from Microsoft Research Asia and Ying Zhang of Nankai University, Zhiming Ma of the Chinese Academy of Sciences, and Shuyuan He of Peking University.

The Four Main Indicators

As mentioned above, early search engines used on-page factors to rank pages. Later Google released Page Rank, and search engine started to count links plus on-page criteria. Then humans entered the equation, providing behavioral data. So I like to break everything down into four core factors that search engines use:

  1. ON PAGE – This factor will stay and is not going anywhere.

  2. LINKS – This factor will also stay and is not going away.

  3. QUALITY RATERS – Rumors say Google employs over 10,000 raters, thus the human factor is number three. This one may be removed or shortened as economic conditions change.

  4. BEHAVIORAL DATA – This factor is the new frontier.

So what can you do? Make sure users stay on your website and view, at the very least, more than one page!

Google+ Comments

Google+ Comments