One Year of Not Provided Google Data

About a year ago, Google started to protect the privacy of users logged into its system by not passing their keyword data along to publishers’ websites – unless, of course, the publishers also advertised with Google. The search giant claimed the move would affect less than ten percent of searches. Why is the reality so different?

Make no mistake, it’s vastly different from what Google claimed it would be. Just ask anyone who studies their Google Analytics data. When a signed-in user performs a Google search, the search engine encrypts the data; the search shows up in GA as “not provided” rather than a search for a particular keyword. If the number of “not provided” searches grows too high, SEOs and marketers can no longer tell what effect, if any, their marketing efforts are achieving. For many websites, the data – or lack of it – reached that level months ago.

Barry Schwartz writing for Search Engine Land cited a study conducted over eleven months by Optify. Covering 424 websites, 17,143,603 visits and 7,241,093 referring keywords, the percentage of “not provided” keywords has risen alarmingly. According to the study, Google withheld 39 percent of search terms – that’s one out of every 2.5 visits!

That’s not even the worst news. The study noted that about 13 percent of companies see “not provided” rates as high as 60 percent. Can you imagine if all you knew about more than half of your visitors was that they found you in Google, but you had no idea for which terms? Sadly, I know a number of you don’t need to imagine it, because it’s your current reality.

Why are the numbers so much greater than Google originally said they would be? Danny Sullivan offered some suggestions. He notes that “as Google has continued to grow its Google+ social network, it has encouraged people sign-in as much as it can…All those signed-in searches have keywords withheld.” Also, in July, Firefox started using Google SSL Search by default – whether or not users were actually signed in to a Google service. “Overnight, a huge chunk of search terms got withheld,” Sullivan observed. Two months later, Apple copied Firefox’s move for searchers using Safari in iOS 6.

Perhaps one of the more interesting points about the “not provided” data is that the percentages aren’t consistent. Some websites are seeing a very small percentage of visits where Google is withholding keyword information, while others are seeing nearly all of their keyword information withheld. What is responsible for this vast difference?

Matt O’Toole at Analytics SEO offers up a tantalizing theory. He studied data for several hundred websites being monitored by his company over the past year, and found an overall “not provided” average of more than 20 percent by September 2012. This reflected steady growth from an initial three percent “not provided” during the first couple of weeks after Google began encrypting data. But not all websites came close to that percentage. One website showed consistently low “not provided” traffic all year long, at around one tenth of one percent. Another website showed a peak of “not provided” traffic around 98.4 percent!

Could these outliers provide clues as to what kinds of sites might naturally see low or high “not provided” percentages? O’Toole thought the demographics data might be telling. He found that the website with the low “not provided” rate featured a user base of males aged 45 to 64 years. The one with the high “not provided” rate, on the other hand, appealed to females aged 25 to 44 years. This led him to ask “whether there were certain demographics more likely to be logged into Google+/GMail and therefore more likely to be contributing to your site’s Not Provided traffic?” Sure enough, looking at the demographic data for Gmail, he found that a high percentage of its users were both males and females between the ages of 18 and 34.

O’Toole suggested that marketers might be able to use this knowledge in their efforts going forward. “For instance, were your site to be below the norm in terms of Not Provided averages, you could assume that your userbase might be less likely to have Google+/Gmail accounts and therefore less likely to take any notice of the work you planned to do on your client’s Google+ page,” he observed.

What else can you do if you’re facing this kind of data blackout? You need to use the information that Google hasn’t encrypted. Look at the landing pages for your “not provided” traffic, and consider what’s making them so attractive. Use the keyword data that you DO have. And try to avoid making the wrong assumptions. Sadly, it looks like “Not Provided” isn’t going away any time soon, so you need to make the best of the situation. Good luck!

Google+ Comments

Google+ Comments