There are a lot of things to like about Google, but if you’re a webmaster or a site owner, there are a whole lot of things to dislike, too. One of the biggest complaints that site owners and anyone who has to perform search engine optimization has with the company is its lack of transparency. Yes, Google has a set of webmaster guidelines, divided into design and technical guidelines (http://www.google.com/support/webmasters/bin/
answer.py?answer=35770) and quality guidelines (http://www.google.com/support/webmasters/bin/
answer.py?answer=35772). It even has a set of guidelines specifically covering search engine optimization (http://www.google.com/support/webmasters/bin/
Somehow, though, that’s not enough for many webmasters and site owners. And given how much a website’s standing in Google’s search engine results pages can affect a company’s business, that’s not too surprising. One common concern, for example, is making sure your site is indexed by Google. If it isn’t indexed, or indexed but apparently not receiving its full PageRank, webmasters have very little clue as to what is going on…and that lack of knowledge can be very expensive. Is the site in the infamous Google sandbox? Has it been temporarily banned? Why was it banned? What needs to be fixed?
Is it any wonder, then, that it’s not unusual to hear webmasters say, with varying degrees of frustration and bewilderment, “I’d gladly comply with Google’s guidelines, if only I knew what they were?!” Granted, some things are obvious, but others are much more subtle. When it comes to being penalized, there’s no question in my mind that search engine spammers get what they deserve. But what about companies trying to do an honest job who accidentally run afoul of one of Google’s guidelines? It can and has happened, and that’s a case where a little personal attention might be helpful. Google hasn’t been deaf to these cries for transparency — and according to a recent entry in Matt Cutt’s blog, the company is finally doing something about it.
The first talk of some kind of program to inform webmasters when they were violating Google’s guidelines came in Matt’s blog in late September of 2005. There he mentioned that “We’ve started a pilot program to alert sites that we consider to be outside our quality guidelines.” Matt went on to explain that they weren’t sending email to every site that receives a spam penalty. “This is not targeted to sites like buy-my-cheap-viagra-here-while-consolidating-your-debt-and-buy-some-posters-about
-online-casinos.com, but more for sites that have good content, but may not be as savvy about what their SEO was doing…”
It’s clearly not in Google’s best interest to penalize sites that have made honest mistakes. Remember, Google’s goal is to give its users the most relevant results. When a site is penalized for a simple mistake, but otherwise has good content, that means that Google can’t deliver that site as a search result, even if it would otherwise be the most relevant one. So, in order to serve its users better, Google started emailing certain webmasters that clearly weren’t spamming intentionally, to notify them of the problems with their site, give some idea of what they needed to do, and let them know how to submit their site for reinclusion in Google’s index once they fixed the problem.
In this blog entry, Matt included a sample email to give webmasters an idea of what to expect. It began with “While we were indexing your webpages, we detected that some of your pages were using techniques that were outside our quality guidelines…In order to preserve the quality of our search engine, we have temporarily removed some webpages from our results…Currently pages from [your site] are scheduled to be removed for at least 30 days.” The sample email goes on to pinpoint the specific problem, which in this sample was hidden text on a particular page, and even cites the page and the text itself. Finally, it goes on to explain how to resubmit the page for indexing.
This alone would be enough to get some webmasters excited; it certainly excited Matt. “I’m glad we’re trying to proactively contact webmasters and site owners when there’s an issue with their site in Google. I’m so excited that I split an infinitive in that sentence, didn’t I?” While this initial move meant that Google was boldly going where no search engine had gone before, more was to come, as Matt informed us in his blog late in April.
On April 26, Matt stated in his blog that “Google’s webspam team is working with our Sitemaps team to alert some (but not all) site owners of penalties for their site.” He emphasized that it was still an experiment, but it was moving Google more in line with its “don’t be evil” motto. As Matt explained, “I think the ideal search engine would also tell legitimate site owners when they risk not doing well in Google.”
Later on in the same post, Matt explains why Google has taken this extra step. Some sites are hard to contact by email because they don’t give any email information, or they don’t receive/read/respond to the email that Google has sent. So, with this new approach, “we are now alerting some sites that they have penalties via the webmaster console in Sitemaps. For example, if you verify your site in Sitemaps and then are penalized by the webspam team for hidden text on your pages, we may explicitly confirm a penalty and offer you a reinclusion request specifically for that site.”
Note Matt’s use of the word “may,” please. He included a couple of examples to illustrate this point. His first example was a nice little hotel in Bath, England. It was clearly a legitimate business, but it had hidden text on the site. Needless to say, this was the kind of website that Google would want to inform of the potential violation of Google’s webmaster guidelines, so they could fix it.
The second example was the kind of site that makes your eyes bleed and your brain go “Wtf?!” Just from the picture Matt provided in his blog, it was easy to see that the site practiced keyword stuffing; deliberately included misspellings; used nonsense or gibberish text, probably auto-generated; and apparently had tons of doorway pages. Matt went on to describe a number of other violations that the site committed which weren’t apparent from the screen shot.
This was an example of a site that had clearly violated Google’s guidelines in such a way that it deserved to be banned. As Matt explained, “Needless to say, I’d rather not tip off spammers like this when we find their pages.” Or, put another way, good riddance to bad rubbish!
Before I go into the reactions from webmasters, SEOs, and site owners, let me explain very quickly how to check whether you have a spam penalty, and you own one of the honest websites that Google wants to inform. With Sitemaps, go to your webmaster console, verify a site, then click on the tab labeled “Diagnostic.” You will see a page section called “Indexing summary.” Matt explained that the specific text would read:
“No pages from your site are currently included in Google’s index due to violations of the webmaster guidelines. Please review our webmaster guidelines and modify your site so that it meets those guidelines. Once your site meets our guidelines, you can request reinclusion and we’ll evaluate your site. [?] Submit a reinclusion request.”
The words “webmaster guidelines” and “Submit a reinclusion request” are linked to the appropriate pages. The “[?]” is linked to a new help page that answers the question “The summary page says that my site is currently not indexed due to violations to the webmaster guidelines. What does this mean?”
I believe it is very helpful that Google is doing this. However, I want to note that the way the search engine goes about informing webmasters via Sitemaps is a little less helpful than the way it had been informing them via email. You know you have violated the guidelines, and been penalized for it. But you do not know for sure what you did. Assuming the sample email Matt included in his blog in September is any indication, those letters spell out very specifically what and where the problem is, which means it can be fixed quickly. With Sitemaps, while you’ll at least know that you did violate the guidelines, you won’t necessarily know exactly what you did.
Still, this is progress, as many readers of Matt’s blog noted in their comments. Some speculated as to the reason for this change. Craig Wilcox called it great news and noted that “I was just talking last night about how Google might have to disclose a little more about how it ranks sites due to the recent lawsuit from the link farmer.” That’s KinderStart, in case you hadn’t heard. While that’s possible, it seems unlikely, given how long this particular change has been in the works.
Another poster heralded the change, explaining that “So often it seems (or feels) like optimizing for Google is a ‘webmaster vs. Google’ battle. I’m excited to see Google reaching out to webmasters with tools that help us achieve a common goal.”
Yet another poster seemed to think that Google hasn’t gone far enough. He suggested that Google publish a list of domains that have been banned in their index. He explained that he discovered some time ago that a used domain he had purchased had been banned in Google. “Or maybe Google could re-evaluate old domains to see if the issues on them have been cleared up by new owners?”
Not everyone was pleased by the change. One poster claimed that it was simply an admission “that Google search algorithms are broke!” He went on to state that the problem was Google’s Adsense, which he believed encouraged the creation of websites strictly to run Adsense ads. “If Google wants to improve the quality of content on the net, Google should review sites running Adsense and ban crap sites.”
Most of the reaction to Google’s new openness, however little or much it is, has been positive. Paul Salber probably summed up the sentiments of a lot of people when he called it the first compelling reason to use Google Sitemaps and observed that “This will save hundreds of man-hours and thousands of dollars of lost revenue for legitimate websites.” We’ll see how well the search engine giant follows through with this.