Removed From Google Index, and Wondering Why?

One of the worst things that could hurt the publicity for your website has happened: it has been removed from Google’s index. Why did this happen? And more importantly, how can you get back into Google’s good graces? Read on to find out.

Before we get to the meat of the matter, let’s review the statement you probably received when Google informed you that it was removing your website from its index:

“Your page was manually removed from our index, because it did not conform with the quality standards necessary to assign accurate PageRank. We will not comment on the individual reasons a page was removed and we do not offer an exhaustive list of practices that can cause removal. However, certain actions such as cloaking, writing text that can be seen by search engines but not by users, or setting up pages/links with the sole purpose of fooling search engines may result in permanent removal from our index. If you think your site may fall into this category, you might try ‘cleaning up’ the page and sending a re-inclusion request to help@google.com. We do not make any guarantees about if or when we will re-include your site.”

So the first thing to keep in mind is that no one other than Google could tell you precisely why your site was removed. Secondly, Google doesn’t do that sort of thing.  The best thing you can do, then, is make an analysis designed to reveal the red flags that could have led to your site’s removal.

It’s true that no SEO can guarantee inclusion in the search engine indexes after performing an analysis. That is partly because the search engines themselves are not terribly specific about what practices will cause them to remove a website from their indexes. For example, what exactly is search engine spam? Google offers a short list of practices that fall under that heading, and therefore should be avoided:

  • Hidden text or hidden links.

  • Cloaking or sneaky redirects.

  • Automated queries to Google.

  • Pages loaded with irrelevant words.

  • Multiple pages, subdomains, or domains with
    substantially duplicate content.

  • “Doorway” pages created just for search engines, or other
    “cookie cutter” approaches such as affiliate programs with little
    or no original content.

Here are some other rules to keep in mind, from Google’s site:

“Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you’d feel comfortable explaining what you’ve done to a website that competes with you. Another useful test is to ask, ‘Does this help my users? Would I do this if search engines didn’t exist?’”

“Don’t participate in link schemes designed to increase your site’s ranking or PageRank. In particular, avoid links to web spammers or ‘bad neighborhoods’ on the web as your own ranking may be affected adversely by those links.”

“Google may respond negatively to other misleading practices not listed here, (e.g. tricking users by registering misspellings of well-known web sites). It’s not safe to assume that just because a specific deceptive technique isn’t included on this page, Google approves of it.”

For more information, check Google’s Webmaster Guidelines at 
http://www.google.com/webmasters/guidelines.html.

Yahoo Search is even more restrictive in their definitions of spam and undesirables; however, concentration on detection and removal has not been receiving quite as much focus. It is important to be aware of and conform to Yahoo’s restrictions, since Yahoo has no reacceptance policy. Banishment has been, in every case I’ve heard of, permanent. http://help.yahoo.com/help/us/ysearch/deletions/deletions-05.html


 
The latest practice that can apparently get your site added to the list of undesirables is crosslinking or interlinking. With this practice, made-for-SE sites are linked together in an attempt to artificially inflate PageRank. RLROUSE Directory offers the following explanation of cross-linking:

“Cross-linking – If your entire site is sitting at PR0, one possibility is a cross-linking penalty. Sometimes a webmaster who controls two or more websites will place links from every page of one website to every page of the other sites to increase the PageRank ofall the sites. If detected, this will quickly incur a penalty if not
an outright ban from the Google index.”

For more information, you may want to point your browser to 
http://www.rlrouse.com/pagerank-penalty.html

How do search engines discover the cross-linking issue? Among other indicators, factors which might prompt discovery of crosslinkage may include:

  • Same content verbatim

  • Same cookie structure

  • Javascript function names

  • Linked CSS and JS files

  • CSS class names

  • Same contact information posted on websites

  • Common name servers

  • Same/similar images and/or graphics theme

  • Site hosted on same IP/block

  • Whois information matching

  • Alexa contact information matching

  • Interlinking of domains

  • Common backlinks (indirect crosslinking)

  • Same credit card used for anything

  • Login from same IP to separate accounts

  • Residual cookies from past logins

  • Similar file names or linking/directory structures

  • Code Comments


It used to be considered relatively safe to have as many inbound links as possible, regardless of their source. Over the course of the past year, that assumption spawned link purchasing and hidden crosslinking. Now, sites must also be very careful about inbound links. The search engines devalued the crosslinked and purchased links networks. Sites linking to those types of networks have reported decreasing traffic, and finally, over the past month or so, a number of such sites have been completely dropped from the index.

Whether this was a manual removal or an algorithm shift can’t be determined without proprietary information from inside Google, which we already know isn’t possible. Remember Google’s SE spam fighting philosophy: “Google prefers developing scalable and automated solutions to problems, so we attempt to minimize hand-to-hand spam fighting. The spam reports we receive are used to create scalable algorithms that recognize and block future spam attempts.”

The crosslinking tactics used alone are consistent with those of other sites that are considered spam and may have been reported as such. If Google should target those characteristics based on spam reports for other sites, then it is not surprising that, for instance, homeboundmortgage.com would be caught by the same adjustments to the algorithm or filters, and be dropped from the index as well.



This summer, a new term emerged: over optimization penalty. This refers to the tweaks most SEOs make to pages to “fine tune” them to the top of their keyword categories. Page length, keyword density, bold, underline, italics, H1 formulas, link text, and various other small elements are manipulated until the perfect balance is struck, and the SEOed site contains just a small bit more of these tweaks than the other sites in the top 10. It can be a full time job keeping a site at that level with these tiny changes.

Google has raised the bar, however: it is now effectively saying that high keyword densities, and many of the other SEO tweaks, are evidence of too much SEO. Filters are created, and such sites drop in the rankings. Sites that have been playing too close to the edge are penalized.

ALT Text

This tweak involves putting your keyword in Alt text. Not only does this contribute to your ideal word count, but it may look like keyword stuffing in the ALT tags. To clean this up up, switch to clear text navigation.
 

Old Link Exchanges

Pages stored in the Internet Archive may indicate the site was once involved in some questionable link exchanges.

Duplicate Content

All duplicate pages should be eliminated. Link to just one page consistently. 

Once all of these changes have been made, what do you do? You can, of course, try writing help@google.com. According to its Webmaster Guidelines, however, “We do not make any guarantees about if or when we will re-include your site.” (www.google.com/webmasters/2.html)

I personally know of only four sites that have been re-included after manual removal. In each case, the site was crawled regularly, but was not included in the index for over six months. I don’t know the specific reason for this, of course, but I would imagine it might be some sort of testing period. How strong is a webmaster’s resolve to walk the straight and narrow, after all, despite lack of indexing?

So, assuming the best case scenario, you might be looking at six months or more before your sites are re-included in the index. Once delisted, I also imagine such sites must stay squeaky clean. An SE might be forgiven once, but seldom (if ever) twice. 

The very worst case scenario, of course, is permanent exclusion from organic results on Google. Recovery basically means starting over, nearly from scratch. Plan a 12-18 month Overture or AdWords campaign, originally targeting the current website. Pick more specifically targeted keyword phrases initially, to keep costs down. I realize this is a high-priced keyword neighborhood. You may need to create new, perfectly targeted landing pages to lower acquisition costs.

Select a new domain name. Without reinclusion within a short period of time, the current name will continue to lose value daily.

Build a new, clean site under the new domain name. Text must be fresh, not a duplicate of the current domain. Do not duplicate site structure, filenames, or other elements that could link it to the banned name.

Gradually add organic links. Expect to take 6-12 months to acquire 1,000 related links. Continue to link to related sites over the next 18 months. Grow the site, adding one new page (250-500 words) each day.

As the new site rises in the SERPs, gradually switch the PPC traffic to the new site, and retire the current site completely. At any point in the process, if the current (old) site should reappear, it shouldn’t be an undue amount of work to gradually retarget the newly acquired links to the older site. Encourage natural link text by those linking to the site. 

 If you’re going to optimize at all, test it first with a safely isolated site. You’re not going to be able to push the optimization envelope for quite some time. The key to long-term survival and growth will need to be the “content is king” model.

The only bright news in the picture is that the site isn’t being meta-hijacked, at least not under any of the keywords I’ve tried (your homepage META keywords list). Google itself wouldn’t encourage such a practice, and in fact will likely be glad to deal with any such offenders under the DMCA.

If your sites were dropped due to algorithm changes, then it is possible that they could come back after a clean up. However, their supporting network of links has, at the very least, been devalued. That alone will cause a change in PR, even if cleanup makes them eligible for reinclusion. Site owners report that in these cases, cleaned up sites have come back into the results in two to six months. Getting back to the top would take additional time, and new, unblemished linking.

There are two schools of thought that apply. After clean-up, resubmit the site, or allow it to be found via linkage. Though Google states that there’s no oversubmission penalty, most webmasters tend to be leary of such submission, and prefer to allow robots to find the site via linkages.

I’ve never experienced a problem with resubmitting URLs that have dropped out for whatever reason. I take Google at their word that it’s harmless, as long as it’s a manual resubmission for good reason (for example, page down when the crawler came through), not submission through a service or promotion program. One submission is enough. Either way, watch the site logs. If robots don’t show up, it could be a case of manual deletion and permanent ban.

The site needs to be squeaky clean for the near term. The days of easy SEOing are on hiatus. The old-fashioned methods (content for users, natural linking) are back. Many sites may find that they have quite a backlog of goodwill links left. Those must be preserved and strengthened, while getting rid of harmful inbound linking.

Again, there’s no guarantee or certainty of getting back into the Google index. You need to use the same procedures already discussed…and sit through the same wait and see period.

Sources:

Crosslink Detection
www.webmasterworld.com/forum3/25568.htm

Crosslinking Penalty
www.webmasterworld.com/forum3/23890-2-10.htm

Sandbox Effect
www.promodo.com/web-site-promotion-articles-en/
about-google-search-engine-promotion-tips_page1_seo60.html

Innocent Interlinking of Sites
www.webmasterworld.com/forum3/25564.htm

A Statistical and Experimental Analysis of Google’s Florida Update
www.linksecrets.com/pub/florida-report.html

Speculation About August Changes
www.webmasterworld.com/forum3/25251.htm

The August Chronicles
www.webmasterworld.com/forum3/25553.htm

Denial of Google Over Optimization Penalty
www.markcarey.com/googleguy-says/archives/
discuss-denial-of-google-over-optimization-penalty.html

Future of SEO
http://list.audettemedia.com/
SCRIPTS/WA.EXE?A2=ind0408&L=led&D=1&
T=0&H=1&O=D&F=&S=&P=266

Google+ Comments

Google+ Comments