Every SEO seems to confront this frustrating situation sooner or later. Fortunately, there are forums devoted to SEO. Many heads with lots of experience, often with very different kinds of sites, can come up with things to check that one person beating his or her head against a bunch of pixels might not have considered. Developers call the practice of trying to figure out why a particular program isn’t working the way it should “troubleshooting.” This article will cover some of the things you’ll want to consider when troubleshooting your web site’s SEO.
It was inspired by a thread in our own SEO Chat forums. The original poster mentioned that he maintained a site that had less than half of its pages listed in the main index of Google. He made some major changes to its internal linking structure to fix some mistakes he’d made earlier; Google also isn’t listing his internal links correctly now. He’s looking for some kind of checklist so he doesn’t feel like he’s just stabbing in the dark.
The first check he can perform, of course, is to make sure that his site is set up in accordance with Google’s own guidelines. The search engine just recently updated these guidelines to be clearer and include more information. If you scroll down to “Quality guidelines – specific guidelines,” you’ll see that many of the bulleted points now contain hyperlinks that take you to more information about specific issues, such as hidden text. Take your time with these to make sure you fully understand the guidelines.
Once you’ve done that, a lot of the things you need to check fall under proper site maintenance and making sure that Google can see everything you want them to see. This won’t solve all your problems, you understand, but enough items can be accounted for in this way that it’s worth going down the list.
If you keep in mind how search engine spiders work, all of the things I’m listing to check in this section will seem obvious. But they can also be missed inadvertently, which is why it’s always good to check them. Think of it as part of good site maintenance, or even "site hygiene," to coin a phrase.
A robots.txt file is a good thing; it tells the search engine spiders whether or not to crawl a particular page. That can be important if you have certain content set up to be seen by subscribers only. But if your robots.txt file is set up wrong, the spiders could be avoiding web pages you actually want them to see, thus preventing the pages from being indexed.
Likewise, a search engine spider can’t follow a broken link on your site. Neither can a human visitor. Make sure all of your links work perfectly.
Do you have any nofollow tags on your internal links? Google honors that tag, which means it doesn’t follow a link with that tag at all – not for awarding “link juice,” and not for indexing. Keep that in mind when you set up your site’s architecture and linking scheme.
Do you have any pages without content? While a site is always “under construction,” you never know when the search engine spiders will be paying a visit to index your site. You want to show them your best face. Keep those oddball pages to a minimum.
How good is your site navigation? It doesn’t have to be fancy, but for the sake of both your human visitors and the search engines, it should be consistent, with general categories leading to more specific topics within the categories. SEO Chat, for example, has a long list of navigation links on the left side. You can visit a variety of topics which we cover in our articles, such as “link trading,” “search optimization,” and others. Click on the category, and you get a list of articles; clicking on an article title will take you to the first page of that article. It’s very predictable.
Speaking of content, you’ll also want to check for duplicate content. You need to find out whether someone else is duplicating your site’s content (in which case Google, who can’t really tell which site was there first, might be penalizing you by mistake). You also need to find out whether you have duplicate content on your own site – whether you’re duplicating someone else’s content, and whether some of your pages are so similar that Google sees the two pages as identical, and chooses to index just one. There are a variety of tools you can use to check this; just Google “duplicate content check.” Or you can just Google some key phrases from the content that you think has been copied, and see what comes up.
You can make Google’s work even easier by submitting a sitemap. Do it in XML, and keep it up to date. You can find the details for how to do this here; the page lists the protocol and explains how to submit your sitemap. Keep in mind that your sitemap should not be larger than 100 links. If it is larger than 100 links, you will have to break it up into more than one page – and you can include another page on your sitemap that links to both of those pages, so Google can keep crawling. Incidentally, if possible, you really should have less than 100 links on each page of your web site as well.
Speaking of your site’s structure, you might want to take another look at it, especially if you’re not using a template. Cookie cutter templates may look boring, but GaryTheScubaGuy (aka Gary Beal) cited one possible reason to at least create and use your own unique template: it may help keep Google from seeing very little or duplicate content on your site and then backing out. He notes that this is rare, but he has seen it when a novice builds a site one page at a time with no template. “In a correctly built site most robots will parse the template and crawl the content and see unique content. This allows them to crawl deeper and faster,” he explained.
Sometimes it’s a matter of patience. If the site is fairly new, Google simply will not have gotten around to indexing all of it. If you’ve recently purchased a site that has been around for a while, you’ll want to consider what was on it before. If the site has undergone a serious redesign, has the domain or subject of the site changed, or remained the same?
Is your site an affiliate site? Or does it link to an affiliate site? Those have their own special issues, which are beyond the scope of this article. I can say that avoiding duplicate content is particularly tricky for affiliates.
There are a number of things you can do to fix your site after you complete your troubleshooting. Some of them will be obvious depending on the problems you find: adjust your link structure, create appropriate sitemaps, and so forth. Other items will be less obvious. Also, it’s worth keeping in mind that my list is by no means exhaustive; it’s a general list, not specific to any particular site or type of site.
One thing you can do is assign crawling priorities to your Google sitemap tag. This way you can make sure that the most important pages pertaining to your position in the SERPs get crawled first.
Another thing you’ll want to do is take care of canonical issues. Make sure all variations of your home page’s URL redirect to the URL that you’ve decided is your “official” one, generally http://www.yoursite.com/ (with the www at the beginning and the slash at the end). Your official home page URL should be whichever one you’ve focused most of your past link building efforts upon.
Use file compression to optimize the sizes of your images and other files. This means that your site’s size as a whole will be smaller and load faster. Search engine spiders will be able to crawl your site more quickly and index more pages.
As a related issue, you’ll also want to check your server’s response time. Does it deliver good performance? Or is it slower than normal? This is another factor that can seriously affect how quickly the search engine spiders can crawl your site – and thus how much of your site they can index.
There are other factors you can check as well. One of the nice things about forums is that many of them let you search for previous threads, so you can see the advice given to others in your position. But checking the items I described in this article should help you get off to a good start if and when you need to troubleshoot your web site. Good luck!