By way of review, in my previous I explained that an accessible website is one that delivers its content successfully as often as possible. I discussed the main issues that would cause a website’s content to be inaccessible, either to visitors or to search engine spiders, and what can be done to solve these issues. Under the issue of URLs, titles, and meta data, I explained the benefits of a well-written URL, and the problems with dynamic URLs. I also explained how well-written titles and meta data can help you attract more visitors to your website.
The three components I will discuss in this article are text, information architecture, and the “canonical issues” surrounding duplicate content. Text, in this case, specifically means search engine friendly text. Text can be in certain formats which make it impossible for search engine spiders to read. If a spider can’t read it, the search engine can’t index it.
Your site’s information architecture – how content is organized – will affect its readability to human visitors. It may also have an effect on your standing in the search engine results pages (SERPs). Finally, the issue of duplicate content, if not properly addressed, can hurt the standing of some of your website’s pages in the SERPs.
Now that I’ve explained each of these points briefly, it’s time to give fuller descriptions, and explain how to tackle each in turn.
You probably assume that search engine spiders can access all of the text on your pages, but can they? There are certain text styles that stop spiders cold. If you have any of the following on your website, you need to do a little rethinking.
- Text embedded in a Java application or Macromedia Flash file.
- Text within an image file. These files include jpg, gif, png, and others.
- Text that can be accessed only via a form submit or other on-page action.
With the technology getting better every day, this might change in the future, and spiders might learn how to read these styles of text. But for now, assume that any text in one of these formats cannot be read by spiders, and therefore is not getting indexed by the search engines. This means that visitors are not finding that page.
Search engine spiders, by and large, read text in HTML format very well. If you want to rank well in the SERPs, you need to put your text in that format. If you have web pages on which you must use a format that the search engines won’t be able to index, there are ways you can make up for this. Try to use the right keywords and phrases in your headlines, title tags, URLs and image/file names on the page. Be careful, though, because it is possible to go overboard.
In particular, do not try to hide any text. Some site builders and SEOs do this with CSS tricks or the simple tactic of making the text the same color as the background. These days, that’s considered to be a “black hat” SEO tactic. Even if the search engines can’t detect it automatically, your competition can certainly spot it. They would be happy to report your site to the search engines for keyword spamming, and that can get you banned entirely.
Now you know what not to do. But you’re probably thinking that isn’t much help. How do you get search engines to see what you want them to? Well, you need to write the right kind of text on your page. And that brings us to our next topic.
Search engines examine the terms and phrases in a web page, and from that they can extract a great deal of information about the page – and, from a group of pages, about a site as a whole. Sure, they learn something from the frequency of certain terms, but that’s not the only thing they weigh. Writing well for search engines is part art, part science; there are books on the subject. Since search engines jealously guard their algorithms, SEOs can only make very educated guesses. In general, though, writing well for the search engines is very similar to writing well for your site visitors.
If you want to optimize your on-page text to score well in the search engine rankings, here are some rules to keep in mind:
- Make sure the main keyword/phrase for which you wish to rank well is featured prominently on your page. Don’t worry too much about measuring your keyword density; its importance is arguable at best. But the general frequency of the term can help your rankings.
- Keep all of the text on your page on-topic and of high quality. Yes, search engines really do look for high quality writing; they are capable of performing some pretty sophisticated analysis of the words on your web pages. It’s not just the artificial intelligences you have to please, but the real ones as well; the large search engines have teams of researchers who work on identifying and describing the common elements in high quality writing. You don’t have to be Shakespeare, of course – but keep in mind that good writing is one of those things that both the search engines and your visitors will appreciate.
- Structure your document so that it flows from broad to narrow topics. You’ll want to start with a description of the content, of course, so that both the spiders and your human visitors know what to expect. It makes the entire page more readable. There are situations in which this would not be an appropriate way to structure the page; in such cases, of course, you can disregard this advice.
- Try to keep the text of your document together. Many SEO experts say that it is better to use cascading style sheets rather than table layouts for this reason. CSS allows you to keep the text flow of the document together and prevent the text from being broken up by coding. You can achieve this with tables, too; just make sure that text sections (i.e. content, ads, navigation, and so forth) flow together inside one table or row. You should also avoid having too many “nested” tables that make for broken sentences and paragraphs.
There was a time when text layout and keyword usage in a document were very important, but that is no longer true. Do they still make a difference? Yes, to some degree; but there is no reason to obsess over keyword placement or text layout any longer.
The way your site’s links are set up can help you in the search engine rankings. Think about the sites you have visited. Which ones were easiest for you to use? What did they have in common? If you want your website to have an effective information architecture, you want to consider the factors that make it most usable to your human visitors. These features will help the search engine spiders find their way as well.
This is one reason it makes sense to create and use a sitemap on your website. You want your sitemap page linked to from every other page on the site. If your site is large, you at least want important high-level category pages and the home page to link to the sitemap. A web surfer looking at your sitemap should see links to all of your site’s internal pages; also, by the way you set up the sitemap, you can give a visitor a clear conceptual idea of your site’s structure.
If your site is very large (more than 100-150 pages), you may not want to link to every page on your sitemap. You can link to all of the category level pages instead, which in turn link to all of the pages in that category. In this way, no page in a site is more than two clicks away from the home page. If your site is exceptionally large, you may need to set it up so that no page is more than three clicks away from the home page.
Speaking of your site’s category structure, it should be set up so that it flows from broad topics to more narrow, specific ones. This tells the search engines that your site covers a topic in depth. It is much more likely, then, that they will consider your site to be highly relevant to your keywords and phrases.
Finally, let me address the issue of duplicate content. This comes up for larger, dynamic websites powered by databases. Search engines want to index unique content; when they find pages which contain the same content, they will probably choose one as “canonical” to display in the search results, and quite possibly ignore the rest.
This could easily happen to your site if you have a content management system that creates duplicate content through separate navigation to pages. This reduces the chances of those pages ranking well in the SERPs. Multiple versions of the same content also dilute the value you get out of anchor text and link weight, through both internal and external links to the page.
To solve this problem, you first need to find all your duplicate pages. Once you have that information, take those pages and do a 301 re-direct to point them to the “canonical” version of the appropriate content. By all means, don’t ignore your home page when you do this! Many sites have the same content on http://www.mywebsite.com, http://mywebsite.com and http://www.mywebsite.com/index.html. That damages the rankings for your own home page, a problem you can fix with a 301 re-direct.