Duplicate content is one of the leading causes of search engine penalties. There are two kinds of duplicate content: internal and external. Internal content is easier to do something about. If one or more of your web pages is identical to others in your site, you need to rewrite the page or rethink whether your site even needs it. External duplicate content I’ll cover in more detail when I discuss content theft. It’s incumbent upon you to make sure your site is cleanly organized and your content is original and not duplicated in any way. It doesn’t hurt to spend some time searching the web regularly to make sure your content remains your own and no one is stealing it.
Content relevancy is becoming more and more important with search engines. For example, if you’re an online dating site, search engines do not expect you to have content about mortgages. Search engines often employ humans to spot check their results. If your content is not relevant to your topic, it will count against you not only in the search engines, but with your visitors, who came to your site expecting one thing and wind up finding something else.
How old is your content? If you want to gain and keep a good position on the search engine results pages, you need to have the search engines index your site frequently. The only way you’re going to do that is by frequently updating your site with fresh, new content. Search engines keep track of how often a site updates its content, and if you do not add to your content frequently, the search engines will send their spiders to crawl your site less often.
Content theft is a big problem, and can lead to duplicate content issues. What can happen is that a thief will send a “scraper bot” to steal someone else’s content and then build a site made up entirely of content that has been scraped from elsewhere. The thief will then put up lots of AdSense ads and make money from the site that way. Google has no idea who had the content first, so it may well penalize both of you for duplicate content. When we find that our content has been stolen, we go through a number of steps to get it removed. We start by emailing the perpetrators directly, and then check the site’s WHOIS file and contact their ISP. If they have forums we will also post a cease and desist there. Remember, your intellectual property is yours, and it needs to be protected.
Finally, consider the length of your content. Search engines will process only so much information for a page. If you want to keep your content indexed, split it up into smaller pieces across several HTML pages so that both the search engines and your visitors have an easier time digesting it.