I, Robots.txt - Commenting and Loosely Supported Extensions
(Page 5 of 5 )
There are several directives you can use that may or may not be supported by the different search engines. They are listed below:
Crawl-delay
If you want to set the number of seconds between recurrent requests to the same server, you can do so by using Crawl-delay. Here it is in action:
User-agent: *
Crawl-delay: 60
Or
User-agent: FatBot
Crawl-delay: 120
The first example makes all bots wait 60 seconds. The second one makes FatBot wait two minutes before doing a recurrent request.
Using Sitemaps Auto-Discovery
This handy dandy little guy allows you to tell the bot where your list of URLs are. You can add it anywhere in your file, like so:
Sitemap: http://www(dot)sample(dot)com/sitemap(dot).xml
Allow
Allow is a nifty directive that works by letting you specify that a bot can look at certain files within a disallowed directory. Let's say that you have disallowed an image directory, but there is a file in that directory you decide later on that you would like to have indexed. Instead of having to block every other file in the directory, you can simply do this:
User-agent: *
Disallow: /images/
Allow: /images/mefeedingorphans.jpg
Now all agents will be able to enter your /images/ directory, but they will only look at the file(s) you tell them to.
Commenting
You can leave comments in your robots.txt files by preceding them with a pound(#) symbol, like so:
# Here is a comment
User-agent: * # all bots should follow the disallow
Disallow: /images/ # no bot should access the images directory
Conclusion
Well, that's it for this article. There are still more features of robots.txt to discuss, like the Robots Meta Data Tag, and more issues to speak of, like NoFollow and ACAP, all of which we will cover in a future article.
Till then...
| DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware. |