Search Optimization
  Home arrow Search Optimization arrow Page 5 - I, Robots.txt
SEO Chat Forums  
Choosing Keywords  
Google Optimization  
Link Trading  
MSN Optimization  
Search Engine News  
Search Engine Spiders  
Search Optimization  
Web Directories  
Website Marketing  
Website Promotion  
Website Submission  
Yahoo Optimization  
SEO Tools
Adsense Calculator
AdSense Preview
Advanced Meta-Tags
Alexa Rank Tool
Check Server Headers
Class C Checker
Code to Text Ratio
CPM Calculator
Domain Age Check
Domain Typos
Future PageRank
Google Dance
Google Keywords
Google Search
Google Suggest
Google vs Yahoo
Indexed Pages
Keyword Cloud
Keyword Density
Keyword Difficulty
Keyword Optimizer
Keyword Position
Keyword Typos
Link Popularity
Link Price Calculator
Meta Analyzer
Meta Tag Generator
Multiple Link Popularity
Page Comparison
Page Size
PageRank Lookup
PageRank Search
Robots.txt Generator
ROI Calculator 
S.E. Comparison 
S.E. Keyword Position 
Site Link Analyzer 
Spider Simulator 
URL Redirect Check 
URL Rewriting 
Mobile Linux 
APP Generation ROI 
IBM® developerWorks 
SEO Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
SEARCH OPTIMIZATION

I, Robots.txt
By: Jamesp
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 9
    2008-03-26

    Table of Contents:
  • I, Robots.txt
  • A Few Pointers
  • Creating Your First Robots.txt
  • Limiting Bots
  • Commenting and Loosely Supported Extensions

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    I, Robots.txt - Commenting and Loosely Supported Extensions


    (Page 5 of 5 )

    There are several directives you can use that may or may not be supported by the different search engines. They are listed below:

    Crawl-delay

    If you want to set the number of seconds between recurrent requests to the same server, you can do so by using Crawl-delay. Here it is in action:


    User-agent: *

    Crawl-delay: 60

    Or


    User-agent: FatBot

    Crawl-delay: 120

    The first example makes all bots wait 60 seconds. The second one makes FatBot wait two minutes before doing a recurrent request.

    Using Sitemaps Auto-Discovery

    This handy dandy little guy allows you to tell the bot where your list of URLs are. You can add it anywhere in your file, like so:

    Sitemap: http://www(dot)sample(dot)com/sitemap(dot).xml

    Allow

    Allow is a nifty directive that works by letting you specify that a bot can look at certain files within a disallowed directory. Let's say that you have disallowed an image directory, but there is a file in that directory you decide later on that you would like to have indexed. Instead of having to block every other file in the directory, you can simply do this:


    User-agent: *

    Disallow: /images/

    Allow: /images/mefeedingorphans.jpg

    Now all agents will be able to enter your /images/ directory, but they will only look at the file(s) you tell them to.

    Commenting

    You can leave comments in your robots.txt files by preceding them with a pound(#) symbol, like so:


    # Here is a comment

    User-agent: * # all bots should follow the disallow

    Disallow: /images/ # no bot should access the images directory

    Conclusion

    Well, that's it for this article. There are still more features of robots.txt to discuss, like the Robots Meta Data Tag, and more issues to speak of, like NoFollow and ACAP, all of which we will cover in a future article.

    Till then...


    DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware.

       · Thanks for stopping by to read my article on the robots.txt protocol. In it, I...
       · Thanks for providing such an informative post. It is always great to see issues...
       · Thanks, glad you enjoyed it!
     

    SEARCH OPTIMIZATION ARTICLES

    - More Blogging Tips: Cooking with Gas
    - Blogging Tips from Julie and Julia
    - SEO Essentials: the Proper Web Server and Pl...
    - Steps to Higher Rankings and Traffic
    - Building Linkable Pieces and Titles
    - Page Rank Sculpting
    - Page Rank Optimization
    - ClickTale Review
    - Final Issues: Moving Blogger to WordPress wi...
    - Avoid the Mistakes New SEOs Make
    - Move Your Blogger Blog to WordPress: Getting...
    - How to Move from Blogger to WordPress Using ...
    - Must Have WordPress SEO Plugins
    - Creating Search Engine Friendly URLs with PHP
    - Online Reputation Management with SEO





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 1 Hosted by Hostway
    Stay green...Green IT