Website Submission
  Home arrow Website Submission arrow Polite Bots
SEO Chat Forums  
Choosing Keywords  
Google Optimization  
Link Trading  
MSN Optimization  
Search Engine News  
Search Engine Spiders  
Search Optimization  
Web Directories  
Website Marketing  
Website Promotion  
Website Submission  
Yahoo Optimization  
SEO Tools
Adsense Calculator
AdSense Preview
Advanced Meta-Tags
Alexa Rank Tool
Check Server Headers
Class C Checker
Code to Text Ratio
CPM Calculator
Domain Age Check
Domain Typos
Future PageRank
Google Dance
Google Keywords
Google Search
Google Suggest
Google vs Yahoo
Indexed Pages
Keyword Cloud
Keyword Density
Keyword Difficulty
Keyword Optimizer
Keyword Position
Keyword Typos
Link Popularity
Link Price Calculator
Meta Analyzer
Meta Tag Generator
Multiple Link Popularity
Page Comparison
Page Size
PageRank Lookup
PageRank Search
Robots.txt Generator
ROI Calculator 
S.E. Comparison 
S.E. Keyword Position 
Site Link Analyzer 
Spider Simulator 
URL Redirect Check 
URL Rewriting 
Dedicated Servers  
Download TestComplete 
IBM® developerWorks 
SEO Weekly Newsletter
 
Developer Updates  
Free Website Content 
IBM Developerworks
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
WEBSITE SUBMISSION

Polite Bots
By: Akinola Akintomide
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 10
    2007-05-15

    Table of Contents:
  • Polite Bots
  • The Basics
  • Meta Tags and Content Values
  • Is There Any Need to Trigger Google's Bot?

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
     
    ADVERTISEMENT

    PCmover - $15 Off with Coupon Code CJPH7Q

    Polite Bots
    (Page 1 of 4 )

    If you've ever wondered how to get a little better control over what parts of your web site get crawled by the search engines, how they crawl your pages, and how to encourage them to visit, keep reading. This article will explain the various protocols that the search engine robots (particularly Google's) follow. It will also touch upon ways to help you guard against scraper bots.

    Polite Bots

    There have been quite a number of articles on the Robots.txt primer. All have explained the basics of the robots exclusion protocols. Recently while working on removing some pages from Google's archives, I browsed through Google's Webmaster Central Blog over at blogspot and saw some posts by Dan Crow and Vanessa Fox. These posts explained how the Googlebot worked in detail. 

    Apart from explaining the robots exclusion protocol in detail, Google has new tools which allow the removal of cached pages using the Webmaster Dashboard -- we will only cover that briefly in this piece since I go into detail about it in a different article. This article will look at the specifics of the robots.txt primer specifically for the Googlebot, quoting Dan Crow, Google product manager. Google's bot is incredibly polite when it is indexing pages; we will compare its behavior to that of some malicious scraper bots.

    Googlebot has several quirks to it, as all bots do. We will look at a few of these quirks before we discuss the basics of search engine bots. For example if you have your web site down temporarily and you want Googlebot to come back you can use an HTTP 503 command to tell the bot (and your users) that your network is temporarily unavailable. Without this command it is probable that Googlebot will index your "this website is down for maintenance" page. You can get more information on the HTTP 503 status code at askapache.com.

    Also note that if the Googlebot is crawling your site too frequently (and hence grabbing all your bandwidth), you can contact Google Support; they should work with you to ensure that the bots don't overload your servers. According to Vanessa Fox, there probably will be a tool that allows you to adjust the crawl rate of the Googlebot on your site. 

    Googlebot is Google's primary agent in crawling and indexing pages on the web; it's incredibly large, truly living up to the name World Wide Web. As Dan Crow puts it, it's "really, really big." And not every one on the public web wants particular pages crawled. There are pages containing client information or inflammatory material. Some don't mind the crawling but don't want to be cached on Google's database for whatever reason.

    More Website Submission Articles
    More By Akinola Akintomide


       · HiThis is more on how to make sure you control what is seen and not seen on your...
     

    WEBSITE SUBMISSION ARTICLES

    - Polite Bots
    - Put Your Site on the Map with Google Sitemaps
    - Open Directory Project: DMOZ: Frequently Ask...
    - DMOZ: Advanced submissions and listings
    - Search Engine and Directory Submission: Auto...
    - Blogs and Internet Directories: The Same and...
    - Submitting to Directories: A Comprehensive G...
    - The DMOZ Directory: Get Your Site Listed


     
    Accelerating Trading Partner Performance
     
    Competing on Analytics
     
    Cost Effective Scaling with Virtualization and Coyote Point Systems
     
    Five Checkpoints to Implementing IP Telephony
     
    Hosted Email Security: Staying Ahead of New Threats
     




    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 3 hosted by Hostway