Search Engine News
  Home arrow Search Engine News arrow Google`s Latest Moves in Information I...
SEO Chat Forums  
Choosing Keywords  
Google Optimization  
Link Trading  
MSN Optimization  
Search Engine News  
Search Engine Spiders  
Search Optimization  
Web Directories  
Website Marketing  
Website Promotion  
Website Submission  
Yahoo Optimization  
SEO Tools
Adsense Calculator
AdSense Preview
Advanced Meta-Tags
Alexa Rank Tool
Check Server Headers
Class C Checker
Code to Text Ratio
CPM Calculator
Domain Age Check
Domain Typos
Future PageRank
Google Dance
Google Keywords
Google Search
Google Suggest
Google vs Yahoo
Indexed Pages
Keyword Cloud
Keyword Density
Keyword Difficulty
Keyword Optimizer
Keyword Position
Keyword Typos
Link Popularity
Link Price Calculator
Meta Analyzer
Meta Tag Generator
Multiple Link Popularity
Page Comparison
Page Size
PageRank Lookup
PageRank Search
Robots.txt Generator
ROI Calculator 
S.E. Comparison 
S.E. Keyword Position 
Site Link Analyzer 
Spider Simulator 
URL Redirect Check 
URL Rewriting 
Mobile Linux 
APP Generation ROI 
IBM® developerWorks 
SEO Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
SEARCH ENGINE NEWS

Google`s Latest Moves in Information Indexing
By: Terri Wells
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 5
    2008-04-28

    Table of Contents:
  • Google`s Latest Moves in Information Indexing
  • Webmasters Unprepared
  • The Second Search Box
  • Getting Your Timeline Straight

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Google`s Latest Moves in Information Indexing


    (Page 1 of 4 )

    Sometimes Google does something with very little fanfare that stirs considerable interest. In this article, I’m going to discuss several of their recent moves. If you’re curious about their attempts to index more of the web or make their indexing more useful for searchers, keep reading; you’ve come to the right place.

    SEOs have known for the longest time that HTML forms are potentially problematic. Any content that requires a user to fill out a form to peruse will trip up search engine spiders and remain unindexed. That's perfectly fine if that's what you want to have happen. Not all online content is for sharing, and if your content is valuable enough to encourage subscribers to pay good money for it, as happens with certain medical and legal indexes, you may not want general search engines to root around in your index and turn it up free for the asking.

    Google wants to change that. In a recent post to the Google Webmaster Central Blog, the search engine revealed that "we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn't find and index for users who search on Google." They make certain automated entries into the form based in part on content from the site, and "If we ascertain that the web page resulting from our query is valid, interesting, and includes content not in our index, we may include it in our index much as we would include any other web page."

    The googlebot's new abilities stem from Google's purchase of Transformic in 2005. Transformic was working on exactly this problem. Anand Rajaraman, writing for Datawocky, mentioned working with one of Transformic's major researchers (Alon Halevy, who also made the recent Google blog post) back in 1995. He noted that Transformic was attempting to solve two problems with their technology. First, they needed to be able to determine which web forms were worth penetrating. Then, "If we decide to crawl behind a form, how do we fill in values in the form to get at the data behind it?" Rajaraman asked. Check boxes and radio buttons were no big deal, but with "free-text inputs, the problem is quite challenging - we need to understand the semantics of the input box to guess possible valid inputs."

    This latest move is Google's way of crawling what has often been referred to as the Hidden, Deep, or Invisible Web. Google insists that it will continue to respect robots.txt files. But the move is not without its problems, and a number of observers have expressed concerns. I'll be covering those issues in the next section.

    More Search Engine News Articles
    More By Terri Wells


       · It's always interesting to see what Google is up to, but these days it seems like...
       · Good Post. Really helpful and interesting...
       · Thanks for your comment!
     

    SEARCH ENGINE NEWS ARTICLES

    - Google`s Living Stories: the Final Nail in t...
    - Should You Be Clocked In?
    - Assessing DMOZ: A Quality Review
    - A Search Engine that Saves the Rain Forest?
    - Collecta: Real Time Search
    - Google Real-Time Search: a Review
    - Microsoft and OpenX Team Up
    - Google`s Influence on the Internet Through i...
    - Fast Flip, Google`s New News Reading Service
    - Masterseek: a Global Business Search Engine
    - Behavioral Advertising Bill Breaks New Ground
    - Microsoft-Yahoo Deal: Where Do We Go From He...
    - The History of Search and Search Technology
    - Yahoo Closes Geocities
    - Tokoni Takes Storytelling in New Direction



     



    © 2003-2010 by Developer Shed. All rights reserved. DS Cluster 2 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek