Learning to Crawl: an Investigation of the Personal Web Crawler
(Page 1 of 4 )
After all this time, shouldn't searching for something on the Internet be an easier process? Sure, it has vastly improved over the years, but there is still a significant element of hit-and-miss involved. Read on to learn about a different approach to finding what you're looking for online.
Despite the Web having been part of our lives for well over a decade now, the fundamental task of searching it for information remains something of a lottery. Locating general information is easy, but finding something specific can present a significant challenge, even to the most experienced Internet researcher. Most searches still involve typing arcane expressions into a search box, applying quotes to limit the terms, and gradually gradually refining the key words in an effort to narrow the results down to those that are most relevant to the required material. Sometimes this approach can be effective, but at others it can seem like searching for a needle in a planet-sized haystack.
Over the years a large number of search engines have attempted to improve the accuracy of the search experience. From Google, who revolutionized the process back in the early days, right up to the newest contenders such as Hakia with its Query Reprocessing technology and Searchmash which provides segregated multimedia results, companies have tried to find new ways of more accurately locating and delivering the information their customers need. Many of them have helped make our lives easier.
But despite all that, every one of these search engines and mechanisms is based on the same basic technology: web spiders and crawlers of one kind or another that travel the Web, locating and indexing information that is then returned in response to the search queries of end users. The task faced by these crawlers is monumental. According to the latest Netcraft survey, the Web as of mid 2008 consists of over 175 million sites that contain literally billions of pages.
Between them, these pages cover a range of information too vast for a single individual to conceive of. So just how well can a multi-purpose search engine with its widely targeted crawlers be expected to meet the specialized requirements of its individual users? The answer, sadly, is probably not very well at all, as is suggested by the ongoing frustration expressed by typical Web users at the difficulty of finding exactly what they’re looking for.
Next: A better way? >>
More Search Engine News Articles
More By Bruce Coker