Search Engine News
  Home arrow Search Engine News arrow Page 3 - Learning to Crawl: an Investigation of...
SEO Chat Forums  
Choosing Keywords  
Google Optimization  
Link Trading  
MSN Optimization  
Search Engine News  
Search Engine Spiders  
Search Optimization  
Web Directories  
Website Marketing  
Website Promotion  
Website Submission  
Yahoo Optimization  
SEO Tools
Adsense Calculator
AdSense Preview
Advanced Meta-Tags
Alexa Rank Tool
Check Server Headers
Class C Checker
Code to Text Ratio
CPM Calculator
Domain Age Check
Domain Typos
Future PageRank
Google Dance
Google Keywords
Google Search
Google Suggest
Google vs Yahoo
Indexed Pages
Keyword Cloud
Keyword Density
Keyword Difficulty
Keyword Optimizer
Keyword Position
Keyword Typos
Link Popularity
Link Price Calculator
Meta Analyzer
Meta Tag Generator
Multiple Link Popularity
Page Comparison
Page Size
PageRank Lookup
PageRank Search
Robots.txt Generator
ROI Calculator 
S.E. Comparison 
S.E. Keyword Position 
Site Link Analyzer 
Spider Simulator 
URL Redirect Check 
URL Rewriting 
Mobile Linux 
APP Generation ROI 
IBM® developerWorks 
SEO Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
SEARCH ENGINE NEWS

Learning to Crawl: an Investigation of the Personal Web Crawler
By: Bruce Coker
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 3 stars3 stars3 stars3 stars3 stars / 3
    2008-10-21

    Table of Contents:
  • Learning to Crawl: an Investigation of the Personal Web Crawler
  • A better way?
  • Examples
  • Copernic Agent

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Learning to Crawl: an Investigation of the Personal Web Crawler - Examples


    (Page 3 of 4 )

    YaCy

    YaCy is probably the best known personal web crawler. It provides powerful capabilities. A single YaCy installation can index and store over 10 million documents, and in a multiple peer configuration there is no upper limit to its capacity. Among its claimed advantages are:

    • The ability to locate information that other search portals hide.

    • The ability to share indexes and create distributed search networks in a community of independent YaCy users.

    • The ability to search within different file formats and different types of media, including common audio and video file types.

    • The fact that it is based on a peer-to-peer web index exchange interface with no central servers. This means that searches are anonymous with no central search data logs.

    For more information on YaCy see:

    http://webscripts.softpedia.com/script/Search-Engines/YaCy-45386.html


    Subject Search Spider

    SSS from Kryloff Technologies is a commercial personal web crawler designed to save time and increase productivity by automating much of the search process. Among other things it claims the ability to:

    • Communicate with an almost unlimited number of search portals.

    • Visit, scan and quote web pages, storing the content in libraries for later use.

    • Identify material in which the search terms have been altered or mis-spelled.

    • Create browser-viewable reports of its findings.

    • Support queries in multiple languages.

    One of the great benefits of SSS is its customizability. The ability to create precisely targeted searches makes it, according to its developers, “the only true personal meta-search engine that is fully configurable by the final-end user.” It is also designed to use disk space efficiently. Rather than storing entire documents locally, SSS retains only the essential information to allow the relevance of the document to be determined. It provides links to the full document in its original location if more extensive viewing is required.

    For more information on Subject Search Spider see:

    http://www.kryltech.com/spider.htm

    More Search Engine News Articles
    More By Bruce Coker


     

    SEARCH ENGINE NEWS ARTICLES

    - Fast Flip, Google`s New News Reading Service
    - Masterseek: a Global Business Search Engine
    - Behavioral Advertising Bill Breaks New Ground
    - Microsoft-Yahoo Deal: Where Do We Go From He...
    - The History of Search and Search Technology
    - Yahoo Closes Geocities
    - Tokoni Takes Storytelling in New Direction
    - Stumpedia: Yet Another Human-Powered Search ...
    - Does Mufin Know Music?
    - Google Layoffs: A Sign of the Times
    - What Makes Question and Answer Sites Popular?
    - Taking a DeepDyve into the Deep Web
    - Is Yahoo`s New CEO Up to the Challenge?
    - Yasni Puts the People in People Search
    - Yasni: Yet Another People Search Engine?





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 1 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek