Learning to Crawl: an Investigation of the Personal Web Crawler - Examples
(Page 3 of 4 )
YaCy
YaCy is probably the best known personal web crawler. It provides powerful capabilities. A single YaCy installation can index and store over 10 million documents, and in a multiple peer configuration there is no upper limit to its capacity. Among its claimed advantages are:
The ability to locate information that other search portals hide.
The ability to share indexes and create distributed search networks in a community of independent YaCy users.
The ability to search within different file formats and different types of media, including common audio and video file types.
The fact that it is based on a peer-to-peer web index exchange interface with no central servers. This means that searches are anonymous with no central search data logs.
For more information on YaCy see:
http://webscripts.softpedia.com/script/Search-Engines/YaCy-45386.html
Subject Search Spider
SSS from Kryloff Technologies is a commercial personal web crawler designed to save time and increase productivity by automating much of the search process. Among other things it claims the ability to:
Communicate with an almost unlimited number of search portals.
Visit, scan and quote web pages, storing the content in libraries for later use.
Identify material in which the search terms have been altered or mis-spelled.
Create browser-viewable reports of its findings.
Support queries in multiple languages.
One of the great benefits of SSS is its customizability. The ability to create precisely targeted searches makes it, according to its developers, “the only true personal meta-search engine that is fully configurable by the final-end user.” It is also designed to use disk space efficiently. Rather than storing entire documents locally, SSS retains only the essential information to allow the relevance of the document to be determined. It provides links to the full document in its original location if more extensive viewing is required.
For more information on Subject Search Spider see:
http://www.kryltech.com/spider.htm
Next: Copernic Agent >>
More Search Engine News Articles
More By Bruce Coker