Majestic SEO is a link analysis tool, with an index of 52 billion pages, a record of 350 unique URLs and 2.6 trillion mapping relationships (URL pointing to URL). Majestic SEO maintains its own web crawler, “MJ12,” and keeps its own database of the web (like the major search engines). The project belongs to Majestic-12 Ltd and is run by volunteers, with the goal of establishing a quality search engine to compete on the worldwide market. Majestic SEO is a side project of the Majestic 12 search engine, designed to raise funds. In this article we take a detailed look at the Majestic SEO tool and its use in search engine optimization.
Technology
Majestic 12 adopted the concept used by SETI@home and distributed.net called distributed computing. The idea of distributed computing is to use private computers (like yours) to work on the task and then send data back to the source for analysis. If you want to participate in Majestic 12's project, you can download MajesticNOD, which will deploy a web crawler from your computer each time it’s idle. As the crawler gathers pages, it will send those pages to MJ12 servers for indexing. The crawler is only deployed at times when the computer is idle, so it does not affect performance or Internet speed.
Once crawled pages reach the server, they undergo indexing, link and anchor text analysis in a similar manner to large commercial search engines like Google, Yahoo and MSN. Once content is indexed (turned into number variables), information is merged into one large, searchable index which can be explored using keywords.
Crawl Champions
On the Stats page you can find crawl leaders, some of whom have crawled over 11,000,000,000 URLs, which is over 272,000,000 MB of data. You can also join “crawl” teams and compete with others for the title of the best crawling team.
So why is Majestic doing this? It can be presented in terms of a problem and a solution.
The Problem
The problem with large commercial search engines is that they don’t share their data with anyone, especially someone concerned with link graphs and algorithms. The reasoning is clear – it’s a competitive advantage on which Google alone spends hundreds of millions of dollars.
Many commands like “link:”, “inanchor:”, “intitle:” etc, are purposefully distorted to confuse search engine optimizers and to decrease manipulation of search results. Without this data, it’s hard to gather competitive SEO intelligence and to develop counteractive strategies.
The Solution
As the Majestic 12 search engine project was growing, founders and co-founders saw early the value of their index – search engine optimization. Hence Majestic SEO was created. All proceeds from Majestic SEO go to the Majestic 12 community fund, and founders are also looking to strike deals with leading search engine optimization companies for SEO use of their index.