Index of 30 Billion Pages for Link Analysis Data from SEO Moz
I am very hyped about a new link analysis tool from SEO Moz – Linkscape.
Announced October 6th, 2008, Linkscape is truly set to revolutionize competitive link intelligence and search engine optimization game. It has a 30 billion index of pages and includes link metrics such as page rank, trust rank, anchor text analysis, embedded image links, links from the same IP C-block, external link data, internal link data, domain level rank, domain level trust and TONs of other sexy SEO metrics.
Reverse Engineering the Search Engines
The problem with current link analysis is that it’s very limited. Google’s "link:" command is distorted on purpose to keep Google’s index clean of spam and Yahoo’s Site Explorer does not show any link metrics.
SEO Moz set out to fix this. By deploying its own web crawler, Linkscape copies pages it crawls to the index, where various algorithms are applied to analyze and compute collected data, just like the major search engines.
Algorithms used in Linkscape are basically copies of those used by Google, Yahoo and MSN:
mozRank (Google Page Rank, Yahoo! WebRank, Live StaticRank)
mozTrust (Google Trust Rank)
External Domain Juice
I give a more detailed explanation of each attribute in the next section.
Currently Linkscape has an index of 30 billion web pages. Rand explains that the company is focused on getting every domain possible as opposed to crawling every URL on the web. Over time, he says, "we hope to do both."
Rand also shared that, according to their estimates, their index size is around 1/3 to 1/5 that of major search engines, which is to be expected. "Fortunately, it appears that nearly universally, the SEO Moz index contains the more important, well-linked-to pages and sites, so the missing portions in a comparison are unlikely to be popular, valuable resources."
As stated above, this is a copy of Google’s Pagerank, Yahoo! WebRank and Live’s StaticRank. It computes the link popularity of pages throughout the web and offers data for everyone to see!
Link-juice flow is equal for internal and external links, but it’s also important to know the flow of link juice only from external websites. This is what "external mozRank" does.
Domain Juice (DJ)
Domain Juice is the sum of mozRanks for all URLs in a domain. It uses a 10 point scale. Domain Juice does not include external links in calculations.
External Domain Juice
This measure calculates the sum of all mozRanks from all external links flowing to a particular domain. External Domain Juice measures ranking power for an entire domain as opposed to separate web pages.
This is a reverse-engineered Google Trust Rank algorithm. Google Trust Rank works on the principle of "seeds," where humans manually identify trusted seed pages, and links from those seed pages pass "trust rank" to other pages. The further a page is from the "seed," the less trust is passed.
It’s not clear if Linkscape uses humans to identify seed pages. There’s also a chance that a page considered a seed at SEO moz may not be such in Google’s eyes.
Domain-Level mozTrust (DmT)
This is exactly the same measurement as above, but on a domain level. This algorithm is used by Google, since new pages with 0 links from authoritative and trusted domains rank high in search results for relatively competitive 3 – 5 word long terms.
The URL report shows data for the top 2,500 URLs, with links to domains and 500 pages with links to a specific URL. The data represents most important links, according to SEO Moz algorithms.
Anchor Text Distribution
This checks the anchor text of each of the links in a report.
Link & URL Attributes:
Here’s the link data offered with LinkScape:
Same IP Address
Same IP C-Block
All of this information is very, very yummy.
You can take Linkscape for a test run by viewing sample reports for:
Samples are limited in data, but you can get a taste of what LinkScape provides. There are six tabs:
Dashboard | Link to URL | Link to Domain | URL Anchor Text | Domain Anchor Text | Custom Reports
Dashboard is the summary of all data. It shows the URL Data summary with metrics like mozRank, mozTrust, External Links and Internal Links. On the right side you’ll find Domain Data, which includes Domain mozRank, Domain mozTrust, Domain Juice and External Links.
Below on the dashboard page, there’s a snippet of the hottest links a site has, as well as some anchor text samples.
The "Links to URL" tab features links that point to the URL, the title of the page from which the link comes and the link’s anchor text. There are also SEO Moz measurements for each of the links, which are, as mentioned, reverse engineered from search engines on purpose.
Anchor Text: Local Search Optimization
mozRank: 5.11 | mozTrust: 0.44 | Passing: 21.63% of mR
mozRank: 3.67 | mozTrust: 2.46 | Domain Juice: 6.49
The Link to Domain tab features exactly the same information as the Link to URL tab, but this time the information focuses on the domain level, not separate pages. There’s also an indicator of whether the site uses 301 or 302 redirects.
URL Anchor Text tab shows anchor text information gathered from links pointing to a website, in our case "www.martijnbeijk.com." There are also metrics such as the following: "Unique Links using that exact anchor text," "Unique Domains using that exact anchor text," and "mR Passed by URLs With This Anchor Text," which is essentially PageRank.
Domain Anchor Text features the exact same measurements and information as the URL Anchor text tab, but this time the information is for domain level links.
Here’s a snippet:
The Custom Reports tab invites you to purchase custom Linkscape reports, starting at $10,000. Reports include:
Advanced Full Link Data
Full Site Data
There’s also API licensing starting at $5,000
In the announcement, Rand shared some very interesting statistics he found with his new technology:
58% of all links on the web are internal and 42% are external.
1.83% of all links use nofollow. 61% of nofollow links are external and 39% are internal (in other words, they are partly used for page rank sculpting).
0.08% of the pages use 301 redirects and 0.12% use 302.
1.5% of all pages use the meta noindex tag.
With 30 billion pages, SEO Moz has an immense, unfair advantage over other search engine optimization firms. It now has a peek at a good example of the information that search engines have. Rand can also run tests, explore theories, apply new algorithms and monitor all competitive link building strategies around the web. I have to say Rand got a nuclear bomb…
Here’s the cost structure:
Small Business SEOs
Monthly Billing – $129/month
Annual Billing – $1299/year
Big Business SEOs
Monthly Billing – $299/month
Annual Billing – $2999/year
On top of a monthly subscription to SEO Moz, you must buy credits to use LinkScape. I am not aware of the pricing at the moment.
SEO Moz attempted what no other company have attempted before – reverse engineering search engines, in order to manipulate them better.
They are thinking big.
I can’t wait to finish writing this and get to analysis. This is a killer tool that all SEOs have been waiting for.
SEO Moz will continue pushing Linkscape aggressively, incorporating more metrics, data and algorithms.
I would not be surprised if Rand attempted to reverse engineer (if he has not already done so) search results and continues tweaking them with various algorithms in order to match major search engines. I would definitely pay several hundred bucks for a service like that. If SEO Moz takes this further and builds search results, it will be able to provide estimates that answer questions like:
"What if I have a link from this page how this will affect my rankings?"
Simulations like this will definitely gain widespread appeal among SEOs who spend thousands on link buying.
Linkscape was a brilliant idea. SEO Moz will most likely get more competitors in the future, some cheaper, some better. I suspect many "low use" search engines that have their own indexes may see a great opportunity here and enter into competition with SEO Moz’s Linkscape.
I think October 6th, 2008, marked a new era for search engine optimization – reverse engineering search engines on a MASSIVE scale in order to optimize better. I think link analysis is the first step. We may see entire search results emulated in the future.
With Linkscape we can look under the hood and, instead of guessing, have solid estimates on what works and what doesn’t work. We can answer the question "Why" a lot better, and plan to execute the "How" using solid data.
There will definitely be more from SEO Moz and I suspect many competitors will emerge with similar offers.
Rand, you don’t know me, but I’m jealous .