Statement of the Problem
As mentioned in the introduction, the trickiest thing to do is avoid duplicate content and successfully transfer the link juices you have earned from your Blogspot address to your new domain. Suppose your Blogspot front page URL is: http://webdevelopmentexperts.blogspot.com/ and has 200 backlinks pointing to it from other domains.
Note: Domain URLs used in this article are hypothetical examples only for illustration purposes.
Now let’s also say that one of your popular Blogspot post URLs is: http://webdevelopmentexperts.blogspot.com/2008/05/php-include-tips and it has 30 backlinks from other domains. Now say that you’ve decided to register a new domain, namely http://www.php-developer.org/ and pay a new web host.
You will then transfer all of your Blogger posts to your new domain, which is hosted by your new web host. Now this is where things start to become complicated, for three reasons.
First, rather annoyingly, you cannot execute 301 redirection from Blogger to a different domain with a different web host.
Second, if you plan to put the link rel canonical tag on your blogger template by editing the template’s HTML source code, this link rel canonical tag will then appear in ALL of your blogger posts URLs. For example: <link href=’http://www.php-developer.org/’ rel=’canonical’/> will appear on the <head> tag of every blogger post URLs.
This is not correct, because for example: http://www.php-developer.org/ is NOT the canonical URL of http://webdevelopmentexperts.blogspot.com/2008/05/php-include-tips. In fact, http://www.php-developer.org/ is the canonical URL of only the Blogger front page URL, namely http://webdevelopmentexperts.blogspot.com/ If you do this, you will lose a lot of long tail traffic from your Blogger posts coming from search engines.
So the ideal solution as far as using the link rel canonical tag would be if the Blogger-requested URL is the front page:
then the Blogger source code in the <head> tag will return:
<link href=’http://www.php-developer.org/’ rel=’canonical’/>
Otherwise, if the Blogger-requested URL is the post URL, for example http://webdevelopmentexperts.blogspot.com/2008/05/php-include-tips, then the Blogger source code in the <head> tag will also return the equivalent and correct target of the link rel canonical tag:
<link href=’http://www.php-developer.org/php-include-tips’ rel=’canonical’/>
The strategy is that, when the Googlebot indexes any of the Blogspot URLs (front page URLs and the inner post URLs), Google sees that there is a canonical tag pointing to the equivalent new domain URLs. Google will then start to award those link reputation juices and other parameters to your new domain, which are vital for Google ranking.
Once you have this strategy in place, you need to do same thing described above to your remaining Blogger post URLs, making them point to the equivalent target URLs in the new domain.
Now the question is: “How do we implement this solution?” This is the main purpose of this tutorial series — to be able to solve this problem and implement the link rel canonical tag to transfer link juices from an old Blogspot address to your new domain/new web host.
Bloggers get lucky when Google implements an additional rule on the link rel canonical element. Basically, the search engine is allowing the use of the link rel canonical tag to solve cross domain duplicate content issues.
Migrating from Blogger to another web host and transferring content is an example of a cross domain duplicate content issue. This is because it can happen when both domains are accessible to users and the Googlebot, but both serve the same content.
Using the link rel canonical tag in the Blogger template source code will solve this canonical issue and hence transfer all the link juice earned from Blogspot to the new domain under the new web host. See analogy below:
Virtually, the links earned from the previous Blogspot URLs are transferred to the equivalent URLs in the new domain, helping it to rank better in the search engines.
Localhost development software is the same as the platform you are using for your new web host. For example, if you decide to use WordPress for your domain in your new web host, your localhost development software is WordPress. This also means that you are developing your website in your localhost environment using WordPress, in preparation for it to be uploaded to your new web host.
The first step of the migration process is to export your Blogger post content to localhost environment software. Most webmasters use WordPress, so you can refer to these following tutorials:
For importing Blogger posts into a WordPress localhost using Windows XP XAMPP: http://www.aspfree.com/c/a/BrainDump/Import-Blogger-Posts-into-WordPress-Using-Windows-XP-XAMPP/
For more on moving the content of your Blogger blog to WordPress:
How to upload files from localhost to your new web host:
How to build the default WordPress installation at your new web host:
The objective of the first step is to ensure four things:
- All Blogger posts and content are transferred completely to your new web host under your new domain name.
- Every aspect of your new domain in your new web host is working (all URLs, images and templates are fully functional).
- That search engine bots like Googlebot are NOT permitted to crawl and index your new domain in your new web host. This is because the link rel canonical tag is NOT yet added on your Blogger template pointing to your new domain.
In this case, upload a robots.txt to the root directory of your domain:
Disallow: / I
Important: Please remove this robots.txt after the link rel canonical tag has been implemented.
- That the server header status of your completed web site is 200 OK. It is extremely important that you double check the server header status to make sure it returns this status. You can use a tool on SEO Chat to do so: http://www.seochat.com/seo-tools/check-server-headers/
Even though you cannot check all URLs, at least you can check a sampling of the important URLs such as the home page, your most popular post and your sitemap.
The purpose of doing this is to check for server misconfiguration.
The ideal link rel canonical integration is to have a one to one correspondence of old Blogspot and your new domain URLs, such as:
http://youroldblogspoturlfrontpage.blogspot.com/ TO http://www.thisismynewdoman.com
http://youroldblogspoturlfrontpage.blogspot.com/posturl1 TO http://www.thisismynewdoman.com/posturl1
http://youroldblogspoturlfrontpage.blogspot.com/posturl2 TO http://www.thisismynewdoman.com/posturl2
Therefore, you need to come up with a COMPLETE LIST of your Blogspot URLs and your new domain URLs. Then you need to pair them up in a spreadsheet application such as MS Excel. Example:
Of course, it would be impossible for you to open all the URLs and then paste them into MS Excel. You need a tool to crawl your website and get those URLs automatically. This tool is called “Xenu Link Sleuth.” Detailed steps are out of scope of this tutorial, however you can refer to this Xenu Link Sleuth tutorial.
The objective of the second step is twofold. First, it is to ensure that you have a complete list of Blogspot URLs and their equivalent new URLs in the new domain. Second, you want to ensure that all indexed Blogspot URLs are also included in that list.
How you are going to systematically pair the URLs, especially if you have a pretty large blog?
First, get Xenu crawling data for each of your Blogspot-hosted blogs and your new website.
Second, export each of those report in .csv/Excel format.
Finally, using the MS Excel vlookup function, you can match up equivalent URLs using the Title Tag column of the Xenu result. Since the content of the Blogger post and configuration is exported to your new domain without changing configurations, the title tag of your post used in Blogger SHOULD be the same as the one found in your new domain.
You can use this property to match up URLs using your spreadsheet software VLOOKUP function. It is important that you do spreadsheet manipulation to filter unimportant columns and rows from the Xenu report, and combine the two xenu results (one for your Blogspot and the other for your new domain) into one spreadsheet for easier processing.
After all of the data has been finalized, you can then execute a vlookup command: http://office.microsoft.com/en-us/excel/HP052093351033.aspx . Below is a sample spreadsheet containing completely processed data using the above procedure. The URLs paired/matched those URLs with vlookup function:
Important note: There are times when Xenu sleuth will not be able to pick up all URLs. This might be due to a timeout or congestion problem on the Internet.
In relation to this, it is highly important that you also gather all of the indexed URLs of your Blogspot in Google using the site command. Example:
If you are having trouble extracting all of your indexed URLs in Google to an MS Excel spreadsheet, you can refer to this tutorial.
Merge the data gathered from the Google index with the set crawled by Xenu, and then filter the unique URLs using Excel. The resulting list of unique URLs will now be a complete list in case there are a lot of missing entries in the Xenu results.
In the second part of this tutorial, you will learn how to formulate a Blogger script that will automatically add link rel canonical tags to all of those Blogger-equivalent posts pointing to the newly canonical URLs located on your new domain at your new web host.