Proper Domain, Sub-domain and Folder URL Structure Management in SEO

One of the important planning decisions any developer should make is to ensure that the website has the proper domain, sub-domain and folder structure for search engine optimization. Failing to do this could adversely affect a website’s ability to make sales. This article will explain why, and show you how to properly set up your site’s structure.

Building a proper site structure is particularly important at the web development stage, when the website has not yet been launched in the production hosting server (at the web hosting company). Even though the website is still in the development stage, the developer should be able to foresee the website’s structure — how it will fit together when the files are uploaded to the web hosting server.

Failing to take account of proper structure can be a costly long-term mistake if the website has already been launched, marketed and optimized for the search engines. This is true for the following reasons:

1. Changes in the file structure, particularly to the domain, sub-domain and folders require permanent redirection (301 redirection status). These redirects can affect the existing ranks of indexed pages in Google. This can translate to losses in traffic, and that can affect the website’s targeted sales.

2. For really big websites, the stabilization of the website file structure (from 301 redirections) can take a long time, perhaps three to six months.

3. A single mistake in redirection can seriously affect existing customer experiences. It could cause visitors to be unable to see proper web pages or check out their orders.

This article has been written for developers to help them prepare their websites to use the optimal structure for search engine optimization. This is best done in the web development stage to prevent the three issues stated above from happening.

You might ask: What is a “domain structure”? It is simply an organization of files, sub-domains and folders within your domain. However, this organization can sometimes be problematic, especially if you do not know how to arrange your files and directories for the best results.

A root domain structure can take only two forms:

1. A www version
2. A non-www version

The above two define your “root” domain structure. In SEO terminology, these should define your domain root canonical URL structure. The root domain structure is what your canonical domain URLs should use. Note that the root domain URL structure is DIFFERENT from the sub-domain URL structure.

For best SEO results, the following is recommended:

1. Before uploading the files to the web server and configuring the name servers/ DNS settings of your website, make sure you’ve already decided what root domain URL structure you should use (www or non-www).

For example, if your domain name is seochat.com, then you might decide to use the WWW version as your canonical root domain structure, so the home page will be http://www.seochat.com/.

2. In your Apache and PHP configurations that manage your file structure, make sure you maintain a consistent root domain URL structure throughout your files and folders.

For example, if you have the following PHP files and folders for your website:

myphpfile1.php
myphpfile2.php
phpfolder1
phpfolder2

They should ONLY be accessible via the canonical root domain URL structure of your website. That is the WWW version. For example:

www.seochat.com/myphpfile1.php
www.seochat.com/myphpfile2.php
www.seochat.com/phpfolder1
www.seochat.com/phpfolder2

3. All of this means that the other root domain structure (the non-www version) should EITHER be 301 redirected to the WWW version or use link rel canonical tags

For example, if someone (e.g search engine bots, users, user-agents, etc)  requests the following non-www version URLs:

seochat.com/myphpfile1.php
seochat.com/myphpfile2.php

The server will 301 redirect those URLs to use the canonical root domain URL structure (WWW version):

www.seochat.com/myphpfile1.php
www.seochat.com/myphpfile2.php

The same principle applies when selecting the non-www version of the website as the canonical root domain URL structure.

So how does a developer set up this structure? First, you need to use relative URLs to get the best benefit out of your root domain URL structure and efficiency in moving files from the local host to production servers. So if your files are structured like this in your local host:

/files1.php
/files2.htm
/folder1

and if these files use a relative URL structure, then uploading them to your web hosting root directory can automatically give you the following URLs:

http://www.thisisyourdomain.com/files1.php
http://www.thisisyourdomain.com/files2.htm
http://www.thisinsyourdomain.com/folder1

Second, you must fully test 301 redirections and other DNS configuration to correctly set your canonical root URL domain structure (either as WWW or non-WWW) before fully launching your website.

Since the root domain contains the website’s canonical URLs (important content, services, products, etc), sub-domains are also widely implemented on modern websites, particularly medium-to-large websites.

Google treats a sub-domain as a “different” entity from the root domain (http://www.mattcutts.com/blog/subdomains-and-subdirectories/). For best SEO results when using a sub-domain, keep the following points in mind.

First, the sub-domain should contain “specific” sets of purposes that are different from those of the root domain. For example, forums.seochat.com is a sub-domain of SEO Chat, where the forum content is hosted, while the root domain URL, http://www.seochat.com/, serves the main SEO content.

Google offers another glimpse of a good implementation of sub-domains. Their root domain URL structure is dedicated to search (their main product): http://www.google.com/. But to separate other core products, with purposes different from “search,” Google uses the sub-domain structure. For example:

http://video.google.com/
http://maps.google.com/
http://news.google.com/
http://mail.google.com/

Second, you should use different robots.txt to control the search engine bots’ behavior when they’re crawling content on your sub-domain. This robots.txt is NOT the same as the root domain’s robots.txt.

For example:

http://thisisyoursubdomain.example.org/robots.txt – This is the robots.txt you use to control crawling and indexing behavior of bots in the “thisisyoursubdomain” subdomain.

http://www.example.org/robots.txt  – This is the robots.txt you use to control bots crawling behavior in your root domain URL structure.

If you have a domain with lots of sub-domains under it, you can upload an individual robots.txt to further optimize the bots’ behavior and to prevent duplicate content.

Not only can you use robots.txt, but you can also use different templates, .htaccess directives and other features that are completely different from those on your root domain URLs.

Third, do not excessively use sub-domains as part of your website structure. Doing this can dilute the importance and authority of your root canonical domain structure in Google.

Instead of using sub-domains, you can implement sub-directories or folder directories when presenting a wide variety of content within your root domain. This is a more SEO-friendly approach, because the more internal pages you have in your root domain, the greater will be the benefits in terms of internal link strength and long tail traffic.

So how should you as a developer approach this? First, you should create a new folder in your hosting root directory and put all of your sub-domain content into it. And then, using DNS configuration of your domain control panel, point a sub domain URL into that folder. (e.g thisisyoursubdomain.yourdomainname.com)

For example, in GoDaddy hosting: http://help.godaddy.com/article/4652

Second, you must prevent the mistake of having your sub-domain accessible by search engine bots in two ways, as shown by the following sample URLs:

http://thisisyoursubdomain.yourdomainname.com/
http://www.yourdomainname.com/thisisyoursubdomain/

Web developers are often unaware of this. To correct this problem, you can either 301 redirect http://www.yourdomainname.com/thisisyoursubdomain/ to http://thisisyoursubdomain.yourdomainname.com/, including the files and folders under it, or use the link rel canonical tag.

For simple websites, sometimes a subdomain is inappropriate, because you do not need to drastically separate content within your root domain. In this case, something as simple as implementing a subdirectory or folder is enough.

But again, even something as easy as setting up a sub-directory can lead to some SEO-related issues. Below you’ll find the recommendations for setting up sub-directories that can provide the best SEO results.

First, shorten and simplify the URL as much as possible. For example, in WordPress, it is much better to use shorter URLs, like this: http://www.rootdomainname.com/thisisyourposturl/ , rather than a longer one, such as http://www.rootdomainname.com/2010/02/01/thisisyourposturl/. The same concept applies to folder naming in a website.

This is best done during the web development stage, where you can do some URL rewrites and editing of permalink structure to get the shortest URL possible without affecting the website daily operation.

Second, you will need to prevent canonical issues from trailing slashes by either using 301 redirection or the link rel canonical tag.

For example, it is not optimal for SEO if both of these URLs return a 200 OK header status because they will split link juice between them:

http://www.rootdomainname.com/thisisyoursubdirectory/
http://www.rootdomainname.com/thisisyoursubdirectory

If you use a trailing slash for sub-directories as your canonical URLs, then you should 301 redirect the second URL listed above (the one without the trailing slash) to the first URL. 

In WordPress, this is implemented by default. And again, this type of issue can be prevented by using link rel canonical tags.

Third, do not exaggerate your vertical website structure by increasing folder depth. A deep URL structure is not recommended, as the URLs can become quite long.

For example, say you are creating a website that sells movie downloads. Of course, you need to categorize those movies in terms of genres (horror, action, animation, etc) using a subdirectory structure. But what about the directors, film studios, film rating, and so forth? Does that need to be included in the folder structure? Let’s look at two possibilities.

Here is a link typical of a deep URL structure, with details. It is not recommended: 

http://thisisyourmoviewebsite.com/action/universalstudios/
christopher_nolan/inception

Here is an SEO-friendly URL:

http://thisisyourmoviewebsite/com/action/inception

So how would potential customers search for detailed information like director’s name, rating, film studios, and so forth? The recommended approach is to design a search box that will let customers search for detailed information, and then arrange the results under the sub-directory logically in terms of alphabetical title releases, etc.

In this way, your folder structure can be kept very simple and short, while letting your customers search for detailed information.

Summary: Combining the Entire Structure

When all of the above structures are implemented in your website, it should look like this:

The gray, yellow and violet are the important SEO website structures that you need to consider. The yellow covers the root domain canonical URLs, while the gray and violet are the sub domains.

When you register a domain, the domain name is the one on the top level; it is above everything else. Then, as a web developer, you will develop plan to structure your website and group it according to the following parts:

1. Root domain URLs (selecting either www or the non-www version)
2. Folders and Subdirectories
3. Sub-domains.

You will create this plan while taking into consideration the different SEO elements that can affect the structuring of your website, as discussed in this article.

Google+ Comments

Google+ Comments