htaccess SEO and Security Tips

This is a beginner’s guide to htaccess. It is meant to introduce this configuration file and its useful implementations in website management. .htaccess is a server configuration file commonly found in Apache, Zeus and Sun Java System web servers. It provides a lot of applications with implications for website security and search engine optimization. You cannot find or use .htaccess in Microsoft-based IIS servers and non-Apache-based servers.

To create and upload an htaccess file:

1. Open any text editor (such as geditor in Ubuntu or notepad in Windows).

2. Enter the htaccess syntax you would like to implement on your website (at the text file).

3. Save it as .htaccess

Note: There is a “.” before the filename, and there is no file extension to it; just .htaccess

4. Upload it to a directory where you would like the directives to be implemented. An htaccess uploaded to the root directory of your website will cover all of the website’s directories and files (from root to inner).

But if you upload only the .htaccess file to a specific inner directory, the directives on that .htaccess file can be only be implemented in that specific directory.

IMPORTANT: If you have an existing .htaccess in your website and you plan to edit it, it is extremely important to secure a backup of that .htaccess first, before doing anything else.

301 redirecting old/dead URL to new URL

Suppose you have URLs, and then you delete them because they are no longer needed. You need to replace them with new URLs (by re-publishing content, for example, or shortening your URLs).

With respect to SEO implementation, this approach can cause you to lose a lot of traffic and link juice, because those old URLs might be contributing a very significant amount of traffic and links to your website.

What is the best solution? Instead of giving those URLs a 404 header status (not found, or does not anymore exist), you need to 301 redirect those non-existing URLs to their new, permanent location.

With this method, if those dead URLs are still indexed by Google, any users coming from search engines or other websites can still read and visit the new URLs because they have been “301 redirected.”

To do this, you can use .htaccess to do the 301 redirects. For example:

HTACCESS SYNTAX:

redirect 301 /2009/04/how-to-make-blogger-post-title-unique.html http://www.php-developer.org/how-to-make-blogger-post-title-unique/
redirect 301 /about/ http://www.php-developer.org/about-codexm/

There are two 301 redirection commands above; the first is to 301 redirect:

http://www.php-developer.org/2009/04/how-to-make-blogger-post-title-unique.html

TO:

http://www.php-developer.org/how-to-make-blogger-post-title-unique/

The second redirection is to 301 redirect:

http://www.php-developer.org/about/

TO:

http://www.php-developer.org/about-codexm/

What if you have a URL with spaces? For example: http://www.php-developer.org/wp-content/uploads/tutorials/Excel database functions sample sheets.xls

Then you will need to use a double quote in the 301 redirection line, e.g:

HTACCESS SYNTAX:

redirect 301 "/wp-content/uploads/tutorials/Excel database functions sample sheets.xls"  http://www.php-developer.org/excel-database-sample-sheets/

The above line will 301 redirect that URL with spaces to this new location: http://www.php-developer.org/excel-database-sample-sheets/

Note: 301 redirection implementations using .htaccess will be uploaded to the root directory of your website. If you have an existing .htaccess, you will simply need to edit it and add the redirection lines. You will not need to actually create a brand-new .htaccess. 

Disable hot linking to save website bandwidth

If you have a website in which you are hosting fairly large files like MP3s, video files and audio wave files, then someone outside your domain might be tempted to take advantage of your website’s resources by directly streaming/linking to it, without paying a single penny for bandwidth use.

Why is this not good? Your website’s bandwidth (for which you are paying with your hosting bills) is being substantially consumed by unauthorized persons. This is known as “bandwidth theft” and includes stealing pictures from your website by reusing them in another website (NOT in your website).

Abuse can slow down your site and weaken your website security. This is where you can use .htaccess to prevent this.

For example, say you want to prevent “hot linking” for the following file types mp3, jpg and wav audio files. You want to allow only your own domain (e.g yourdomain.org) to access them; other domains are restricted.

HTACCESS SYNTAX:

## DISABLE HOTLINKING
RewriteEngine on
# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www.)?yourdomain.org/.*$ [NC]
RewriteRule .(mp3|jpg|wav)$ – [F]

Again, if you upload the .htaccess file to the root directory of your website, all of your website directories and files will be protected against hot linking, as stated in your .htaccess file.

Adding more than one domain to hot link to your site

There are times when you have several websites and you need to hotlink to the images, perhaps for convenience and saving disk space. In this case, if you implemented the above .htaccess syntax, then your website will not be permitted to access those images.

You can grant permission to any website to access and “hot link” your web content (images, etc). For example:

HTACCESS SYNTAX:

## DISABLE HOTLINKING
RewriteEngine on
# Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www.)?yourdomain.org/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www.)?seochat.com/.*$ [NC]
RewriteRule .(mp3|jpg|wav)$ – [F]

The .htaccess lines above allow seochat.com to hot link to yourdomain.org website (in addition to yourdomain.org, of course).

This is one of the most important .htaccess security syntaxes. You can use this, for example, if you have an administrative folder in your website for which you are the only person that should access to it (like /wp-admin/ folder in WordPress). If you make your administrative folders publicly accessible, it increases risk of hacking attacks, such as brute forcing login pages.

Thus, it is important to secure these types of folders. Using .htaccess, you can execute something like this:

HTACCESS SYNTAX:

Order allow,deny
Allow from 76.161.33.242

For which 76.161.33.242 is your IP address. You can know your IP address by visiting this URL:  http://www.whatismyip.com

So if you visit your administrative folders, you are allowed in because your IP address is allowed in .htaccess — however, all of the other users on the Internet will be unable to access the content, because they are not authorized.

If you need “complete” isolation of private folders and files from search engine robots, crawlers and public users, then this is definitely the best thing to do, which is more effective than relying on a robots.txt file (as search engines will still provide reference links to directories and URLS they are not allowed to crawl).

What if the IP address allowed is the “home” IP address, and you plan to access your administrative website folders using your “office” IP address? Then you need to get your office IP address and add it to the .htaccess files. For example:

HTACCESS SYNTAX:

Order allow,deny
Allow from 76.161.33.242
Allow from 84.154.236.3

The IP address 84.154.236.3 is also allowed to access your website administrative folders. Remember to upload this .htaccess file only to specific, protected folders. So, if you need to protect your /administrator/ folder, then upload an .htaccess file containing the syntax above to the administrator folder only, NOT to the root directory!

How to check that your “deny public access” htaccess syntax is working

It is important check that your deny public access syntax in htaccess is working. The quickest way that you can do this without using another computer with another IP address is to:

Step 1. Launch your web browser.

Step 2. Visit any of these proxy browsing services (depending on which are available, fast and convenient):

a. http://www.proxybrowsing.com/

b. http://www.eatmybrowser.com/

c. http://www.ninjacloak.com/

Step 3. Go to http://www.whatismyip.com using those proxy browsing services.

Step 4. Write down the IP address that you are using. Remember that if you are using a proxy browsing service, it will hide your true IP address and you will be using a proxy IP address instead.

Step 5. Now go to the restricted URL in your website, e.g http://www.example.com/thisisrestricted/

Step 6. The server should reject your visit (like given a 403 forbidden status) because the IP address you are using didn’t match with the one provided in your .htaccess.

Step 7. Now try adding the IP address you are using with your proxy browsing service to the .htaccess. Clear the browser cache and history, and then reload the restricted directory in your website.

You should now be able to visit your administrative folders. Just delete that .htaccess entry after doing this test.

Force downloads implementation of a specific file type

Suppose you have a page on your website that links to a PDF file stored on your server. And for user convenience, you would like the browser to display the download dialog box if a user clicks on a link, instead of streaming the PDF contents directly to the user using a web browser.

Another example could be forcing the download of an MP3 file instead of streaming it with a browser. To implement a forced download with .htaccess:

<files *.pdf>
forceType application/octet-stream
Header set Content-Disposition attachment
</files>

The above example will force the user to download a PDF file. You can change the extension to anything you like, depending on your application (e.g .mp3, .wav, etc).

Here is a sample screen shot of a forced download dialog in the browser:

Note:

1. You need to upload the .htaccess file containing the force download syntax only to the applicable directory/folders containing those file types (e.g /pdf/)

2. Not all hosting servers enable the Header set in their Apache configuration, so you need to consult on that with your web hosting provider.

You can view the final list of htaccess syntax discussed in this tutorial here: http://www.php-developer.org/htaccess_syntax.txt

Google+ Comments

Google+ Comments