PHP Search Engine Optimization

You want to convince search engine spiders to crawl your pages frequently. You need to put in keyword-rich, dynamic content to help convince the spiders to check your pages regularly. Unfortunately, many spiders trip over dynamic pages. How do you fix this problem? Use PHP to give your dynamic pages static-looking URLs. Roger Stringer explains how to do it.

PHP is a useful language, used by many all over the Web. But it has one failing. By its nature, it is not search engine friendly. In fact, it’s the exact opposite. But with some clever tweaking, we can make PHP a powerful tool in the quest for search engine dominance.

Proper search engine optimization can make PHP even look like a different script. You could make it look like an HTML file or several HTML files, one for each article on your site.

Speed

Speed is a major factor for websites. If a search engine spider follows a link on your site and is forced to wait too long for the server to process the PHP code behind that page, it may ignore your page and move on.

The biggest slowdowns in a PHP script are typically the database and loop code. Try to avoid making any SELECT * calls; instead, name all the columns you want to retrieve. Using SELECT * on a table that contains 10 fields when you only want to use one or two fields is a waste of valuable resources.

To optimize loops, try to use duplicated code instead of loops that don’t repeat very many times. Additionally, use as many static values, such as count($array) values, inside the loop as you can, generating their values before they loop once.



The first issue to deal with is PHP’s tendency to add session ID numbers to links.

This is caused by having the “enable-trans-sid” option turned on, and it creates links with an additional, long, nonsense-looking GET variable. In addition to making the links clunky, it gives spiders different URLs with the same content, which makes them less likely to treat the pages individually. They might not even index them at all.

An example of this can be seen here:

http://www.mysite.com/index.php?PHPSESSID=b5dbe844a2c69fadc58614eb9a94b0 ce

Kind of ugly right? Now imagine being a search engine spider and seeing that long number change every time you look at the page. Not a pretty picture.

A quick way to fix this is to disable the trans-id feature, if you have access, in php.ini by setting “session.use_trans_sid” to false. If you don’t have access to change php.ini, you can add this line to the .htaccess file in your root directory:

php_flag session.use_trans_sid off

This will make the Session IDs disappear from the URL.

Friendly URLs

A major goal in optimizing your PHP pages for search engines is to make them look and act like static pages. If you have a large site you can use Apache to fake static-looking URLs, or, with a smaller site, you can simply keep your GET variables to a useful minimum. In either case, however, never allow a spider to see links with different URLs to the same content. If the URL is different, the page should be, too.

One of the major problems most webmasters have with getting their dynamic pages to index is URL cleanliness. Since many dynamic pages are created with GET variables, lots of pages have URLs that look like:

Page.php?var=abc&var2=def&var3=ghi

Most of the search engines will be able to follow this link, because it has three or fewer GET variables (a good rule of thumb is to keep the number of GET variables passed in the URL to less than three), but any more than three variables will cause you to run into problems. Try using less GET variables, and make them more relevant. Rather that useless id numbers, use titles and other keyword rich bits of text. This is an example of a better URL:

Page.php?var=category&var2=topic

If the page requires more variables, you may want to consider combining the variables by delimiting them with a hyphen or another unused character, and then splitting the variable in the target page.

Next, we’ll cover two methods of making your dynamic URLs more friendly. The first is mod_rewrite:



It’s very important that you plan how to rewrite you URLs, because you don’t want to go back and change the links over and over again. Once you have a good steady map of how to rewrite your links, you can go ahead and modify a test script to see if everything worked. If your links work properly then you’ll be able to move on and modify all existing links. Let’s take a look at this example below:

http://example.com/index.php?act=articles&id=21&page=0

That can turn into:

mod_rewrite URL: http://example.com/articles/21/0.html

Doesn’t that look much friendlier?

Once you have planned your URL rewriting, you are ready to set up your .htaccess file, which maps out what to do with each URL. Here’s an example of what to put in your .htaccess file. I’ll explain how this works below.

RewriteEngine On
RewriteRule ^(.*)/(.*)/(.*).html /index.php?act=$1&id=$2&page=$3

In the example above we have two lines. The first line RewriteEngine On starts the mod_rewrite engine.


The second line does all the work.

RewriteRule ^(.*)/(.*)/(.*).html /index.php?act=$1&id=$2&page=$3

Here you are starting a RewriteRule.

RewriteRule ^(.*)/(.*)/(.*).html /index.php?act=$1&id=$2&page=$3

This is the start of a regex that allows you to create wildcards for the URL that your friendly URL. This allows us to use whatever we want between the (.*) and compare them to the second part of this rewrite rule.

RewriteRule ^(.*)/(.*)/(.*).html /index.php?act=$1&id=$2&page=$3

This is the final part of the rewrite rule that tells us how to map the friendly URL in part two of the RewriteRule to the actual URL that our script was written for. mod_rewrite will translate part one to part 2 automatically.

This works by using regex. The first (.*) will be $1 and the second (.*) becomes $2 in the translation. You can do this as many times as you want for your URL.

Say for example we had this URL: articles/21/0.html.

We’ll be matching up /index.php?act=articles&id=31&page=0 by using mod_rewrite!

Believe it or not, that’s all that’s really to mod_rewriting!

You can really get carried away with this by adding category titles to some of your rewrite URLs to make them even more search engine friendly. All you have to do is create another regex and don’t use that variable in your translation.



Another way to make dynamic URLs appear static is by using Apache’s ForceType method in combination with a PHP command to interpret URLs like: www.example.com/articles/21/0.html as referring to a page called “articles” which is executed as a PHP script. This can be accomplished by inserting a line like this into the .htaccess file in the root of your web documents directory:

ForceType application/x-httpd-php

Now create a file called “articles”:

$nav = $_SERVER["REQUEST_URI"];
$script = $_SERVER["SCRIPT_NAME"];
$nav = ereg_replace(“^$script”, “”, $nav);
$nav = str_replace(“.html”, “”, $nav);
$vars = explode(“/”, $nav);
$article = $vars[1];
$page = $vars[2];
require(“/index.php?act=articles&id=”.$article.”&page=”.$page);
?>

It works in a similar way as the mod_rewrite example does, but with a little more work.

Advantages

One big advantage to using dynamic pages as opposed to static pages is the ability to create content that is constantly changing and updated in real time. You can use the tricks above, combined with RSS feeds from other sites and other automatically “fresh” content to boost your ranks in Google and many other search engines.

Another advantage to using PHP is that you can make simple modifications to many scripts to create relevant page titles. Since this is the most important factor in SEO, special attention should be given to creating title tags that accurately reflect the page’s current content. Any HTML templates that are used in PHP pages can be altered to contain this line: 

<title><?=$page_title?></title>
With this, $page_title can be set to a keyword rich text describing the page. Title text is also important in improving the click-through from SERP’s, so be sure that the title tags doesn’t read like spam, but more like a human created title.

Conclusion
 
Mod_rewrite and ForceType can be powerful tools for any webmaster to add to their arsenal when dealing with PHP. Which methods you want to use are up to you. I’ve employed both in my development for different purposes and they both serve me well.

Google+ Comments

Google+ Comments