More CrawlTrack Tips and Techniques for Webmasters

In part one, you learned the important concepts and steps for installing and integrating CrawlTrack into your website. In this part you will learn how to maximize the use of CrawlTrack in dealing with website statistics. You will also learn how to maximize the security of your website.

Before you read this tutorial, make sure you have read the first part and that you already have CrawlTrack fully integrated into your website. This is required for you to understand this part.

Interpreting and Gathering Website Statistics Data

First, you need to log on to your CrawlTrack account. To log on, since CrawlTrack will be installed in the root directory of your website (for example, http://www.thisisyourwebsite.com/crawltrack3-1-1/), type the URL address into the browser and press enter. You will then need to enter your username and password. After a successful log-in, you will see the CrawlTrack Dashboard. 

In the Dashboard, you will see 8 major sections of web statistics data with the three most important parts.

The first part is Visitors. This shows the number of visitors to the website in a certain time interval (the default setting is one day/current day). You can customize the period of data collection. Just go to “Period Choice;”  a drop down appears with the following elements:

1. Day

2. Last 8 days

3. Week

4. Month

5. Year

6. Everything

Setting the period not only affects the Visitors data but other parts as well (like the ”server load,” “crawlers” and “hacking attempts” sections.)

The data presented in the Dashboard à Visitors tab is self-explanatory and similar to Google Analytics. In order to see the details of the “Visitors” web analytics data, click the “Visitors” link.  You can then view the detailed statistics; for example the “browsers used by visitors,” “origin visits,” “website and other search engines,” etc. The details page of the “Visitors” section should look like the screen shot below:

 

To browser another section of data, click the “Home” icon. This will take you back to the dashboard panel again.

The second important section of CrawlTrack covers crawlers. This feature is not available in popular products like Google Analytics. One of the most important pieces of information you can get from the Crawlers section is when Googlebot or other search engine bots visit your website. This section also reveals other bots that CrawlTrack can detect, and you can decide whether you want to block them using .htaccess — if, for example, they are consuming too much bandwidth.

The summary page for the “Crawler” section gives you the total number of crawlers visiting the website, as well as the names of the main crawlers. To see details, click the “Crawler” link and you can see a table showing the “Visits detail” of the web crawlers. For example, if you need to analyze the activity of Googlebot in your website, just click the “Googlebot” link in the “Visits detail” table.

The above data shows that Googlebot has made 473 visits to the website since CrawlTrack was installed (“Everything” is set in the period choice), and it shows that Googlebot has visited around 67.4% of the website, so not all of its pages are being crawled by Googlebot. This is because some pages have been blocked with robots.txt.

And if you need to know if a certain page or URL has been crawled by Googlebot, you can find this out in the “visits detail” section. For example, since CrawlTrack has been installed, “/about” has been crawled 9 times.

The third important part of CrawlTrack reveals statistics for hacking attempts. This is also a new feature, not found in most all web analytics packages. The data you see in the dashboard reveals two types of attacks, code injection and MySQL injection. By default, CrawlTrack will only record the attacks, but will NOT block them.

To block hacking attempts, go to the “Administration page (with a “wrench” icon). Under “Tools,” find the section labeled ”Hacking protection parameters.” Check “Record it and block it,” and then click OK.

Once this is set and you want to view the details of hacking attempts, you need to go to the dashboard and then click the “Hacking Attempts” link. On the next page, click the number of hacking attempts; it is a hyperlink (see the screen shot below, inside the red circle):

You can then see the attack details after clicking the link. The IP address of the attacker, the number of attacks and the script injected can be seen in the table. If you see a massive number of attacks, you can then decide to report these IP addresses to the authorities or block the IP addresses in your .htaccess.

A single and complete installation of CrawlTrack at your main domain allows you to integrate CrawlTrack in your other domains without the need to re-upload and go through the re-installation process. However, it requires that the server for your domains as well as the scripting language and databases are fully compatible with CrawlTrack (read part one for the requirements).

To integrate CrawlTrack into your other domains, follow the step-by-step procedure below:

Step 1. At your main domain, go to the CrawlTrack administration page.

Step 2. Find “Add a website” and click that link.

Step 3. Enter the website’s name and domain URL (refer to part one for details) and click OK.

Step 4. When you see the text “The website has been added to the database,” click OK.

Step 5. Go back to the CrawlTrack administration page.

Step 6. Click the link that says “Create tags to insert in your website.”

Step 7. Under “Choose a Website,” check the website (the domain aside from your main domain that you need to integrate with CrawlTrack). And then click OK.

Step 8. Select the tags under “Tag to be used if the site is hosted on a different server than CrawlTrack (you also need to have the fsockopen and fputs functions activated).”

Step 9. Implement the same insertion procedures for the PHP embed tag in your domain that were discussed in part one. These embed tags for the other domain are DIFFERENT from the standard tags. Below is what the code should look like when pasted into your PHP template (the last part of the Google Analytics code is shown on the top):

 

Step 10. Go back again to the CrawlTrack administration page.

Step 11. In the tools section, click “Create a test crawler.”

Step 12. Click the “Create a test crawler” button.

Step 13. When you see the message “The crawler has been added to the database,” click OK.

Step 14. Open your other domain (the one that needs to be integrated with CrawlTrack), using the same browser and computer.

Step 15. After the website has fully loaded, check to see if the test crawler has been registered under the details of the “Crawler” section.

Sometimes when you see your CrawlTrack log (after one to three months), especially the “hacking attempts” log, you will be surprised that there are lot of hackers trying to inject malicious code into your website.

Without CrawlTrack blocking those attempts, it is highly possible that the website could be compromised.

Repairing a hacked website is a pain, because you need to reinstall/clean the database as well as the CMS scripts. Not only that, but you have to make sure there are no doors open to exploits.

With respect to web analytics, CrawlTrack does perform well as far as reporting statistics because you will find data that cannot be found in other web analytic products. A good example of this is the “Crawler” reports.

There are a lot of other advanced features offered by CrawlTrack which are beyond the scope of this article. We will focus on those in the future.

Google+ Comments

Google+ Comments