Hiding Your Sensitive Data From Google and the World - Protect Your Data from Search Engines Continued
(Page 4 of 4 )
Remove Content From Google's IndexAlthough it is often no use to cry over spilled milk, contacting Google and following their instructions helps. Most of the tasks that you have to do are performed via the robots.txt and http://www.google.com/remove.html. That is the place to see what you can exclude from their index, like entire sites, parts of sites, snippets, cached pages, dead links, images.
Secure Your Public Servers and Operating SystemIf your public servers and the operating systems they are running on are not secured, the robots.txt file is of little help. Although Google will keep to the instructions in the robots.txt file and will not index specified files, there are other crawlers on the web that will take advantage of the files they find on your computer. Regardless of the fact that you are telling them in the robots.txt file to keep away, they do not all listen. It is a wise choice to use password-protected folders for especially sensitive data. This definitely helps in keeping privacy, no matter if Google or another search engine are involved as intermediaries or someone directly goes to your computer to hunt for stuff.
Database Security
If your site has a database as a backend, you need to protect it as well. There are many techniques that exploit vulnerabilities in databases and use SQL injection in order to get access to sensitive data. Depending on which database you are running, the measures that you need to take and the exact steps that are to be performed vary; in any case, applying the latest patches is a must. Also, you may need to talk to your web developer if possible to do it, but a second database that is not accessible from the net could be a wise choice for keeping sensitive data and retrieving it by authorized individuals when necessary. Again, your Web developer is the guy or girl to ask about how to hide columns with sensitive data to exclude them from possible searches.
Whose Job Is It?At first sight, most of the tasks that need to be performed in order to secure a site from disclosing sensitive information seem to be a job more for the system administrator than for the web marketer or the SEO expert. While it is true that it really requires some knowledge and skills in system administration, most of these tasks are not that difficult and can be performed by the SEO expert, or together with the system administrator. And if, as it happens very often, you are both webmaster and SEO expert or are optimizing your own site, then the exclamation “But it is not my job!” becomes absolutely pointless.
Many of the techniques that are used to check the security of sites (or to take advantage of any security oversight) are often called “Google hacking,” They can be used both by potential hackers and by you. Needless to say, it is much better that you first use them to discover any potential holes than if hackers come first. What is more, very often the measures needed to secure your web server and the pages on it are neither difficult, nor time-consuming, especially if you use automated tools to do the checks. There are several tools for performing automated tests for Google hacking: SiteDigger, Gooscan, WebInspect, and AppDetective. You may want to consider trying several of them in order to see if your site is vulnerable.
| DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware. |