Google`s Search Across Computers, a Privacy Faux Pas?

Since Google released their Desktop Search 3, nearly everyone has become a security expert and criticized the software feature allowing you to search across your computers. I’ve seen little actual exploration of the feature, and it’s about time people take a real look into it. Search Across Computers (SAC) is not the security problem that it has been made out to be. This article demonstrates the software and evaluates the risk.

It’s amazing how many people can be outraged by Google’s Desktop 3 compromising their privacy without having used it. Speculation is flying around the net about how a search company wants to steal all your information, but far fewer people have really evaluated the feature. When searching for other people’s experiences with SAC, I found one thread on a newsgroup explaining how to install it and literally dozens of people asking how to disable it. The most informative articles are ones that copy and paste Google’s press statements and help pages.

Don’t get me wrong, there is a loss of some privacy—which Google even admits to—but most people’s concerns are overstated and based on hype. A great example of zealots enraged by SAC is the Electronic Frontier Foundation (EFF). They called for a complete boycott of the software:

If you use the Search Across Computers feature and don’t configure Google Desktop very carefully—and most people won’t—Google will have copies of your tax returns, love letters, business records, financial and medical files, and whatever other text-based documents the Desktop software can index. (Source: EFF)

As we’ll see once I start showing how Google Desktop works, this state of alarm is simply ridiculous and uninformed. They do make a good point eventually, which I’ll address later. Another company, who is a little less silly, is Gartner. They say they software is a security risk for businesses:

The risk to enterprises, according to Gartner, lies in how this shared information is pooled by Google. The data is transferred to a remote server, where it is stored and can then be shared between users for up to 30 days.

Gartner said in a report on Thursday that the “mere transport (of data) outside the enterprise will represent an unacceptable security risk to many enterprises,” as intellectual property could be transported out of the business. (Source: CNet)

Again, the risk is severely overrated. The article says the information can be shared between users, when in fact it cannot. It can only be accessed by one Google user. Equally damaging internet technology already exists and is in common use. Read on to see why.

{mospagebreak title=Installation and Setup of SAC}

After downloading and running the installation file, Google asks us to do some initial configuration. Keep in mind, the EFF says that most users have to be super-careful to not have it turned on.

Well look at that. The Search Across Computers option is turned off by default. In fact, you have to activate two (2) check boxes and also enter your Google account information to make it work. I don’t think this is very cryptic either. In case you can’t read the screenshot, here’s Google’s text under the checkbox:

(This feature stores your indexed files on Google Desktop servers for copying to your other computers. Learn more about this feature or our Privacy Policy.)

That’s clear as far as I’m concerned, and it links to the SAC information page, which explains it in even more detail. No, you do not have to be careful at all to have the feature turned off. You really have to go out of your way to turn it on and associate it with your Google account. I’m not sure if the EFF has an agenda against Google, but they just lost credibility to me. So, I avoided activating it right away, then finished the installation and went into the Desktop preferences.

These are the general indexing preferences. It’s very simple to remove file types from being indexed (for example, just uncheck Excel). You can also exclude areas of your computer or the internet from the index. I excluded SEO Chat from the web history index as an example. However, excluding folders of my computer did not work and may have been buggy. I wasn’t very worried about it, and it must work for other users.

Since I didn’t turn on Search Across Computers before, I decided to do it in its settings tab. Once again, you have to go out of your way to activate it, and you’ll see Google’s warnings.

Users must define what gets uploaded. If you select Web history, it uploads a browsing cache from Internet Explorer, Firefox, Netscape, and Mozilla. It doesn’t look like they are Opera friendly. If you select Documents, it uploads Word, Excel, Powerpoint, PDF and text files in your My Documents folder.

So I entered my Google account. This is a necessary step, because Google doesn’t devour your information and share it with just anyone. The company uploads the information on your computer to your own account, so that way only you can access the files through your login.

Your files are only accessible to whoever has your login and password, and of course, anyone in control of Google’s servers. The files are not widely available, and its threat to a company’s security is similar to other services. However, I’ll have to talk more about this later, since I want to show SAC in action first.

{mospagebreak title=SAC Data Collection and Removal}

Google does not slurp up everything on a computer. It will not steal all your financial records and love letters, as the EFF is afraid. Their warning is overblown to the point of inaccuracy.

Documents that existed before Search Across Computers was activated will not be uploaded. Old files will be indexed locally but will never be sent to Google’s servers. Only files created and edited after you activate the feature will be sent online, and even then only those of a correct file format which are also in an SAC-index-enabled folder. Additionally, Search Across Computers will never upload HTTPS web history (secure web pages). This means it will not share sessions on your email or banking websites. Insecure webmail and logins can still be indexed if you use IE, Firefox, or Netscape to access them, but reputable pages do use HTTPS to secure the connections.

Normally, data uploaded to Google’s servers is deleted after 30 days. Honestly, this is not a very long time. It’s long enough for you to continue working on projects between computers, but it isn’t long enough for anyone to hack your account and look up a long-term history (or even a good short-term history) of your PC usage. There is also a way to manually wipe out what is stored.

After entering my Google account, SAC displays a new button in the preferences labeled Clear my files from Google. I’d say the purpose of this button speaks for itself.

Honestly, I can’t say how responsive this button is. I used it to clear Google’s cache of my computers, but I can still search the files on the remote computer which were already shared. I believe that Google did remove its cache of my files, but the files are now stored in the Google Desktop cache on this computer.

So, it seems like the actual index is downloaded to all connected computers, and it is not just accessed on Google’s servers. So even though Google deletes its copied of your files after 30 days, your computer’s Google Desktop index may still contain those files (I don’t know since it hasn’t been 30 days yet). To get rid of these files, I can use the Remove from Index tool at the top of the Desktop Search pages.

Click the link there, and then the result page turns into a checklist of items you can remove. Just check them off and click remove, and they are pulled from the local index. Don’t count on them being removed from other computers, though.

That said, you can also protect your desktop search tools from other people using them. You can lock your desktop search, which will require you to enter your Windows password before getting results from the local index. This will restrict anyone from jumping on one of your PCs and seeing what is on them all. The Lock Search option is in the middle of the menu.

All these precautions to let users control Desktop Search and Search Across Computers makes me speculate that the EFF never installed the program. If they had, their warning about everyone screwing up configuration and dumping all their personal information online would have sounded ridiculous to them.

Well, I’ve mentioned most of the relevant aspects of Google Desktop, but I haven’t demonstrated the actual Search Across Computers results. Let me show how the result pages come out and how well the service works.

{mospagebreak title=Using Google’s Search Across Computers}

After installing Google Desktop 3 on two computers in the office, I activated Search Across Computers and connected them to my Google account. Less than half an hour later, documents on the second computer started appearing in my desktop searches. It seems like files usually arrive on other computers about 20 minutes after they are saved.

The only thing that identified the result as coming from SAC is that the name of the computer where the file originated is the beginning of the green file name. In this case, that second computer is named WRITER2, and it belongs to Terri Wells. The top two results on the screenshot above came from her computer, and the bottom one came from my own.

To search only on a remote computer, you can add the command “machine:WRITER2” to the search query.

So, I decided to access that top text file she had been working on. After clicking on it, I come to this cached page.

Word files and other documents look similar. It seems like Google strips a lot of formatting to save file space and bandwidth. Still, it’s good enough for pulling from your second computer for reference or copying into a new file and editing. This is actually very cool, and it makes a writer’s job a lot easier if, for example, he has to finish a story at home.

I also noticed that SAC shared some web history that is private. These pages required a login and password to access them, but they are not secured by HTTPS. For example, the screenshot of the cached page below is from the administration side of one of our websites.

As you can see, it shows that Terri was logged in up there. Though there is no especially sensitive information there, other company websites might be insecure and have information that shouldn’t be shared. Yes, it might end up in Google’s cache, but keep in mind that (1) only the person who owns the Google login and Google’s sys admins can access the info and (2) you can deactivate indexing of all web history or deactivate indexing specific sites.

Before I finish showing off Searh Across Computers, take a look at the timeline log of files.

Like you can see, the timeline logs all file changes on all computers that are connected to the Google account. Changes on Terri’s computer are marked with [WRITER2] in the green file path. This is actually really handy when tracking down things you’ve done recently on any PC.

{mospagebreak title=Examining the actual privacy threat}

Having actually seen Google’s new tool, I want to take look at all the controversy that has come up around it. I already said that the EFF claims are overblown. After saying that it’s near impossible to keep SAC from activating and that it will gobble up all of your computer’s information (both untrue), they go on to say:

The government could then demand these personal files with only a subpoena rather than the search warrant it would need to seize the same things from your home or business, and in many cases you wouldn’t even be notified in time to challenge it. Other litigants—your spouse, your business partners or rivals, whoever—could also try to cut out the middleman (you) and subpoena Google for your files. (Source: EFF)

It’s true that the privacy laws surrounding search engines are less defined than they should be, so the government might get away with only using a subpoena. However, a subpoena is still a court order and is not something a spouse or business rival can just randomly grab when they want to see what you’ve been up to. Keep in mind also that Google is currently fighting a subpoena from the government, which is only requesting statistical information on search queries (which does not even identify users or reveal personal information). The laws surrounding search engine privacy are vague enough that the case can go either way, and actual requests for personal information might even be harder to get past Google. This is not a problem with Google’s software or service, but a problem with outdated legal documents that need to be cleaned up.

The risk of subpoena is not a sensible reason for a home user to avoid trying out the software (unless you are into some really illegal stuff). You must opt in to everything Google sends to its servers, and the files are held for only 30 days. You have a lot of control over what is sent away, and Google makes it obvious. Nobody else sees your files, withholding the slim chance of a court order or your Google account being hacked. This is a possibility with storing information on any third party service.

However, the threat for enterprises is not Google, as much as it is careless employees. Since the employees are not handling their personal information, they may not care that they are logged into a corporate page that is not secure HTTPS while SAC’s Web history is activated. Even then, the information only spreads to employees’ personal Google accounts for a few weeks. No harm there, and they can even do work from home. Most employers like that idea. Of course, it is remotely possible a court order could turn up any data on the Google Desktop server, if the employee’s Google account is discovered to be related to a company lawsuit. Then again, an employee could intentionally try to hurt the company simply by giving out their Google login and password (though an email attachment full of sensitive data can be just as damaging). That is just a risk an enterprise has to determine on its own.

There are certainly some great perks to using the software, because it really beats burning CDs or emailing text files. Everything stays up to date with very little effort too. But like most things, it comes down to your personal comfort level. When the service was first started, Google admitted:

“With everything, you trade privacy for a value-add,” says Mayer at Google. “You want a driver’s license? First, you have to tell the state what color your eyes and hair are. I view this the same way.”

Says Weiner: “There will be confusion and concern about this initially, but people will be willing to accept it. Linking multiple devices will be greatly welcomed.” (Source: USA Today)

Google isn’t deceiving anyone. Personally, I don’t feel this is much different from some things I’m doing online already. I have Gmail and Hotmail accounts which I use to send personal and work files. These face the same risks of being hacked or having a court order pull information from them. But neither has ever happened. These services are not the same, but the biggest difference between them is that the Gmail and Hotmail emails stay on the server indefinitely (thanks to high storage), which can be more damaging than my Google Desktop files that are only online temporarily.

It seems that more than anything, the trouble people are voicing about SAC comes down to needless Google bashing. This happened when Gmail came out, but now everyone has a Gmail account. Yes, there is some privacy trade-off here, but I’m saying that a lot of this controversy is exaggerated. And maybe this feature is useful enough to be worth the trade-off for many of us. After all, how many people will really try to break into my account to see what article is coming up on SEO Chat?

[gp-comments width="770" linklove="off" ]