Google Sees Flash. So What?

Late in June Google and Adobe announced an arrangement whereby the search engine could now read and index Shockwave Flash files. The move has been hailed as a huge step forward, since Flash has up until now been invisible to the search engines – and therefore invisible to searchers. But this isn’t quite the panacea it appears to be.

Let’s begin our discussion with the official announcements from Google and Adobe. According to Adobe’s web site, the software maker’s contribution to this deal includes “optimized Adobe Flash Player technology” which it is providing to both Google and Yahoo. The technology will help the search engines index the many rich Internet applications (RIAs) that use the SWF file format. Basically, using Adobe’s technology, the search engine bots can navigate through a live SWF application as if they were virtual users. Best of all, RIA producers won’t need to change anything about their content to make it searchable.

Granted, this is fairly exciting news. Google says the new technology covers SWF files of all kinds. “This includes Flash ‘gadgets’ such as buttons and menus, self-contained Flash websites, and everything in between,” the company explains in a post about the arrangement. Since TechCrunch estimated that there are about 73 million Flash files on the web, that’s a lot of stuff that was invisible suddenly getting indexed.

For those who keep track of these things, it’s interesting to note that no direct mention of Microsoft was made in either the Adobe or the Google announcements. A number of observers have speculated that this apparent snub has something to do with Silverlight, the technology Microsoft created to compete with Flash. As to whether Adobe is planning to provide its Flash indexing technology to other search vendors, its press release said only that “Adobe wants to help make all SWF content more easily searchable. As we roll out the solution with Google and Yahoo!, we are also exploring ways to make the technology more broadly available.”

While most of this is good news, there’s a lot that remains unsaid – and a lot that will remain invisible. Not everything in a Flash file will be indexed. And not all Flash files will be indexed. In short, Google and Yahoo may now be able to see these files, but they still need glasses — of a strength as yet unprovided by Adobe or anyone else — to make out certain details. Keep reading to see what I mean.

{mospagebreak title=What Can’t Be Seen}

The technology from Adobe lets Google and Yahoo index the textual content and links in Flash files. What does this mean as far as images? “If your Flash files only include images, we will not recognize or index any text that may appear in those images,” Google explained. “Similarly, we do not generate any anchor text for Flash buttons which target some URL, but which have no associated text.”

If you guessed that Google and Yahoo also can’t index video with Adobe’s technology, because they can’t index images, give yourself a gold star. Google explicitly states that they do not index FLV files “such as the videos that play on YouTube,” because these files don’t contain any text elements. While I’m reassured as a writer and an editor to know that text won’t be going obsolete on the web any time soon, it does mean that any ideas of doing an all-video web site need a serious rethink (so much for my dreams of an “Editors Gone Wild” site).

You’re probably aware that Google and other search engines have problems when confronted with JavaScript. If your web page loads a Flash file via JavaScript, it’s quite possible that Google won’t see it – in which case, it won’t be indexed. So you’re back to square one there.

Okay, let’s assume that the JavaScript issue is not a problem for you. But you have your Flash set up so that it loads files from external resources. That’s not a problem, right? After all, Google can read HTML, XML, and certain other kinds of files, so it shouldn’t have any problem reading these files – or so you’d think. Yes, Google can index those files, but it does so separately. It doesn’t yet know how to relate it to the Flash file, so it won’t be considered to be part of the same content. That’s really important if those files contain keywords or other content that you’re trying to use to optimize that particular Flash file!

There’s one more issue of particular concern to those who feature content on their site that is not in English. I’m going to quote directly from Google’s press release, because this is something I wouldn’t want to see anyone misconstrue. “While we are able to index Flash in almost all of the languages found on the web, currently there are difficulties with Flash content written in bidirectional languages. Until this is fixed, we will be unable to index Hebrew language or Arabic language content from Flash files.”

{mospagebreak title=The Good and the Bad}

Even with the down sides, this is forward progress. Flash developers have been dealing with the “invisible content” issue for more than a decade, and this move is better than nothing at all. On the other hand, it does mean that you might have to do things a little differently.

Why is this necessary? Didn’t I just say earlier that you wouldn’t have to do anything special to make your SWF files visible? Yes, but you might now have to do something to make them invisible. You may unintentionally start showing Google some less than informative content, now that the search engine can see the text your visitors see. “If you prefer Google to ignore your less informative content, such as a ‘copyright’ or ‘loading’ message, consider replacing the text with an image, which will make it effectively invisible to us,” the search engine helpfully suggests.

There is something else you should keep in mind. Getting indexed is not the same thing as getting ranked. Google helpfully illustrates the difference between its old and new way of indexing flash files by showing an interesting pair of screen shots in its blog entry on the topic. The search performed was for “nasa deep impact animation,” without quotes. The “before” screen shot showed only the link and page title. The “after” screen shot included a reasonable blurb from the site. But here’s something Google didn’t show you:

The link that takes you to the actual animation is the fourth one down – and it’s beaten by links to HTML pages. That’s not a bad rank at all, but there’s a lesson to be learned here: even when a searcher is specifically looking for an animation, you’re likely to be beaten by an HTML page. So you’d better make sure that the HTML page you’re beaten by is your own, as NASA did here. That first link takes you to a page that describes the animation, and features two links to it.

By the way, this isn’t necessarily a bad thing. As a searcher, even if I’m looking for an animation or a video, I strongly dislike it when the item in question just starts loading right away. You can expect that your other visitors aren’t thrilled about that either. So the things that you used to do to make sure your pages with animation will rank on the search engine results pages still apply.

{mospagebreak title=More Evolution Than Revolution}

This move sounds like less of a giant leap and more of a natural step forward when you consider that the search engines (especially Google) have been trying to index rich content for some time. Vanessa Fox, writing for Search Engine Land, noted that Google has been able to extract some text and links from Flash files for a while now. Adobe’s technological help makes the process “less error prone.”

Perhaps the most significant change for Flash files won’t be in becoming visible so much as becoming a little more understandable to searchers. Fox observed that improvements in the snippets displayed under search results describing the files might be the most noticeable difference. “Before, Google often couldn’t extract any content from a Flash file, so the description for a Flash page was often either empty or would consist of the only text available for the file, such as the Flash version or the word ‘loading,’” she stated.

So how does this change SEO? Fox advises that you implement Flash in such a way that a unique URL is provided for each set of content. Since Google can follow interactions with Flash (to a limited degree), it might load a particular URL as relevant to a certain query, when the actual item that is relevant is deeper in the SWF file – so when the searcher clicks that link, “the content won’t be found on the page. The searcher will have to interact with the application until that content is loaded. Searchers may instead feel frustrated and abandon the page,” Fox explained.

The other way this changes SEO is that you may now have to fight designers and clients who insist that, since Google can now read Flash, it’s reasonable to build lots of Flash into a website, or even go whole hog with an all-Flash video site. They will probably be dismayed when they learn the reality; in some ways, things have gotten more difficult, since you can’t tell quite as easily what is and is not indexed. Google won’t show you the file they crawled.

John Andrews, writing in his blog Competitive Webmastering & SEO, notes that “SEO for Flash just got more expensive, because it got more sophisticated…Where SEO for Flash used to be limited to a reasonable set of success metrics, we now have an opportunity to help Google much more as it seeks to understand what the Flash content means for the user.” Interestingly, his suggestion is that “The first thing smart SEOs need to do now is block Google from indexing Flash, simply because we don’t control Google’s interpretation of the meaning of Flash content. I don’t think that is what Adobe intended.” Indeed. 

[gp-comments width="770" linklove="off" ]