A Different Way to Search

Google became the leader in Internet search in a relatively short period of time, supplanting the search engines that came before it. It gave us a new approach to search. But the web has changed. Is Google’s approach running out of steam? Could we see another search engine come along and supplant Google with a different approach to search? This article considers Google’s rise, the limitations of its current search philosophy, and how other contenders are trying to solve the same problem.

As far as many people are concerned, there is only one search engine. The name Google is almost synonymous with the web. So powerful is Google’s position, as it sits on just about every browser toolbar and handles around 70% of US Internet searches, that anyone under thirty will probably find it hard to remember that it wasn’t always this way.

Before Google, there was real competition in the search engine market. Yahoo, AltaVista, Excite, Lycos, WebCrawler, HotBot and Infoseek at one time or another had genuine aspirations to be leader of the pack, yet the majority of these engines have been reduced to the occupation of niche markets, becoming nostalgic memories of the way things used to be on the web before 1998. That was the year when the Google phenomenon really began to take off, forever changing the way people search for information online.

Google moves the goalposts

When Google took the world by storm, it was to an unprecedented fanfare of hype about its search technology. The search engines of the time were either directory based, like Yahoo, or used a search algorithm based on analyzing page content to determine relevancy, like AltaVista. Both of these approaches were deeply unsatisfactory, the first because directories tended to be so incomplete, and the second because so many of the results were wide of the mark.

This state of affairs often required surfers to waste time making multiple searches on different engines to find what they were looking for. Even then, they very often had the uneasy sense that they were somehow missing it, however convinced they were that it must be out there somewhere.

The market was more than ready for Google, with its new search technology based on analyzing the links between sites to determine relevance to search terms. The core idea that underpinned this approach was that the more incoming or back links a site had, the higher its perceived value by other members of the web community and therefore the higher its deserved ranking in the search results. This, inevitably, still neglected many sites with strong content but few inbound links; nonetheless, it was a revelation from the perspective of searchers who subjectively felt as though Google spoke directly to them and understood their requirements like no search engine before.

Almost overnight, much of the lottery aspect went out of Internet searching. By and large, the results returned by Google worked. For the first time it was possible to use the Internet with a reasonable degree of efficiency. From a distance it can be difficult to understand the impact of this, but at the time it was the online equivalent of exchanging your Ford for a Ferrari.

{mospagebreak title=More Google}

Ten years on, however, Google is facing its own problems. Growing concerns over privacy, copyright and censorship issues grab the headlines, but they serve as a disguise for a deeper malaise: the increasing corporatization and predictability of Google’s search results. In many ways the market seems to be turning full circle, arriving back at a position where even Google’s results no longer reflect the true needs of its users.

Ironically, this is the direct outcome of the very technology that transformed searching in the first place. On the essentially level playing field that was the Web in 1998, Google’s technology served to reinforce a fundamental democracy, where content and its inherent quality wielded significantly more power than either scale or financial power. But as corporations have tightened their control over the Web, Google’s strategy has started to look dated, serving the needs of those corporations, suppliers of goods and services, at the expense of what many users are really looking for. Not everybody who uses the Web is trying to buy something, but in the Google model, non-consumers are becoming increasingly disenfranchised under the weight of commercialization.

This is due to a number of factors. One is that the site optimizations necessary to obtain a high Google search ranking tend to require specialized knowledge or to cost a lot of money, which favors wealthier corporations whose sites therefore tend to be promoted towards the top of the listings. Another is the way in which the quantification of back links as an arbiter of quality tends to reinforce the high ranking of already popular sites, effectively creating a barrier that prevents new sites from improving their ranking. Given these factors, it is easy to see the extent to which Google’s once-radical ranking methodology now serves to sustain the status quo.

Of course it’s all very well to criticize Google, but this is a pointless exercise without being able to offer meaningful alternatives, either in terms of new search technologies that are less susceptible to corporate influence, or alternative search engines that more frequently meet user requirements than corporate ones.

{mospagebreak title=Query reprocessing}

One search enhancement technology that has been attracting interest recently  is query reprocessing. This approach, implemented by sites such as Hakia and Powerset (recently purchased by Microsoft), attempts to subject search queries to logical analysis in order to better determine the intentions of the searcher.

Hakia, which describes itself as a semantic search engine, claims that this leads to significant improvements over Google. Where Google tends to return the most popular results, Hakia says that its concept matching technology improves the quality of the results, reducing both wasted search time and the use of misleading information. According to Hakia, popularity and quality are not the same thing: the quality of a search result is determined by the credibility of the source, the newness of the material and its relevance to the query. Hakia have implemented a number of specific enhancements that they claim lead to improvements in these areas, including:

  • Categorization
    This is the subdivision and display of short – often single-word – queries in categorized form to help the searcher distinguish between various interpretations of the search.

  • Parallelism
    This feature enables certain terms in the search query to be dynamically replaced with others that have the same meaning in order to expand the result set. Hakia provide the example of "cure" replacing "treat" in a health query to enable a more representative set of results to be obtained.

  • Generalization
    A typical search engine query contains general terms that if treated literally artificially limit the result set. Hakia’s generalization function will show results, for example, that contain the names of specific car manufacturers in response to a query that contains the word "car." 

Hakia have also made improvements to the way their search results are presented in order to make it easier for users to identify relevant material. These include the use of highlighting, the display of extended text extracts, and a so-called dialog mode in which Hakia will address you directly to point out good answers. This ties in with Hakia’s philosophy that in the longer term, search technology must fully embrace user interaction. According to this view, search will only reach fulfillment when it fully supports something approaching natural language queries and extended conversations between the searcher and the search engine. “Eventually,” says the Hakia web site, “people would love to talk back and forth to a search engine pretending to be Mr. Spock.”

Data typing is another area in which enhancements have been applied to search models to improve the user experience. Google itself has taken certain steps towards evolving from its firmly text-based roots to embrace media such as music and video. However, the seamless integration of different media into a single search remains a distant dream for traditional search engines whose main search interfaces tend to deal strictly in words. Searchmash is one site that has set out to overturn this convention. A Searchmash search for, say, Coldplay, will reveal text results in the main window with image, blog, video and Wikipedia results below drop down menus in a block on the right hand side of the screen.

Ask.com offer an essentially similar presentation, though offering music tracks instead of video. This categorization is brilliantly useful, especially to younger and newer users for whom the web is inherently a rich-media experience. It can help eliminate extensive searching through text results to work out whether they contain the images or videos you’re really after, saving time and enhancing the end-user experience.

Better still, Searchmash allows you to play videos from its own site just by clicking its thumbnail, while Ask allows you to do the same with music tracks. As rich media increasingly dominates the web, this kind of integration can only suit users’ needs better than traditional text-based searches.

{mospagebreak title=A personal agent}

No matter how sophisticated the technology that drives them, Google and all other traditional search engines suffer from the same fundamental limitation – they only find information for you when you type a query into the interface. In other words, you have to tell them what you want and when you want it. Search agents such as Copernic are making an interesting attempt to overturn this paradigm and bring greater efficiency to the searching process by seeking information on an ongoing basis.

The concept is simple enough: so simple in fact that, like all great ideas, you wonder why nobody thought of it before. Essentially you tell the Copernic agent what you’re interested in – let’s say new books about architecture. The agent will then scour 90 different search engines, tracking new search results and reporting back to you with sets of summarized, collated results, like your own personal web crawler. It’s up to you to pick out what to follow up, but theoretically this could save you hours of scanning the web yourself, seeking out similar data. There is an entire industry dedicated to seeking out media reports on celebrities and corporations that already benefits from technology of this kind. Copernic makes it available to the regular user.

Time never stands still on the web, and this is just as true of search technology as anything else. No company, not even Google, can afford to rest on its laurels in an environment where ever-growing numbers of users must sift through ever-increasing volumes of data to find what they need. The old model of typing arcane search strings into little text boxes probably still has some mileage in it, but the days of this approach are almost certainly numbered.

Whether it’s created by Google or someone else, the field is wide open for the next killer search application. The search industry appears set to undergo a fragmentation, with a range of specialized engines and new technologies providing capabilities that suit the many ways in which different people look for information, and the wide variety of information they are seeking. This fragmentation could bring with it an erosion of Google’s natural core audience as they disperse among a variety of new players.

It’s equally possible, of course, that a single visionary company could supplant Google’s position, rapidly becoming very popular and having to face in turn the same issues that Google is struggling to deal with now. Whatever the future holds, one thing is certain: the way you search for information online is once again set to change.

[gp-comments width="770" linklove="off" ]