Search Engines and Algorithms: Optimizing for Yahoo! Search and AltaVista

Yahoo is considered the number one search engine above all other search engines. Yahoo search queries make up approximately 28% of all search engine traffic. And just in raw traffic reported by Alexa rankings, Yahoo! demolishes competitors such as Google and MSN. The Y! web portal has continued to offer its visitors comprehensive searching capabilities through its evolving algorithm. Search optimizers would be wise to consider its attributes and quirks, which Jennifer Sullivan reviews in the second part of this series.

Yahoo has changed its search engine several times drastically. Yahoo is striving to produce relevant results to its searchers in areas that computer algorithms simply fall short, areas like opinions and personalized results. For example, “What is the best online electronics site?” or “Who has the best Italian cuisine in town?” are representative of questions that would be an opinion that a searcher would be interested in, but is difficult for a search engine algorithm to put them in touch with. Personal results are indicative of a user’s personal preferences, and not based solely upon the opinion of the majority, and serendipity brings to mind the notion of becoming familiar with the searcher’s own tastes, such as what might be personally relevant to a user, or something family and friends would be a better source of information regarding, rather than a machine. Searching for the term “apple” doesn’t always mean that a user is looking for Macintosh computers.

Yahoo! Social Search

With this in mind, Yahoo introduced Social Search, called My Web 2.0. It is a new kind of search engine – a social search engine – that complements web search by enabling users to search the knowledge and expertise of their friends and community in addition to the web.

The technology powering the social search is called MyRank. According to Yahoo’s My Web 2.0 FAQ, “MyRank leverages all the advances in algorithmic search and combines these advances with a very simple idea – your definition of a ‘better answer’ may be very different than somebody else’s definition. The MyRank technology powering My Web 2.0 enables you to tap into the knowledge of the people you know, and leverage this knowledge to find better answers that are more relevant to you. Friends, colleagues, and other contacts in your community are invaluable sources of information and advice in the offline world. These are the people that share your interests, work in your industry, live in your neighborhood, and have potentially searched for many of the same topics as you – along with topics that you never thought of in relation to the people you know. By fusing the power of algorithmic search with the ability to tap into your community, MyRank technology enables you to find better, more relevant answers for you.”

The way they accomplish this is by adding the ability to tag, save, and then share information with other people, as well as obtain the information that other people wish to share with you. Instead of simply book-marking a page, you have the enhanced ability to tag a bookmarked page with keywords you assign to it. This allows a user to comment on pages they find useful, then give extra information about it to others in the community the user builds. Then, when web searches are performed, not only algorithmic findings are used, but also the information in the personal tags saved by the user, which provides personalized search results based on the shared knowledge of the people they trust.

Social search complements web search, which is driven by publishers and web sites, by providing a better search experience that is powered by people and communities. We’ll talk more about other social searches in a later article in this series.

Yahoo! Concept Search

In recent patent filings, it is clear that Yahoo is attempting to become a better concept engine, as well. What is the theory behind a concept engine?

Yahoo says, “What human beings think in terms of are natural concepts. For example, ‘hawaii’ and ‘new york city’ are vastly different queries in terms of length as measured by number of words but for a human being they share one important characteristic: they are each made up of one concept. In contrast, a person regards the query ‘new york city law enforcement’ as fundamentally different because it is made up of two distinct concepts: ‘new york city’ and ‘law enforcement’.

“Human beings also think in terms of logical relationships between concepts. For example, ‘law enforcement’ and ‘police’ are related concepts since the police are an important agency of law enforcement; a user who types in one of these concepts may be interested in sites related to the other concept even if those sites do not contain the particular word or phrase the user happened to type…” It is not clear when or how Yahoo intends to incorporate this, or what technology they have up their sleeves, but Yahoo has been very good about announcing their changes before they happen.

What is not clear is how Yahoo intends to implement this type of technology, or whether they plan to offer concept search as a separate search engine from their new social search. But what is very apparent is the ambition and dedication to try to bring the searchers exactly what they are looking for, and to personalize them according to each individual’s own needs.

Crawling Behavior & Optimization Strategies

Yahoo places top priority on high keyword density. Some have suggested that keywords in the meta title, or high title density, receiving about 10% weight in the algorithm; keywords in alt tags, meta tags, and meta description receive about 3%; header tags, and outbound anchor link text, and content keywords receive another 3% weight in the algorithm.

It seems at first glance that their algorithm is a complete opposite to the priorities placed in Google’s algorithms, but this is not really the case. In fact, some relate the current Yahoo algorithm to a Google algorithm of 2 years ago. Since the integration of the Inktomi algorithm, Yahoo’s search engine is placing a higher weight on backlinks, however we still show it’s not a driving force of the algorithm like in Google’s.

When optimizing for Yahoo, it’s important to remember that Yahoo continues to be very much about on-page factors like content, keyword usage and density, bold text, and header tags; and not nearly as much about off-page factors, such as inbound links, anchor text, and so on. Yahoo also likes to see keywords in the actual URL of the site or page, but does shows preference for bold and <h1> text.

While Google’s preferred keyword and tag density is roughly 1.5%, Yahoo’s optimal is twice that, or 3%. The higher ranked sites in Yahoo have between 2.7% and 3.3% keyword density. Other keyword factors to consider are in the title tag, with 15 to 20%. The title tag is one of the more important on page elements for the new Yahoo algorithm.

Yahoo is one of the few search engines that still look at Keyword meta tags, although we have certainly seen the priority shift away from this recently, as this element continues to be abused by spammers.

Yahoo does employ a duplicate content filter to search results. However, we usually hear nothing about it, like we do with Google, because it utilizes technology to discern the source of the original content, whereas Google seems unable to decipher the source of the original content. We’ll talk more on that at another time.

While these strategies are currently working for Yahoo, with the introduction of Social Search, and concept searching, it is inevitable that these techniques will undoubtedly change soon.

What’s New & Hot with Yahoo!

Like any of the major search engines, Yahoo is constantly adding new features to their search services. Utilizing these extra features help to enhance search for the user.

Blog search

Yahoo Inc. said in mid-October, 2005, that it will begin featuring the work of self-published Web bloggers side by side with the work of professional journalists, leveling distinctions between the two. While I’ve reserved most of my information in this area for another article, it is sufficient to say that specific blog searches to supplement news searches come as great news to the blogger and SEO alike.

Podcast search

Yahoo! Podcasts offers a comprehensive directory of podcast series and individual shows from across the Web, complete with detailed search results, most popular and highest rated lists, editorial picks, and a full collection of tools you can use. These community tools include ratings, reviews, and the ability to tag audio content, or view the tags that other people add.

Video search

The Video Search service allows you to search the Internet for video clips. While Video Search has been around for a little while, it now incorporates the use of RSS so a user can create a feed URL of Yahoo video content based on a search and subscribe to it in iTunes 6, automatically downloading free movies every day to one’s media collection.

Desktop search

In late September, 2005, Yahoo introduced its Desktop search utility. Yahoo’s Desktop search offers a simple and convenient way to search on the Web for more information about any topic, word or phrase in your desktop files, email, and documents, as well as instant messages. You can even find information related to your desktop content. Some of the features of this new utility are:

  • The ability to search as fast as you can type

  • Previewing file capability

  • Link content you are viewing with web results

  • The ability to search email attachments regardless of the type of file

  • Selectively indexing only the content you choose, so as to ensure privacy

In The Works

Some of the new things that are coming from Yahoo include Travel Trip Planner (currently in beta), which saves all of your trip ideas into a personalized and printable travel guide. Other new things to watch for from Yahoo include Instant Search, Mobile Shopping, Audio Search, Search Subscriptions, and more. You can preview all the new projects Yahoo has in the works at Yahoo Next (next.yahoo.com).

Yahoo News, one of the world’s most popular Internet media destinations, has begun testing an expanded news search system that includes not only news stories and blogs but also user-contributed photos and related Web links. Yahoo said its move to “…combine professionally edited news alongside the work of grassroots commentators promises to enrich the sources of information on breaking news events.”

I think we can expect great things from Yahoo in the near future with the social and concept search engines, and even in the areas of semantic search, as research continues in the area of latent semantic analysis. We’ll look more at semantic search in another article in this series.

AltaVista

One of the oldest search engines, AltaVista has changed much over that past decade. AltaVista’s search technology is now powered by Yahoo, which is why it’s being talked about in this particular article.

AltaVista means “a view from above.” It was one of the first major search engines to appear on the web. Unfortunately it lost significant market share from its peak years at the end of the 90′s to such engines as MSN and Google, so that now AltaVista is only a minor search engine that uses the search index results from Yahoo.

When AltaVista started in 1995, engineers devised a method to store every word of every page on the entire Internet in a fast, searchable index. By December, less than six months after the start of the project, AltaVista opened to the public, with an index of 16 million documents. With more than 300,000 searchers using the engine on its first day, it was an instant success. By the end of 1996, AltaVista was handling 19 million search queries per day. AltaVista was a favorite of both novice searchers and information professionals alike, but saw the beginning of its decline in popularity when Google was born.

After many changes of hands, including Compaq and a company who owned part of Lycos, Overture purchased AltaVista in February 2003 for price of $140 million, a fraction of its valuation of $2.3 billion in 2000. Consequently, when Yahoo purchased Overture at the end of 2003, AltaVista was just part of the deal. It is now, unfortunately, just a mirror of Yahoo, using the same search index and basic user interface. So to optimize for AltaVista, you should actually optimize for Yahoo.

Wrapping Up

Hopefully, we’ll get a clearer picture of where Yahoo is going with search, as the My Web 2.0 beta, and the latent semantic indexing technology currently being researched by Yahoo, takes off. The evolution that Yahoo has undergone since the beginning is almost mind-boggling, and it doesn’t look like it’s stopping anytime soon. So we will just keep plugging along, and simply just go with the flow.

Google+ Comments

Google+ Comments