What Lies Ahead for Local Search Engine Technology?

In this interview by Andy Beal of KeywordRanking.com with Arnaud Fischer, search product planning for InfoSpace’s Search Directory division, Beal asks Fischer about privacy, search features on cell phones, desktop searching, and many more topics.

No topic has received as much coverage recently as that of “local search” – the ability to find search results that are targeted to a user’s geographical preference. Google, Yahoo! and Ask Jeeves are all making impressive advancements with local search, but there is another company that is vying for the local search crown.

InfoSpace is best known for its search engine brands like Dogpile.com and webcrawler.com, but the company is building a reputation for itself as a provider of local search results, while at the same time building useful applications for the mobile user. As part of Andy Beal’s continuing look at “the future of search technology”, Andy had a chance to ask Arnaud Fischer, previously AltaVista product manager from 1999-2001 and currently leading search product planning for InfoSpace’s Search & Directory division, some questions about how local search will develop in the future.

[Andy Beal] InfoSpace recently re-aligned itself to serve online yellow pages and white pages customers. Can you tell us what most excites you about this space?

[Arnaud Fischer] I am most excited about the “local search” opportunity. Inktomi, Google, and others already serve country-specific search results today and geo-targeting at a more granular level will unlock a tremendous amount of value for local advertisers, in addition to serving more relevant content to end-users. The traditional yellow pages market is roughly a $25 billion a year global industry. Many small businesses are awakening to the efficiency and predictability of online marketing, increasingly shifting marketing budgets to Web search and Internet yellow pages. Unlocking that opportunity is no easy task, though.

Internet yellow pages sites such as InfoSpace.com and Switchboard.com are working hard to deliver an end user experience that will bring more of the billions of annual print YP (yellow pages) look-ups online. With the penetration of broadband, always-on Internet connections growing and increasing adoption and use of ‘data-friendly’ mobile handsets, the print yellow pages appear to be on the verge of becoming obsolete.

[AB] What are some of the challenges search companies face with local search?

[AF] Search engines are developing ways to disambiguate and adequately address location-specific queries. Geo-targeting Web search content, both organic and paid, requires search engines to better understand users and queries, inferring local intent by extracting geo-signals and leveraging implicit and explicit user profiles. Taking local search marketing services to market is also very different than selling paid listings to online businesses. The vast majority of local businesses still don’t have a Web site, nor the time and expertise to invest in managing sophisticated auction-type listing campaigns.

{mospagebreak title=Paid Inclusion Services, Advancements, and Desktop Search} 

[AB] There’s been a lot of discussion recently about paid inclusion services; where do you see advancements coming in this area?

[AF] Search marketing should keep evolving very fast this year. Although pay-per-click platforms have expanded match type flexibility, campaign targeting is growing beyond keyword analysis to include geo-targeting and day-parting. Search engines are leveraging smarter linguistic technology, concept extraction and contextual categorization, to optimize targeting of paid content, improving on relevancy, conversion rates and increase advertisers’ ROI. While advertisers might be losing control over guaranteed placement over time, paid search has made budgeting for traffic-generation programs increasingly predictable. Effectiveness metrics are evolving from impression counts, and click-through conversion rates to more sophisticated return on investment (ROI) methodologies. Some engines already provide advertisers with tools to calculate conversion rates from impressions to orders and ROI metrics.

Overture and Google go one step further, suggesting forecasted traffic levels and cost estimates for specific keyword combinations, match types and bid amounts. In a yield-driven context, where content targeting gets more sophisticated and matching more scientific, Paid Inclusion and Paid Listing programs will eventually merge into more automated bid-for-traffic models. Ultimately, advertisers will target impressions by dictating an ROI level acceptable to them such as “8% over advertising spend”. To meet these requirements, search engine marketers will increasingly rely on automation tools to target the right content to the right users at the right location at the right time.

[AB] Let’s look beyond the next few months — what advancements do you see in the coming years?

[AF] One of the most significant developments currently underway in web search is the integration of search capabilities within a broad range of other services. Increasingly, this trend in creating a new competitive arena in web Search that is forcing established providers to adopt new strategies and creating new market opportunities.

As the #1 web application, search is becoming more ubiquitous as technology and business models mature. We are seeing more ISPs adding search capability to their portals; we are seeing more newspapers and community-type portals integrating local search and Yellow Page offerings as well, in order to retain users on their properties and leverage what has become a very profitable business model.

InfoSpace has long offered its web search and online directory capabilities on a private-label basis that allows our distribution partners such as Verizon, ABCNews, FoxNews, and Cablevision to deliver these services under their own brand. The increasing level of search activity occurring at popular destination sites like these has been a key component of InfoSpace’s growth over the past year. In January, we announced that distribution revenue accounted for over half of InfoSpace’s search-related revenue in the fourth quarter of 2003.

[AB] We hear in the news that desktop search is going to be the next “big thing”. Who do you see as being the key contributors to this area of search?

[AF] Both Microsoft Longhorn and IBM WebFountain will eventually make search a lot more transparent and integrated to end-users’ broader task-centric activities.

The Microsoft Longhorn operating system will have a significant impact on the overall information retrieval discipline and how users search. Microsoft is building centralized storage architecture around the next version of Windows that will make it much easier for end-users to retrieve locally stored information, no matter which application was originally used to author it. The subjective nature of users’ intent when formulating queries is complex. A better understanding of the task surrounding a search could make strides into serving more relevant results. The desktop and associated applications add a level of understanding of the user context that a browser cannot match. You could envision a world where users working on a document in Microsoft Word or PowerPoint get presented relevant related content leveraging text analytic technologies extracting concepts and themes of the document being worked on in real time. This is query-less search, relevant, in your face, all the time, without user interaction.

IBM has also been quietly working on the next generation of search technology, focusing more on text analytic solutions, leveraging what some call the “Semantic Web”, including natural language processing, statistics, probabilities, machine learning, pattern recognition and artificial intelligence. IBM’s WebFountain technology goes beyond crawling and indexing the Web for the mere purpose of returning relevant links for a given queries. The technology actually tries to make sense of massive amounts of structured and un-structured content, extracting knowledge from the Web, Intranets, chat rooms, message boards, blogs, to isolate insightful and timely information that is not readily perceptible or available today. Applications could include identifying trends, monitoring brand perception, competitive activities, and monitoring other concept-specific “buzz”.

{mospagebreak title=Categorizing Queries and Assessing Privacy Concerns}

[AB] Let’s look at commercial searches and informational searches; do you see the two becoming distinct categories?

[AF] No. A central theme behind classical information retrieval theories is that users are driven by an information need. More granular search log analyses over the past years have attempted to categorize queries as “transactional” (Commercial), “informational”, and “navigational”. The immediate intent behind “navigational” queries is to reach a particular site; “informational” queries aim at acquiring information assumed to be present on web pages; while “transactional” queries usually result in some activity such as an online purchase. Andrei Broder, while chief scientist officer at AltaVista in the late 90’s demonstrated that queries at the time were roughly split equally among each category.

We don’t live in a binary world where queries (or content) are either inherently commercial or purely informational. The commercial-informational dichotomy looks more like a spectrum to me, where understanding user intent and the psychology of purchasing cycle is critical. The definitions behind commercial and informational content are fuzzy and personal; content perceived as purely commercial by some might be informational to others and vice versa. Clearly, the query “1819 treaty manuscript” could be considered “informational” in nature, but leading to a book purchase at Amazon about the United States-Spain treaty of 1819, or even the schedule of a trip to Spain or Florida.

[AB] So what’s the answer?

[AF] In focus groups, users have told us unequivocally that they would much prefer a search engine display an array of content types that may be relevant to their query, rather than try to guess what their intent was. Users also appreciate having tools available to help them narrow their results. Based in part of this feedback, InfoSpace worked with Vivisimo last year to deploy a ‘Refine Your Results’ feature on our three owned and operated search properties — Dogpile, WebCrawler and Metacrawler. The feature automatically organizes and groups results by category for every search, providing a comprehensive view of web search results and allowing users to more rapidly get to the information most relevant to them. For example, a search on “flowers” groups results into subcategories such as delivery, gardening, arts and crafts, and more.

[AB] If search engine users gave up a little of their privacy and allowed their search habits to be monitored, would this allow the search engines to provide better, customized results?

[AF] There is no doubt that sharing personal data with search engines would result in better individual search experiences. The quality of search results is a function of two sets of variables: i) the user query and ii) the content indexed. Search engines are constantly crawling and indexing more web pages, more often, leveraging better entity extraction and concept recognition techniques, inferring document relationships in smarter ways. An enhanced understanding of user intents would certainly unlock more value from this semantic understanding of Web content.

Link analysis and other “off-the-page” ranking criteria have played an increasing role in relevancy algorithms over the past years. Monitoring navigation behavior at a user-level could conceivably be the basis to developing an understanding of users’ individual interests over time, in essence personalizing the equivalent of Google’s PageRank scores. If you consistently browse music-related content, search engines should become smart enough to understand that your query “Prince” most probably relates to the singer than to the royal family. Personalizing search relevancy algorithms presents some major scalability and performance challenges, though. It takes days, if not weeks to process link analysis and compute authority scores for individual Web sites after a crawl.

[AB] Do you think search engine users will balk due to privacy fears?

[AF] Privacy concerns are certainly legitimate to some extent. I actually see some parallel between users’ reluctance to using their credit card online in the early e-commerce days and giving up personal information to search engines today. It’s a constant trade-off between privacy concerns and the added value extracted from that data.

In the meantime, IP-sniffing technology might take search engines a step closer to personalizing search results without requiring users to compromise on very personal information. IP-analytic software associate Internet-connected devices to geographic areas, domains (.com, .edu, and .gov), ISPs, connection speed and browser types with some level of confidence. Analyzing click popularity at an aggregate level along IP-associated parameters could be leveraged to extrapolate personalized ranking for clusters of users exhibiting similar behaviors. This technique would not be unlike Amazon’s implementation of collaborative filtering technology, in essence also reaching similar goals than social networks such as Eurekster.

{mospagebreak title=Wireless Applications}

[AB] InfoSpace also offers wireless data applications. Do you think that search has a future on a cell phone?

[AF] Sending local content such as yellow page listings, directions, maps and business ratings to mobile devices just makes sense. I remember looking up on my cellular phone the nearest ice-cream parlor from the park a couple years ago with my kid. It worked! The experience was far from optimal, though, scrolling through about 10 to 15 screens I could barely read. Personalization features, geo-based services, faster networks, better handset resolution and color displays should significantly improve the experience over time. The navigation schema, whether search or browse modes, will be critical to make cellular phones a viable platform for both end-users and IYP advertisers. About 90% of mobile phones will be Web-enabled by 2006, making it a more attractive platform for content providers, developers, and information architects to invest time on.

The opportunity to deliver Web search and online directory information to mobile devices is something InfoSpace is well positioned to capitalize on. InfoSpace was a wireless data pioneer in the US and our mobile division today powers wireless data applications for every major US provider with the exception of Nextel. Going forward, we see a significant opportunity to increasingly combine our mobile and search and directory assets to accelerate the adoption of these services on wireless devices.

[AB] Thanks Arnaud for taking the time to share with us your thoughts on the future of search! 

For information on Andy Beal and KewordRanking.com, see the website KeywordRanking.com .

[gp-comments width="770" linklove="off" ]