What Lies Ahead for Local Search Engine Technology? - Categorizing Queries and Assessing Privacy Concerns
(Page 3 of 4 )
[AB] Let's look at commercial searches and informational searches; do you see the two becoming distinct categories?
[AF] No. A central theme behind classical information retrieval theories is that users are driven by an information need. More granular search log analyses over the past years have attempted to categorize queries as "transactional" (Commercial), "informational", and "navigational". The immediate intent behind "navigational" queries is to reach a particular site; "informational" queries aim at acquiring information assumed to be present on web pages; while "transactional" queries usually result in some activity such as an online purchase. Andrei Broder, while chief scientist officer at AltaVista in the late 90's demonstrated that queries at the time were roughly split equally among each category.
We don't live in a binary world where queries (or content) are either inherently commercial or purely informational. The commercial-informational dichotomy looks more like a spectrum to me, where understanding user intent and the psychology of purchasing cycle is critical. The definitions behind commercial and informational content are fuzzy and personal; content perceived as purely commercial by some might be informational to others and vice versa. Clearly, the query "1819 treaty manuscript" could be considered "informational" in nature, but leading to a book purchase at Amazon about the United States-Spain treaty of 1819, or even the schedule of a trip to Spain or Florida.
[AB] So what's the answer?
[AF] In focus groups, users have told us unequivocally that they would much prefer a search engine display an array of content types that may be relevant to their query, rather than try to guess what their intent was. Users also appreciate having tools available to help them narrow their results. Based in part of this feedback, InfoSpace worked with Vivisimo last year to deploy a 'Refine Your Results' feature on our three owned and operated search properties -- Dogpile, WebCrawler and Metacrawler. The feature automatically organizes and groups results by category for every search, providing a comprehensive view of web search results and allowing users to more rapidly get to the information most relevant to them. For example, a search on "flowers" groups results into subcategories such as delivery, gardening, arts and crafts, and more.
[AB] If search engine users gave up a little of their privacy and allowed their search habits to be monitored, would this allow the search engines to provide better, customized results?
[AF] There is no doubt that sharing personal data with search engines would result in better individual search experiences. The quality of search results is a function of two sets of variables: i) the user query and ii) the content indexed. Search engines are constantly crawling and indexing more web pages, more often, leveraging better entity extraction and concept recognition techniques, inferring document relationships in smarter ways. An enhanced understanding of user intents would certainly unlock more value from this semantic understanding of Web content.
Link analysis and other "off-the-page" ranking criteria have played an increasing role in relevancy algorithms over the past years. Monitoring navigation behavior at a user-level could conceivably be the basis to developing an understanding of users' individual interests over time, in essence personalizing the equivalent of Google's PageRank scores. If you consistently browse music-related content, search engines should become smart enough to understand that your query "Prince" most probably relates to the singer than to the royal family. Personalizing search relevancy algorithms presents some major scalability and performance challenges, though. It takes days, if not weeks to process link analysis and compute authority scores for individual Web sites after a crawl.
[AB] Do you think search engine users will balk due to privacy fears?
[AF] Privacy concerns are certainly legitimate to some extent. I actually see some parallel between users' reluctance to using their credit card online in the early e-commerce days and giving up personal information to search engines today. It's a constant trade-off between privacy concerns and the added value extracted from that data.
In the meantime, IP-sniffing technology might take search engines a step closer to personalizing search results without requiring users to compromise on very personal information. IP-analytic software associate Internet-connected devices to geographic areas, domains (.com, .edu, and .gov), ISPs, connection speed and browser types with some level of confidence. Analyzing click popularity at an aggregate level along IP-associated parameters could be leveraged to extrapolate personalized ranking for clusters of users exhibiting similar behaviors. This technique would not be unlike Amazon's implementation of collaborative filtering technology, in essence also reaching similar goals than social networks such as Eurekster.
Next: Wireless Applications >>
More Search Engine News Articles
More By Andy Beal