Powerset Launches as Wikipedia Tool

Powerset says it can deliver on a promise originally made by Ask Jeeves: namely, if you ask it a simple question, it will answer. This trick calls for software with the ability to understand the meaning of words, something even Google’s engineers find challenging to create. With Powerset’s technology in open beta, though, we can see for ourselves how well it works.

I first wrote about Powerset back in July 2007. I reported that Powerset got its natural language technology from a licensing deal with Xerox’s Palo Alto Research Center (PARC). Powerset hired Google- and Yahoo-caliber engineers to work on the technology, and put it into private beta for many months before letting the whole world play with it.

Even now, Powerset’s tool doesn’t search the whole web; it’s limited to Wikipedia. That might be an excellent move on the company’s part. Wikipedia makes a good showcase. And Powerset may be pursuing a business model that is very different from Google’s or Yahoo’s. Rather than setting up a free-standing search engine that makes it money from advertisers, why not license the technology to sites that produce content? After all, if Powerset’s technology can help Wikipedia users answer questions, think what it can do for university and business sites, which have information scattered far and wide, organized (or not) in many different ways.

The truth is, Powerset is also pursuing the free-standing search engine with advertising model. But if the technology works as advertised, there is no reason the company couldn’t license it to others. Imagine being able to visit a government web site, type in a simple question, and get a simple answer! Perhaps if you type in “How do I buy a foreclosed home?” for example, you might get in reply something like this: “You can find a list of foreclosed homes in your area at this web site. You will need to put in your area code. Auctions are usually held once a month. You will need the following information to prequalify for the auction…”

Or…you could just put “foreclosed home auctions” into Google and find this web site near the top of the listings. Which approach makes more sense? That depends on a number of factors, including the nature of the question and the experience of the searcher. Even experienced searchers can become frustrated when trying to find really obscure, tricky, or specific information. How well does Powerset answer this need?

{mospagebreak title=How it Works}

Powerset is actually hosting a copy of Wikipedia’s 2.5 million English language entries on its own computers. It also searches Freebase, a database created by MetaWeb. Michael Arrington, who tested Powerset for several weeks before the open beta, noted that it is an effective way to gather information quickly. “For someone doing research, Powerset effectively removes a number of steps towards getting to the final information. It is particularly effective when the information needed is on many different web pages.”

For example, what happens if you ask Powerset when hurricanes have hit New York City?


I know that doesn’t look terribly impressive, but the eighth entry, which didn’t make it into this screen shot, was for an actual list of New York hurricanes. I tried the search on Google – specifically in the question format – and the search engine didn’t do nearly as well. Anyway, when I clicked through to the link I mentioned, Powerset included a helpful outline of the article on the right:

The article itself looked as if came straight from Wikipedia (no surprise there). The clickable outline shown above took me to specific sections. It also followed me as I scrolled up and down the page, but the button on the far right at the top lets the user “pin” the outline in place so it doesn’t engage in that behavior (very useful if you find that sort of thing annoying).

If this outline doesn’t give you enough detail to click to the points in the article you want to see, you can always go to the Factz view. That view seems to highlight the parts that Powerset’s technology thinks you might want to jump to for more detail. Here is the Factz view of the outline for the same article:

{mospagebreak title=Better With People?}

While I liked Powerset’s results, I didn’t see this as that much of an improvement over what I’d find in Wikipedia itself. But I’ve read several reviews of the technology, which led me to believe it might do better with people. In a fit of whimsy, I asked a celebrity-related question. Powerset could be the next great gossip columnist:

But what if I’m not exactly sure of what I’m looking for? Or what if I’m doing general research and want to discover what’s out there for certain important figures? Danny Sullivan wrote an excellent article that compares what could be found at the three major search engines and Powerset for Henry VIII. I have to see what Powerset can do with Abraham Lincoln. Here’s the screen shot:

The main listing shows the president, but there are tabs that lead to a film, book, and Pullman car by that name. Next to the beginning of the Wikipedia article is structured information from the Freebase database. Below that you can find Factz from Wikipedia. These are items that show that Powerset’s technology understood, on some level, that Abraham Lincoln did certain things: he took office, won elections, made speeches, and so forth. You can click on the button labeled “more” to find out what else he did.

You can drill down for more information on any of these points, and you might be surprised as to where you’re led. As a technical note, Powerset uses AJAX to smooth out the drill-down process. Here’s an example:

Well, I wouldn’t have expected the phrase “Abraham Lincoln wrote decision” to lead me to a “Saving Private Ryan” reference. Danny Sullivan did somewhat better when he researched Henry VIII with Powerset. He found that Henry VIII built Pendennis Castle, which is mentioned in an article on Falmouth, Cornwall – but not in the main Henry VIII page. To be fair, though, the “Saving Private Ryan” reference also wasn’t in the main body of the Abraham Lincoln article – and since I didn’t watch the movie, I now know something about it I didn’t know before.

{mospagebreak title=Where Powerset Fits}

For some time, I’ve seen a debate going on about the future of search, and whether it’s going to be search as we know it now, or look more like “discovery” going forward. When you search for something on the Internet, you usually have a target of some sort in mind. You may be looking for an easy-to-use digital camera, for example, or what’s playing at the local movie theaters (remember when we used to call the theater for that information?).

Discovery is different. Your query is your starting point, but a simple answer is not your end point. If you’re looking for information on multiple sclerosis, for example, you may be satisfied with an overview, or you may want to dig more deeply into the topic. Todd Leyba wrote about this topic last year, noting that a discovery engine, as opposed to a mere search engine, “provides various facets of the result set in the form of navigational links. These links represent different dimensions of the result set and allow you to drill down or sideways depending on the facet.”

Powerset, then, is clearly a discovery engine. That doesn’t mean it can’t be used as a search engine, or even in combination with one. In fact, Michael Arrington, writing for Techcrunch, makes an interesting observation: Powerset’s technology would work well with some of the other search engines out there. Indeed, the company seems prepared to either go it alone OR sell itself. Arrington notes that “They hired Dave Wehner, a Managing Director at investment bank Allen & Co. (he’s the guy who sold Bebo for $850 million to AOL, and is working on LinkedIn’s huge financing), to represent them in a possible sale or financing.”

So who might be interested in buying Powerset? CNet reported rumors that Microsoft might be nosing around, though both Powerset and Microsoft refuse to comment. The software giant certainly has some cash to spend after walking away from a purchase of Yahoo. And they may be more interested than either Google or Yahoo, since they have more to gain – this would give them a technology that Google doesn’t have, and a possible edge if they can get it to work appropriately.

Even if nobody buys Powerset, the company has a certain amount of potential going forward. Content-rich sites challenge search engines because their technology doesn’t really understand the meaning behind words (check out Danny Sullivan’s article, linked to earlier, for a good explanation of how search engines use key words to figure out relevance). Such sites would do well to license Powerset’s technology.

Indeed, that might work out better for the company than searching the web as a whole. Sullivan noted that “it takes Powerset about a month to comprehend Wikipedia’s 2.5 million topic pages. In that time, many of those pages will have changed – thus needing to be reread again. Powerset’s impressive, but with the web having in excess of 20 BILLION [emphasis in original] constantly changing pages, this is no overnight secret weapon that Microsoft might buy and employ to take the search lead.” In short, Powerset might not be a Google killer, but for those of us who like to draw interesting connections in our research, or have lots of content to organize, it could be a good friend.

[gp-comments width="770" linklove="off" ]