Powerset, Leveraging Open Source Hadoop, Powers Microsoft's Bing

by Sam Dean - Jun. 05, 2009Comments (7)

Last summer, we reported on Microsoft's acquisition (reportedly for $100 million) of Powerset, which specializes in semantic search based on the open source, cluster-based software framework Hadoop. This acquisition of an open source-centric search company was more strategic than many people realize. Hadoop also underlies Yahoo!'s search engine with its ability to search large data sets quickly, and the acquisition of Powerset may have played a key part in how Microsoft decided to give up its effort to acquire Yahoo!

Of course, Microsoft's big search engine news of the week is Bing, which I've found to have both strengths and weaknesses. Surprisingly, as The Register reports,  Powerset's technology plays only a small part in how Bing works, but what it does in Bing is open source-driven, and interesting.

As this blog post from Powerset describes, "the Powerset division has contributed to Bing in both subtle and more conspicuous ways." Most notably, Powerset's technology provides a corrolary engine to Bing's main engine, designed to search Wikipedia. For example, at Bing.com, type in a search for "squirrel monkey." On the left rail of the search results that come back, you'll find a "Reference" link, and if you click on it, you'll get a formatted version of the Wikipedia entry for squirrel monkey, with extras such as an outline of the article, with links to key parts of it.

What's less apparent, though, is that Bing includes the Hadoop-driven semantic wikisearch technology that is really Powerset's specialty. I wrote about how Powerset goes about this with Hadoop, and clusters, here. You can quickly get a sense for it by going to Bing.com and typing natural language queries in. Try these at the site:

Was Einstein married?

What did Benjamin Franklin invent?

What is the top selling album of all time?

Powerset's technology in Bing delivers easily scannable answers to questions like these, and also links back to the reference source for the answers. This technology in Bing is actually pretty good, though semantic search has never been perfect, and I'm surprised more people aren't talking about it as Bing rolls out. It's also an example of Microsoft leveraging open source technology in a big way.
 



Jesse Babson uses OStatic to support Open Source, ask and answer questions and stay informed. What about you?



7 Comments
 

Bing is actually quite decent. The suggestions and the way in which it displays them is much better than Google. The images for a search show up well, and the way in which you can see previews, or play videos without opening up another window is quite good. I can see them picking up a ton of share as they make this the 'default' home page on IE, instead of that MSN.com crap.


0 Votes

I wonder which cloud the hadoop implementation uses! Is it Azure? I doubt it. For Bing type of scale, it will be very interesting to see where they deploy Hadoop.


0 Votes

I still cannot get over the name! Bing?!


0 Votes

Bing, for a search engine? I find its name silly too. Here in my place bing is the sound you make when you make fun of someone. MS should have thought of something better just like Find.com, Ask.com, etc.


0 Votes

Anonymous - good point, but look at what those names did for those products! Who even uses Find.com or even Ask.com for that matter? Surprisingly they remain in business...


0 Votes

So, Bing is a silly name, but Google isn't?


0 Votes

I don't get you guys. I tried all 3 searches suggested and Google gives me each time a correct answer in the first row whereas Bing points to non immediat relevant stuff at least in the two first queries.


0 Votes
Share Your Comments

If you are a member, to have your comment attributed to you. If you are not yet a member, Join OStatic and help the Open Source community by sharing your thoughts, answering user questions and providing reviews and alternatives for projects.