Facebook Open Sources New Tool That Can Speed Queries

by Ostatic Staff - Mar. 20, 2015

Facebook, like Google, has shown itself to be a strong contributor to the open source community. Only a few months ago, the company open sourced Haxl, a library that eases access to remote data. Haxl can automatically batch multiple requests to the same data source, request data from multiple data sources concurrently, and cache previous requests.

Now, with an eye toward optimizing the performance of open source distributed SQL query engine Presto, Facebook has designed a new Optimized Row Columnar (OCR) file format reader for Presto, and it is open sourced.

As noted in a blog post:

"A few months ago, a few of us started looking at the performance of Hive file formats in Presto. As you might be aware, Presto is a SQL engine optimized for low-latency interactive analysis against data sources of all sizes, ranging from gigabytes to petabytes. Presto allows you to query data where it lives, whether it's in Hive, Cassandra, Kafka, relational databases, or even a proprietary data store."

"We are always pushing the envelope in terms of scale and performance. We have a large number of internal users at Facebook who use Presto on a continuous basis for data analysis. Improving query performance directly improves their productivity, so we thought through ways to make Presto even faster. We ended up focusing on a few elements that could help deliver optimal performance in the Presto query engine."

The post also summarizes (graphically) the remarkabe performance enhancements that the new file format reader for Presto ushers in.

According to a general update from Facebook midway through last year, its open source activities of note include:

Facebook's open source projects have seen 13,000 total commits, an increase of 45 percent from the second half of 2013.

Launched 63 new projects since January 2014

Total active Github portfolio stands at exactly 200 for projects spread across Facebook, Instagram and Parse

Projects collectively have netted 20,000 forks and 95,000 followers.

Facebook's total open source library stands at approximately 9.9 million lines of code.

That's nothing to shake a stick at. It's great to see Facebook contributing meaningful projects to the open source community, and it's also worth noting that the company can benefit from community participation in its projects.