Facebook Open Sources a Storage Engine Focused on MySQL

by Ostatic Staff - Sep. 02, 2016

In recent months, Facebook, Microsoft and other tech titans have shown themselves to be strong contributors to the open source community. Only a few months ago, Facebook company open sourced Haxl, a library that eases access to remote data. Haxl can automatically batch multiple requests to the same data source, request data from multiple data sources concurrently, and cache previous requests.

Now, Facebook's RocksDB key-value store is opening up and gaining integration with MySQL. "A few years ago, we built RocksDB, an embeddable, persistent key-value store for fast storage that has several advantages compared with InnoDB for space efficiency," write Facebook enginners. "Despite its advantages, RocksDB does not support replication or an SQL layer, and we wanted to continue using those features from MySQL. This led us to build MyRocks, a new open source project that integrates RocksDB as a new MySQL storage engine. With MyRocks, we can use RocksDB as backend storage and still benefit from all the features in MySQL."

The post from Facebook adds:

"While there are some databases at Facebook that will still use InnoDB, we're in the process of migrating to MyRocks on our user database (UDB) tier. After deploying MyRocks to this database tier in one of our data center regions, we were able to use 50 percent less storage for the same amount of data compared with compressed InnoDB. Ultimately this will allow us to use half as many UDB servers. By sharing MyRocks with the community through open source, we hope that others can take advantage of these efficiencies.

MyRocks integrates RocksDB as a new MySQL storage engine, and its architecture provides added features that weren't previously available with InnoDB, including:

Faster replication: MyRocks replication is faster than InnoDB for a couple of reasons. MyRocks doesn't need random reads for updating secondary keys, unless the index is unique. Also, with the row-based binary logging format, MyRocks does not need random reads for updating primary keys, and does not even need to check uniqueness. The second feature is called read-free replication, which is an option that can be enabled in MyRocks.

Faster data loading: With faster data loading enabled for a session, MyRocks writes data directly onto the bottommost level, which avoids all compaction overheads. Since compactions use both CPU and I/O for decompressing, compressing, and writing data, avoiding compactions by bulk loading is optimal."

 Facebook uses MyRocks in house. "Our work on MyRocks has been centered on making storage savings efficient," engineers report. "In looking ahead, we'll be focused on building out more complete features to support MyRocks."

You can find out much more here, and the GitHub repository is here.