Cloud-Smart Apache BookKeeper Graduates to Top Level Project

by Ostatic Staff - Jan. 27, 2015

One of the Achilles heels of administrators of Big Data and cloud computing deployments is that disk/server failure rates occur on up to 10 percent of systems annually. That failure rate calls for data replication strategies and use of services for replication.

One of the emerging projects focused on that task is Apache BookKeeper, an open source project that was established in 2011 as a sub-project of Apache ZooKeeper (an open Source API for reliable distributed coordination) to reliably log streams of records. Bookkeeper serves as a building block for reliable system consistency and recovery, and can be used to turn any standalone service into a highly available replicated service. Now, the Apache Software Foundation (ASF) has fast tracked Bookkeeper by making it a Top Level Project.

According to the ASF:

"One way to build a replicated service is to ensure that all write operations to the service are copied to all replicas; Apache BookKeeper's replicated logging service is well suited for this purpose. A database may have two replicas to ensure availability: if one crashes, the other can continue to serve traffic. However, ensuring that the data in these two replicas is consistent is not an easy problem to solve. Unlike naive solutions that run into problems like deadlock and inconsistency when one or both of the replicas fail, BookKeeper uses a combination of quorum writes, fencing, and, when necessary, outsourcing of consensus to ZooKeeper to ensure no state will be lost in the case of a replica failure. BookKeeper can similarly be applied to different classes of systems, such as messaging systems, filesystems and transaction processing systems."

Apache BookKeeper scales horizontally as more storage nodes are added, and is already in use in many cloud deployments. BookKeeper is used in production at Yahoo as the persistence layer for its cloud messaging infrastructure, and is also used at Twitter as the replicated persistence backend for different messaging use cases. BookKeeper is also used by Huawei for shared storage in their solution for HDFS Namenode High Availability.

"We're very proud to have BookKeeper become a Top-Level Project. It is a testament to the hard work that my fellow committers have put in over the years that the ASF would give us their stamp of approval," said Ivan Kelly, Vice President of Apache BookKeeper, in a statement. "We hope that the increased exposure will bring even more contributions and use cases to the community."

If you are interested in joining the BookKeeper community, visit http://bookkeeper.apache.org and https://twitter.com/asfbookkeeper

One way to build a replicated service is to ensure that all write operations to the service are copied to all replicas; Apache BookKeeper's replicated logging service is well suited for this purpose. A database may have two replicas to ensure availability: if one crashes, the other can continue to serve traffic. However, ensuring that the data in these two replicas is consistent is not an easy problem to solve. Unlike naive solutions that run into problems like deadlock and inconsistency when one or both of the replicas fail, BookKeeper uses a combination of quorum writes, fencing, and, when necessary, outsourcing of consensus to ZooKeeper to ensure no state will be lost in the case of a replica failure. BookKeeper can similarly be applied to different classes of systems, such as messaging systems, filesystems and transaction processing systems. - See more at: http://globenewswire.com/news-release/2015/01/27/700172/10117113/en/The-Apache-Software-Foundation-Announces-Apache-tm-BookKeeper-tm-as-a-Top-Level-Project.html#sthash.NM7KeI6w.dpuf

Apache BookKeeper is highly available (no single point of failure), and scales horizontally as more storage nodes are added. BookKeeper is used in production at Yahoo as the persistence layer for its Cloud messaging infrastructure, and is also used at Twitter as the replicated persistence backend for different messaging use cases. BookKeeper is also used by Huawei as a shared storage in their solution for HDFS Namenode High Availability.

- See more at: http://globenewswire.com/news-release/2015/01/27/700172/10117113/en/The-Apache-Software-Foundation-Announces-Apache-tm-BookKeeper-tm-as-a-Top-Level-Project.html#sthash.NM7KeI6w.dpuf