DRBD: For When it Absolutely, Positively, Has to be in Sync

by Jon Buys - Aug. 25, 2010Comments (2)

drbd_logo_small

DRBD is an acronym that stands for Distributed Replicated Block Device, and as the name implies it is used for replicating a block device between two servers. DRBD was designed to be used in “High Availability” (HA) clusters, and is conceptually similar to a level one RAID, or mirroring, setup.

Let’s say you have two servers, and one service, say a MySQL database that you want to make sure stays up, even if the server it is on crashes. Without some form of HA, if the server hosting MySQL goes away, so does the MySQL database. To provide HA, DRBD inserts itself into the IO stack and proxies all block level actions, simultaneously writing the data to the local disk, as well as the disk on the second server. So, when the time comes to fail over to the standby server, a script moves the IP address over from the primary, mounts the DRBD filesystem, and starts MySQL. Since everything is replicated, no data loss occurs during the failover.

DRBD Overview Diagram

I honestly think that DRBD is ingenious. The replication happens in real time, and in my experience causes little to no performance degradation. DRBD, when combined with a floating IP address and any available heartbeat monitoring script can make nearly any application highly available, as long as it can recover cleanly from a crash.

DRBD is a kernel loadable module, and as of kernel 2.6.33 it is included with the mainline kernel. If you are running a Linux server that has any level of kernel revision past 2.6.33, you already have drbd available, and at most will need to load the management tools. If your server is older than that, you will need to download the tools and the loadable module to get it running.

There are a few important things to know that DRBD will not help with. If there is corruption in the filesystem, DRBD will happily replicate the corruption between nodes. This is because DRBD has no knowledge of what is happening farther up the IO stack, and therefore has no way of detecting if such corruption is occurring. Further, DRBD cannot provide instant failover between nodes. Failover is fast, but it’s not on the same level as MySQL Cluster. If you are using scripts like Heartbeat and Mon or Pacemaker, these systems must first detect that a failure has occurred, then add the IP address, take over primary DRBD role, mount the filesystem, and then start the service. These things take time, not a lot, but it might be noticeable, depending on the sensitivity of your environment.

If you are responsible for a service, if it is important enough to warrant some form of high availability, DRBD may well be worth a look. DRBD is open source, licensed under the GPL, and commercial support is provided by Linbit. Do you have any experience with DRBD, good or bad? How has it changed how you look at high availability and disaster recovery? I’d love to hear your stories in the comments!



Handrus Nogueira uses OStatic to support Open Source, ask and answer questions and stay informed. What about you?



2 Comments
 

Good article, I think you did a great job giving a birds eye overview as to what DRBD is, and more importantly what it is not. Personally I have been a long time consumer of it, and the company I work for (Logicworks) uses it as part of our business.


While many people may want to incorporate some of the features and protection of Hearbeat, I like to use either ucarp or keepalived. Both of these really just incorporate the VRRP protocol to linux, and ucarp is a port of bsd's carp utility - for those familiar with that.


I do disagree with you about fail over time though. We do run clusters here that have sub 5-second fail over times. This does take some extreme engineering precision to accomplish, but it is possible.


-- Kyle, Sr. Engineer @ Logicworks


0 Votes

Good overview. When DRBD is coupled with MySQL Replication, you have a very nice scale out solution, simply adding MySQL Slaves for more read scalability.


MySQL Cluster is a very different solution. As a shared nothing architecture, there are no lock managers to worry about and all nodes are active, so failover times are typically sub-second. Also, data partitioning is automatic, so coupled with distributed, parallel architecture, users gets very high right scalability without application modifications


Unlike MySQL with InnoDB storage engine, MySQL Cluster distributes data across nodes, so complex joins involving many tables and returning large result sets (ie >1k rows) will not perform well.


Good overview of MySQL Cluster is here (requires registration):

http://www.mysql.com/why-mysql/white-papers/mysql_wp_Cluster_For_OnlineA...


0 Votes
Share Your Comments

If you are a member, to have your comment attributed to you. If you are not yet a member, Join OStatic and help the Open Source community by sharing your thoughts, answering user questions and providing reviews and alternatives for projects.


Promote Open Source Knowledge by sharing your thoughts, listing Alternatives and Answering Questions!