Cloudy Saturday: Puppet, for Automating Virtual Machines

by Ostatic Staff - Jun. 07, 2008

This is the first of our weekly "Cloudy Saturday" columns that will delve into using cloud computing techniques to manage your own server farms, both large and small. This week we'll talk about the Puppet automation system from Reductive Labs. Puppet is built upon the legacy of the venerable Cfengine system, but takes things to a whole new level.

The open source world is rife with projects that try to automate the management of systems. The most notable and most mature until now had been Cfengine. Luke Kanies was a member of the team that helped build Cfengine and he saw an opportunity, along with partner Andrew Schafer, to create a new framework that was easier to use and become more functional. Kanies explains it like this: "I didn't create Puppet just to be different, I saw some clear holes in Cfengine and saw an opportunity to fix them."

So what can Puppet do for you? It allows system administrators to write "recipes" that define machine functions and maintenance tasks that automate their routine work. Thinking along the lines of cloud or utility computing, Puppet allows you to manage a large number of systems or virtual machines without doing manual labor or writing small one-off scripts.

This level of automation is critically important when you are trying to run any number of systems from 1 to 10,000. "The Puppet project was conceived when clouds were on the far horizon, but Puppet solves configuration problems that virtualization potentially multiplies," said Schafer.

 Puppet adheres to the concept of idempotency. Put simply, this means that any task that it runs can be run repeatedly with no adverse effects on the system. The framework only makes changes to systems that don't match the state that they are supposed to. Ultimately the team intends to add full transaction support to these tasks, so that you can roll back changes that fail or commit many related changes in one monolithic block.

Puppet goes a long way to the goal of the virtual datacenter, where compute resources can be allocated on-demand and in the place that best suits the business' needs. Schafer said, "Manage your services, not your servers.  Things get more interesting when adding 'machines' is an API, as opposed to a purchase order, then a 3 week wait, before plugging in...a lot more interesting."

Puppet can manage any server anywhere. It does not have a requirement that it live in your colo space or datacenter, but can automate systems on Amazon EC2 or any number of other commercial utility computing clouds. This capability gives you the chance to launch services on EC2 and move them to your own machines without changes to the overall structure of your systems. Additionally, you can use Puppet, along with another extra-VM package, such as Enomalism or Eucalyptus, as the intra-VM framework to add capacity to your internal systems in times of need.

If you're building your own datacenter or utility computing cloud, Puppet is definitely one piece of the puzzle and I encourage you to check it out. Next week we'll look into Enomalism, another step in automatiion and using cloud computing techniques in your own systems.