Evolving as a Systems Administrator
I returned from an extended absence to find that during the five months I was gone, my beloved Nagios, Puppet, and Webmin had been replaced. In my absence, a younger generation had deployed tools that were less command line driven, and more web, GUI, and pointy-clicky. While I am impressed with the capabilities of the tools, I can not help but wonder if my preference for the command line marks me as the expert I aspire to be, or a relic.
I do not know of too many other folks in the IT industry today who can claim production experience with vacuum tube technology and paper tape. I can, not because of my age, but because of my service in the US Navy. We had transmitters and receivers that needed to be warmed up for a half-hour before we could start using them, and a gigantic machine with reels on the front of it that would accept input from a punched paper tape that contained a database of ships. The Navy used this old equipment at least up through the mid-nineties, not because they had some type of idealogical aversion to new technology, but because it worked. The equipment was there because it served its purpose and served it well, and this experience early in my career has shaped my views of the tools I use in the datacenter today.
Admittedly, after a time some things just need to go. Webmin in particular, needed to go. While it is a handy tool, especially in an environment without a central user store, or one where the root or local admin account needs to have the password changed on a regular basis, Webmin has a handful of security risks that are, frankly, unacceptable. Webmin has a cluster management tool which lets admins do things like change the root password on all servers, or run arbitrary commands as root. Lots of power, but as Spidey would say, also lots of responsibility. Webmin’s dirty secret is that it would store the root password as plain text in /etc/webmin/servers, once for each server in the cluster. Auditors tend to frown on such things. The Webmin client would show up on security scans. It had to go.
Puppet is a different matter entirely. Like Chef, Puppet allows you to program your datacenter, rather than administer it. After my return I found that the remaining admins had replaced both Puppet and Webmin with Spacewalk. From the two weeks I’ve spent with Spacewalk, I like it so far. It has an API that I was able to tap into with a Python library, and while I’m looking forward to exploring it a bit more, I miss the simplicity of editing plain text files. Spacewalk’s web interface also seems a bit crowded, but after the shell and vi, I suppose anything does. I’m aware of Puppet’s web interface, but I never had reason to load it. Perhaps if I did we would still be using it.
Then there is Nagios. I loved it, I hated it, it really was a beast of a system. However, I felt it was a beast I had tamed. The new generation had replaced Nagios with Zenoss, and rolled out a comprehensive monitoring solution that included a few features that were on the planning board for Nagios. Choosing SNMP monitoring over the Nagios NRPE daemon, the new team had been able to convince management that the slick graphs and reporting tools were a better solution than the Frankenstein that I had been building for the past five years. And you know what, they were right. While I had done my best to tame Nagios and put the config files in a manageable scheme, the truth is that some things that needed to change were not being changed because of how complicated the NRPE setup was. Zenoss is completely centralized and scalable, and clearly the right choice for where our team is right now.
Watching the progression of technology in fast-forward while I was in the Navy was fantastic, but I also lived through another transitional period. The second major shift was away from the mainframe to smaller, cheaper, x86 boxes running Linux. During this period there were a few technicians who did not upgrade their skill set along with the march of technology, and, unfortunately, they were left behind. Part of being a modern sysadmin is knowing which technologies to fight for because they keep working, and which ones are becoming an impediment to the real job you are supposed to be doing. The new systems are both fantastic in their own right, even if they are a bit too graphical for my taste. They are both well supported, open source, and constantly updated, good choices for enterprise adoption. If there is anything I have learned this summer, it is that you can never settle in the tech field, you are never just done learning. You have to keep moving, keep growing, and keep getting a little better every day. Part of the job is knowing which technologies to hold on to, and which ones need to go.