On Call Scheduling With Nagios
OStatic
Home
Blog
Questions
Software
Members
 
 
 
 
Follow Us:
Follow us on Twitter
Subscribe to our RSS
About
Contact
On Call Scheduling With Nagios
by Jon Buys - Jul. 30, 2013Comments (0)
Related Blog PostsDocker Delivers Content Trust to Lock Down Container SecurityThe Year of DockerProfiles of Linux ProfessionalsFlashbacks to CVSExamining Linux Load Averages
Nagios continues to impress me with its power and capability. For the past seven years Nagios has been my enterprise monitoring solution of choice, and as our environment has grown, Nagios has grown right along with it. Luckily, since we have a sane method of managing our configuration files, growing Nagios has not been an issue. Recently though, we were comparing ZenOSS with Nagios, and one item discussed was how ZenOSS deals with the on-call rotation between sysadmins. The way we were doing it with ZenOSS required the sysadmin who was on call that week to log in and make a change. I was certain that Nagios had something built into it, and sure enough, I was right. On call scheduling with Nagios is done using timeperiods. Normally, time periods are defined in the timeperiods.cfg file, and to be honest, the only one we normally use is 24x7. Nagios comes preconfigured with 24x7, or all the time, a "workhours" time period for the work day, "none", for no time at all, the US Holidays, and the reverse of the US Holidays. Time periods are one of the checks Nagios performs when deciding if it should send out an alert or not. If there is an alert that needs to be sent, and the current time falls within the time period defined for the users, the alert goes out. To set up an on call rotation, we create a new file named "oncall.cfg" and define a few new timeperiods for Nagios. For example, here is a portion of ours: define timeperiod{ alias buys-oncall timeperiod_name buys-oncall 2013-03-11 / 21 07:00-24:00 2013-03-12 / 21 00:00-24:00 2013-03-13 / 21 00:00-24:00 2013-03-14 / 21 00:00-24:00 2013-03-15 / 21 00:00-24:00 2013-03-16 / 21 00:00-24:00 2013-03-17 / 21 00:00-24:00 2013-03-18 / 21 00:00-07:00 exclude buys-out-of-office use smith-out-of-office} If we walk through this a bit, the first definition names the timeperiod "buys-oncall".The next section defines the starting dates for the rotation. 2013-03-11 obviously is March 11, 2013. The next part, " / 21" tells Nagios to repeat this timeperiod every twenty-one days, and the last part of the line tells Nagios to start at seven in the morning and end that day at midnight. I then repeat the definition for the remainder of the week, ending at seven in the morning of the following Monday. The last two lines reference two other timeperiods, one to exclude and one to use. The first, named buys-out-of-office looks like this: define timeperiod{ name buys-out-of-office alias buys-out-of-office timeperiod_name buys-out-of-office 2013-01-22 - 2013-01-23 07:00-16:00 ; Test Vacation Definition} This section defines when I am going to be unavailable to be on call. According to the dates above, I would be on vacation from January twenty-second at 7:00 AM to January twenty-third at 4:00 PM, or two days. Nagios uses these dates as exclusion times when choosing if you are going to be sent an alert. But if I am not available, who is going to cover for me? That question is answered by the last section of the on call definition, "use". If there are three sysadmins, each covering a week, than each should cover for one other. Which is why in the third sysadmin's timeperiod definition is a use line that says use buys-out-of-office. So, for times that I tell Nagios not to send something to me, I am also telling it to send it to my backup. Each of us has the two definitions in Nagios that automatically rotate through every 21 days. If any of us has vacation or time off, we enter it in as a line item in the config file, maybe with a friendly comment of where we will be. Nagios can be hard to wrap your head around at first, but once you do it becomes very easy to maintain. In the past few years there has been a lot of emphasis put on big, heavy, GUI-driven interfaces for systems management, but I'm personally happy to keep everything in text files and do my management with vi.
sysadmin Nagios
Previous: Mozilla's New Leader...Next: Ask Your Hadoop Ques... Browse Blog
Jesse Babson uses OStatic to support Open Source, ask and answer questions and stay informed. What about you?
 
Comments
Share Your Comments
If you are a member, Sign in to have your comment attributed to you. If you are not yet a member, Join OStatic and help the Open Source community by sharing your thoughts, answering user questions and providing reviews and alternatives for projects.
Your Name
Email Address (kept hidden)
Your Comment *
Promote Open Source Knowledge by sharing your thoughts, listing Alternatives and Answering Questions!
 
Explore Software in this Blog Post
1
2
3
4
5
Nagios has 2 reviews5 users
1
2
3
4
5
sysadmin has 0 reviews1 user
Featured MembersViewLeon MerchiSystem Analyst at a large IT Services firm. Based...
ViewJesse BabsonI used to work for a large chip company here and a...
Related Questions
Browse
Get answers and share your expertise.
Have a question? Ask the community
Partner Center
Happening Now on OStatic
Toney Otey commented on Mirantis Delivers OpenStack Installer with Plugin Options via Fuel
Matthew Miller commented on Gentoo Choice, Awful Fedora 24, Debian Firefox
Susan Linton commented on Slackware Live 0.5.1, 1.0 on Its Way
Home
Blog
Software
Questions
About OStatic
Contact
Terms of Service
Privacy Policy
Send Feedback
Powered by Vox Holdings
© 2015 OStatic. Built on fine Open Source Software from projects like
Apache,
Drupal,
Java,
Linux,
MySQL and
PHP.
Sign in to OStatic
close
Username: *
Password: *
Not a member? Join NowI forgot my password