Over here, we can discuss about " monitoring automation ", a generic
way: which will work for any one who is using "AWS" "Nagios" for
monitoring.
Features:
[1]: automatically add the new host into monitoring, when ever we add a new system into the aws system.
[2]: automatically will remove the system from monitoring if we terminate the system from aws system.
[3]: read group information from custom tags.
NOTE: As we are going auto monitoring, few rule we have to maintain, else the monitoring will fail.
Rule1: We can have only two tags to any of our aws instance. [1. default: Name, 2. groups ] NOTE, these are case sensitive, so please maintain the same.
Rule2:
As of now we have the following key words that can be part of the
groups custom tags: [Note: if you need new, you have to let me know
before putting the value. This is also case sensitive ] [ you can update
the nagios hostgroup config file to add new hostgroup, before adding
them into groups custom tags.]
hostgroup_name hadoop
hostgroup_name db
hostgroup_name http
Following is the python boto script:
#!/usr/bin/env python
import boto.ec2
import subprocess
#import os, subprocess
conn=boto.ec2.connect_to_region('us-east-1')
reservations = conn.get_all_instances()
for res in reservations:
for inst in res.instances:
print ("define host{")
print "%s \t %s" % ("use","generic-host") # \t for tab
print "%s %s" % ("host_name", inst.tags['Name'])
if inst.tags['Name'] == 'qa1':
print "%s \t%s" % ("check_command", "check_ssh")
# different check for qa1 as it is fedora system.
print "%s \t %s: %s" % ("alias", inst.tags['Name'], inst.public_dns_name)
print "%s %s" % ("address", inst.private_ip_address)
# Swapped the alias and address value, because of cost effective)
## Following few code block will check for a custom tags knonw as groups
## if its find the groups, then that host will be part of those hosts.
alltags = (inst.tags) # Will get all the other tags.
alltagsC = str(alltags) # changing the variable type to string.
isgroup = (alltagsC.find('groups'))
if isgroup > 0:
sp = isgroup+11 #found the groups index value and picking the other groups
#global otherGroups
otherGroups = alltagsC[sp:-2]
#print "%s %s %s" % ("hostgroups", inst.instance_type, otherGroups)
print "%s %s" % ("hostgroups", otherGroups)
#else:
#print "%s %s" % ("hostgroups", inst.instance_type)
print ("}\n")
NOTE: As of now I don't know how to get the custom tags value so did some hacks.
NOTE: Removing instance type as part of group, because the monitor will fail, if we have define any group with a instance type and no host is part of that group.
And put the following script into a file and put the file under root crontab:
#!/bin/bash
sudo /path/to/getInstanceDetails.py > /path/to/all_hosts.cfg
sleep 2
sudo service nagios3 restart
##Added this above script in cron as root user: sudo crontab -e
## */15 * * * * sudo /path/to/aboveScrptName.sh
## Now where I will update, what to check where ##
define service{
hostgroup_name db ;<-NOTE: over here you just have to put hostgroup.
service_description MYSQL
check_command check_nrpe_1arg!check_mysql
use generic-service-after-15 ; Name of service template to use
notification_interval 0 ; set > 0 if you want to be renotified
}
NOTE: you can create generic-service-xxx names with its own properties and add them over here.
No comments:
Post a Comment