Performing actions on host-up-down using nagios

From Notes_Wiki
Revision as of 11:20, 15 December 2015 by Saurabh (talk | contribs) (Created page with "<yambe:breadcrumb>Nagios_configuration|Nagios configuration</yambe:breadcrumb> =Performing actions on host-up-down using nagios= To perform actions when a host goes up or dow...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

<yambe:breadcrumb>Nagios_configuration|Nagios configuration</yambe:breadcrumb>

Performing actions on host-up-down using nagios

To perform actions when a host goes up or down using nagios use following steps:

  1. yum -y install nagios* --skip-broken
  2. edit /etc/nagios/objects/contacts.cfg and write correct email ID
  3. edit /etc/nagios/objects/commands.cfg and change check_host_alive command definition to:
    define command{
    command_name check-host-alive
    # command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
    command_line $USER1$/check_host_alive.sh $HOSTADDRESS$
    }
  4. edit /etc/nagios/objects/templates.cfg and after linux-server definition add following template:
    define host{
    name isp; The name of this host template
    use linux-server  ; This template inherits other values from the generic-host template
    check_period 24x7  ; By default, Linux hosts are checked round the clock
    check_interval 1  ; Actively check the host every 5 minutes
    retry_interval 1  ; Schedule host check retries at 1 minute intervals
    max_check_attempts 3  ; Check each Linux host 10 times (max)
    check_command check-host-alive ; Default command to check Linux hosts
    notification_period 24x7  ; Check ISPs 24x7
    notification_interval 10  ; Resend notifications every 2 hours
    notification_options d,u,r,f,s  ; Only send notifications for specific host states
    contact_groups admins  ; Notifications get sent to the admins by default
    register 0  ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
    }
  5. edit /etc/nagios/nagios.cfg and
    1. comment cfg_file=/etc/nagios/objects/localhost.cfg
    2. uncomment cfg_dir=/etc/nagios/servers
  6. create a host and service definition for monitoring by creating file /etc/nagios/servers/test.cfg with following contents:
    define host{
    use isp  ; Name of template
    host_name 192.168.122.102  ; Short name of server, will be used in nagios configuration
    alias 192.168.122.102 ; Long name of server, will be used in reporting
    address 192.168.122.102 ; FQDN or IP address of host, will be used for checks
    }
    define service{
    use generic-service  ; Name of service template to use
    host_name 192.168.122.102
    service_description PING
    check_command check_ping!100.0,20%!500.0,60%
    notifications_enabled 1
    }
  7. create /usr/lib64/nagios/plugins/check_host_alive.sh with following contents:
    #!/bin/bash
    OUTPUT2=$(ping -w 5 $1 -q)
    STATUS2=$?
    #Do not use internal plugin. It hangs in few cases. Ping used above is more reliable
    #OUTPUT=$(/usr/lib64/nagios/plugins/check_ping -H $1 -w 3000.0,80% -c 5000.0,100% -p 5 -t 5)
    #STATUS=$?
    if [[( "$1" == "14.139.5.5" ) && ( "$STATUS2" != "0" ) ]]; then
    GATEWAY=$(/sbin/ip route | awk '/default/ { print $3 }')
    echo $(date) "NKN is down, Gateway is $GATEWAY" >> /var/log/nagios/nkn-link-status.txt
    /usr/bin/ssh root@proxy1.sbarjatiya.com "echo $(date) NKN is down, should shift to BEAM if required >> /root/nkn-link-status.txt" 2>&1 >> /tmp/nagios.txt
    #Shift from NKN to BEAM if current gateway is not beam gateway
    if [[ "$GATEWAY" != "183.82.96.1" ]]; then
    sudo /sbin/route -v del default gw 10.4.8.2
    sudo /sbin/route -v add default gw 183.82.96.1
    #Not changing DNS from 10.4.20.204 to BEAM in hope that the same DNS might work
    fi
    elif [[( "$1" == "14.139.5.5" ) && ( "$STATUS2" == "0" ) ]]; then
    GATEWAY=$(/sbin/ip route | awk '/default/ { print $3 }')
    echo $(date) "NKN is up, Gateway is $GATEWAY" >> /var/log/nagios/nkn-link-status.txt
    /usr/bin/ssh root@proxy1.sbarjatiya.com "echo $(date) NKN is up, should shift to NKN if required. >> /root/nkn-link-status.txt" 2>&1 >> /tmp/nagios.txt
    #Shift from BEAM to NKN, if current gateway is not NKN gateway
    if [[ "$GATEWAY" != "10.4.8.2" ]]; then
    sudo /sbin/route -v del default gw 183.82.96.1 2>&1 >>/tmp/nagios.txt
    sudo /sbin/route -v add default gw 10.4.8.2 2>&1 >>/tmp/nagios.txt
    #Not changing DNS back to 10.4.20.204, as the DNS are not changed while migrating to BEAM
    fi
    elif [[( "$1" == "183.82.96.1" ) && ( "$STATUS2" == "0" ) ]]; then
    echo $(date) "BEAM is up" >> /var/log/nagios/beam-link-status.txt
    /usr/bin/ssh root@proxy1.sbarjatiya.com "echo $(date) BEAM is up, we can use it if NKN is down. >> /root/beam-link-status.txt" 2>&1 >> /tmp/nagios.txt
    elif [[( "$1" == "183.82.96.1" ) && ( "$STATUS2" != "0" ) ]]; then
    echo $(date) "BEAM is down" >> /var/log/nagios/beam-link-status.txt
    /usr/bin/ssh root@proxy1.sbarjatiya.com "echo $(date) BEAM is down >> /root/beam-link-status.txt" 2>&1 >> /tmp/nagios.txt
    else
    echo "Else case" >> /tmp/nagios.txt
    echo "1 is $1" >> /tmp/nagios.txt
    echo "STATUS is $STATUS" >> /tmp/nagios.txt
    fi
    echo $OUTPUT2
    if [[ "$STATUS2" == "0" ]]; then
    exit 0
    else
    exit 2
    fi
  8. chmod +x /usr/lib64/nagios/plugins/check_host_alive.sh
  9. usermod -s /bin/bash nagios
  10. su - nagios
  11. ssh-keygen
  12. ssh-copy-id root@192.168.122.103 #root@proxy1.sbarjatiya.com
  13. ssh root@192.168.122.103
  14. exit
  15. visudo
    1. Give nagios full root access or at least access to run /sbin/route
    2. Disable Defaults requiretty option by commenting it
  16. nagios -v /etc/nagios/nagios.cfg
  17. chkconfig nagios on
  18. service nagios start
  19. htpasswd -c /etc/nagios/passwd nagios
  20. chown -R nagios:apache /etc/nagios
  21. chmod -R 755 /etc/nagios
  22. chmod -R 777 /var/log/nagios
  23. chown -R nagios:nagios /var/log/nagios
  24. chmod 700 /var/log/nagios/.ssh/
  25. chmod 600 /var/log/nagios/.ssh/*
  26. chown -R nagios:apache /var/spool/nagios/
  27. chmod -R 775 /var/spool/nagios/
  28. Look at /tmp/nagios.txt and verify works lines are getting appended
  29. Look at status.txt on 192.168.122.103 and verify that /root/status.txt is as per status of 192.168.122.102 host
  30. Verify emails are being received at admin email ID.


<yambe:breadcrumb>Nagios_configuration|Nagios configuration</yambe:breadcrumb>