Nagios Logo Nagios is a tool for system monitoring. Nagios constantly checks if other machines are working properly. It also verifies that various services on those machines are working fine. In addition, Nagios accepts other processes or machines reporting their status, for example, a web server can directly report if it is not overloaded to Nagios. The main purpose of system monitoring is to detect as soon as possible any system that is not working properly so that users of that system will not report the issue to you first.

It is a common task to add new Hostgroups to Nagios, this entails setting up the Hostgroup, hosts and the associated services. You may also want to give access to the Nagios Web Interface for the Hostgroup Administrator, which will involve creating a new contact. You would only want to give this Administrator access to the hosts that they are responsible for, so I will cover all these themes today.

Describe Directory Structure

Before starting out, I will list my /usr/local/etc/nagios directory structure so as to visualize the connections between the configuration files.

/usr/local/etc/nagios # tree
.
|-- cgi.cfg
|-- config
|   |-- commands
|   |   |-- default-commands.cfg
|   |   `-- extra-commands.cfg
|   |-- contactgroups
|   |-- contacts
|   |   |-- default-contacts.cfg
|   |   `-- example.com-contacts.cfg
|   |-- hostgroups
|   |   |-- servers-example.com-hostgroups.cfg
|   |   `-- servers-freebsd-hostgroups.cfg
|   |-- hosts
|   |   |-- alfa.pbdigital.org.cfg
|   |   |-- bravo.pbdigital.org.cfg
|   |   |-- charlie.pbdigital.org.cfg
|   |   |-- xray.example.com.cfg
|   |   |-- yankee.example.com.cfg
|   |   `-- zulu.example.com.cfg
|   |-- servicegroups
|   |-- services
|   |   |-- alfa.pbdigital.org-services.cfg
|   |   |-- bravo.pbdigital.org-services.cfg
|   |   |-- charlie.pbdigital.org-services.cfg
|   |   |-- servers-freebsd-services.cfg
|   |   |-- xray.example.com-services.cfg
|   |   |-- yankee.example.com-services.cfg
|   |   `-- zulu.example.com-services.cfg
|   |-- templates
|   |   |-- default-templates.cfg
|   |   `-- extra-templates.cfg
|   `-- timeperiods
|       `-- default-timeperiods.cfg
|-- nagios.cfg
|-- resource.cfg

As you can see I have all the configuration files that do not need to be in the root nagios directory in a directory named config. This is set in nagios.cfg as follows:

# You can also tell Nagios to process all config files (with a .cfg
# extension) in a particular directory by using the cfg_dir 
# directive as shown below:

cfg_dir=/usr/local/etc/nagios/config

Define a New Hostgroup

In the file servers-example.com-hostgroups.cfg I define the new Hostgroup.

define hostgroup{
  hostgroup_name    servers-example.com
  alias             example.com servers 
  members           xray.example.com,yankee.example.com,zulu.example.com
}

Define New Hosts

In the config/hosts directory, I add 3 new files for each of the hosts I will be monitoring. Follows is the file which defines xray.example.com:

define host {
    use                     server-template-example.com
    host_name               xray.example.com
    alias                   xray
    address                 104.27.185.152
}

The directive use determines which template file to use as a base configuration for the host. I have defined a new template in config/templates/extra-templates.cfg as follows:

define host {

    name                      server-template-example.com   ; The name of this host template
    use                       generic-host                  ; This template inherits other values from the generic-host template
    check_period              24x7                          ; By default, example.com hosts are checked round the clock
    check_interval            5                             ; Actively check the host every 5 minutes 
    retry_interval            1                             ; Schedule host check retries at 1 minute intervals
    max_check_attempts        10                            ; Check each example.com host 10 times (max)
    check_command             check-host-alive-4            ; Default command to check example.com hosts
    notification_period       workhours                     ; example.com admins hate to be woken up, so we only notify during the day 
                                                            ; Note that the notification_period variable is being overridden from
                                                            ; the value that is inherited from the generic-host template!
    notification_interval     120                           ; Resend notifications every 2 hours 
    notification_options      d,u,r                         ; Only send notifications for specific host states
    contact_groups            example.com-admins            ; Notifications get sent to the admins by default 
    register                  0                             ; DON'T REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}

Add Services

In the config/services directory, I add 3 new files for each of the hosts I will be monitoring. Follows is a sample of the xray.example.com-services.cfg file which defines the services for the host xray.example.com.

define service{
        use                     generic-service
        service_description     HTTPS Certificate
        check_command           check_http_cert_sni
        host_name               xray.example.com
        }

define service{
        use                     generic-service
        service_description     Web Server HTTPS
        check_command           check_http_ssl_sni
        host_name               xray.example.com
        }

define service{
        use                     passive_service
        service_description     Disk Usage
        host_name               xray.example.com
        }

define service{
        use                     passive_service
        service_description     Swap Usage
        host_name               xray.example.com
        }

Add Contacts

In the server-template-example.com template, I defined the contact_groups as example.com-admins. This is a group of contacts that needs to contain individual members. Below I will define a single administrator, and then define the contact group and add this administrator as a member. If you would like to add more members to the group you will need to define a contact for each individual member.

define contact {

    contact_name            example.com-admin       ; Short name of user 
    use                     generic-contact         ; Inherit default values from generic-contact template (defined above)
    alias                   admin-example.com       ; Full name of user 
    email                   admin@example.com       ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
}

define contactgroup {

    contactgroup_name       example.com-admins
    alias                   example.com  Administrators
    members                 example.com-admin
}

This is a good time to create a htpasswd password for the administrator. This will enable the administrator access to only the hosts that have been configured with their contact details.

Use the htpasswd command for the user example.com-admin as follows:

htpasswd /usr/local/etc/nagios/.htpasswd.users example.com-admin

Test Configuration & Restart

WIth all the configuration taken care off, I can check the files to search for any mistakes that may have been made before attempting to restart Nagios.

Use nagios -v nagios.cfg to verify everything is correct:

nagios -v nagios.cfg

Now I see that everything is in order, I restart Nagios.

service nagios restart

Verify Nagios Web Interface

Logging into the Nagios Web Interface with the new user, the new Hostgroup is visible. Of note, this will also be visible to the nagiosadmin user.

Navigating to the Host Groups view, you will see a new Host Group similar to below:

Nagios Hostgroup

Wrapping Up

That’s all for today! I highly recommend Learning Nagios, Third Edition by Wojciech Kocjan if you are interested in learning the full workings of Nagios.