@demitri wrote:
Introduction
Let’s say your Nagios service monitors groups of services and hosts that are under the care of different teams of individuals. In a basic integration, all alerts for all services and hosts would go through the same service and thus the same escalation policy, alerting the same personnel. However, it is possible to route alerts for distinct hosts and services to different integrations and thus different services and escalation policies.
Furthermore, being able to route to multiple services grants another option for of performing automatic alert triage: by setting the urgency of incidents on some services (designated for low-priority systems) to low, as opposed to others that are more mission-critical.
Disclaimer
Nagios is highly configurable. How it is configured out-of-the-box when it is first installed may vary dramatically between implementations, i.e. if it is installed from source or through a software package built by the maintainers of a Linux distribution. With features like templating, groups and inheritance, the design of Nagios’ configuration language permits a virtual infinitude of different ways in which one can achieve any given goal in configuration.
That being said, the configuration examples given here implementation are not the only way to achieve the goal of integrating Nagios with multiple PagerDuty services, and an actual working path to implementing this goal may vary based on your preexisting Nagios configuration. If you paste them into your configuration as-is and substitute your own object names, not only will it probably not work but it might also break or otherwise disrupt your existing Nagios configuration.
You must understand what you are doing. If you feel comfortable with the configuration language of Nagios, you should be able to adapt these rudimentary examples to best fit your implementation.
Prerequisites
It is thus strongly advised to have some familiarity with the following topics:
- Main Configuration File Options
- Object Definitions
- Object Inheritance
- The preexisting configuration of your Nagios installation.
To learn more about your Nagios installation: you can start by finding out where your configuration resides, which is what you’ll be editing. Here’s one way to do it:
root@ip-172-30-0-109:~# ps aux | grep nagios nagios 15006 0.0 0.4 42576 4844 ? SNsl 01:01 0:00 /usr/sbin/nagios3 -d /etc/nagios3/nagios.cfg
The
-d
option indicates the path to the main configuration file.Next, note any instances of
cfg_file
orcfg_dir
in the main config file. These indicate secondary configuration files that are loaded, parsed and applied.How It Works
Firstly, note (vis-à-vis the PagerDuty configuration templates and configuration changes shown in the integration guides), whether for the agent-based integration or agentless integration, the integration works as follows:
- Two notification commands are defined, which handle calling an external integration script (supplied by PagerDuty) with the necessary arguments for submitting alerting data to PagerDuty. They are for service and host notifications
- A contact is defined, whose
pager
property is the integration key, and which is configured to use the special notification commands for event delivery to PagerDuty.- The contact is added to all hosts and services by way of adding it to a contact group.
- The contact group, to which the contact is added, is used on all hosts and all services by way of templates that specify that contact group for alerts. This structure applies to many out-of-box implementations of Nagios, particularly for Debian and RHEL-based installations, which is why it is advised.
What is touched upon in the FAQ of the Nagios integration guides is the fact that, by re-using the same notification commands on a different contact with a different
pager
property (set to a distinct integration key), that contact can be used for sending Nagios alerts to a different service.Sample Configuration
First, let’s define two distinct contacts based on the original configuration template, which have distinct values for
contact_name
andpager
.define contact { contact_name pagerduty_foo alias PagerDuty Pseudo-Contact (FOO) service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r host_notification_options d,r service_notification_commands notify-service-by-pagerduty host_notification_commands notify-host-by-pagerduty pager 291a2412ebec493b9e4cd0f92aceb8eb } define contact { contact_name pagerduty_bar alias PagerDuty Pseudo-Contact (BAR) service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r host_notification_options d,r service_notification_commands notify-service-by-pagerduty host_notification_commands notify-host-by-pagerduty pager 66ac7e430b924344b4dff67353171722 }
Next, let’s say we have a host
foohost
and want the service associated withpagerduty_bar
to have an incident when it is unreachable. *Note, in this example, it is assumed we have a host template namedgeneric-host
which defines some reasonable default values for various options. This host is defined in the pre-built Nagios software package for Debian / RHEL-based implementations. We add:define host{ use generic-host host_name foohost alias foohost address 192.168.7.6 contacts pagerduty_bar # Required for PagerDuty integration }
On this host, we are running HTTP and SSH services. To send alerts to
pagerduty_foo
's service when the HTTP service is inaccessible, andpagerduty_bar
's service when SSH is inaccessible (note also here the templategeneric-service
):define service { host_name foohost service_description HTTP check_command check_http use generic-service notification_interval 0 contacts pagerduty_foo # Required for PagerDuty integration } define service { host_name foohost service_description SSH check_command check_ssh use generic-service notification_interval 2 check_interval 2 contacts pagerduty_bar # Required for PagerDuty integration }
Posts: 1
Participants: 1