nagios4.3.4监控
Database monitoring is key to understanding how a database performs over time. It can help you uncover hidden usage problems and bottlenecks happening in your database. Implementing database monitoring systems can quickly turn out to be a long-term advantage, which will positively influence your infrastructure management process. You’ll be able to swiftly react to status changes of your database and will quickly be notified when monitored services return to normal functioning.
数据库监视是了解数据库如何随时间运行的关键。 它可以帮助您发现数据库中隐藏的使用问题和瓶颈。 实施数据库监视系统可以很快证明是一项长期优势,这将对您的基础架构管理过程产生积极影响。 您将能够对数据库的状态更改做出快速React,并在受监视的服务恢复正常运行时Swift得到通知。
Nagios Core is a popular monitoring system that you can use to monitor your managed database. The benefits of using Nagios for this task are its versatility—it’s easy to configure and use—a large repository of available plugins, and most importantly, integrated alerting.
Nagios Core是一种流行的监视系统,可用于监视托管数据库。 使用Nagios进行此任务的好处是它的多功能性-易于配置和使用-大量的可用插件存储库,最重要的是集成警报。
In this tutorial, you will set up PostgreSQL database monitoring in Nagios Core using the check_postgres
Nagios plugin and set up Slack-based alerting. In the end, you’ll have a monitoring system in place for your managed PostgreSQL database, and will be notified of status changes of various functionality immediately.
在本教程中,您将使用check_postgres
Nagios插件在Nagios Core中设置PostgreSQL数据库监视,并设置基于Slack的警报。 最后,您将拥有一个用于托管PostgreSQL数据库的监视系统,并将立即收到各种功能状态更改的通知。
An Ubuntu 18.04 server with root privileges, and a secondary, non-root account. You can set this up by following this initial server setup guide. For this tutorial the non-root user is sammy
.
具有root用户特权的Ubuntu 18.04服务器和一个非root用户辅助帐户。 您可以按照本初始服务器安装指南进行设置 。 在本教程中,非root用户是sammy
。
Nagios Core installed on your server. To achieve this, complete the first five steps of the How To Install Nagios 4 and Monitor Your Servers on Ubuntu 18.04 tutorial.
服务器上安装了Nagios Core。 为此,请完成“ 如何在Ubuntu 18.04上安装Nagios 4和监视服务器”教程的前五个步骤。
A DigitalOcean account and a PostgreSQL managed database provisioned from DigitalOcean with connection information available. Make sure that your server’s IP address is on the whitelist. To learn more about DigitalOcean Managed Databases, visit the product docs.
由DigitalOcean提供的DigitalOcean帐户和PostgreSQL管理的数据库具有可用的连接信息。 确保服务器的IP地址在白名单上。 要了解有关DigitalOcean托管数据库的更多信息,请访问产品文档 。
A Slack account with full access, added to a workspace where you’ll want to receive status updates.
具有完全访问权限的Slack帐户,已添加到您要接收状态更新的工作空间中。
In this section, you’ll download the latest version of the check_postgres
plugin from Github and make it available to Nagios Core. You’ll also install the PostgreSQL client (psql
), so that check_postgres
will be able to connect to your managed database.
在本部分中,您将从Github下载最新版本的check_postgres
插件,并将其提供给Nagios Core。 您还将安装PostgreSQL客户端( psql
),以便check_postgres
能够连接到托管数据库。
Start off by installing the PostgreSQL client by running the following command:
首先通过运行以下命令来安装PostgreSQL客户端:
Next, you’ll download check_postgres
to your home directory. First, navigate to it:
接下来,您将check_postgres
下载到您的主目录。 首先,导航到它:
Head over to the Github releases page and copy the link of the latest version of the plugin. At the time of writing, the latest version of check_postgres
was 2.24.0
; keep in mind that this will update, and where possible it’s best practice to use the latest version.
转到Github发布页面,并复制最新版本插件的链接。 在撰写本文时, check_postgres
的最新版本是2.24.0
; 请记住,这将会更新,并且在可能的情况下,最佳做法是使用最新版本。
Now download it using curl:
现在使用curl下载它:
curl -LO https://github.com/bucardo/check_postgres/releases/download/2.24.0/check_postgres-2.24.0.tar.gz
curl -LO https://github.com/bucardo/check_postgres/releases/download/2.24.0/check_postgres- 2.24.0 .tar.gz
Extract it using the following command:
使用以下命令将其解压缩:
This will create a directory with the same name as the file you have downloaded. That folder contains the check_postgres
executable, which you’ll need to copy to the directory where Nagios stores its plugins (usually /usr/local/nagios/libexec/
). Copy it by running the following command:
这将创建一个与您下载的文件同名的目录。 该文件夹包含check_postgres
可执行文件,您需要将其复制到Nagios存储其插件的目录(通常为/usr/local/nagios/libexec/
)。 通过运行以下命令将其复制:
Next, you’ll need to give the nagios
user ownership of it, so that it can be run from Nagios:
接下来,您需要为其授予nagios
用户所有权,以便可以从Nagios运行它:
check_postgres
is now available to Nagios and can be used from it. However, it provides a lot of commands pertaining to different aspects of PostgreSQL, and for better service maintainability, it’s better to break them up so that they can be called separately. You’ll achieve this by creating a symlink to every check_postgres
command in the plugin directory.
check_postgres
现在可用于Nagios并可以从中使用。 但是,它提供了许多与PostgreSQL不同方面有关的命令,为了更好的服务可维护性,最好将它们分解以便分别调用。 您将通过创建一个指向插件目录中每个check_postgres
命令的符号链接来实现此目的。
Navigate to the directory where Nagios stores plugins by running the following command:
通过运行以下命令导航到Nagios存储插件的目录:
Then, create the symlinks with:
然后,使用以下命令创建符号链接:
The output will look like this:
输出将如下所示:
Output
Created "check_postgres_archive_ready"
Created "check_postgres_autovac_freeze"
Created "check_postgres_backends"
Created "check_postgres_bloat"
Created "check_postgres_checkpoint"
Created "check_postgres_cluster_id"
Created "check_postgres_commitratio"
Created "check_postgres_connection"
Created "check_postgres_custom_query"
Created "check_postgres_database_size"
Created "check_postgres_dbstats"
Created "check_postgres_disabled_triggers"
Created "check_postgres_disk_space"
Created "check_postgres_fsm_pages"
Created "check_postgres_fsm_relations"
Created "check_postgres_hitratio"
Created "check_postgres_hot_standby_delay"
Created "check_postgres_index_size"
Created "check_postgres_indexes_size"
Created "check_postgres_last_analyze"
Created "check_postgres_last_autoanalyze"
Created "check_postgres_last_autovacuum"
Created "check_postgres_last_vacuum"
Created "check_postgres_listener"
Created "check_postgres_locks"
Created "check_postgres_logfile"
Created "check_postgres_new_version_bc"
Created "check_postgres_new_version_box"
Created "check_postgres_new_version_cp"
Created "check_postgres_new_version_pg"
Created "check_postgres_new_version_tnm"
Created "check_postgres_pgagent_jobs"
Created "check_postgres_pgb_pool_cl_active"
Created "check_postgres_pgb_pool_cl_waiting"
Created "check_postgres_pgb_pool_maxwait"
Created "check_postgres_pgb_pool_sv_active"
Created "check_postgres_pgb_pool_sv_idle"
Created "check_postgres_pgb_pool_sv_login"
Created "check_postgres_pgb_pool_sv_tested"
Created "check_postgres_pgb_pool_sv_used"
Created "check_postgres_pgbouncer_backends"
Created "check_postgres_pgbouncer_checksum"
Created "check_postgres_prepared_txns"
Created "check_postgres_query_runtime"
Created "check_postgres_query_time"
Created "check_postgres_relation_size"
Created "check_postgres_replicate_row"
Created "check_postgres_replication_slots"
Created "check_postgres_same_schema"
Created "check_postgres_sequence"
Created "check_postgres_settings_checksum"
Created "check_postgres_slony_status"
Created "check_postgres_table_size"
Created "check_postgres_timesync"
Created "check_postgres_total_relation_size"
Created "check_postgres_txn_idle"
Created "check_postgres_txn_time"
Created "check_postgres_txn_wraparound"
Created "check_postgres_version"
Created "check_postgres_wal_files"
Perl listed all the functions it created a symlink for. These can now be executed from the command line as usual.
Perl列出了为其创建符号链接的所有功能。 现在可以像往常一样从命令行执行这些操作。
You’ve downloaded and installed the check_postgres
plugin. You have also created symlinks to all the commands of the plugin, so that they can be used individually from Nagios. In the next step, you’ll create a connection service file, which check_postgres
will use to connect to your managed database.
您已经下载并安装了check_postgres
插件。 您还创建了指向插件所有命令的符号链接,以便可以从Nagios中单独使用它们。 在下一步中,您将创建一个连接服务文件, check_postgres
将使用该文件来连接到托管数据库。
In this section, you will create a PostgreSQL connection service file containing the connection information of your database. Then, you will test the connection data by invoking check_postgres
on it.
在本节中,您将创建一个包含数据库连接信息的PostgreSQL连接服务文件。 然后,您将通过在其上调用check_postgres
来测试连接数据。
The connection service file is by convention called pg_service.conf
, and must be located under /etc/postgresql-common/
. Create it for editing with your favorite editor (for example, nano):
根据惯例,连接服务文件名为pg_service.conf
,并且必须位于/etc/postgresql-common/
。 创建它以便使用您喜欢的编辑器(例如,nano)进行编辑:
Add the following lines, replacing the highlighted placeholders with the actual values shown in your Managed Database Control Panel under the section Connection Details:
添加以下行,将突出显示的占位符替换为“托管数据库控制面板”中“ 连接详细信息 ”部分下显示的实际值:
[managed-db]
host=host
port=port
user=username
password=password
dbname=defaultdb
sslmode=require
The connection service file can house multiple database connection info groups. The beginning of a group is signaled by putting its name in square brackets. After that comes the connection parameters (host
, port
, user
, password
, and so on), separated by new lines, which must be given a value.
连接服务文件可以容纳多个数据库连接信息组。 通过将其名称放在方括号中来表示组的开始。 之后是连接参数( host
, port
, user
, password
等),以新行分隔,必须为其指定一个值。
Save and close the file when you are finished.
完成后保存并关闭文件。
You’ll now test the validity of the configuration by connecting to the database via check_postgres
by running the following command:
现在,您将通过运行以下命令,通过check_postgres
连接到数据库来测试配置的有效性:
Here, you tell check_postgres
which database connection info group to use with the parameter --dbservice
, and also specify that it should only try to connect to it by specifying connection
as the action.
在这里,您通过参数--dbservice
告诉check_postgres
哪个数据库连接信息组,并且还指定仅应通过将connection
指定为操作来尝试与其connection
。
Your output will look similar to this:
您的输出将类似于以下内容:
Output
POSTGRES_CONNECTION OK: service=managed-db version 11.4 | time=0.10s
This means that check_postgres
succeeded in connecting to the database, according to the parameters from pg_service.conf
. If you get an error, double check what you have just entered in that config file.
这意味着根据pg_service.conf
的参数, check_postgres
成功连接到数据库。 如果出现错误,请再次检查您刚刚在该配置文件中输入的内容。
You’ve created and filled out a PostgreSQL connection service file, which works as a connection string. You have also tested the connection data by running check_postgres
on it and observing the output. In the next step, you will configure Nagios to monitor various parts of your database.
您已经创建并填写了PostgreSQL连接服务文件,该文件用作连接字符串 。 您还通过在连接数据上运行check_postgres
并观察输出来测试了连接数据。 在下一步中,您将配置Nagios以监视数据库的各个部分。
Now you will configure Nagios to watch over various metrics of your database by defining a host and multiple services, which will call the check_postgres
plugin and its symlinks.
现在,您将通过定义一个主机和多个服务来配置Nagios来监视数据库的各种指标,这些服务将调用check_postgres
插件及其符号链接。
Nagios stores your custom configuration files under /usr/local/nagios/etc/objects
. New files you add there must be manually enabled in the central Nagios config file, located at /usr/local/nagios/etc/nagios.cfg
. You’ll now define commands, a host, and multiple services, which you’ll use to monitor your managed database in Nagios.
Nagios将您的自定义配置文件存储在/usr/local/nagios/etc/objects
。 您添加到其中的新文件必须在/usr/local/nagios/etc/nagios.cfg
的中央Nagios配置文件中手动启用。 现在,您将定义命令,主机和多个服务,这些命令将用于监视Nagios中的托管数据库。
First, create a folder under /usr/local/nagios/etc/objects
to store your PostgreSQL related configuration by running the following command:
首先,通过运行以下命令,在/usr/local/nagios/etc/objects
下创建一个文件夹来存储与PostgreSQL相关的配置:
You’ll store Nagios commands for check_nagios
in a file named commands.cfg
. Create it for editing:
您会将用于check_nagios
Nagios命令存储在名为commands.cfg
的文件中。 创建它进行编辑:
Add the following lines:
添加以下行:
define command {
command_name check_postgres_connection
command_line /usr/local/nagios/libexec/check_postgres_connection --dbservice=$ARG1$
}
define command {
command_name check_postgres_database_size
command_line /usr/local/nagios/libexec/check_postgres_database_size --dbservice=$ARG1$ --critical='$ARG2$'
}
define command {
command_name check_postgres_locks
command_line /usr/local/nagios/libexec/check_postgres_locks --dbservice=$ARG1$
}
define command {
command_name check_postgres_backends
command_line /usr/local/nagios/libexec/check_postgres_backends --dbservice=$ARG1$
}
Save and close the file.
保存并关闭文件。
In this file, you define four Nagios commands that call different parts of the check_postgres
plugin (checking connectivity, getting the number of locks and connections, and the size of the whole database). They all accept an argument that is passed to the --dbservice
parameter, and specify which of the databases defined in pg_service.conf
to connect to.
在此文件中,您定义了四个Nagios命令,这些命令调用check_postgres
插件的不同部分(检查连接性,获取锁和连接的数量以及整个数据库的大小)。 它们都接受传递给--dbservice
参数的参数,并指定要连接到pg_service.conf
定义的数据库。
The check_postgres_database_size
command accepts a second argument that gets passed to the --critical
parameter, which specifies the point at which the database storage is becoming full. Accepted values include 1 KB
for a kilobyte, 1 MB
for a megabyte, and so on, up to exabytes (EB
). A number without a capacity unit is treated as being expressed in bytes.
check_postgres_database_size
命令接受第二个参数,该参数传递给--critical
参数,该参数指定数据库存储已满的点。 可接受的值包括1 KB
代表千字节), 1 MB
代表兆字节),依此类推,直至艾字节( EB
)。 没有容量单位的数字被视为以字节表示。
Now that the necessary commands are defined, you’ll define the host (essentially, the database) and its monitoring services in a file named services.cfg
. Create it using your favorite editor:
现在已经定义了必要的命令,您将在名为services.cfg
的文件中定义主机(基本上是数据库)及其监视services.cfg
。 使用您喜欢的编辑器创建它:
Add the following lines, replacing db_max_storage_size
with a value pertaining to the available storage of your database. It is recommended to set it to 90 percent of the storage size you have allocated to it:
添加以下行,将db_max_storage_size
替换db_max_storage_size
与数据库的可用存储有关的值。 建议将其设置为分配给它的存储空间的90%:
define host {
use linux-server
host_name postgres
check_command check_postgres_connection!managed-db
}
define service {
use generic-service
host_name postgres
service_description PostgreSQL Connection
check_command check_postgres_connection!managed-db
notification_options w,u,c,r,f,s
}
define service {
use generic-service
host_name postgres
service_description PostgreSQL Database Size
check_command check_postgres_database_size!managed-db!db_max_storage_size
notification_options w,u,c,r,f,s
}
define service {
use generic-service
host_name postgres
service_description PostgreSQL Locks
check_command check_postgres_locks!managed-db
notification_options w,u,c,r,f,s
}
define service {
use generic-service
host_name postgres
service_description PostgreSQL Backends
check_command check_postgres_backends!managed-db
notification_options w,u,c,r,f,s
}
You first define a host, so that Nagios will know what entity the services relate to. Then, you create four services, which call the commands you just defined. Each one passes managed-db
as the argument, detailing that the managed-db
you defined in Step 2 should be monitored.
首先定义一个主机,以便Nagios知道服务与哪个实体相关。 然后,您创建四个服务,它们调用您刚定义的命令。 每个managed-db
都将managed-db
作为参数传递,详细说明应该监视您在步骤2中定义的managed-db
。
Regarding notification options, each service specifies that notifications should be sent out when the service state becomes WARNING
, UNKNOWN
, CRITICAL
, OK
(when it recovers from downtime), when the service starts flapping, or when scheduled downtime starts or ends. Without explicitly giving this option a value, no notifications would be sent out (to available contacts) at all, except if triggered manually.
关于通知选项,每个服务都指定当服务状态变为WARNING
, UNKNOWN
, CRITICAL
, OK
(从停机时间恢复时),服务开始波动或计划的停机时间开始或结束时,应发送通知。 如果未明确为该选项提供值,则除非手动触发,否则根本不会发送任何通知(向可用联系人)。
Save and close the file.
保存并关闭文件。
Next, you’ll need to explicitly tell Nagios to read config files from this new directory, by editing the general Nagios config file. Open it for editing by running the following command:
接下来,您需要通过编辑常规的Nagios配置文件,明确告诉Nagios从该新目录中读取配置文件。 通过运行以下命令将其打开以进行编辑:
Find this highlighted line in the file:
在文件中找到以下突出显示的行:
...
# directive as shown below:
cfg_dir=/usr/local/nagios/etc/servers
#cfg_dir=/usr/local/nagios/etc/printers
...
Above it, add the following highlighted line:
在其上方,添加以下突出显示的行:
...
cfg_dir=/usr/local/nagios/etc/objects/postgresql
cfg_dir=/usr/local/nagios/etc/servers
...
Save and close the file. This line tells Nagios to load all config files from the /usr/local/nagios/etc/objects/postgresql
directory, where your configuration files are located.
保存并关闭文件。 此行告诉Nagios从配置文件所在的/usr/local/nagios/etc/objects/postgresql
目录中加载所有配置文件。
Before restarting Nagios, check the validity of the configuration by running the following command:
在重新启动Nagios之前,通过运行以下命令检查配置的有效性:
The end of the output will look similar to this:
输出的结尾将类似于以下内容:
Output
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
This means that Nagios found no errors in the configuration. If it shows you an error, you’ll also see a hint as to what went wrong, so you’ll be able to fix the error more easily.
这意味着Nagios在配置中没有发现错误。 如果显示错误,您还会看到关于出了什么问题的提示,因此您可以更轻松地修复错误。
To make Nagios reload its configuration, restart its service by running the following command:
要使Nagios重新加载其配置,请通过运行以下命令重新启动其服务:
You can now navigate to Nagios in your browser. Once it loads, press on the Services option from the left-hand menu. You’ll see the postgres
host and a list of services, along with their current statuses:
现在,您可以在浏览器中导航到Nagios。 加载后,按左侧菜单中的“ 服务”选项。 您将看到postgres
主机和服务列表,以及它们的当前状态:
They will all soon turn to green and show an OK
status. You’ll see the command output under the Status Information column. You can click on the service name and see detailed information about its status and availability.
它们很快就会变成绿色并显示OK
状态。 您将在“ 状态信息”列下看到命令输出。 您可以单击服务名称,然后查看有关其状态和可用性的详细信息。
You’ve added check_postgres
commands, a host, and multiple services to your Nagios installation to monitor your database. You’ve also checked that the services are working properly by examining them via the Nagios web interface. In the next step, you will configure Slack-based alerting.
您已在Nagios安装中添加了check_postgres
命令,一个主机和多个服务来监视数据库。 您还可以通过Nagios Web界面检查服务,以检查服务是否正常运行。 在下一步中,您将配置基于Slack的警报。
In this section, you will configure Nagios to alert you about events via Slack, by posting them into desired channels in your workspace.
在本部分中,您将配置Nagios通过将事件发布到工作区中所需的通道中来通过Slack提醒您有关事件。
Before you start, log in to your desired workspace on Slack and create two channels where you’ll want to receive status messages from Nagios: one for host, and the other one for service notifications. If you wish, you can create only one channel where you’ll receive both kinds of alerts.
在开始之前,登录到Slack上所需的工作区并创建两个通道,您将在其中接收来自Nagios的状态消息:一个用于主机,另一个用于服务通知。 如果需要,您只能创建一个通道,在此通道中您将收到两种警报。
Then, head over to the Nagios app in the Slack App Directory and press on Add Configuration. You’ll see a page for adding the Nagios Integration.
然后,转到Slack App Directory中的Nagios应用程序,然后按Add Configuration 。 您会看到一个用于添加Nagios集成的页面。
Press on Add Nagios Integration. When the page loads, scroll down and take note of the token, because you’ll need it further on.
点击Add Nagios Integration 。 页面加载后,向下滚动并记下令牌,因为您将需要进一步使用它。
You’ll now install and configure the Slack plugin (written in Perl) for Nagios on your server. First, install the required Perl prerequisites by running the following command:
现在,您将在服务器上为Nagios安装和配置Slack插件(用Perl编写)。 首先,通过运行以下命令来安装所需的Perl先决条件:
Then, download the plugin to your Nagios plugin directory:
然后,将该插件下载到您的Nagios插件目录:
Make it executable by running the following command:
通过运行以下命令使其可执行:
Now, you’ll need to edit it to connect to your workspace using the token you got from Slack. Open it for editing:
现在,您需要对其进行编辑,以使用从Slack获得的令牌连接到工作区。 打开它进行编辑:
Find the following lines in the file:
在文件中找到以下几行:
...
my $opt_domain = "foo.slack.com"; # Your team's domain
my $opt_token = "your_token"; # The token from your Nagios services page
...
Replace foo.slack.com
with your workspace domain and your_token
with your Nagios app integration token, then save and close the file. The script will now be able to send proper requests to Slack, which you’ll now test by running the following command:
将foo .slack.com
替换为您的工作区域,并将your_token
为Nagios应用程序集成令牌,然后保存并关闭文件。 该脚本现在将能够向Slack发送适当的请求,您现在可以通过运行以下命令来对其进行测试:
./slack.pl -field slack_channel=#your_channel_name -field HOSTALIAS="Test Host" -field HOSTSTATE="UP" -field HOSTOUTPUT="Host is UP" -field NOTIFICATIONTYPE="RECOVERY"
./slack.pl -field slack_channel =# your_channel_name -field HOSTALIAS =“测试主机” -field HOSTSTATE =“ UP” -field HOSTOUTPUT =“主机已启动” -field NOTIFICATIONTYPE =“ RECOVERY”
Replace your_channel_name
with the name of the channel where you’ll want to receive status alerts. The script will output information about the HTTP request it made to Slack, and if everything went through correctly, the last line of the output will be ok
. If you get an error, double check if the Slack channel you specified exists in the workspace.
将your_channel_name
替换为您要接收状态警报的频道的名称。 该脚本将输出有关它对Slack发出的HTTP请求的信息,如果一切顺利,则输出的最后一行将是ok
。 如果收到错误,请仔细检查您指定的Slack通道是否存在于工作空间中。
You can now head over to your Slack workspace and select the channel you specified. You’ll see a test message coming from Nagios.
现在,您可以转到Slack工作区并选择您指定的通道。 您会看到来自Nagios的测试消息。
This confirms that you have properly configured the Slack script. You’ll now move on to configuring Nagios to alert you via Slack using this script.
这确认您已正确配置了Slack脚本。 现在,您将继续配置Nagios,以使用此脚本通过Slack提醒您。
You’ll need to create a contact for Slack and two commands that will send messages to it. You’ll store this config in a file named slack.cfg
, in the same folder as the previous config files. Create it for editing by running the following command:
您需要为Slack创建一个联系人,以及两个向其发送消息的命令。 您会将此配置存储在名为slack.cfg
的文件中,该文件与先前的配置文件位于同一文件夹中。 通过运行以下命令创建要编辑的文件:
Add the following lines:
添加以下行:
define contact {
contact_name slack
alias Slack
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,f,s,r
host_notification_options d,u,r,f,s
service_notification_commands notify-service-by-slack
host_notification_commands notify-host-by-slack
}
define command {
command_name notify-service-by-slack
command_line /usr/local/nagios/libexec/slack.pl -field slack_channel=#service_alerts_channel
}
define command {
command_name notify-host-by-slack
command_line /usr/local/nagios/libexec/slack.pl -field slack_channel=#host_alerts_channel
}
Here you define a contact named slack
, state that it can be contacted anytime and specify which commands to use for notifying service and host related events. Those two commands are defined after it and call the script you have just configured. You’ll need to replace service_alerts_channel
and host_alerts_channel
with the names of the channels where you want to receive service and host messages, respectively. If preferred, you can use the same channel names.
在这里,您定义了一个名为slack
的联系人,该联系人可以随时联系,并指定用于通知服务和主机相关事件的命令。 这两个命令在其后定义,并调用您刚刚配置的脚本。 您需要host_alerts_channel
用要接收服务和主机消息的通道的名称替换service_alerts_channel
和host_alerts_channel
。 如果愿意,可以使用相同的频道名称。
Similarly to the service creation in the last step, setting service and host notification options on the contact is crucial, because it governs what kind of alerts the contact will receive. Omitting those options would result in sending out notifications only when manually triggered from the web interface.
与最后一步中的服务创建类似,在联系人上设置服务和主机通知选项至关重要,因为它决定了联系人将收到哪种警报。 忽略这些选项仅在从Web界面手动触发时才导致发出通知。
When you are done with editing, save and close the file.
完成编辑后,保存并关闭文件。
To enable alerting via the slack
contact you just defined, you’ll need to add it to the admin
contact group, defined in the contacts.cfg
config file, located under /usr/local/nagios/etc/objects/
. Open it for editing by running the following command:
要通过您刚刚定义的slack
联系人启用警报,您需要将其添加到在/usr/local/nagios/etc/objects/
下的contacts.cfg
配置文件中定义的admin
联系人组中。 通过运行以下命令将其打开以进行编辑:
Find the config block that looks like this:
找到如下所示的配置块:
define contactgroup {
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
}
Add slack
to the list of members, like so:
将slack
添加到成员列表中,如下所示:
define contactgroup {
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin,slack
}
Save and close the file.
保存并关闭文件。
By default when running scripts, Nagios does not make host and service information available via environment variables, which is what the Slack script requires in order to send meaningful messages. To remedy this, you’ll need to set the enable_environment_macros
setting in nagios.cfg
to 1
. Open it for editing by running the following command:
默认情况下,运行脚本时,Nagios不会通过环境变量使主机和服务信息可用,这是Slack脚本发送有意义的消息所需要的。 为了解决这个问题,您需要将nagios.cfg
的enable_environment_macros
设置设置为1
。 通过运行以下命令将其打开以进行编辑:
Find the line that looks like this:
找到看起来像这样的行:
enable_environment_macros=0
Change the value to 1
, like so:
将值更改为1
,如下所示:
enable_environment_macros=1
Save and close the file.
保存并关闭文件。
Test the validity of the Nagios configuration by running the following command:
通过运行以下命令来测试Nagios配置的有效性:
The end of the output will look like:
输出的结尾将如下所示:
Output
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
Proceed to restart Nagios by running the following command:
通过运行以下命令来重新启动Nagios:
To test the Slack integration, you’ll send out a custom notification via the web interface. Reload the Nagios Services status page in your browser. Press on the PostgreSQL Backends service and press on Send custom service notification on the right when the page loads.
要测试Slack集成,您将通过Web界面发送自定义通知。 在浏览器中重新加载Nagios Services状态页面。 在页面加载时,按PostgreSQL后端服务,然后按右侧的发送自定义服务通知 。
Type in a comment of your choice and press on Commit, and then press on Done. You’ll immediately receive a new message in Slack.
输入您选择的注释,然后按“ 提交” ,然后按“ 完成” 。 您会立即在Slack中收到一条新消息。
You have now integrated Slack with Nagios, so you’ll receive messages about critical events and status changes immediately. You’ve also tested the integration by manually triggering an event from within Nagios.
现在,您已经将Slack与Nagios集成在一起,因此您将立即收到有关关键事件和状态更改的消息。 您还通过从Nagios内部手动触发事件来测试了集成。
You now have Nagios Core configured to watch over your managed PostgreSQL database and report any status changes and events to Slack, so you’ll always be in the loop of what is happening to your database. This will allow you to swiftly react in case of an emergency, because you’ll be getting the status feed in real time.
现在,您已将Nagios Core配置为监视托管的PostgreSQL数据库,并将任何状态更改和事件报告给Slack,因此,您将始终处于数据库正在发生的情况的循环中。 这将使您在紧急情况下Swift做出React,因为您将实时获取状态信息。
If you’d like to learn more about the features of check_postgres
, check out its docs, where you’ll find a lot more commands that you can possibly use.
如果您想了解有关check_postgres
功能的更多信息,请查看其文档docs ,在该文档中您可以找到更多可以使用的命令。
For more information about what you can do with your PostgreSQL Managed Database, visit the product docs.
有关可以对PostgreSQL托管数据库执行的操作的更多信息,请访问产品文档 。
nagios4.3.4监控