Contents
一、nagios服务端安装
1.nagios安装
1. 创建用户 nagios
/usr/sbin/useradd nagios
2. 下载nagios4.0.4
http://jaist.dl.sourceforge.net/project/nagios/nagios-4.x/nagios-4.0.4/nagios-4.0.4.tar.gz
3. 解压
tar –zxfnagios-4.0.4.tar.gz
4. 进入nagios-4.0.4目录进行依次执行。
./configure
make all #compile the main program and CGIs
make install #this installs the main program CGIs, and HTML files
make install-init #this installs the initscript in /etc/rc.d/init.d
make install-commandmode #this installs and configures permissions onthe directory for holding the external command file
make install-config #this installs SAMPLE config files in /usr/local/nagios/etc/
you’ll have to modify these sample files before you can usenagios ,
样例配置文件默认安装在这个目录下/usr/local/nagios/etc,这些样例文件可
以配置Nagios使之正常运行,只需要做一个简单的修改...
用你擅长的编辑器软件来编辑这个
/usr/local/nagios/etc/objects/contacts.cfg配置文件,更改email地址
nagiosadmin的联系人定义信息中的EMail信息为你的EMail信息以接收报警内
容。
vi /usr/local/nagios/etc/objects/contacts.cfg
#配置web接口
make install-webconf # this installs the Apache configfile for the Nagios web interface
创建下面命令创建一个nagiosadmin的用户,用于Nagios的web接口登录。
htpasswd -c /usr/local/nagios/etc/htpasswd.usersnagiosadmin
重启apache,使服务生效。
service httpd restart
make install-exfoliation #this installs the exfoliation theme for the Nagios webinterface
make install-classicui # this installs the classictheme for the Nagios web interface
2.安装nagios插件
编译并安装插件
./configure --with-nagios-user=nagios--with-nagios-group=nagios
make
makeinstall
3.安装nrpe插件
下面在监控linux主机中有详细说明。
安装好nrpe后,需要配置
在objects/command.cfg命令定义文件中添加NRPE命令。
[root@KCentOS5C ~]# vi /usr/local/nagios/etc/commands.cfg
-------------------------------------------------------
# NRPE Command
添加NRPE功能命令。
define command(
command_name nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
二、运行nagios
把Nagios 加入到服务列表中,以使之在系统启动时自动启动。
chkconfig --add nagios
chkconfig nagios on
验证Nagios的配置文件,如果不报错,则可以启动Nagios
/usr/local/nagios/bin/nagios -v/usr/local/nagios/etc/nagios.cfg
1. 初始化脚本:最简单的启动nagios守护进程的方式是使用初始化脚本,
/etc/rc.d/init.d/nagios start
2、 手工方式,可以收到启动Nagios守护进程,用命令参数 -d
/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
3. 使用service
service nagios start
三、添加监控主机
1. windows主机
安装NSClient++
安装过程中,只需要修改添加nagios服务器地址,密码不要填写。
其他一路下一步,
2.编辑方式打开windows.cfg文件。
vi/usr/local/nagios/etc/objects/windows.cfg (这个是在nagios服务器主机上配置的)
可以只是修改windows.cfg文件里的对象定义。修改host_name、
alias和address域以符合添加的Windows机器。
定义主机
define host{
use windows-server ; Inherit default valuesfrom a
Windows server template (make sure you keep this line!)
host_name winserver
alias My Windows Server
address 192.168.1.2 # windows主机地址。
}
定义服务
define service{
use generic-service
host_name 192.168.120.152
service_description NSClient++ Version
check_command check_nt!CLIENTVERSION
}
# Create a service for monitoring theuptime of the server
# Change the host_name to match the name ofthe host you defined above
define service{
use generic-service
host_name 192.168.120.152
service_description Uptime
check_command check_nt!UPTIME
}
# Create a service for monitoring CPU load
# Change the host_name to match the name ofthe host you defined above
define service{
use generic-service
host_name 192.168.120.152
service_description CPU Load
check_command check_nt!CPULOAD!-l 5,80,90
}
# Create a service for monitoring memoryusage
# Change the host_name to match the name ofthe host you defined above
define service{
use generic-service
host_name 192.168.120.152
service_description MemoryUsage
check_command check_nt!MEMUSE!-w 80 -c 90
}
# Create a service for monitoring C:\ diskusage
# Change the host_name to match the name ofthe host you defined above
define service{
use generic-service
host_name 192.168.120.152
service_description C:\ DriveSpace
check_command check_nt!USEDDISKSPACE!-l c -w 80 -c 90
}
# Create a service for monitoring the W3SVCservice
# Change the host_name to match the name ofthe host you defined above
define service{
use generic-service
host_name 192.168.120.152
service_description W3SVC
check_command check_nt!SERVICESTATE!-dSHOWALL -l W3SVC
}
# Create a service for monitoring theExplorer.exe process
# Change the host_name to match the name ofthe host you defined above
define service{
use generic-service
host_name 192.168.120.152
service_description Explorer
check_command check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe
}
修改好后,把windows.cfg,加入nagios.cfg 中
cfg_file=/usr/local/nagios/etc/objects/windows.cfg
2.linux主机
在被监控主机上安装nagios插件和nrpe
1.添加Nagio用户
[root@KCentOS5A~]# useradd nagios
2.解压Nagios-plugins压缩包
[root@KCentOS5A~]# tar -zxvf nagios-plugins-1.4.10.tar.gz 4.进入Nagios-plugins插件包目录
[root@KCentOS5A~]# cd nagios-plugins-1.4.10
5.预配置Nagios-plugins的安装路径
[[email protected]]# ./configure --prefix=/usr/local/nagios
6.预配置完成后会反馈信息并生成Makefile
-------------------------------------------------------
config.status:creating po/Makefile
--with-apt-get-command:
--with-ping6-command: /bin/ping6-n -U -w %d -c %d %s
--with-ping-command: /bin/ping-n -U -w %d -c %d %s
--with-ipv6: yes
--with-mysql: no
--with-openssl: yes
--with-gnutls: no
--with-perl:/usr/bin/perl
--enable-perl-modules: no
--with-cgiurl:/nagios/cgi-bin
--with-trusted-path:/bin:/sbin:/usr/bin:/usr/sbin
-------------------------------------------------------
7.编译Nagios-plugins插件程序
[[email protected]]# make
8.安装Nagios-plugins插件程序
[[email protected]]# make install
9.查看Nagios-plugins插件程序的安装
[[email protected]]# ll /usr/local/nagios/
total 8
drwxr-xr-x2 root root 4096 Oct 7 01:02 libexec
drwxr-xr-x3 root root 4096 Oct 7 01:02 share
10.递归更改Nagios主路径的属主
[root@KCentOS5A~]# chown -R nagios.nagios /usr/local/nagios/
11.检查Nagios主路径的属性
[root@KCentOS5A~]# ll /usr/local/|grep nagios
drwxr-xr-x4 nagios nagios 4096 Oct 7 01:02 nagios
[root@KCentOS5A~]# ll /usr/local/nagios/
total 8
drwxr-xr-x2 nagios nagios 4096 Oct 7 01:02 libexec
drwxr-xr-x3 nagios nagios 4096 Oct 7 01:02 share
四.再在被监控主机上安装NRPE扩展插件程序:
1.解压NRPE的压缩包
[root@KCentOS5A~]# tar -zxvf nrpe-2.9.tar.gz
2.进入NRPE包目录
[root@KCentOS5A~]# cd nrpe-2.9
3.预配置NRPE安装程序
[[email protected]]# ./configure
4.预配置完毕后会反馈提示信息以及建立Makefile
-------------------------------------------------------
configure:creating ./config.status
config.status:creating Makefile
config.status:creating src/Makefile
config.status:creating subst
config.status:creating include/config.h
***Configuration summary for nrpe 2.9 08-13-2007 ***:
GeneralOptions:
-------------------------
NRPEport: 5666
NRPEuser: nagios
NRPEgroup: nagios
Nagiosuser: nagios
Nagiosgroup: nagios
Reviewthe options above for accuracy. If they look okay,
type'make all' to compile the NRPE daemon and client.
最后提示继续使用“make all”命令来对NRPE守护程序以及客户端程序进行编译。
-------------------------------------------------------
5.对NRPE程序进行编译
[[email protected]]# make all
6.编译成功后会反馈提示信息
-------------------------------------------------------
***Compile finished ***
If theNRPE daemon and client compiled without any errors, you
cancontinue with the installation or upgrade process.
Read thePDF documentation (NRPE.pdf) for information on the next
steps youshould take to complete the installation or upgrade.
在NRPE的包目录中有一份NRPE.pdf的手册,接下来就可以参考那份手册进行后续的操作了。
-------------------------------------------------------
安装NRPE插件程序,守护程序以及模板守护程序配置文件。
Installthe NRPE plugin (for testing), daemon, and sample daemon config file.
7.安装NRPE插件程序
[[email protected]]# make install-plugin
8.安装NRPE守护程序
[[email protected]]# make install-daemon
9.安装NRPE守护程序配置文件
[[email protected]]# make install-daemon-config
10.检查NRPE程序的安装
[[email protected]]# ll /usr/local/nagios/
-------------------------------------------------------
total 16
drwxrwxr-x2 nagios nagios 4096 Oct 7 01:16 bin
drwxrwxr-x2 nagios nagios 4096 Oct 7 01:16 etc
drwxr-xr-x2 nagios nagios 4096 Oct 7 01:15 libexec
drwxr-xr-x 3nagios nagios 4096 Oct 7 01:02 share
在被监测主机上配置并运行NRPE:
1.配置NRPE的主配置文件
[root@KCentOS5A ~]# vi /usr/local/nagios/etc/nrpe.cfg
这里主要交待一些关键重要的设定。
-------------------------------------------------------
pid_file=/var/run/nrpe.pid
设定NRPE运行的PID文件,这个默认就设定好了,不用更改。
server_port=5666
这个是NRPE守护进程占用的系统端口。
#server_address=0.0.0.0
server_address=192.168.1.9 #这是被控端IP地址
设定系统监听NRPE的网络接口。一般设定具体的IP地址,如果要多个网卡的话就用逗号“,”来分隔多个IP地址。如果要监听系统全部的网络接口的话,可以使用“0.0.0.0”这个表示全部网络接口的特殊地址,但是不可以用通配符“*”。
nrpe_user=nagios
nrpe_group=nagios
设定NRPE的宿主用户。
allowed_hosts=192.168.1.12 #这是充许监控服务器IP地址
这里是设定允许与本机进行NRPE交互的主机的IP地址,也就是Nagios监测服务器的地址,如果要指定多个Nagios服务器的话,那么用逗号“,”来分隔多个IP地址。
在nrpe.cfg中添加需要监控的命令:
command[check_users]=/usr/local/nagios/libexec/check_users-w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load-w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk-w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs-w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs-w 150 -c 200
启动守护nrpe进程
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg–d
将NRPE运行添加加入系统启动脚本中
[root@KCentOS5A ~]# echo "/usr/local/nagios/bin/nrpe-c /usr/local/nagios/etc/nrpe.cfg -d" >> /etc/rc.d/rc.local
在nagios服务端添加主机信息。并把该主机加入nagios.cfg中
cfg_file=/usr/local/nagios/etc/objects/host160.cfg
四、使用插件
编写插件,放在目录/usr/local/nagios/libexec/下,
例如:
1, 把check_cpu.sh 文件放在libexec目录下。
修改执行权限。
2.在监控的主机中加入如下命令, 我的是在localhost.cfg 加入:
define service{
host_name 153
use generic-service
check_command check_nrpe!check_cpu_233
service_description check_cpu_233
notifications_enabled 1
event_handler_enabled 1
max_check_attempts 3
check_interval 5
retry_check_interval 2
notification_options w,u,c
}
在nrpe.cfg 中添加 :
command[check_cpu_233]=/usr/local/nagios/libexec/check_cpu.sh–w 60 –c 80
check_cpu.sh 文件需要放在每个需要进行检查的目录下。
五、自定义通知报警
指定服务,指定联系人,指定通知方式。
为了实现对指定服务,实现指定联系人的通知方式,
在 timeperiods.cfg 中加入发送通知时间
define timeperiod{
timeperiod_name notify_at_8
alias once_a_day
sunday 08:00-09:00
monday 08:00-09:00
tuesday 08:00-09:00
wednesday 08:00-09:00
thursday 09:00-10:00
friday 08:00-09:00
saturday 08:00-09:00
}
定义需要通知的联系人:
define contact {
contact_name notifyadmin
alias wu
use generic-contact
service_notification_commands notify-service-by-qqemail #通知命令
email [email protected]
}
我们在指定的主机中加入服务
define service{
host_name localhost
use local-service
service_descriptionnotify_at_800
contacts notifyadmin # 指定联系人
check_command notify_at_800 #定义一个服务检查命令来触发通知发送。
check_period notify_at_8 #定义检测时间。
notification_period notify_at_8 #检测间隔
}
重启服务就可以实现在指定时间段发送消息。