CentOS 6.2+Nginx+Nagios,手机短信和qq邮箱提醒

CentOS 6.2+Nginx+Nagios,手机短信和qq邮箱提醒


注:192.168.0.21 服务端

192.168.0.22 客户端

环境:两台centos6.0 64位系统,都已经搭建好了源码的lnmp平台

结尾附上所需的软件包


1.nagios安装(中文版)

tar xvf nagios-cn-3.2.3.tar.bz2
cd nagios-cn-3.2.3
useradd -m -s /bin/bash nagios
usermod -a -G nagcmd nagios
./configure --prefix=/usr/local/nagios --with-command-group=nagcmd
make
make all
make install
make install-init     # 生成init启动脚本
make install-config     # 安装示例配置文件
make install-commandmode     # 设置相应的目录权限
chmod o+rwx /usr/local/nagios/var/rw

2.nagios-plugins安装

wget http://prdownloads.sourceforge.net/sourceforge/nagiosplug/nagios-plugins
tar zxvf nagios-plugins-1.4.16.tar.gz
cd nagios-plugins-1.4.16
yum install make apr* autoconf automake curl curl-devel gcc gcc-c++ zlib-devel \
openssl openssl-devel pcre-devel gd gd-devel kernel keyutils patch perl perl-devel \
kernel keyutils kernel-headers compat* mpfr cpp glibc libgomp libstdc++-devel ppl \
cloog-ppl keyutils-libs-devel libcom_err-devel libsepol-devel libselinux-devel \
krb5-devel zlib-devel libXpm* freetype libjpeg* libpng* php-common php-gd ncurses* libtool* libxml2 libxml2-devel patch -y
./configure --prefix=/usr/local/nagios --with-mysql=/home/mysql/
make
make install


3.nrpe安装

tar xzvf nrpe-2.12.tar.gz
cd nrpe-2.12
./configure
make
./configure
make all
make install-plugin
make install-daemon
make install-daemon-config
\cp src/check_nrpe /usr/local/nagios/libexec/
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
echo '/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d' >> /etc/rc.local
要重启nrpe进行就先杀掉进行,然后重启
kill `ps aux |grep nrpe |grep -v grep |awk '{print $2}'`
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
本机测试下:
/usr/local/nagios/libexec/check_nrpe -H localhost -c check_users


加入系统服务

加入系统服务并设为开机自动
chkconfig --add nagios
chkconfig nagios on
chown nagios.nagios /usr/local/nagios/var/rw
# 测试配置文件是否正确
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg


添加别名命令,方便测试配置文件

vi ~/.bashrc
在里面用alias 来自定义一个命令来代替,这里我用check
alias check='/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg'
source ~/.bashrc
此时可以用check命令来检测配置文件了


修改联系人邮箱,修改为用于报警接收的邮件地址

vi /usr/local/nagios/etc/objects/contacts.cfg
###############################################################################
# CONTACTS.CFG - SAMPLE CONTACT/CONTACTGROUP DEFINITIONS
#
# Last Modified: 05-31-2007
#
# NOTES: This config file provides you with some example contact and contact
#        group definitions that you can reference in host and service
#        definitions.
#
#        You don't need to keep these definitions in a separate file from your
#        other object definitions.  This has been done just to make things
#        easier to understand.
#
###############################################################################
###############################################################################
###############################################################################
#
# CONTACTS
#
###############################################################################
###############################################################################
# Just one contact defined by default - the Nagios admin (that's you)
# This contact definition inherits a lot of default values from the 'generic-contact'
# template which is defined elsewhere.
define contact{
        contact_name                    nagiosadmin             ; Short name of user
        use                             generic-contact         ; Inherit default values from generic-contact template (defined above)
        alias                           Nagios Admin            ; Full name of user
        email                           nagios@localhost        ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
        }
###############################################################################
###############################################################################
#
# CONTACT GROUPS
#
###############################################################################
###############################################################################
# We only have one contact in this simple configuration file, so there is
# no need to create more than one contact group.
define contactgroup{
        contactgroup_name       admins
        alias                   Nagios Administrators
        members                 nagiosadmin
        }
定义check_nrpe命令
vi /usr/local/nagios/etc/objects/commands.cfg
define command{
command_name check_nrpe
command_line /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

检测配置文件是否有误

check

134820361.jpg

134910903.jpg

nginx 配置,Nginx fastcgi perl (pl、cgi)支持
安装FCGI模块
cd
tar zxvf FCGI-0.70.tar.gz
cd FCGI-0.70
perl Makefile.PL
make
make install
cd
安装 IO 和 IO::ALL模块
tar zxvf IO-1.25.tar.gz
cd IO-1.25
perl Makefile.PL
make
make install
cd
tar zxvf IO-All-0.41.tar.gz
cd IO-All-0.41
perl Makefile.PL
make
make install
cd
unzip perl-fcgi.zip
cp perl-fcgi.pl /usr/local/nginx/
chmod 755 /usr/local/nginx/perl-fcgi.pl


vi /usr/local/nginx/start_perl_cgi.sh
#!/bin/bash
#set -x
dir=/usr/local/nginx/
stop ()
{
#pkill  -f  $dir/perl-fcgi.pl
kill $(cat $dir/logs/perl-fcgi.pid)
rm $dir/logs/perl-fcgi.pid 2>/dev/null
rm $dir/logs/perl-fcgi.sock 2>/dev/null
echo "stop perl-fcgi done"
}
start ()
{
rm $dir/now_start_perl_fcgi.sh 2>/dev/null
chown nobody.root $dir/logs
echo "$dir/perl-fcgi.pl -l $dir/logs/perl-fcgi.log -pid $dir/logs/perl-fcgi.pid -S $dir/logs/perl-fcgi.sock" >>$dir/now_start_perl_fcgi.sh
chown nobody.nobody $dir/now_start_perl_fcgi.sh
chmod u+x $dir/now_start_perl_fcgi.sh
sudo -u nobody $dir/now_start_perl_fcgi.sh
echo "start perl-fcgi done"
}
case $1 in
stop)
stop
;;
start)
start
;;
restart)
stop
start
;;
esac


start_perl_cgi.sh文件中的nobody全部用nagios替换,nginx 目录上的用户

sed -i 's@nobody@nagios@g' /usr/local/nginx/start_perl_cgi.sh
chmod 755 /usr/local/nginx/start_perl_cgi.sh
/usr/local/nginx/start_perl_cgi.sh start

135456830.jpg

# 取消用户认证(方便调试)
vi /usr/local/nagios/etc/cgi.cfg
找到use_authentication=1并把值改为0
修改联系人邮箱,修改为用于报警接收的邮件地址
vi /usr/local/nagios/etc/objects/contacts.cfg

135458347.jpg


135505944.jpg

到这一步就是正常的

下面nginx 配置

我把监听改成80的了

135508550.jpg

然后开启服务

就可以访问了,然后继续安装客户端,最后给大家截图看效果

service nagios start


nagios被控端安装

yum install openssl-devel -y
1. nagios-plugins安装
groupadd nagios
useradd nagios -M -s /sbin/nologin -g nagios
tar xvf nagios-plugins-1.4.16.tar.gz
cd nagios-plugins-1.4.16
./configure --prefix=/usr/local/nagios --with-nagios-user=nagios --with-nagios-gourp=nagios --with-mysql=/usr/local/mysql && make && make install
cd
2. nrpe安装
tar zxvf nrpe-2.13.tar.gz
cd nrpe-2.13
./configure
make all
make install-plugin
make install-daemon
make install-daemon-config

启动nrpe
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
echo '/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d' >> /etc/rc.local


监控服务端本机:自己监控自己不需要配置nrpe,服务端的nrpe只用于获取客户端的nrpe传送过来的数据,在这里因为中文版的nagios已经默认有些配置,等会儿修改下直接用了

监控客户端:监控的服务有:mysqlnginxmemoryip连接数、僵死的进程、磁盘空间、磁盘IO、登录用户数、进程总数、cpu负载、PINGSSH

unzip libexec.zip
\cp libexec/* /usr/local/nagios/libexec
chmod -R +x /usr/local/nagios/libexec

装插件

创建一个空的数据库nagios,授权nagios这个用户从任何地方访问nagios这个数据库,刷新授权设置,查询下nagios这个用户是否创建成功
create database nagios;
grant select on nagios.* to nagios @'%' identified by '123456';
flush privileges;
select User,Password,Host from mysql.user;


135951346.jpg

添加mysql库到系统搜索库
vim /etc/ld.so.conf
/usr/local/mysql/lib
ldconfig
要监控磁盘io,还得安装sysstat这个工具包
yum install sysstat -y
配置客户端上面的nrpe
vim /usr/local/nagios/etc/nrpe.cfg

135954212.jpg


135958970.jpg

140000200.jpg

140003554.jpg

配置客户端上面的nrpe
vim /usr/local/nagios/etc/nrpe.cfg
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_cpu.sh -w 80% -c 90%
command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1
command[check_sda2]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda2
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
command[check_iostat]=/usr/local/nagios/libexec/check_iostat.sh -d sda -w 6 -c 10
command[check_mysql]=/usr/local/nagios/libexec/check_mysql -H 192.168.0.22 -u nagios -p 123456 -d nagios
command[check_nginx]=/usr/local/nagios/libexec/check_nginx.sh -u 192.168.0.22 -p /status -w 4000 -c 5000
command[check_mem]=/usr/local/nagios/libexec/check_memory.pl -f -w 20 -c 10
command[check_ip_conn]=/usr/local/nagios/libexec/ip_conn.sh 200 250
command[check_ssh]=/usr/local/nagios/libexec/check_tcp -p 22 -w 1.0 -c 10.0
配置完成后,重启nrpe
kill `ps aux |grep nrpe |grep -v grep |awk '{print $2}'`
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
服务端配置:
监控服务端本机的配置:
vim /usr/local/nagios/etc/objects/localhost.cfg
修改里面的配置,最后修改完成的配置如下
define host{
        use                     linux-server
        host_name               localhost
        alias                   localhost
        address                 127.0.0.1
        icon_p_w_picpath              server.gif
        statusmap_p_w_picpath         server.gd2
    2d_coords       500,200
    3d_coords       500,200,100
        }
define hostgroup{
        hostgroup_name  linux-servers ; The name of the hostgroup
        alias           Linux Servers ; Long name of the group
        members         *     ; Comma separated list of hosts that belong to this group
        }
define servicegroup{
servicegroup_name 全部联通性检查
alias 联通性检查
          members localhost,PING,nagios-client,PING
          }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       *
        service_description             PING
           check_command            check_ping!100.0,20%!500.0,60%
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             根分区
           check_command            check_local_disk!20%!10%!/
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             登录用户数
           check_command            check_local_users!20!50
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             进程总数
           check_command            check_local_procs!250!400!RSZDT
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             系统负荷
           check_command            check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             交换空间利用率
           check_command            check_local_swap!20!10
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             SSH
           check_command            check_tcp!22!1.0!10.0
           notifications_enabled        0
        }
服务器监控客户端的配置:
保存退出后复制这个文件一份,作为nagios-client的监控模版文件
cp /usr/local/nagios/etc/objects/localhost.cfg /usr/local/nagios/etc/objects/nagios-client.cfg
vim /usr/local/nagios/etc/objects/nagios-client.cfg   修改完成后的配置如下
define host{
        use                     linux-server
        host_name               nagios-client
        alias                   nagios-client
        address                 192.168.0.22
        icon_p_w_picpath              server.gif
        statusmap_p_w_picpath         server.gd2
        2d_coords       500,200
        3d_coords       500,200,100
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       *
        service_description             PING
        check_command           check_ping!100.0,20%!500.0,60%
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             boot分区
        check_command           check_nrpe!check_sda1
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             根分区
        check_command           check_nrpe!check_sda2
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             登录用户数
        check_command           check_nrpe!check_users
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             进总程数
        check_command           check_nrpe!check_total_procs
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             CPU平均负载
        check_command           check_nrpe!check_load
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             虚拟内存
        check_command           check_nrpe!check_swap
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             SSH
        check_command           check_nrpe!check_ssh
        notifications_enabled       0
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             僵死进程数
        check_command           check_nrpe!check_zombie_procs
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             iostat
        check_command           check_nrpe!check_iostat
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             mysql
        check_command           check_nrpe!check_mysql
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             nginx
        check_command           check_nrpe!check_nginx
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             memory
        check_command           check_nrpe!check_mem
        }
define service{
        use                             local-service         ; Name of service template to use
        host_name                       nagios-client
        service_description             IP连接数
        check_command           check_nrpe!check_ip_conn
        }

142017582.jpg

直接把原来的邮件报警的两条命令中的/bin/mail修改为/usr/bin/mutt即可,如下图
                                                                           
加快nagios的报警时间设置:
1.修改模版文件:
vim /usr/local/nagios/etc/objects/templates.cfg
修改所有normal_check_interval项的值为1,既发现故障后1分钟就报警
修改所有check_interval项的值为1,即正常情况下每分钟检查一次
修改所有notification_interval  的值为20分钟         #在主机出现异常后,故障一直没有解决,nagios再次对使用者发出通知的时间
service nagios restart    重启nagios

140904939.jpg

140015518.jpg


140936638.jpg

140940246.jpg

140945505.jpg


140953384.jpg

140956156.jpg

140958984.jpg

141001706.jpg

141003226.jpg

141005148.jpg

141007150.jpg

141009367.jpg

141012872.jpg

141014584.jpg

141018987.jpg


测试告警:

141501863.jpg

141421804.jpg

141427598.jpg

试验完成!

附上软件包所需软件地址

缺的软件可以直接找我要!

http://down.51cto.com/data/1007210


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章