Linux 之nggios 安裝及配置

Nagios 簡介：

是一個開源軟件，可以監控網絡設備網絡流量、Linux/windows主機狀態，甚至可以監控打印機

它可以運行在Linux上或windows上

基於瀏覽器的web界面方便運維人員查看監控項目的狀態

支持web界面配置、管理操作

支持短信、郵件通知

可以自定義腳本實現自定義化監控

Nagios官網 http://www.nagios.org/

Nagios 服務器端安裝：

Centos6默認的yum源裏沒有nagios相關的rpm包，但是我們可以安裝一個epel的擴展源： rpm -ivh http://www.lishiming.net/data/p_w_upload/forum/month_1211/epel-release-6-7.noarch.rpm

yum install -y httpd nagios nagios-plugins nagios-plugins-all nrpe nagios-plugins-nrpe

#樓主是在LNMP 下的環境來做實驗的，httpd 已經安裝了，epel 也安裝了，省去了安裝步驟。

[root@LNMP ~]# yum install -y nagios nagios-plugins nagios-plugins-all nrpe nagios-plugins-nrpe

[root@LNMP ~]# htpasswd -c /etc/nagios/passwd nagiosadmin #設置登錄nagios後臺的用戶和密碼

New password:

Re-type new password: （zaq12wsx）

Adding password for user nagiosadmin

[root@LNMP ~]# nagios -v /etc/nagios/nagios.cfg #檢測配置文件，最下方輸出0，則沒有問題

Total Warnings: 0

Total Errors: 0

[root@LNMP ~]# service httpd restart #重啓httpd

Stopping httpd: [ OK ]

Starting httpd: [ OK ]

[root@LNMP ~]# service nagios restart #重啓nagios

Running configuration check...done.

Stopping nagios: done.

Starting nagios: done.

通個IE 訪問 http://10.72.4.38/nagios/

-----------------------------------分割線---------------------------------

Nagios 客戶端的安裝配置：（樓主客戶端環境LAMP,之前做實驗搭的）

在客戶端機器上yum源：

rpm -ivh http://www.lishiming.net/data/p_w_upload/forum/month_1211/epel-release-6-7.noarch.rpm

[root@OBird ~]# yum install -y nagios-plugins nagios-plugins-all nrpe nagios-plugins-nrpe

[root@OBird ~]# vim /etc/nagios/nrpe.cfg #編輯配置文件

allowed_hosts=127.0.0.1 #更改爲allowed_hosts=10.72.4.38，服務端IP

dont_blame_nrpe=0 #更改爲 dont_blame_nrpe=1

[root@OBird ~]# /etc/init.d/nrpe start #客服端和服務端的中間介質

Starting nrpe: [ OK ]

[root@LNMP ~]# cd /etc/nagios/conf.d/ #在服務端的配置

[root@LNMP conf.d]# vim 10.72.4.43.cfg #加入以下內容

define host{

use linux-server ; Name of host template to use

; This host definition will inherit all variables that are defined

; in (or inherited by) the linux-server host template definition.

host_name 10.72.4.43

alias 4.43

address 10.72.4.43

}

define service{

use generic-service

host_name 10.72.4.43

define host{

use linux-server ; Name of host template to use

; This host definition will inherit all variables that are defined

; in (or inherited by) the linux-server host template definition.

host_name 10.72.4.43

alias 4.43

address 10.72.4.43

}

define service{

use generic-service

host_name 10.72.4.43

service_description check_ping

check_command check_ping!100.0,20%!200.0,50%

max_check_attempts 5

normal_check_interval 1

}

define service{

use generic-service

host_name 10.72.4.43

service_description check_ssh

check_command check_ssh

max_check_attempts 5

---------------------------------------------

定義的配置文件中一共監控了三個service：ssh, ping, http 這三個項目是使用本地的nagios工具去連接遠程機器，也就是說即使客戶端沒有安裝nagios-plugins以及nrpe也是可以監控到的。其他的一些service諸如負載、磁盤使用等是需要服務端通過nrpe去連接到遠程主機獲得信息，所以需要遠程主機安裝nrpe服務以及相應的執行腳本(nagios-plugins)

max_check_attempts 5 #當nagios檢測到問題時，一共嘗試檢測5次都有問題纔會告警，如果該數值爲1，那麼檢測到問題立即告警

normal_check_interval 1#重新檢測的時間間隔，單位是分鐘，默認是3分鐘

notification_interval 60 #在服務出現異常後，故障一直沒有解決，nagios再次對使用者發出通知的時間。單位是分鐘。如果你認爲，所有的事件只需要一次通知就夠了，可以把這裏的選項設爲0。

--------------------------------------------

檢測配置文件

[root@LNMP conf.d]# nagios -v /etc/nagios/nagios.cfg #出現兩個OK

Total Warnings: 0

Total Errors: 0

[root@LNMP ~]# service nagios restart #重啓nagios 服務，在網頁端查看被監控的客戶端4.43

------------------------------分割線-------------------- ----------

Nagios 客戶端的安裝配置2：

[root@LNMP ~]# vim /etc/nagios/objects/commands.cfg #服務端，增加一個調用客戶端的服務的命令獲得狀態，check_nrpe 和對客戶端通信。

define command{

command_name check_nrpe

command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$

}

[root@LNMP ~]# vim /etc/nagios/conf.d/10.72.4.43.cfg #增加服務

define service{

use generic-service

host_name 10.72.4.43

service_description check_load

check_command check_nrpe!check_load #客戶端調用，下同

max_check_attempts 5

normal_check_interval 1

}

define service{

use generic-service

host_name 10.72.4.43

service_description check_disk_sda1

check_command check_nrpe!check_hda1

max_check_attempts 5

normal_check_interval 1

}

define service{

use generic-service

host_name 10.72.4.43

service_description check_disk_sdb5

check_command check_nrpe!check_hda5

max_check_attempts 5

normal_check_interval 1

}

[root@OBird ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda3 18G 4.0G 13G 24% /

tmpfs 495M 0 495M 0% /dev/shm

/dev/sda1 194M 29M 155M 16% /boot

/dev/sdb5 2.0G 68M 1.9G 4% /mnt

/dev/sdb1 2.0G 74M 1.9G 4% /home/liven/123

--------------------------------------------------------

[root@OBird ~]# vim /etc/nagios/nrpe.cfg #客戶端被調用的腳本,用戶是可以在此處自定義腳本的

# The following examples use hardcoded command arguments...

command[check_users]=/usr/lib64/nagios/plugins/check_users -w 5 -c 10

command[check_load]=/usr/lib64/nagios/plugins/check_load -w 15,10,5 -c 30,25,20

command[check_hda1]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1

#chech_hda1 ，樓主實驗環境是sda1,要改爲一致 ,用df -h 查看

command[check_hda5]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/sda1

#增加一行sdb5

command[check_zombie_procs]=/usr/lib64/nagios/plugins/check_procs -w 5 -c 10 -s Z

command[check_total_procs]=/usr/lib64/nagios/plugins/check_procs -w 150 -c 200

[root@OBird ~]# /etc/init.d/nrpe restart #重啓客戶端&nrpe 服務

Shutting down nrpe: [ OK ]

Starting nrpe: [ OK ]

檢測配置文件

[root@LNMP conf.d]# nagios -v /etc/nagios/nagios.cfg #出現兩個OK

Total Warnings: 0

Total Errors: 0

[root@LNMP ~]# service nagios restart #重啓服務端nagios服務

Running configuration check...done.

Stopping nagios: done.

Starting nagios: done.

LNMP 服務端hHTTP一直 WARNING（403）.解決辦法：

編寫網站的訪問目錄echo 111>/var/www/html/index.html

[root@LNMP ~]# tail /var/log/nagios/nagios.log #查看nagios 日誌，

[1482895207] SERVICE ALERT: 10.72.4.43;check_disk_sda5;CRITICAL;SOFT;3;DISK CRITICAL - /dev/sda5 is not accessible: No such file or directory

-------------------------------分割線------------------------------

配置郵件警告：

define contactgroup{

contactgroup_name admins

alias Nagios Administrators

members nagiosadmin，222,333 #用戶

}

define contact{

contact_name 222 #用戶1

use generic-contact

alias linux

email [email protected] #報警郵箱

}

define contact{

contact_name 333

use generic-contact

alias linux

email [email protected]

}

define contactgroup{ #定義組

contactgroup_name common #組名

alias common #別名

members 222,333 #組員

}

[root@LNMP ~]# vim /etc/nagios/conf.d/10.72.4.43.cfg #設置告警，測試

define service{

use generic-service

host_name 10.72.4.43

service_description check_disk_sda1

check_command check_nrpe!check_hda1

max_check_attempts 5

normal_check_interval 1

contact_groups common #加入告警設置

notification_period 24x7 #告警時間24x7

notification_options c,r #告警級別

}

define service{

use generic-service

host_name 10.72.4.43

service_description check_http

check_command check_http

max_check_attempts 5

normal_check_interval 1

contact_groups common

notifications_enabled 1 #加入告警設置

notification_period 24x7

notification_options w,c,r

｝

[root@OBird ~]# /usr/lib64/nagios/plugins/check_disk -w 99% -c 99% -p /dev/sda1

DISK CRITICAL - free space: /boot 154 MB (84% inode=99%);| /boot=28MB;1;1;0;193

樓主在客戶端用腳本產生告警，但是並沒有發出郵件。樓主百度去了。。。

http://dl528888.blog.51cto.com/2382721/763079（參考這個博客詳細內容在下端）

操作：

[root@LNMP ~]# vim /etc/hosts #加入本機IP&主機名

10.72.4.38 LNMP

讓客戶端（10.72.4.43）的http服務告警：

[root@LNMP ~]# tail /var/log/nagios/nagios.log #查看服務端事件，如果內容有service notification這樣的話，就代表nagios發送了郵件報警

樓主親測，收到了郵件。樓主用的兩臺虛擬機，網絡是橋接模式。有圖有真相。

--------------------------------------------

下文摘抄至下面的博客：

經過測試，虛擬機不能直接發郵件，但是阿里雲主機可以，原因是沒有公網IP，如果有問題參考以下三個步驟：http://dl528888.blog.51cto.com/2382721/763079（參考這個博客）

(1)hosts裏的配置

[root@nagios ~]# cat /etc/hosts

172.16.4.3 nagios.com nagios ##一定要有本機的ip對應

(2)hostname與/etc/sysconfig/network裏的hostname一致，並與hosts裏的一致

(3)去nagios的web裏，選擇報警的服務（如果httpd），點擊http旁邊的這個摁鈕，如果你看到notification狀態爲disabled，那麼你可以選擇右側的enabled notification for this services，確認對話框出來後，查看日誌tail /var/log/nagios/nagios.log

如果內容有service notification這樣的話，就代表nagios發送了郵件報警

Linux 之nggios 安裝及配置

python gdal 安裝使用（Windows， python 3.6.8）

LB集羣之LVS/DR

Shell 腳本介紹之date 應用

shell 函數

LB集羣之LVS/keepalived

Linux系統下安裝rz/sz命令及使用說明

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結