139郵箱挺好用的,之前用shell做監控發信到@139.com,139還真的把把短信提示送到了手機上,網上說nagios是一款強大的監控工具,於是順利把nagios裝在ubuntu上,除了Email地址沒有做其它配置,瀏覽器也能順利打開http://xx.xx.xx.xx/nagios 監控頁面,於是把apache服務down一下,過了一會,nagios監控頁面“Current Status”-“services”項顯示HTTP服務CRITICAL,卻沒有收到告警短信,查看郵箱也沒有,查看一下原來是HTTP項的Notifications處於Disabled狀態,於是直接在web接口啓用Http的Notifications。
sudo /etc/init.d/nagios restart ,界面已經顯示Notifications “Enable”,接着stop apache ,等了一陣,web界面顯示http critical ,但仍沒有發出郵件。於是mail –s “test” [email protected] < test.conf 測試發信沒有問題,網上說是mail路徑設置問題,檢查nagios的發信命令,/usr/bin/mail路徑並沒有錯誤。
查看/usr/local/nagios/objects/localhost.cfg ,發現service的設置並沒有啓用Notifications:
define service{
use local-service ; Name of service template to use
host_name localhost
service_description SSH
check_command check_ssh
notifications_enabled 0
}
# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.
define service{
use local-service ; Name of service template to use
host_name localhost
service_description HTTP
check_command check_http
notifications_enabled 0
}
奇怪了,配置文件裏面並沒有啓用,怎麼web界面顯示了Enable了呢。把這兩項都改爲“1”,然後restart nagios。仍然發不出信件。
到處看看,發現nagios的/etc目錄下主配置文件nagios.cfg有關預緩存的配置,說是爲了nagios快速啓動而設置的
precached_object_file=/usr/local/nagios/var/objects.precache
果然,objects.precache包含了http service項:注意下面的notifications_enabled項仍然被設置爲0
define service {
host_name localhost
service_description HTTP
check_period 24x7
check_command check_http
contact_groups admins
notification_period 24x7
initial_state o
check_interval 5.000000
retry_interval 1.000000
max_check_attempts 4
is_volatile 0
parallelize_check 1
active_checks_enabled 1
passive_checks_enabled 1
obsess_over_service 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options o,w,u,c
freshness_threshold 0
check_freshness 0
notification_options u,w,c,r
notifications_enabled 0
notification_interval 60.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
failure_prediction_enabled 1
retain_status_information 1
retain_nonstatus_information 1
}
很顯然,修改了localhost.cfg之後object.precache並沒有及時更新。在nagios.cfg裏把 “precached_object_file”註釋掉,然後restart nagios .:)Everything goes well .能收到critical,recovery等郵件了。