配置文件說明
本篇是說明icinga2的配置文件。跟nagios比較,邏輯思維是一樣的,定義主機(組)、服務(組)、檢測命令、模板、檢測頻率等。但是實際使用的語法卻不一樣,重新定義了一套關鍵詞。具體細節可參考下文。有些地方我也沒能搞明白,希望讀者童鞋能一起探討。
默認採用yum安裝的icinga2.
1 matser server上配置文件的兩個目錄:
/etc/icinga2/, 更多的配置放在./conf.d下,這裏主要是用來自定義配置。
文件名,只要你能明白是用來做什麼的即可,不需要一定區分user,service什麼的。
/usr/share/icinga2/include/,這裏主要是一些已經定義好的命令,可以直接使用。同時也可以參考這些定義好的命令, 來實現自己的腳本插件.
2 各個配置文件的說明
commands.conf和command-plugins.conf 定義命令
object CheckCommand "ssh"{ import"plugin-check-command" command = PluginDir+"/check_ssh" #defined in constants.conf arguments = { "-p"= "$ssh_port$" "host"= { value = "$ssh_address$" skip_key = true order = -1 } } vars.ssh_address= "$address$" }
object CheckCommand 爲定義命令的固件關鍵詞
import導入模板command.conf裏的
Command用法,PluginDir定義在constants.conf
Arguments參數,如果是自定義的腳本,可以不需要在這裏定義命令
"-p"= "$ssh_port$" 這個-p是插件本身的參數,後面的ssh_port是自定義名,格式$.....$
templates.conf 定義模板
針對host的檢測模板:
template Host "generic-host" { max_check_attempts = 3 check_interval = 5m retry_interval = 30s check_command = "hostalive" }
針對service的檢測模板:
template Service "generic-service" { max_check_attempts = 2 check_interval = 5m retry_interval = 20s }
template Host templateService固定格式,後面引號內名字自定義
max_check_attempts檢測遇到問題,最多嘗試次數
check_interval 檢測的頻率
retry_interval 如果檢測遇到問題,重新檢測的頻率
通知模板:
template Notification "30mins-notification" { interval = 30m command = "mail-service-notification" states = [ Critical ] types = [ Problem, Recovery ] period = "24x7" }
Command定義在commands.conf裏
States這裏設置需要發報警郵件的狀態,我只設置critical,減少郵件量
Types爲states的類型,很多
Perio報警的時間段
如果你想延遲第一次報警的時間,可如下:
apply Notification "mail" to Service { import "generic-notification" command = "mail-notification" users = [ "icingaadmin" ] times.begin = 15m // delay first notification assign where service.name == "ping4" }
Tips:
When detecting a problem with a host/service Icinga re-checks theobject a number of times (based on the max_check_attempts and retry_intervalsettings) before sending notifications. This ensures that no unnecessarynotifications are sent for transient failures. During this time the object isin a SOFT state.After all re-checks have been executed and the object is still ina non-OK state the host/service switches to a HARD state and notifications are sent.
users.conf 用來定義報警和定義主機
object User "icingaadmin"{ import "generic-user" display_name = "Icinga 2 Admin" groups = [ "icingaadmins" ] email = "[email protected]" } object Host "xx" { display_name = "xx" address = "xx" groups = [ "cs" ] check_command = "hostalive" }
Object User 或Object Host是固定格式,後面的內容爲自定義。
Host說明:
Import導入templates.conf裏的模板
display_name 自定義
groups自定義,如果多個,用逗號隔開(但是是否每個都能用,有待確認)
address 可以是域名或者ip
check_command 檢測主機的命令,這裏用的hostalive,就是ping檢測…
services.conf 定義服務(也可以給特別的服務單獨寫個xxx.conf)
objectService "ssh" { import "generic-service" check_command = "ssh" host_name= "hk" vars.ssh_port = "22221" }
針對單個主機的服務,可以用object Service的方式定義。
var.ssh_port這裏是自定義參數的使用方式。var.爲固定格式,後面跟參數名,參數名是在command-plugins.conf中定義的,等號後面是自定義的端口。
針對一個服務很多主機的情況,用如下apply service的方式定義:
applyService "total_procs" { import "generic-service" check_command = "nrpe" # use nrpe command to check vars.nrpe_command ="check_total_procs" #commandon client server assignwhere "es" in host.groups ignorewherehost.address == "" } apply Service "http 80" { import "generic-service" check_command = "http" # commandon monitor server which has argument “-H” assign where "vu" in host.groups ignore where host.address == "" }
用apply的方式,一定有關鍵詞assign和ignore,後者可以爲空,可以多行ignore(寫在一行沒成功)。
這裏兩個service定義,原理是一樣的,都用插件,check_nrpe或者check_http,這裏寫的命令http或者nrpe已經定義在command-plugins.conf。
groups.confg 定義服務組或者主機組
object ServiceGroup "load" { display_name = "Load Checks" assignwhereservice.vars.nrpe_command== "check_load" } object ServiceGroup "ssh" { display_name = "Ssh Checks" assign where service.check_command== "ssh" } object HostGroup "es" { display_name = "es server" }
notifications.conf 應用報警(之前做了模板,現在是應用)
apply Notification "mail-icingaadmin" to Host { import "mail-host-notification" user_groups = [ "icingaadmins" ] assign where host.vars.sla == "24x7" }
apply Notification "mail-icingaadmin-5" to Service { import "5mins-notification" user_groups = [ "icingaadmins" ] assign where service.name == "ssh" assign where service.name == "check_system_5" assign where service.name == "zombie_procs" assign where service.name == "http 80" assign where service.name == "ssh" }
icinga2的配置的語法, 遠比我上面寫的複雜, 支持正則, 各種宏變量, 非常的靈活.
驗證並加載配置
/etc/init.d/icinga2 reload (會自動檢查配置)