適用場景:當服務器某一內網網卡經常有問題,另一張公網網卡正常,或者通過某一臺正常的服務器去監控另一臺不穩定的服務器。以此腳本爲基礎,可以在多線路網關或***節點上,修改爲自動判斷網絡狀態並切換路由;或者在監控到有丟包率嚴重時重啓網卡。
基本原理:通過參數控制ping的結果,統計比較丟包率,通過sendmail命令發郵件通知,所以必須啓用系統的郵件服務,一般是默認就啓用的。
郵件報警實現的功能:有問題則報警,問題持續則間隔一定時間再報警,該間隔時間可在變量中設定。網卡從問題中恢復也通知。
使用:修改相應的變量,腳本保存爲/root/sh/mon-eth.sh
chmod u+x /root/sh/mon-eth.sh
執行:nohup /bin/bash /root/sh/mon-eth.sh >> /var/log/mon-eth.log 2>&1 &
#/bin/bash ############################################# # author zhao yanan # date 2012/09/14 # update 2012/09/22 Improve function # update 2012/09/24 Improve function # update 2012/12/06 Increased variable settings and log output ############################################# # Execution: # nohup /bin/bash /root/sh/mon-eth.sh >> /var/log/mon-eth.log 2>&1 & ## env ############ localip=192.168.0.2 remoteip=192.168.0.1 servername="dbserver" eth=eth0 packet_loss_percentage=60 # Packet loss percentage, alarm threshold repeat_alarm_time=75 # Repeat alarm time interval ( About *24 seconds. value of 75, about half an hour) interval=20 # Detection interval(seconds) mailfromadd='server1<[email protected]>' mailtoadd='user1<[email protected]>' mailccadd='user2<[email protected]>' export LANG=C export LC_ALL=C export PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin ###### check ################################# echo "Start monitoring." function mailto() { # mail /usr/sbin/sendmail -t <<EOF From: $mailfromadd To: $mailtoadd Cc: $mailccadd Subject: $servername $eth $localip packet loss ---------------------------------- $eth $localip packet loss, `date` $i ---------------------------------- EOF } function mailto2() { # mail /usr/sbin/sendmail -t <<EOF From: $mailfromadd To: $mailtoadd Cc: $mailccadd Subject: $servername $eth $localip packet loss (too many times) ---------------------------------- $eth $localip packet loss (too many times) `date` $i ---------------------------------- EOF } function mailto3() { # mail /usr/sbin/sendmail -t <<EOF From: $mailfromadd To: $mailtoadd Cc: $mailccadd Subject: $servername $eth $localip ok ---------------------------------- $eth $localip ok `date` $i ---------------------------------- EOF } m=0 n=0 echo "$m" > /tmp/mon-"$eth"-m echo "$n" > /tmp/mon-"$eth"-n while true do i=`ping $remoteip -I $eth -i 1 -c 5 -W 1 -w 5 -q | grep "packet loss"` j=`echo "$i" | awk -F, '{print $(NF-1);}' | awk -F% '{print $1}'` if [ $j -ge $packet_loss_percentage ]; then echo `date` "$i" m=$(($m+1)) echo "$m" > /tmp/mon-"$eth"-m else m=0 echo "$m" > /tmp/mon-"$eth"-m echo "`date` $eth ok." fi if [ "$m" -eq 0 ] && [ "$n" -eq 1 ]; then echo `date` "$i" mailto3 echo "$eth ok, mail notification has been sent." n=0 echo "$n" > /tmp/mon-"$eth"-n elif [ "$m" -eq 1 ] && [ "$n" -eq 0 ]; then mailto echo "$eth packet loss, mail notification has been sent." n=1 echo "$n" > /tmp/mon-"$eth"-n fi if [ "$m" -ge $repeat_alarm_time ] && [ "$n" -eq 1 ]; then mailto2 echo "$eth packet loss, mail notification is sent again." m=2 echo "$m" > /tmp/mon-"$eth"-m fi sleep $interval done