後臺開發順手運維的筆記

寫在前頭

github地址,歡迎大家來點贊,會持續更新…

https://github.com/emaste-r/backend_note

章節目錄

正在運行的業務進程CPU調優

Java調優

先找到CPU異常的PID:

[root@iZ9458z0ss9Z ~]# top -c

然後查看該進程下的線程:

[root@iZ9458z0ss9Z ~]# top -Hp YOUR_PID

然後查看異常的線程:

[root@iZ9458z0ss9Z ~]# jstack YOUR_THREAD_PID

如此,便能查看該線程到底在執行到哪一段代碼會導致CPU異常:

[root@iZ94won0vbvZ ~]# printf '%x\n' 15589
3ce5
[root@iZ94won0vbvZ ~]# jstack 15589 | grep "3ce" -C 5

"qtp540159270-15" #15 prio=5 os_prio=0 tid=0x00007f312878e000 nid=0x3cfd waiting on condition [0x00007f311115f000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000e002b348> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
--
        at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:531)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.access$700(QueuedThreadPool.java:47)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:590)
        at java.lang.Thread.run(Thread.java:745)

可能會出現如下錯誤:

Unable to open socket file: target process not responding or HotSpot VM not loaded

原因:

jstack找不到pid文件。

解決方案:

1、要麼是因爲pid文件放在/tmp文件夾下,然而tmpwatch策略導致系統定時清理掉/tmp文件夾下的長時間無訪問的文件...
   只能重啓才能重新生成pid;
2、也有可能pid文件被自定義設置到別處,那麼jstack /data/you/path/your.pid即可。

Python調優

本人版本:

centos7+python2.7.5   

安裝gdb:

[root@iZ9458z0ss9Z ~]# sudo yum install gdb

出現如下信息說明安裝成功:

[root@ouyang ~]# gdb
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7_4.1
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
(gdb) 

安裝python包(才能使用py-list、py-bt等操作):

wget http://debuginfo.centos.org/6/x86_64/python27-python-debuginfo-2.7.5-10.el6.centos.alt.x86_64.rpm
rpm -ivh python27-python-debuginfo-2.7.5-10.el6.centos.alt.x86_64.rpm

出現如下信息說明python-debuginfo包安裝成功:

[root@iZ94won0vbvZ ~]# gdb -p 31618 # 31618 是某個python進程的PID

(gdb) py
py-bt               py-down             py-list             py-locals           py-print            py-up               python              python-interactive  

開始調試,查看當前線程運行的代碼:

(gdb) py-list
 858                        # clear alarm so it doesn't fire while poll is waiting for
 859                        # events.
 860                        signal.setitimer(signal.ITIMER_REAL, 0, 0)
 861    
 862                    try:
>863                        event_pairs = self._impl.poll(poll_timeout)
 864                    except Exception as e:
 865                        # Depending on python version and IOLoop implementation,
 866                        # different exception types may be thrown and there are
 867                        # two ways EINTR might be signaled:
 868                        # * e.errno == errno.EINTR
(gdb) 

Mysql日常優化

Mysql查看慢查詢日誌

首先要判斷mysql是否開啓慢查詢、慢查詢日誌存放位置:

[root@iZ9458z0ss9Z ~]#  cat /etc/my.cnf| grep slow           
slow_launch_time=2  # 表示如果建立線程花費了比這個值更長的時間,slow_launch_threads 計數器將增加
slow_query_log=on   # 開啓 慢查詢
slow_query_log_file=/data/log/your_slow_query.log  # log文件存放位置

萬一沒開啓,怎麼開啓呢?

mysql> set global slow_query_log='ON'; 
mysql> set global slow_query_log_file='/usr/local/mysql/data/slow.log';
mysql> set global long_query_time=1;  # 超過一秒的查詢會被記錄

查看慢查詢日誌,主要關心參數:
Query_time = 查詢耗費多少秒 = 3.088468
Rows_examined = 影響的行數 = 43817

[root@iZ9458z0ss9Z ~]# tail -f /data/log/your_slow_query.log
# Time: 180301 16:17:20
# User@Host: root[root] @ iZ9458z0ss9Z [10.24.245.83]
# Thread_id: 9601563  Schema: miaoyan  QC_hit: No
# Query_time: 3.088468  Lock_time: 0.000084  Rows_sent: 1  Rows_examined: 43817
# Rows_affected: 0
SET timestamp=1519892240;
select count(id) as cnt from `your_table` where com_id = 1867 and ptype = 1;

Mysql查看正在執行的sql

查看當前的sql執行情況:

mysql> show processlist;
+----------+---------+----------------------+----------------------+---------+-------+----------+------------------+
| Id       | User    | Host                 | db                   | Command | Time  | State    | Info             |
+----------+---------+----------------------+----------------------+---------+-------+----------+------------------+
| 15490667 | seafile | localhost:45682      | ccnet-db             | Sleep   |    23 |          | NULL             |
| 15490668 | seafile | localhost:45684      | seafile-db           | Sleep   |     3 |          | NULL             |
| 15496875 | seafile | localhost:60158      | ccnet-db             | Sleep   |  1530 |          | NULL             |
| 15498263 | seafile | localhost:35184      | seafile-db           | Sleep   |   630 |          | NULL             |
| 9604402  | root    | iZ9458z0ss9Z:22193   | your_db              | Query   |     1 | Sending data                                                                  | select count(id) as cnt from `your_table` where com_id = 1769 and ptype=1 |

Sleep狀態的項不需要管,如果看到一個Query狀態的sql,Time(秒)還很大,可能卡住了,個人感覺select的sql如果卡住了,可以先kill掉:

kill 9604402

再慢慢分析是否有索引:

explain select count(id) as cnt from `your_table` where com_id = 1769 and ptype=1;

如果是Update的sql,那就還是慢慢等執行完畢吧,一般不會耗費太多時間,除非你全表update。

Nginx查看訪問最頻繁的IP並禁止某個異常IP

查看排名前10個訪問最頻繁的IP:

[root@iZ9458z0ss9Z ~]# awk '{a[$1]+=1;}END{for(i in a){print a[i] " " i;}}' /var/log/nginx/access.log |sort -gr | head -10                          
11929 123.44.55.66
9727 119.137.52.231
8132 111.18.73.48
2926 115.191.176.18
2407 183.198.212.108
2322 218.11.141.142
2257 183.225.60.104
2229 183.40.130.198
1990 115.207.217.131
1965 221.180.236.146
[root@iZ9458z0ss9Z ~]# 

firewall-cmd

假設我們感覺這個123.44.55.66太可疑了,如何用firewall禁用:
先查看firewall是否開啓:

[root@iZ9458z0ss9Z ~]# firewall-cmd --state
running

禁用123.44.55.66:

[root@iZ9458z0ss9Z ~]# firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='123.44.55.66' reject"
success

重啓生效:

[root@iZ9458z0ss9Z ~]# firewall-cmd --reload
success
[root@iZ9458z0ss9Z ~]# iptables -L | grep '123'
REJECT     all  --  123.44.55.66         anywhere             reject-with icmp-port-unreachable

解封IP,把–add-rich-rule改成–remove-rich-rule:

[root@iZ9458z0ss9Z ~]# firewall-cmd --permanent --remove-rich-rule="rule family='ipv4' source address='123.44.55.66' reject"

iptables

如果未安裝firewall-cmd,那麼可以試試iptables,先看iptables服務跑起來沒:

[root@iZ94won0vbvZ ~]# service iptables status

如果沒有跑起來:

重啓後生效
    開啓: chkconfig iptables on
    關閉: chkconfig iptables off
即時生效,重啓後失效
    開啓: service iptables start
    關閉: service iptables stop

添加拒絕ip的規則:

[root@iZ9458z0ss9Z ~]# cp /etc/sysconfig/iptables-config  /var/tmp   //先保存原先配置
[root@iZ9458z0ss9Z ~]# iptables -I INPUT -s 123.44.55.66 -j DROP //添加拒絕的ip
[root@iZ9458z0ss9Z ~]# service iptables save  //保存配置
[root@iZ9458z0ss9Z ~]# service iptables restart //重啓防火牆

查看下添加成功否:

[root@iZ94won0vbvZ ~]# iptables -L  //查看iptables
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
DROP       all  --  123.44.55.66         anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED

如果發現是個誤會,想解封123.44.55.66:

[root@iZ94won0vbvZ ~]# iptables -D INPUT -s 123.44.55.66 -j DROP
[root@iZ94won0vbvZ ~]# iptables -L | grep 123.44.55   //發現找不到和123.44.55.66相關的規則了,解封成功!
[root@iZ94won0vbvZ ~]# 

監控硬盤並郵件報警

我們查看硬盤使用率 df -h:

[root@iZ9458z0ss9Z script]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        40G   14G   24G  36% /
devtmpfs        3.9G     0  3.9G   0% /dev
tmpfs           3.9G   28K  3.9G   1% /dev/shm
tmpfs           3.9G  408M  3.5G  11% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/vdb        197G  124G   64G  66% /data
tmpfs           783M     0  783M   0% /run/user/0

如果從未配置過mail:

[root@001 ~]# vim /etc/mail.rc #添加如下內容
set from=xxxx@126.com  # from:對方收到郵件時顯示的發件人
set smtp=smtp.126.com  # smtp:指定第三方發郵件的smtp服務器地址
set smtp-auth-user=xx@126.com  # set smtp-auth-user:第三方發郵件的用戶名
set smtp-auth-password=xxx  # set smtp-auth-password:用戶名對應的密碼,有些郵箱填的是授權碼
set smtp-auth=login # smtp-auth:SMTP的認證方式,默認是login,也可以改成CRAM-MD5或PLAIN方式

測試下:

[root@iZ9458z0ss9Z script]# echo "測試郵件" | mail -s "測試主題" xihuanjianguo@163.com

寫發送郵件的腳本:

[root@iZ9458z0ss9Z script]# cat sendmail.sh 
#!/bin/bash
# by kin
source /etc/profile
content=$1
address=$2
echo ${content} | mail -s ${content} ${address}

寫監控硬盤腳本:

[root@iZ9458z0ss9Z ~]# cat /data/your_script_path/hdd_alarm.sh 
#!/bin/sh
# by kin
source /etc/profile;
runpath='/data/your_script_path'

# 定義send函數來調用我們的sendmail.sh腳本
send()
{
  ./sendmail.sh $1 your_email_1@163.com
  ./sendmail.sh $1 your_email_2@163.com
}

# 如果不存在hdd_alarm文件夾則創建之
if [ ! -d  $runpath/hdd_alarm ]
then
        mkdir $runpath/hdd_alarm
fi

# 獲取本機的IP地址
Host_IP=`ifconfig |grep "inet "|grep -v 127.0.0.1|grep -v 'inet 10.'|grep -v '192.168'|awk '{print $2}'`


# 獲取df硬盤容量參數的後兩列使用率a和路徑b,遍歷a和b
df -h | awk '{print $(NF-1),$NF}'|awk '{if($NF!=$1){print $0}}'|sed 's/\%//g'|sed '1d'|while read a b;
do

# 如果a和b有一個是空字符串則跳過
if [ -z $a ] || [ -z $b ] 
then      
        :
else
        # 從日誌中獲取上一次報警的使用率
        old_used=`cat $runpath/hdd_alarm/hdd_tmp.log 2>/dev/null|grep "$b$"|awk '{print $(NF-1)}'|sed 's/\%//g'`

        # 如果使用率不超過80 或者 相比上一次沒有增加,則跳過
        if [ ! -z $old_used ] && [ $old_used -eq $a ] &&  [ $a -le 80 ]
        then
                continue;
        fi

        # 如果使用率確實>=80,則調用send()發送郵件...
        if  [ $a -gt 80  ] ; 
        then
                msg="Hdd_alarm:$Host_IP-$b-Reach-$a-precent"
                echo "$msg" #>>./Hdd_alarm.log
                send "$msg"

                # 如果調用send()的返回值不是0,說明有錯誤
                if [ $? -ne 0 ]; 
                then
                        echo -e "############Send Fail#############"
                fi
        fi


fi
done

# 把df信息存入到日誌中,方便下一次對比
df -h >$runpath/hdd_alarm/hdd_tmp.log

放到定時任務,每15分鐘跑一次:

[root@iZ9458z0ss9Z script]# crontab  -l
*/15 * * * *  cd /data/your_script_path/;./hdd_alarm.sh >/dev/null 2>&1
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章