日誌巡檢發現,公司web服務器一直報錯,信息如下:
Jul 5 15:40:37 mail kernel: printk: 272 messages suppressed.
Jul 5 15:40:37 mail kernel: TCP: time wait bucket table overflow
Jul 5 15:40:37 mail kernel: TCP: time wait bucket table overflow
Jul 5 15:40:43 mail kernel: printk: 92 messages suppressed.
Jul 5 15:40:43 mail kernel: TCP: time wait bucket table overflow
(TCP:時間等待桶表)
排查步驟:
1. 查看服務器網絡連接情況;
[root@mail ~]# netstat -pant |awk '/^tcp/ {++state[$6]} END {for(key in state) printf("%-10s\t%d\n",key,state[key]) }'
TIME_WAIT 4944
CLOSE_WAIT 1
FIN_WAIT1 93
FIN_WAIT2 66
ESTABLISHED 292
SYN_RECV 29
CLOSING 32
LAST_ACK 9
LISTEN 14
[root@mail ~]#
2.查看內核參數
vi /etc/sysctl.conf
將net.ipv4.tcp_max_tw_buckets = 5000
改爲:net.ipv4.tcp_max_tw_buckets = 10000
3.使更改的內核參數生效
sysctl -p
4. 再次查看服務器網絡連接情況;
[root@mail ~]# netstat -pant |awk '/^tcp/ {++state[$6]} END {for(key in state) printf("%-10s\t%d\n",key,state[key]) }'
TIME_WAIT 6644
CLOSE_WAIT 1
FIN_WAIT1 93
FIN_WAIT2 66
ESTABLISHED 292
SYN_RECV 29
CLOSING 32
LAST_ACK 9
LISTEN 14
5.
#再看/var/log/messages和dmesg的信息,已經不再報錯了,看來net.ipv4.tcp_max_tw_buckets=10000暫時是夠用了
6.原因
服務器的TCP連接數,超出了內核定義最大數。