xen 虛擬機掛了,宿主機假死的問題追終,全思路

出問題主機工作環境用的是xenserver6.5集羣,有一天上去突然發現一臺vm連不上了,想着那就上去xenserver重啓虛擬機,結果強制重啓不能成功,就上去宿主機查詢磁盤空間

[root@VIP-XS-08 cron.d]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              20G  20G   0  100% /
none                  7.8G  2.0M  7.8G   1% /dev/shm

發現宿主機磁盤空間滿了,ok,那清磁盤空間吧,結果執行下面命令發現

[root@VIP-XS-08 /]# cd /
[root@VIP-XS-08 /]# du -sh *
5.7M    bin
24M     boot
2.1M    cli-rt
3.3M    dev
7.4M    etc
28K     EULA
4.0K    home
118M    lib
20M     lib64
16K     lost+found
4.0K    media
4.0K    mnt
554M    opt
du: cannot read directory `proc/7020': No such file or directory
du: cannot read directory `proc/7021': No such file or directory
0       proc
12K     Read_Me_First.html
102M    root
24M     sbin
4.0K    selinux
4.0K    srv
0       sys
1.6M    tftpboot
68K     tmp
542M    usr
2.6G    var

好嗎,磁盤空間沒滿,那怎麼辦,其它空間哪裏去了,想想應該是刪除了未釋放空間的文件導致,再執行下面的命令,看看哪些文件是刪除了還在使用的

[root@VIP-XS-08 cron.d]#  ls -l /proc/[0-9]*/fd/* |grep delete 
ls: /proc/29018/fd/255: No such file or directory
ls: /proc/29018/fd/3: No such file or directory
l-wx------ 1 root   root   64 Nov 14 13:14 /proc/22020/fd/2 -> /tmp/stunnelbd3855.log (deleted)
l-wx------ 1 root   root   64 Nov 14 13:27 /proc/24758/fd/2 -> /tmp/stunnel1bc930.log (deleted)
lrwx------ 1 root   root   64 Nov 14 11:03 /proc/4555/fd/6 -> /tmp/tmpfLfGwGG (deleted)
lrwx------ 1 root   root   64 Nov 14 11:03 /proc/4556/fd/6 -> /tmp/tmpfLfGwGG (deleted)
l-wx------ 1 root   root   64 Nov 14 11:03 /proc/4587/fd/5 -> /var/run/openvswitch/ovs-xapi-sync.pid.tmp4587 (deleted)
l-wx------ 1 root   root   64 Nov 14 11:03 /proc/4587/fd/12 ->  /var/log/blktap/tapdisk.2345.log (deleted)

試了一圈,最後最大可能就是/var/log/blktap/tapdisk.2345.log (deleted) 這個文件了

tapdisk.2345.log 這個文件說明文件是一個tapdisk進程id爲2345的log文件,裏面主要記錄tapdisk監控磁盤鏡像的日誌記錄,像是下面的日誌記錄

Aug 21 17:55:06: [17:55:06.597] tapdisk_vbd_check_progress: vhd:/dev/VG_XenStorage-39d05ede-4cd6-6dd0-4263-f8dbe2949580/VHD-2e957900-09c5-4e8d-9ba1-c9e17f78f519: watchdog timeout: pending requests idle for 60 seconds
Aug 21 17:55:06: [17:55:06.597] tapdisk_vbd_check_progress: vhd:/dev/VG_XenStorage-39d05ede-4cd6-6dd0-4263-f8dbe2949580/VHD-2e957900-09c5-4e8d-9ba1-c9e17f78f519: watchdog timeout: pending requests idle for 60 seconds
Aug 21 17:55:06: [17:55:06.921] tapdisk_vbd_check_progress: vhd:/dev/VG_XenStorage-39d05ede-4cd6-6dd0-4263-f8dbe2949580/VHD-2e957900-09c5-4e8d-9ba1-c9e17f78f519: watchdog timeout: pending requests idle for 60 seconds
Aug 21 17:55:06: [17:55:06.925] tapdisk_vbd_check_progress: vhd:/dev/VG_XenStorage-39d05ede-4cd6-6dd0-4263-f8dbe2949580/VHD-2e957900-09c5-4e8d-9ba1-c9e17f78f519: watchdog timeout: pending requests idle for 60 seconds


那麼xen的虛擬機掛了,會導致一開始那個問題呢,無法重啓虛擬機,宿主機磁盤空間滿,日誌文件又給刪除呢?

答案是虛擬機掛了後,宿主機上vm對應的的tapdisk進程不斷刷日誌,直到刷爆磁盤,導致虛擬機想重啓也沒法重啓,因爲宿主機的磁盤空間滿了。但是如果日誌大小超過了觸發了日誌滾動的大小,日誌發生備份操作,滾動後剛剛好有超過了預設的最多保留個數的限制,那文件就會被刪除掉

[root@VIP-XS-08 /]# rpm -vV  elasticsyslog
........  c /etc/cron.d/logrotate.cron
........  c /etc/logrotate-xenserver.conf
........    /etc/sysconfig/syslog.elastic
........    /etc/sysconfig/syslog.patch
........    /opt/xensource/bin/delete_old_logs_by_space
........    /opt/xensource/bin/elasticsyslog
........    /opt/xensource/bin/logrotate-xenserver
........    /opt/xensource/bin/rotate_logs_by_size
[root@VIP-XS-08 /]# cat /etc/logrotate.conf
# see "man logrotate" for details
# rotate log files weekly
weekly

# keep 4 weeks worth of backlogs
rotate 4

# create new (empty) log files after rotating old ones
create

# uncomment this if you want your log files compressed
#compress

# RPM packages drop log rotation information into this directory
include /etc/logrotate.d

# no packages own wtmp -- we'll rotate them here
/var/log/wtmp {
    monthly
    minsize 1M
    create 0664 root utmp
    rotate 1
}

/var/log/btmp {
    missingok
    monthly
    minsize 1M
    create 0600 root utmp
    rotate 1
}

# system-specific logs may be also be configured here.


說了那麼多,解決的方法也很簡單,就是釋放佔用刪除文件的進程,看到上面的/var/log/blktap/tapdisk.2345.log (deleted) 了嗎,進程號就是2345了,幹掉它

[root@VIP-XS-08 /]# ps -ef |grep 2345
root     18165 15432  0 14:22 pts/37   00:00:00 grep 21611
root     2345     1  0 Jun01 ?        03:10:55 tapdisk
[root@VIP-XS-08 /]# kill 2345
[root@VIP-XS-08 /]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              20G  4.1G   15G  22% /
none                  7.8G  2.0M  7.8G   1% /dev/shm

好吧,看到空間出來了吧,這時候,你會看到宿主機恢復正常了,因爲有磁盤空間了,我們原先掛掉的那臺虛擬機也已經關機了.

那接下來,啓動虛擬機吧,如果你是集羣的虛擬機,那最簡單,在另一個宿主機上啓動就可以,如果你是單獨一臺虛擬機,或是想在原先的宿主機上啓動,那你需要先啓動tapdisk,這裏需要個編號,在你幹掉虛擬機進程前最好記住,沒什麼好辦法,執行下面命令,保存,等到執行kill 進程後,再執行下面命令,就可以找到對應該虛擬機的啓動tapdisk工作進程

 #查看所有的tapdisk進程
 #ps -ef |grep tap
 # 啓動vm自己的tapdisk進程,注意,這裏的8是我通過kill前後的執行  ps -ef |grep tap 對比得出,不是固定的
 #tapback -d -x 18

啓動完vm對應的tapdisk進程,你就可以正常啓動虛擬機了。


下面是補給,解釋什麼是tapdisk,可以給有需要的朋友,本人英文也是能僅限看懂的水平,就不獻醜翻譯了:

url : https://wiki.xen.org/wiki/Blktap

tapdisk, each tapdisk process in userspace is backed by one or several image files

When xend is started the userspace daemon blktapctrl is started, too. When booting the Guest VM the XenBus is initialized as described in XenSplitDrivers. The request for a new virtual disk is propagated to blktapctrl, which creates a new character device and two named pipes for communication with a newly forked tapdisk process. 

After opening the character device the shared memory is mapped to the fe_ring using the mmap system call. The tapdisk process opens the image file and sends information about the imageas size back to blktapctrl, which stores it. After this initialization tapdisk executes a select system call on the two named pipes. On an event it checks if the tap-fd is set and if it is, tries to read a request from the frontend ring. 

The XenBus connection between DomU and Dom0 is used by XenStore to negotiate the backend/frontend connection. After the setup of both backend and frontend a shared ring page and an event channel are negotiated. These are used for any further communication between backend and frontend. I/O requests issued in the Guest VM are handled in the Guest OS and forwarded using these two communication channels.


There is a trade-off between delay and throughput which is controlled by modifying the number of requests until the blktap driver is notified. 

The blktap driver notifies the appropriate blktapctrl or tapdisk process depending on the event type by returning the poll and waking up the tapdisk process respectively. The shared frontend ring works as described in the ring.h. 

tapdisk reads the request from the frontend ring and in case of synchronous I/O reads and immediately returns the request. In case of asynchronous I/O a batch of requests is submitted to Linux AIO subsystem. Both mechanisms read from the image file. In the asynchronous case it is checked using the non-blocking system call io_getevents if the I/O requests were completed. 

The information about completed requests is propagated in the frontend ring. The blktap driver is notified by the tapdisk process with the ioctl system call. 
Using the same XenSplitDevices mechanism the data is returned to the frontend of the Guest VM.




 

Blktap$blktap diagram differentSymbols.png







發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章