背景:
(1) 我們的平臺docker默認的掛載方式是MountFlags=slave, 該掛載方式的一個特性是:一旦某個container的以這種方式掛載後啓動後,則host節點的信息變動,不會再同步到container裏
(2) 因爲節點監控數據採集工具node-exporter, 需要掛載host節點的根目錄,若以MountFlags=slave的方式掛載,會導致節點上的mount信息變動,不會同步到node-exporter.
(3) 其中一個影響是一個statefulset若從節點node-1遷移到node-2,會先umount 該statefulset在node-1上的rbd設備,並mount到node-2上.但結果會報錯:
Warning FailedMount 25m (x34 over 1h) kubelet, node-2 Unable to mount volumes for pod "fluentd-0_openstack(b4179bc4-3776-11ea-862b-246e965469a8)": timeout expired waiting for volumes to attach/mount for pod "openstack"/"fluentd-0". list of unattached/unmounted volumes=[storage]
初步判斷是MountFlags=slave的掛載方式問題,即host節點上umount了該rbd設備,但node-exporter的container裏的mount信息還在.所以,只需要在該container裏umount掉該rbd設備,若該statefulset能夠起來,這說明需要修docker默認的掛載方式.
方法:
以下例子中,我們把根目錄掛載prometheus-polling-exporter中,statefulset的pod是fluentd, 從節點node-1遷移到node-2
(1) 查看node-1節點沒有被mount的rbd設備,即rbd14在host上已經處於umount狀態
[root@node-1 ~]# lsblk | grep rbd14
rbd14 237:0 0 500G 0 disk
(2) 進入prometheus-polling-exporter的pod中,查看rbd14狀態,還處於mount狀態,即host和container不同步
()[root@node-1 /]# mount | grep rbd14
/dev/rbd14 on /host/var/lib/kubelet/plugins/kubernetes.io/rbd/rbd/rbd-image-kubernetes-dynamic-pvc-144d567d-1aaa-11ea-a40d-0a580ae80250 type ext4 (rw,relatime,stripe=1024,data=ordered)
(3) node-1的message信息裏報錯
Jan 15 17:13:05 node-1 kubelet: E0115 17:13:05.330855 18003 nestedpendingoperations.go:263] Operation for "\"kubernetes.io/rbd/rbd:kubernetes-dynamic-pvc-144d567d-1aaa-11ea-a40d-0a580ae80250\"" failed. No retries permitted until 2020-01-15 17:15:07.330793288 +0800 CST m=+196440.172890846 (durationBeforeRetry 2m2s). Error: "UnmountDevice failed for volume \"pvc-b54ea292-1aa5-11ea-ae82-246e96549cb8\" (UniqueName: \"kubernetes.io/rbd/rbd:kubernetes-dynamic-pvc-144d567d-1aaa-11ea-a40d-0a580ae80250\") on node \"node-1\" : rbd: failed to unmap device /dev/rbd14, error exit status 16, rbd output: rbd: sysfs write failed\nrbd: unmap failed: (16) Device or resource busy\n"
(4) 查看node-1節點上prometheus-polling-exporter的進程號,進程號是40567
[root@node-1 ~]# ps -ef | grep prometheus-polling-exporter
root 40567 40485 0 17:04 ? 00:00:01 prometheus-polling-exporter
root 46498 8212 0 18:28 pts/44 00:00:00 grep --color=auto prometheus-polling-exporter
(5) 通過nsenter命令進入該進程
[root@node-1 ~]# nsenter -t 40567 -m -p
()[root@node-1 /]#
(6) 查看rbd14的mount信息,並umount,然後t退出
()[root@node-1 /]# mount | grep rbd14
/dev/rbd14 on /host/var/lib/kubelet/plugins/kubernetes.io/rbd/rbd/rbd-image-kubernetes-dynamic-pvc-144d567d-1aaa-11ea-a40d-0a580ae80250 type ext4 (rw,relatime,stripe=1024,data=ordered)
()[root@node-1 /]# umount /host/var/lib/kubelet/plugins/kubernetes.io/rbd/rbd/rbd-image-kubernetes-dynamic-pvc-144d567d-1aaa-11ea-a40d-0a580ae80250
()[root@node-1 /]# exit
logout
(7) 在node-1節點上執行rbd unmap操作
[root@node-1 ~]# rbd unmap /dev/rbd14
flunentd即可正常umount/mount pvc並起來.