聊聊Kubernetes Pod or Namespace卡在 Terminating 狀態的場景

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個話題,想必玩過kubernetes的同學當不陌生,我會分Pod和Namespace分別來談。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"開門見山,爲什麼Pod會卡在Terminating狀態?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一句話,本質是API Server雖然標記了對象的刪除,但是作爲實際清理的控制器kubelet, 並不能關停Pod或相關資源, 因而沒能通知API Server做實際對象的清理。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原因何在?要解開這個原因,我們先來看Pod Terminating的基本流程:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"客戶端(比如kubectl)提交刪除請求到API Server","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":1,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"可選傳遞 --grace-period 參數","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"API Server接受到請求之後,做 Graceful Deletion 檢查","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":1,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"若需要 graceful 刪除時,則更新對象的 metadata.deletionGracePeriodSeconds和metadata.deletionTimestamp字段。這時候describe查看對象的話,會發現其已經變成Terminating狀態了","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"Pod所在的節點,kubelet檢測到Pod處於Terminating狀態時,就會開啓Pod的真正刪除流程","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":1,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"如果Pod中的容器有定義preStop hook事件,那kubelet會先執行這些容器的hook事件","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":1,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"之後,kubelet就會Trigger容器運行時發起 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"TERM","attrs":{}}],"attrs":{}},{"type":"text","text":" signal 給該Pod中的每個容器","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"在Kubelet開啓Graceful Shutdown的同時,Control Plane也會從目標Service的Endpoints中摘除要關閉的Pod。ReplicaSet和其他的workload服務也會認定這個Pod不是個有效副本了。同時,Kube-proxy 也會摘除這個Pod的Endpoint,這樣即使Pod關閉很慢,也不會有流量再打到它上面。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"如果容器正常關閉那很好,但如果在grace period 時間內,容器仍然運行,kubelet會開始強制shutdown。容器運行時會發送","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"SIGKILL","attrs":{}}],"attrs":{}},{"type":"text","text":"信號給Pod中所有運行的進程進行強制關閉","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","text":"注意在開啓Pod刪除的同時,kubelet的其它控制器也會處理Pod相關的其他資源的清理動作,比如Volume。","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"而待一切都清理乾淨之後","attrs":{}},{"type":"text","text":",Kubelet才通過把Pod的grace period時間設爲0來通知API Server強制刪除Pod對象。","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"參考鏈接: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"只有執行完第六步,Pod的API對象纔會被真正刪除。那怎樣才認爲是**\"一切都清理乾淨了\"**呢?我們來看源碼:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"go"},"content":[{"type":"text","text":"// PodResourcesAreReclaimed returns true if all required node-level resources that a pod was consuming have\n// been reclaimed by the kubelet. Reclaiming resources is a prerequisite to deleting a pod from theAPI Server.\nfunc (kl *Kubelet) PodResourcesAreReclaimed(pod *v1.Pod, status v1.PodStatus) bool {\n if kl.podWorkers.CouldHaveRunningContainers(pod.UID) {\n // We shouldn't delete pods that still have running containers\n klog.V(3).InfoS(\"Pod is terminated, but some containers are still running\", \"pod\", klog.KObj(pod))\n return false\n }\n if count := countRunningContainerStatus(status); count > 0 {\n // We shouldn't delete pods until the reported pod status contains no more running containers (the previous\n // check ensures no more status can be generated, this check verifies we have seen enough of the status)\n klog.V(3).InfoS(\"Pod is terminated, but some container status has not yet been reported\", \"pod\", klog.KObj(pod), \"running\", count)\n return false\n }\n if kl.podVolumesExist(pod.UID) && !kl.keepTerminatedPodVolumes {\n // We shouldn't delete pods whose volumes have not been cleaned up if we are not keeping terminated pod volumes\n klog.V(3).InfoS(\"Pod is terminated, but some volumes have not been cleaned up\", \"pod\", klog.KObj(pod))\n return false\n }\n if kl.kubeletConfiguration.CgroupsPerQOS {\n pcm := kl.containerManager.NewPodContainerManager()\n if pcm.Exists(pod) {\n klog.V(3).InfoS(\"Pod is terminated, but pod cgroup sandbox has not been cleaned up\", \"pod\", klog.KObj(pod))\n return false\n }\n }\n\n // Note: we leave pod containers to be reclaimed in the background since dockershim requires the\n // container for retrieving logs and we want to make sure logs are available until the pod is\n // physically deleted.\n\n klog.V(3).InfoS(\"Pod is terminated and all resources are reclaimed\", \"pod\", klog.KObj(pod))\n return true\n}\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"源碼位置: https://github.com/kubernetes/kubernetes/blob/1f2813368eb0eb17140caa354ccbb0e72dcd6a69/pkg/kubelet/kubelet_pods.go#L923","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"是不是很清晰?總結下來就三個原因:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"Pod裏沒有Running的容器","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"Pod的Volume也清理乾淨了","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"Pod的cgroup設置也沒了","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如是而已。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自然,其反向對應的就是各個異常場景了。我們來細看:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"容器停不掉 - 這種屬於CRI範疇,常見的一般使用docker作爲容器運行時。筆者就曾經遇到過個場景,用","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"docker ps","attrs":{}}],"attrs":{}},{"type":"text","text":" 能看到目標容器是","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"Up","attrs":{}}],"attrs":{}},{"type":"text","text":"狀態,但是執行","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"docker stop or rm","attrs":{}}],"attrs":{}},{"type":"text","text":" 卻沒有任何反應,而執行","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"docker exec","attrs":{}}],"attrs":{}},{"type":"text","text":",會報","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"no such container","attrs":{}}],"attrs":{}},{"type":"text","text":"的錯誤。也就是說此時這個容器的狀態是錯亂的,docker自己都沒法清理這個容器,可想而知kubelet更是無能無力。workaround恢復操作也簡單,此時我只是簡單的重啓了下docker,目標容器就消失了,Pod的卡住狀態也很快恢復了。當然,若要深究,就需要看看docker側,爲何這個容器的狀態錯亂了。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":1,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"更常見的情況是出現了殭屍進程,對應容器清理不了,Pod自然也會卡在Terminating狀態。此時要想恢復,可能就只能重啓機器了。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Volume清理不了 - 我們知道在PV的\"兩階段處理流程中\",Attach&Dettach由Volume Controller負責,而Mount&Unmount則是kubelet要參與負責。筆者在日常中有看到一些因爲自定義CSI的不完善,導致kubelet不能Unmount Volume,從而讓Pod卡住的場景。所以我們在日常開發和測試自定義CSI時,要小心這一點。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"cgroups沒刪除 - 啓用QoS功能來管理Pod的服務質量時,kubelet需要爲Pod設置合適的cgroup level,而這是需要在相應的位置寫入合適配置文件的。自然,這個配置也需要在Pod刪除時清理掉。筆者日常到是沒有碰到過cgroups清理不了的場景,所以此處暫且不表。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現實中導致Pod卡住的細分場景可能還有很多,但不用擔心,其實多數情況下通過查看kubelet日誌都能很快定位出來的。之後順藤摸瓜,恢復方案也大多不難。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當然還有一些系統級或者基礎設施級異常,比如kubelet掛了,節點訪問不了API Server了,甚至節點宕機等等,已經超過了kubelet的能力範疇,不在此討論範圍之類。","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"還有個注意點,如果你發現kubelet裏面的日誌有效信息很少,要注意看是不是Log Level等級過低了。從源碼看,很多更具體的信息,是需要大於等於3級別才輸出的。","attrs":{}}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"那Namespace卡在Terminating狀態的原因是啥?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"顯而易見,刪除Namespace意味着要刪除其下的所有資源,而如果其中Pod刪除卡住了,那Namespace必然也會卡在Terminating狀態。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除此之外,結合日常使用,筆者發現CRD資源發生刪不掉的情況也比較高。這是爲什麼呢?至此,那就不得不聊聊 Finalizers機制了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"官方有篇博客專門講到了這個,裏面有個實驗挺有意思。隨便給一個configmap,加上個finalizers字段之後,然後使用","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"kubectl delete","attrs":{}}],"attrs":{}},{"type":"text","text":"刪除它就會發現,直接是卡住的,kubernetes自身永遠也刪不了它。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"參考: https://kubernetes.io/blog/2021/05/14/using-finalizers-to-control-deletion/#understanding-finalizers","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原因何在?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原來Finalizers在設計上就是個pre-delete的鉤子,其目的是讓相關控制器有機會做自定義的清理動作。通常控制器在清理完資源後,會將對象的finalizers字段清空,然後kubernetes才能接着刪除對象。而像上面的實驗,沒有相關控制器能處理我們隨意添加的finalizers字段,那對象當然會一直卡在Terminating狀態了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自己開發CRD及Controller,因成熟度等因素,發生問題的概率自然比較大。除此之外,引入webhook(mutatingwebhookconfigurations/validatingwebhookconfigurations)出問題的概率也比較大,日常也要比較注意。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綜合來看,遇Namespace刪除卡住的場景,筆者認爲,基本可以按以下思路排查:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"kubectl get ns $NAMESPACE -o yaml","attrs":{}}],"attrs":{}},{"type":"text","text":", 查看","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"conditions","attrs":{}}],"attrs":{}},{"type":"text","text":"字段,看看是否有相關信息","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"如果上面不明顯,那就可以具體分析空間下,還遺留哪些資源,然後做更針對性處理","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":1,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"參考命令: ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -n $NAMESPACE ","attrs":{}}],"attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"找準了問題原因,然後做相應處理,kubernetes自然能夠清理對應的ns對象。不建議直接清空ns的finalizers字段做強制刪除,這會引入不可控風險。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"參考: https://github.com/kubernetes/kubernetes/issues/60807#issuecomment-524772920","attrs":{}}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"相關閱讀","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前同事也有幾篇關於kubernetes資源刪除的文章,寫的非常好,推薦大家讀讀:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https://zhuanlan.zhihu.com/p/164601470","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https://zhuanlan.zhihu.com/p/161072336","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"更多工程效能、測開技術、雲原生相關討論歡迎關注: BigCarlJi","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章