CentOS 7 yum安裝 k8s 創建Pod一直處於ContainerCreating狀態 問題解決

CentOS 7 yum安裝 k8s 創建Pod一直處於ContainerCreating狀態 問題解決

閱讀目錄
問題描述
本文內容
問題分析與解決
原因分析與其他方案
參考
問題描述
使用CentOS7的 yum 包管理器安裝了 Kubernetes 集羣,使用 kubectl 創建服務成功後,執行 kubectl get pods,發現AGE雖然在不斷增加,但狀態始終不變

本文內容
分析問題原因
給出直接解決此問題的方式 (不完美)
給出其他方案
且聽我娓娓道來~

問題分析與解決
kubectl 提供了 describe 子命令來輸出指定的一個/多個資源的詳細信息。

執行 kubectl describe pod mytomcat-9lcq5,查看問題 Pod 的狀態信息,輸出如下:

[root@kube-master app]# kubectl describe pod mytomcat-9lcq5
Name: mytomcat-9lcq5
Namespace: default
Node: kube-node-2/192.168.87.145
Start Time: Fri, 17 Apr 2020 15:53:50 +0800
Labels: app=mytomcat
Status: Pending
IP:
Controllers: ReplicationController/mytomcat
Containers:
mytomcat:

Container ID:        
Image:            tomcat:9-jre8-alpine
Image ID:            
Port:            8080/TCP
State:            Waiting
  Reason:            ContainerCreating
Ready:            False
Restart Count:        0
Volume Mounts:        <none>
Environment Variables:    <none>

Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
No volumes.
QoS Class: BestEffort
Tolerations:
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
5m 5m 1 {default-scheduler } Normal Scheduled Successfully assigned mytomcat-9lcq5 to kube-node-2
4m 4m 1 {kubelet kube-node-2} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request. details: (Get https://registry.access.redhat.com/v1/_ping: net/http: TLS handshake timeout)"

3m 3m 1 {kubelet kube-node-2} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request. details: (Network timed out while trying to connect to https://registry.access.redhat.com/v1/repositories/rhel7/pod-infrastructure/images. You may want to check your internet connection or if you are behind a proxy.)"

2m 2m 1 {kubelet kube-node-2} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request. details: (Error: image rhel7/pod-infrastructure:latest not found)"

3m 1m 3 {kubelet kube-node-2} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image "registry.access.redhat.com/rhel7/pod-infrastructure:latest""

通過查看最下方的輸出信息,Successfully assigned mytomcat-9lcq5 to kube-node-2 說明這個 Pod 分配到 kube-node-2 這個主機上了,然後在這個主機上創建 Pod 失敗,

原因是 image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request.

通過以上信息,我們瞭解到通過紅帽自家的 docker 倉庫 pull 鏡像,需要使用 CA 證書進行認證,才能 pull 成功

docker的證書在 /etc/docker/certs.d 目錄下,根據上邊的錯誤提示域名是 registry.access.redhat.com,證書在這個目錄中

經過 ll 命令查看,發現 /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt 是一個軟鏈接(軟鏈接是什麼?),指向到 /etc/rhsm/ca/redhat-uep.pem,

熟悉軟連接的我們知道,處於紅色閃爍狀態的目標是不存在,需要生成 /etc/rhsm/ca/redhat-uep.pem 證書文件

生成證書:

openssl s_client -showcerts -servername registry.access.redhat.com -connect registry.access.redhat.com:443 /null 2>/dev/null | openssl x509 -text > /etc/rhsm/ca/redhat-uep.pem

生成證書命令執行有時會出現 unable to load certificate 139930742028176:error:0906D06C:PEM routines:PEM_read_bio:no start line:pem_lib.c:707:Expecting: TRUSTED CERTIFICATE 問題,重新執行就好

命令執行完畢後,查看軟鏈接指向的證書文件:

[root@kube-node-2 registry.access.redhat.com]# ll /etc/rhsm/ca/redhat-uep.pem
-rw-r--r-- 1 root root 9233 Apr 17 16:55 /etc/rhsm/ca/redhat-uep.pem
證書文件已經存在,我們去 k8s 管理節點 kube-master 主機刪除剛纔的 Pods,等待 Pod 重新創建成功 (第二個節點因爲網絡問題沒有拉成功鏡像……)

至此完成 Pod 的創建

但是還有存在些問題的,當前國內網絡環境訪問外邊的網絡偶爾會有問題,導致創建 Pod 失敗,通過 describe 描述還是同樣的信息提示,但是查看證書文件卻存在且有內容

原因分析與其他方案
k8s 管理節點分配創建 Pod 到執行節點,到達執行節點後,拉取紅帽 docker 倉庫的 Pod基礎鏡像 pod-infrastructure:latest,由於其倉庫使用 https 需要驗證證書,證書不存在導致失敗

另外就是因爲拉取的鏡像是紅帽 docker 倉庫中的,在國內網絡環境下握手失敗,無法下載鏡像

所以問題就成了 如何解決 k8s pod-infrastructure 鏡像拉取失敗,這裏給出一個方案,步驟如下:

拉取 docker 官方倉庫其他人上傳的 pod-infrastructure 鏡像,docker pull tianyebj/pod-infrastructure
添加tag標籤,改爲私有倉庫地址,如:docker tag tianyebj/pod-infrastructure 10.2.7.70:5000/dev/pod-infrastructure
push鏡像到私有倉庫,如:docker push 10.2.7.70:5000/dev/pod-infrastructure

修改所有 worker 節點的 /etc/kubernetes/kubelet,修改 registry.access.redhat.com/rhel7/pod-infrastructure 爲剛纔設置的 tag 標籤
sed -i "s#registry.access.redhat.com/rhel7/pod-infrastructure#<私有倉庫pod-infrastructure鏡像tag>#" /etc/kubernetes/kubelet

重啓所有 worker 節點的 kubelet,systemctl restart kubelet,即可
注意事項:

上傳的鏡像要設爲公開鏡像,否則 kubelet 自己沒權限拉鏡像的,另外也可以去 ssh 登錄 worker 節點登錄倉庫,執行docker pull <私有倉庫pod-infrastructure鏡像tag>
最後的效果:

參考
https://github.com/CentOS/sig-atomic-buildscripts/issues/329
https://cloud.tencent.com/developer/article/1156329

原文地址https://www.cnblogs.com/hellxz/p/k8s-pod-always-container-creating-status-problem.html

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章