kubernetes node 節點啓動報錯: No valid private key

kubernetes node 節點啓動報錯故障排查

報錯場景:

kubernetes 集羣安裝部署期間,部署node節點kubelet服務時,執行  systemctl start kubelet ,tailf /var/log/messages 看到大量證書驗證報錯;

報錯內容:

May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.583305    5336 feature_gate.go:206] feature gates: &{map[]}
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589637    5336 mount_linux.go:180] Detected OS with systemd
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589680    5336 server.go:407] Version: v1.13.4
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589732    5336 feature_gate.go:206] feature gates: &{map[]}
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589825    5336 feature_gate.go:206] feature gates: &{map[]}
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589899    5336 plugins.go:103] No cloud provider specified.
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589916    5336 server.go:523] No cloud provider specified: "" from the config file: ""
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589938    5336 bootstrap.go:65] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.593022    5336 bootstrap.go:96] No valid private key and/or certificate found, reusing existing private key or creating a new one
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.612493    5336 bootstrap.go:239] Failed to connect to apiserver: Get https://172.20.101.157:6443/healthz?timeout=1s: x509: certificate signed by unknown authority
May  5 22:23:42 kubnode-01 kubelet: I0505 22:23:42.909358    5336 bootstrap.go:239] Failed to connect to apiserver: Get https://172.20.101.157:6443/healthz?timeout=1s: x509: certificate signed by unknown authority
May  5 22:23:45 kubnode-01 kubelet: I0505 22:23:45.036663    5336 bootstrap.go:239] Failed to connect to apiserver: Get https://172.20.101.157:6443/healthz?timeout=1s: x509: certificate signed by unknown authority

解決辦法如下:

在master節點創建kubelet-bootstrap用戶

[root@k8s-node01 ~]# 

kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
clusterrolebinding "kubelet-bootstrap" created

node節點執行啓動服務

[root@k8s-node01 ~]# systemctl start kubelet

node 節點kubelet啓動後,會向master申請csr證書,需要在master上同意證書申請

master節點執行命令,查看csr狀態是Pending

[root@kubm-01 ~]# kubectl get csr
NAME                                                   AGE     REQUESTOR           CONDITION
node-csr-mgZK4Cqvb7kZA7tDqVmszNQYLq27Yydia5LCqKJnnEI   4m11s   kubelet-bootstrap   Pending

master節點執行命令批准證書

[root@kubm-01 ~]# 
kubectl certificate approve node-csr-mgZK4Cqvb7kZA7tDqVmszNQYLq27Yydia5LCqKJnnEI

master節點執行命令接受證書申請,同意後查看狀態變成 Approved,Issued

[root@kubm-01 ~]# kubectl get csr
NAME                                                   AGE     REQUESTOR           CONDITION
node-csr-mgZK4Cqvb7kZA7tDqVmszNQYLq27Yydia5LCqKJnnEI   5m39s   kubelet-bootstrap   Approved,Issued

node節點驗證

在node節點ssl目錄可以看到,多了4個kubelet的證書文件

[root@kubnode-02 kubernetes]# ls /kubernetes/ssl/kubelet*
/kubernetes/ssl/kubelet-client-2019-05-05-22-15-53.pem  /kubernetes/ssl/kubelet-client-current.pem  /kubernetes/ssl/kubelet.crt  /kubernetes/ssl/kubelet.key

刪除csr證書 (按需執行)

[root@kubm-01 ~]# kubectl delete csr node-csr-mgZK4Cqvb7kZA7tDqVmszNQYLq27Yydia5LCqKJnnEI
certificatesigningrequest.certificates.k8s.io "node-csr-mgZK4Cqvb7kZA7tDqVmszNQYLq27Yydia5LCqKJnnEI" deleted

驗證刪除:

kubectl get csr

返回爲空

排查過程有點坑。。。。。。。

參考文檔:

https://www.liuyalei.top/1433.html

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章