1 問題
在node1上,安裝kubelet後,查看日誌:
root@node1:~# tailf /var/log/syslog
Jan 15 15:01:07 node1 kubelet[17646]: I0115 15:01:07.080370 17646 kubelet_node_status.go:273] Setting node annotation to enable volume controller attach/detach
Jan 15 15:01:07 node1 kubelet[17646]: I0115 15:01:07.084014 17646 kubelet_node_status.go:431] Recording NodeHasSufficientDisk event message for node 192.168.122.3
Jan 15 15:01:07 node1 kubelet[17646]: I0115 15:01:07.084061 17646 kubelet_node_status.go:431] Recording NodeHasSufficientMemory event message for node 192.168.122.3
Jan 15 15:01:07 node1 kubelet[17646]: I0115 15:01:07.084084 17646 kubelet_node_status.go:431] Recording NodeHasNoDiskPressure event message for node 192.168.122.3
Jan 15 15:01:07 node1 kubelet[17646]: I0115 15:01:07.084106 17646 kubelet_node_status.go:82] Attempting to register node 192.168.122.3
Jan 15 15:01:07 node1 kubelet[17646]: E0115 15:01:07.086790 17646 kubelet_node_status.go:106] Unable to register node "192.168.122.3" with API server: nodes is forbidden: User "system:node:192.168.122.3" cannot create nodes at the cluster scope
Jan 15 15:01:07 node1 kubelet[17646]: E0115 15:01:07.309503 17646 eviction_manager.go:238] eviction manager: unexpected err: failed to get node info: node "192.168.122.3" not found
在master結點上查看日誌,apiserver日誌報錯,大量的組件都是RBAC DENY狀態。:
root@master:~# tailf /var/log/syslog
Jan 18 14:39:39 master kube-apiserver[2638]: I0118 14:39:39.885925 2638 rbac.go:116] RBAC DENY: user "system:node:192.168.122.3" groups ["system:nodes" "system:authenticated"] cannot "list" resource "services" cluster-wide
Jan 18 14:39:39 master kube-apiserver[2638]: I0118 14:39:39.886076 2638 wrap.go:42] GET /api/v1/services?limit=500&resourceVersion=0: (490.18µs) 403 [[kubelet/v1.9.0 (linux/amd64) kubernetes/925c127] 192.168.122.3:33354]
Jan 18 14:39:40 master kube-apiserver[2638]: I0118 14:39:40.097660 2638 rbac.go:116] RBAC DENY: user "system:node:192.168.122.3" groups ["system:nodes" "system:authenticated"] cannot "list" resource "pods" cluster-wide
Jan 18 14:39:40 master kube-apiserver[2638]: I0118 14:39:40.098047 2638 wrap.go:42] GET /api/v1/pods?fieldSelector=spec.nodeName%3D192.168.122.3&limit=500&resourceVersion=0: (930.577µs) 403 [[kubelet/v1.9.0 (linux/amd64) kubernetes/925c127] 192.168.122.3:33354]
Jan 18 14:39:40 master kube-apiserver[2638]: I0118 14:39:40.359008 2638 wrap.go:42] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (16.594085ms) 200 [[kube-scheduler/v1.9.0 (linux/amd64) kubernetes/925c127/leader-election] 192.168.122.2:57472]
Jan 18 14:39:40 master kube-apiserver[2638]: I0118 14:39:40.367652 2638 wrap.go:42] PUT /api/v1/namespaces/kube-system/endpoints/kube-scheduler: (7.997847ms) 200 [[kube-scheduler/v1.9.0 (linux/amd64) kubernetes/925c127/leader-election] 192.168.122.2:57472]
Jan 18 14:39:40 master kube-apiserver[2638]: I0118 14:39:40.749509 2638 rbac.go:116] RBAC DENY: user "system:node:192.168.122.3" groups ["system:nodes" "system:authenticated"] cannot "list" resource "nodes" cluster-wide
2 原因:
按照k8s官方文檔(https://kubernetes.io/docs/admin/authorization/rbac/#service-account-permissions),存在如下的clusterrolebing。
1.8版本之前.開啓rbac後,apiserver默認綁定system:nodes組到system:node的clusterrole。v1.8之後,此綁定默認不存在,需要手工綁定,否則kubelet啓動後會報認證錯誤,使用kubectl get nodes查看無法成爲Ready狀態。
默認角色與默認角色綁定
API Server會創建一組默認的 ClusterRole和 ClusterRoleBinding對象。 這些默認對象中有許多包含 system:前綴,表明這些資源由Kubernetes基礎組件”擁有”。 對這些資源的修改可能導致非功能性集羣(non-functional cluster) 。一個例子是 system:node ClusterRole對象。這個角色定義了kubelets的權限。如果這個角色被修改,可能會導致kubelets無法正常工作。所有默認的ClusterRole和ClusterRoleBinding對象都會被標記爲kubernetes.io/bootstrapping=rbac-defaults。
使用命令kubectl get clusterrolebinding和kubectl get clusterrole可以查看系統中的角色與角色綁定
使用命令kubectl get clusterrolebindings system:node -o yaml或kubectl describe clusterrolebindings system:node查看system:node角色綁定的詳細信息:
root@master:~# kubectl describe clusterrolebindings system:node
Name: system:node
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate=true
Role:
Kind: ClusterRole
Name: system:node
Subjects:
Kind Name Namespace
---- ---- ---------
system:node角色默認綁定爲空。
創建角色綁定
在整個集羣中授予 ClusterRole ,包括所有命名空間。
從錯誤日誌中可看出信息:user "system:node:192.168.122.3" groups ["system:nodes" "system:authenticated"]或者從kubelet.kubeconfig查看使用的用戶。
在整個集羣範圍內將 system:node ClusterRole 授予用戶”system:node:192.168.122.3”或組”system:nodes”:
root@master:~# kubectl create clusterrolebinding kubelet-node-clusterbinding --clusterrole=system:node --user=system:node:192.168.122.3
clusterrolebinding "kubelet-node-clusterbinding" created
root@master:~# kubectl describe clusterrolebindings kubelet-node-clusterbinding
Name: kubelet-node-clusterbinding
Labels: <none>
Annotations: <none>
Role:
Kind: ClusterRole
Name: system:node
Subjects:
Kind Name Namespace
---- ---- ---------
User system:node:192.168.122.3
root@master:~#
root@master:~# kubectl delete clusterrolebindings kubelet-node-clusterbinding
clusterrolebinding "kubelet-node-clusterbinding" deleted
root@master:~#
root@master:~# kubectl create clusterrolebinding kubelet-node-clusterbinding --clusterrole=system:node --group=system:nodes
clusterrolebinding "kubelet-node-clusterbinding" created
root@master:~#
root@master:~# kubectl describe clusterrolebindings kubelet-node-clusterbinding
Name: kubelet-node-clusterbinding
Labels: <none>
Annotations: <none>
Role:
Kind: ClusterRole
Name: system:node
Subjects:
Kind Name Namespace
---- ---- ---------
Group system:nodes
此時節點狀態變更爲Ready
root@master:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.122.3 Ready <none> 1h v1.9.0
查看日誌都已經正常