在CentOS7上安裝完K8S單Master+3Node集羣后,安裝kubernetes-dashboard時顯示爲CrashLoopBackOff或Error狀態:
kubernetes-dashboard dashboard-metrics-scraper-6f669b9c9b-btkzj 1/1 Running 0 2m3s kubernetes-dashboard kubernetes-dashboard-758765f476-x8rc7 0/1 CrashLoopBackOff 2 (12s ago
查看具體的Pod日誌:
[root@master1 ~]# kubectl logs -f -n kubernetes-dashboard kubernetes-dashboard-758765f476-x8rc7 2023/01/09 07:53:16 Using namespace: kubernetes-dashboard 2023/01/09 07:53:16 Using in-cluster config to connect to apiserver 2023/01/09 07:53:16 Starting overwatch 2023/01/09 07:53:16 Using secret token for csrf signing 2023/01/09 07:53:16 Initializing csrf token from kubernetes-dashboard-csrf secret panic: Get "https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf":
dial tcp 10.96.0.1:443: connect: no route to host goroutine 1 [running]: github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc0004dfae8) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:41 +0x30e github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:66 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc000096b80) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:527 +0x94 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0x19aba3a?) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:495 +0x32 github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:594 main.main() /home/runner/work/dashboard/dashboard/src/app/backend/dashboard.go:96 +0x1cf
起初以爲是相關端口(例如443)沒在防火牆開放的原因,於是就把相關端口在所有機器上都開放了,並使用以下命令卸載乾淨了kubernetes-dashboard:
sudo kubectl delete deployment kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete service kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete service dashboard-metrics-scraper --namespace=kubernetes-dashboard sudo kubectl delete role.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete clusterrole.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete rolebinding.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete clusterrolebinding.rbac.authorization.k8s.io kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete deployment.apps kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete deployment.apps dashboard-metrics-scraper --namespace=kubernetes-dashboard sudo kubectl delete sa kubernetes-dashboard --namespace=kubernetes-dashboard sudo kubectl delete secret kubernetes-dashboard-certs --namespace=kubernetes-dashboard sudo kubectl delete secret kubernetes-dashboard-csrf --namespace=kubernetes-dashboard sudo kubectl delete secret kubernetes-dashboard-key-holder --namespace=kubernetes-dashboard sudo kubectl delete namespace kubernetes-dashboard sudo kubectl delete configmap kubernetes-dashboard-settings
重裝之後發現還是老子樣:
[root@master1 ~]# kubectl apply -f kubernetes-dashboard.yaml
於是乾脆將所有機器上的防火牆臨時關閉了:
[root@master1 ~]# systemctl stop firewalld
然後重新卸載乾淨kubernetes-dashboard,並進行重裝,結果還是報錯,但錯誤有一點點不一樣了:
[root@master1 ~]# kubectl logs -f -n kubernetes-dashboard kubernetes-dashboard-758765f476-g7lqs 2023/01/09 08:03:05 Starting overwatch 2023/01/09 08:03:05 Using namespace: kubernetes-dashboard 2023/01/09 08:03:05 Using in-cluster config to connect to apiserver 2023/01/09 08:03:05 Using secret token for csrf signing 2023/01/09 08:03:05 Initializing csrf token from kubernetes-dashboard-csrf secret panic: Get "https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf":
dial tcp 10.96.0.1:443: i/o timeout goroutine 1 [running]: github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc00049fae8) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:41 +0x30e github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/csrf/manager.go:66 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc0001c4100) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:527 +0x94 github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0x19aba3a?) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:495 +0x32 github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...) /home/runner/work/dashboard/dashboard/src/app/backend/client/manager.go:594 main.main() /home/runner/work/dashboard/dashboard/src/app/backend/dashboard.go:96 +0x1cf
此時可以確定的是——應該不是端口開放之類的防火牆問題了,因爲此時所有機器的防火牆都處於關閉狀態!所以網上其他兄弟說的執行:iptables -L -n --line-numbers | grep dashboard,發現是ipatables規則問題就不成立了,雖然我這裏也確實能查到相關攔截規則記錄:
[root@master1 ~]# iptables -L -n --line-numbers | grep dashboard 1 REJECT tcp -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes-dashboard/kubernetes-dashboard has no endpoints */ ADDRTYPE match dst-type LOCAL tcp dpt:32646 reject-with icmp-port-unreachable 1 REJECT tcp -- 0.0.0.0/0 10.103.89.4 /* kubernetes-dashboard/kubernetes-dashboard has no endpoints */ tcp dpt:443 reject-with icmp-port-unreachable
經過多方查證,最後發現竟然是一開始使用 kubeadm init 命令初始化主節點Master時,配置文件裏的 podSubnet: 192.168.0.0/16 (或直接的命令參數--pod-network-cidr=192.168.0.0/16)的網段與當前Master節點的內網IP地址(192.168.17.3)網段重疊,導致在分配網絡的時候會出現問題,在安裝kubernetes-dashboard組件的時候Node工作節點和Master主節點之間無法互訪,導致安裝失敗。
解決辦法是:重新安裝k8s集羣
步驟是:
1、卸載乾淨了kubernetes-dashboard (見前文)
2、刪除Node節點
[root@master1 ~]# kubectl get nodes [root@master1 ~]# kubectl drain node1.localk8s --delete-emptydir-data --force --ignore-daemonsets [root@master1 ~]# kubectl drain node2.localk8s --delete-emptydir-data --force --ignore-daemonsets [root@master1 ~]# kubectl drain node3.localk8s --delete-emptydir-data --force --ignore-daemonsets [root@master1 ~]# kubectl delete nodes node1.localk8s node2.localk8s node3.localk8s
2、重置K8S集羣
[root@master1 ~]# kubeadm reset
3、刪除殘餘文件
[root@master1 ~]# rm -rf /etc/kubernetes [root@master1 ~]# rm -rf /var/lib/etcd/ [root@master1 ~]# rm -rf $HOME/.kube
4、修改kubeadm init 初始化配置文件中的podSubnet,並保存
比如 podSubnet: 192.168.0.0/16 改爲 podSubnet: 192.169.0.0/16
5、重新初始化K8S主節點
[root@master1 ~]# kubeadm init --config=/etc/kubeadm/init.default.yaml
後面重新將安裝K8S集羣安裝部署操作就行了,然後重新安裝kubernetes-dashboard
最後查看安裝結果:
[root@master1 ~]# kubectl get svc,pods -n kubernetes-dashboard NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/dashboard-metrics-scraper ClusterIP 10.96.254.169 <none> 8000/TCP 67s service/kubernetes-dashboard NodePort 10.96.146.223 <none> 443:31529/TCP 67s NAME READY STATUS RESTARTS AGE pod/dashboard-metrics-scraper-6f669b9c9b-lcdnv 1/1 Running 0 67s pod/kubernetes-dashboard-758765f476-vfjx5 1/1 Running 0 67s
現在可以根據結果中顯示的端口號,使用任意一個Node節點去訪問kubernetes-dashboard了:
獲取指定用戶(kubernetes-dashboard)的Token命令:
kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep kubernetes-dashboard | awk '{print $1}')
最後建議一下,最好在kubernetes-dashboard的yaml配置文件中,將dashboard的訪問端口一開始就固定下來:
增加一個nodePort: 31529 用於暴露服務
spec:
type: NodePort
ports:
- prot: 443
targetProt: 8443
nodePort: 31529
以下命令用於實時監控集羣內Pod的狀態:
----實時監控----
watch -n 3 kubectl get pods -A
----單次查看(可-n指定namespace)----
kubectl get pods -A -o wide
附上kubernetes-dashboard v2.5.1 的配置文件(紅色爲自定義增加的配置)
Kubernetes version | 1.20 | 1.21 | 1.22 | 1.23 |
---|---|---|---|---|
Compatibility | ? | ? | ? | ✓ |
# Copyright 2017 The Kubernetes Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. apiVersion: v1 kind: Namespace metadata: name: kubernetes-dashboard --- apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard --- kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard spec: type: NodePort ports: - port: 443 targetPort: 8443 nodePort: 31529 selector: k8s-app: kubernetes-dashboard --- apiVersion: v1 kind: Secret metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard-certs namespace: kubernetes-dashboard type: Opaque --- apiVersion: v1 kind: Secret metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard-csrf namespace: kubernetes-dashboard type: Opaque data: csrf: "" --- apiVersion: v1 kind: Secret metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard-key-holder namespace: kubernetes-dashboard type: Opaque --- kind: ConfigMap apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard-settings namespace: kubernetes-dashboard --- kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard rules: # Allow Dashboard to get, update and delete Dashboard exclusive secrets. - apiGroups: [""] resources: ["secrets"] resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"] verbs: ["get", "update", "delete"] # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map. - apiGroups: [""] resources: ["configmaps"] resourceNames: ["kubernetes-dashboard-settings"] verbs: ["get", "update"] # Allow Dashboard to get metrics. - apiGroups: [""] resources: ["services"] resourceNames: ["heapster", "dashboard-metrics-scraper"] verbs: ["proxy"] - apiGroups: [""] resources: ["services/proxy"] resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"] verbs: ["get"] --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard rules: # Allow Metrics Scraper to get metrics from the Metrics server - apiGroups: ["metrics.k8s.io"] resources: ["pods", "nodes"] verbs: ["get", "list", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: kubernetes-dashboard subjects: - kind: ServiceAccount name: kubernetes-dashboard namespace: kubernetes-dashboard --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: kubernetes-dashboard roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: kubernetes-dashboard subjects: - kind: ServiceAccount name: kubernetes-dashboard namespace: kubernetes-dashboard --- kind: Deployment apiVersion: apps/v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard spec: replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: k8s-app: kubernetes-dashboard template: metadata: labels: k8s-app: kubernetes-dashboard spec: securityContext: seccompProfile: type: RuntimeDefault containers: - name: kubernetes-dashboard image: kubernetesui/dashboard:v2.5.1 imagePullPolicy: Always ports: - containerPort: 8443 protocol: TCP args: - --auto-generate-certificates - --namespace=kubernetes-dashboard # Uncomment the following line to manually specify Kubernetes API server Host # If not specified, Dashboard will attempt to auto discover the API server and connect # to it. Uncomment only if the default does not work. # - --apiserver-host=http://my-address:port volumeMounts: - name: kubernetes-dashboard-certs mountPath: /certs # Create on-disk volume to store exec logs - mountPath: /tmp name: tmp-volume livenessProbe: httpGet: scheme: HTTPS path: / port: 8443 initialDelaySeconds: 30 timeoutSeconds: 30 securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsUser: 1001 runAsGroup: 2001 volumes: - name: kubernetes-dashboard-certs secret: secretName: kubernetes-dashboard-certs - name: tmp-volume emptyDir: {} serviceAccountName: kubernetes-dashboard nodeSelector: "kubernetes.io/os": linux # Comment the following tolerations if Dashboard must not be deployed on master tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule --- kind: Service apiVersion: v1 metadata: labels: k8s-app: dashboard-metrics-scraper name: dashboard-metrics-scraper namespace: kubernetes-dashboard spec: ports: - port: 8000 targetPort: 8000 selector: k8s-app: dashboard-metrics-scraper --- kind: Deployment apiVersion: apps/v1 metadata: labels: k8s-app: dashboard-metrics-scraper name: dashboard-metrics-scraper namespace: kubernetes-dashboard spec: replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: k8s-app: dashboard-metrics-scraper template: metadata: labels: k8s-app: dashboard-metrics-scraper spec: securityContext: seccompProfile: type: RuntimeDefault containers: - name: dashboard-metrics-scraper image: kubernetesui/metrics-scraper:v1.0.7 ports: - containerPort: 8000 protocol: TCP livenessProbe: httpGet: scheme: HTTP path: / port: 8000 initialDelaySeconds: 30 timeoutSeconds: 30 volumeMounts: - mountPath: /tmp name: tmp-volume securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsUser: 1001 runAsGroup: 2001 serviceAccountName: kubernetes-dashboard nodeSelector: "kubernetes.io/os": linux # Comment the following tolerations if Dashboard must not be deployed on master tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule volumes: - name: tmp-volume emptyDir: {}