kubeadm部署最新版k8s集羣v1.18.0

kubeadm部署最新版k8s集羣V1.18.0

安裝部署時間:2020年3月26日
文檔更新時間:2020年3月27日
文檔更新原因:國內yum源有1.18.0版本kubeadm kubelet kubectl

官方文檔:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

kubeadm部署k8s高可用集羣的官方文檔:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/

更新日誌:

(新版本剛出,沒有中文更新日誌,可能部分解釋有偏差)

https://kubernetes.io/blog/
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md

	我們很高興宣佈Kubernetes 1.18的交付,這是我們2020年的第一版!Kubernetes 1.18包含38個增強功能:15個增強功能已趨於穩定,beta增強了11個,alpha增強了12個。

Kubernetes 1.18是一個“完美”的版本。爲了改善用戶體驗,已經在改進beta和穩定功能方面進行了大量工作。我們付	出了同等的努力來增加新的開發和令人興奮的新功能,這些承諾有望進一步增強用戶體驗。對alpha,beta和穩定版進行幾乎一樣多的增強是一項偉大的成就。它顯示了社區在提高Kubernetes的可靠性以及繼續擴展其現有功能方面所做的巨大努力。

主要主題
Kubernetes拓撲管理器移至Beta-對齊!
	拓撲管理器功能是 1.18版中Kubernetes的beta功能,它使CPU和設備(例如SR-IOV VF)實現NUMA對齊,這將使您的工作負載在針對低延遲而優化的環境中運行。在引入拓撲管理器之前,CPU和設備管理器將做出彼此獨立的資源分配決策。這可能會導致在多套接字系統上分配不良信息,從而導致延遲關鍵型應用程序的性能下降。

Serverside Apply推出Beta 2
	服務器端Apply在1.16中升級爲Beta,但現在在1.18中引入了第二個Beta。這個新版本將跟蹤和管理所有新Kubernetes對象的字段更改,從而使您知道什麼更改了資源以及何時更改。

使用IngressClass擴展Ingress並用IngressClass替換不推薦使用的註釋
	在Kubernetes 1.18中,Ingress有兩個重要的補充:一個新pathType字段和一個新IngressClass資源。該pathType字段允許指定路徑應如何匹配。除了默認ImplementationSpecific類型外,還有new Exact和Prefixpath類型。

	該IngressClass資源用於描述Kubernetes集羣中的Ingress類型。入口可以通過ingressClassName在入口上使用新字段來指定與它們關聯的類。此新資源和字段替換了不建議使用的kubernetes.io/ingress.class註釋。

SIG-CLI引入kubectl調試
	SIG-CLI一直在爭論是否需要調試實用程序。隨着臨時容器的發展,我們可以通過在kubectl exec。該kubectl debug命令的添加(它是Alpha,但歡迎您提供反饋),使開發人員可以輕鬆地在集羣中調試其Pod。我們認爲這種增加是無價的。此命令允許創建一個臨時容器,該容器在要檢查的Pod旁邊運行,並且還附加到控制檯以進行交互式故障排除。

爲Kubernetes引入Windows CSI支持Alpha
	隨着Kubernetes 1.18的發佈,用於Windows的CSI代理的Alpha版本也已發佈。CSI代理使非特權(預先批准)的容器能夠在Windows上執行特權存儲操作。現在,可以利用CSI代理在Windows中支持CSI驅動程序。

其他更新
	畢業到穩定
	污穢驅逐
	kubectl diff
	CSI塊存儲支持
	API Server空運行
	在CSI呼叫中傳遞Pod信息
	支持樹外vSphere Cloud Provider
	爲Windows工作負載支持GMSA
	跳過附加的非附加CSI卷
	PVC克隆
	將kubectl軟件包代碼移至暫存
	Windows的RunAsUserName
	適用於服務和端點的AppProtocol
	擴展大頁面功能
	客戶端簽名重構,以標準化選項和上下文處理
	節點本地DNS緩存
主要變化
	EndpointSlice API
	將kubectl軟件包代碼移至暫存
	CertificateSigningRequest API
	擴展大頁面功能
	客戶端簽名重構,以標準化選項和上下文處理

各個組件之間的兼容性關係

注:因網絡原因,kubeadm,kubectl,kubelet的版本無法更新到V1.18.0,故將使用最新版V1.17.4,且可兼容。後續國內yum源支持,可根據下面官方文檔升級。(已更新,請忽略)

升級kubeadm集羣:
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/

kubernetes各個組件版本之間的兼容性關係:
https://kubernetes.io/zh/docs/setup/release/version-skew-policy/


kubelet
kubelet 版本號不能高於 kube-apiserver,最多可以比 kube-apiserver 低兩個小版本。

例如:
kube-apiserver 版本號如果是 1.13
kubelet 只能是 1.13 、 1.12 和 1.11


kubectl
kubectl 可以比 kube-apiserver 高一個小版本,也可以低一個小版本。

例如:
如果 kube-apiserver 當前是 1.13 版本
kubectl 則支持 1.14 、1.13 和 1.12
查看指定版本k8s,所需鏡像的版本:
[root@k8s-master ~]# kubeadm config images list --kubernetes-version=v1.18.0

k8s.gcr.io/kube-apiserver:v1.18.0
k8s.gcr.io/kube-controller-manager:v1.18.0
k8s.gcr.io/kube-scheduler:v1.18.0
k8s.gcr.io/kube-proxy:v1.18.0
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/coredns:1.6.7

安裝開始

準備三臺機器

vi /etc/hosts
192.168.100.10   kub-k8s-master
192.168.100.20   kub-k8s-node1
192.168.100.30   kub-k8s-node2

所有機器系統配置

1.關閉防火牆:
# systemctl stop firewalld
# systemctl disable firewalld
2.禁用SELinux:
# setenforce 0
3.編輯文件/etc/selinux/config,將SELINUX修改爲disabled,如下:
# sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux
SELINUX=disabled 

關閉系統Swap:1.5之後的新規定

Kubernetes 1.8開始要求關閉系統的Swap,如果不關閉,默認配置下kubelet將無法啓動。方法一,通過kubelet的啓動參數–fail-swap-on=false更改這個限制。方法二,關閉系統的Swap。

1.關閉swap分區
# swapoff -a
修改/etc/fstab文件,註釋掉SWAP的自動掛載,使用free -m確認swap已經關閉。
2.註釋掉swap分區:
# sed -i 's/.*swap.*/#&/' /etc/fstab
# free -m
              total        used        free      shared  buff/cache   available
Mem:           1980         123        1310           9         546        1693
Swap:             0           0           0

# 注:以上兩步都做,第一步臨時關閉,第二步永久關閉

安裝docker–三臺機器都操作

# yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
# yum install -y yum-utils device-mapper-persistent-data lvm2 git
# yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

# yum install docker-ce -y

啓動並設置開機啓動

阿里下載鏡像

[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-controller-manager:v1.18.0
[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-proxy:v1.18.0
[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-apiserver:v1.18.0
[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-scheduler:v1.18.0
[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/coredns:1.6.7
[root@k8s-master ~]# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0
[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/pause:3.2

給鏡像打tag

[root@k8s-master ~]# docker tag registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-controller-manager:v1.18.0 k8s.gcr.io/kube-controller-manager:v1.18.0
[root@k8s-master ~]# docker tag registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-proxy:v1.18.0 k8s.gcr.io/kube-proxy:v1.18.0
[root@k8s-master ~]# docker tag registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-apiserver:v1.18.0 k8s.gcr.io/kube-apiserver:v1.18.0
[root@k8s-master ~]# docker tag registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-scheduler:v1.18.0 k8s.gcr.io/kube-scheduler:v1.18.0
[root@k8s-master ~]# docker tag 67d k8s.gcr.io/coredns:1.6.7
[root@k8s-master ~]# docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0 k8s.gcr.io/etcd:3.4.3-0
[root@k8s-master ~]# docker tag 80d k8s.gcr.io/pause:3.2

使用kubeadm部署Kubernetes:

所有節點安裝kubeadm和kubelet:

配置yum源
# cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
所有節點:
1.安裝
# yum makecache fast
# yum install -y kubelet kubeadm kubectl ipvsadm

注:ipvsadm(此處只要不是非常老的內核,可不安裝,只需要將所需內核模塊開啓即可)

2.加載ipvs相關內核模塊
如果重新開機,需要重新加載(可以寫在 /etc/rc.local 中開機自動加載)
# modprobe ip_vs
# modprobe ip_vs_rr
# modprobe ip_vs_wrr
# modprobe ip_vs_sh
# modprobe nf_conntrack_ipv4
3.編輯文件添加開機啓動
# vim /etc/rc.local 
# chmod +x /etc/rc.local

重啓服務器

4.配置:
配置內核參數,將橋接的IPv4流量傳遞到iptables的鏈
# cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0
net.ipv4.ip_forward = 1
EOF

注:net.ipv4.ip_forward = 1  如果不配置,後期node節點無法訪問其他pod ip
5.使配置生效
# sysctl --system

6.如果net.bridge.bridge-nf-call-iptables報錯,加載br_netfilter模塊
# modprobe br_netfilter
# sysctl -p /etc/sysctl.d/k8s.conf


7.查看是否加載成功
# lsmod | grep ip_vs
ip_vs_sh               12688  0 
ip_vs_wrr              12697  0 
ip_vs_rr               12600  0 
ip_vs                 141092  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack          133387  2 ip_vs,nf_conntrack_ipv4
libcrc32c              12644  3 xfs,ip_vs,nf_conntrack

配置啓動kubelet(所有節點)

1.配置kubelet使用pause鏡像
獲取docker的cgroups
# DOCKER_CGROUPS=$(docker info | grep 'Cgroup' | cut -d' ' -f4)
# echo $DOCKER_CGROUPS
=================================
配置變量:
[root@k8s-master ~]# DOCKER_CGROUPS=`docker info |grep 'Cgroup' | awk '{print $3}'`
[root@k8s-master ~]# echo $DOCKER_CGROUPS
cgroupfs

2.配置kubelet的cgroups
# cat >/etc/sysconfig/kubelet<<EOF
KUBELET_EXTRA_ARGS="--cgroup-driver=$DOCKER_CGROUPS --pod-infra-container-image=k8s.gcr.io/pause:3.2"
EOF
啓動
# systemctl daemon-reload
# systemctl enable kubelet && systemctl restart kubelet
在這裏使用 # systemctl status kubelet,你會發現報錯誤信息;

10月 11 00:26:43 node1 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
10月 11 00:26:43 node1 systemd[1]: Unit kubelet.service entered failed state.
10月 11 00:26:43 node1 systemd[1]: kubelet.service failed.

運行 # journalctl -xefu kubelet 命令查看systemd日誌才發現,真正的錯誤是:
    unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory
#這個錯誤在運行kubeadm init 生成CA證書後會被自動解決,此處可先忽略。
#簡單地說就是在kubeadm init 之前kubelet會不斷重啓。

配置master節點

運行初始化過程如下:
[root@master ~# kubeadm init --kubernetes-version=v1.18.0 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.100.10 --ignore-preflight-errors=Swap
注:
apiserver-advertise-address=192.168.100.10    ---master的ip地址。
--kubernetes-version=v1.18.0   --更具具體版本進行修改
注意在檢查一下swap分區是否關閉

注:k8s官方推薦是用systemd驅動程序

如果報錯會有版本提示,那就是有更新新版本了
[init] Using Kubernetes version: v1.16.1
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.03.0-ce. Latest validated version: 18.09
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kub-k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.246.166]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kub-k8s-master localhost] and IPs [192.168.246.166 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kub-k8s-master localhost] and IPs [192.168.246.166 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 24.575209 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.16" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node kub-k8s-master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kub-k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 93erio.hbn2ti6z50he0lqs
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.100.10:6443 --token 93erio.hbn2ti6z50he0lqs \
    --discovery-token-ca-cert-hash sha256:3bc60f06a19bd09f38f3e05e5cff4299011b7110ca3281796668f4edb29a56d9  #需要記住

=======================================================================================
  
上面記錄了完成的初始化輸出的內容,根據輸出的內容基本上可以看出手動初始化安裝一個Kubernetes集羣所需要的關鍵步驟。
其中有以下關鍵內容:
    [kubelet] 生成kubelet的配置文件”/var/lib/kubelet/config.yaml”
    [certificates]生成相關的各種證書
    [kubeconfig]生成相關的kubeconfig文件
    [bootstraptoken]生成token記錄下來,後邊使用kubeadm join往集羣中添加節點時會用到
  
配置使用kubectl
如下操作在master節點操作
[root@kub-k8s-master ~]# rm -rf $HOME/.kube
[root@kub-k8s-master ~]# mkdir -p $HOME/.kube
[root@kub-k8s-master ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@kub-k8s-master ~]# chown $(id -u):$(id -g) $HOME/.kube/config

查看node節點
[root@k8s-master ~]# kubectl get nodes
NAME         STATUS     ROLES    AGE     VERSION
k8s-master   NotReady   master   2m41s   v1.18.0

配置使用網絡插件

在master節點操作
下載配置
# cd ~ && mkdir flannel && cd flannel
# curl -O https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

修改配置文件kube-flannel.yml:
此處的ip配置要與上面kubeadm的pod-network一致,本來就一致,不用改
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
# 這裏注意kube-flannel.yml這個文件裏的flannel的鏡像是0.11.0,quay.io/coreos/flannel:v0.11.0-amd64
# 默認的鏡像是quay.io/coreos/flannel:v0.11.0-amd64,需要提前pull下來。


# 如果Node有多個網卡的話,參考flannel issues 39701,
# https://github.com/kubernetes/kubernetes/issues/39701
# 目前需要在kube-flannel.yml中使用--iface參數指定集羣主機內網網卡的名稱,
# 否則可能會出現dns無法解析。容器無法通信的情況,需要將kube-flannel.yml下載到本地,
# flanneld啓動參數加上--iface=<iface-name>
    containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-amd64
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        - --iface=ens33
        - --iface=eth0
        
⚠️⚠️⚠️--iface=ens33 的值,是你當前的網卡,或者可以指定多網卡

# 1.12版本的kubeadm額外給node1節點設置了一個污點(Taint):node.kubernetes.io/not-ready:NoSchedule,
# 很容易理解,即如果節點還沒有ready之前,是不接受調度的。可是如果Kubernetes的網絡插件還沒有部署的話,節點是不會進入ready狀態的。
# 因此修改以下kube-flannel.yaml的內容,加入對node.kubernetes.io/not-ready:NoSchedule這個污點的容忍:
    - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - arm64
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      - key: node.kubernetes.io/not-ready  #添加如下三行---在261行左右
        operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel

啓動:
# kubectl apply -f ~/flannel/kube-flannel.yml  #啓動完成之後需要等待一會
NAME                                     READY   STATUS    RESTARTS   AGE
coredns-5644d7b6d9-sm8hs                 1/1     Running   0          9m18s
coredns-5644d7b6d9-vddll                 1/1     Running   0          9m18s
etcd-kub-k8s-master                      1/1     Running   0          8m14s
kube-apiserver-kub-k8s-master            1/1     Running   0          8m17s
kube-controller-manager-kub-k8s-master   1/1     Running   0          8m20s
kube-flannel-ds-amd64-9wgd8              1/1     Running   0          8m42s
kube-proxy-sgphs                         1/1     Running   0          9m18s
kube-scheduler-kub-k8s-master            1/1     Running   0          8m10s

查看:
# kubectl get pods --namespace kube-system
# kubectl get service
# kubectl get svc --namespace kube-system
只有網絡插件也安裝配置完成之後,才能會顯示爲ready狀態

在這裏插入圖片描述
[]
在這裏插入圖片描述
所有node節點操作

配置node節點加入集羣:

在所有node節點操作,此命令爲初始化master成功後返回的結果
# kubeadm join 192.168.100.10:6443 --token 93erio.hbn2ti6z50he0lqs \
    --discovery-token-ca-cert-hash sha256:3bc60f06a19bd09f38f3e05e5cff4299011b7110ca3281796668f4edb29a56d9

在這裏插入圖片描述
在master操作:

各種檢測:
1.查看pods:
[root@kub-k8s-master ~]# kubectl get pods -n kube-system
NAME                                     READY   STATUS    RESTARTS   AGE
coredns-5644d7b6d9-sm8hs                 1/1     Running   0          39m
coredns-5644d7b6d9-vddll                 1/1     Running   0          39m
etcd-kub-k8s-master                      1/1     Running   0          37m
kube-apiserver-kub-k8s-master            1/1     Running   0          38m
kube-controller-manager-kub-k8s-master   1/1     Running   0          38m
kube-flannel-ds-amd64-9wgd8              1/1     Running   0          38m
kube-flannel-ds-amd64-lffc8              1/1     Running   0          2m11s
kube-flannel-ds-amd64-m8kk2              1/1     Running   0          2m2s
kube-proxy-dwq9l                         1/1     Running   0          2m2s
kube-proxy-l77lz                         1/1     Running   0          2m11s
kube-proxy-sgphs                         1/1     Running   0          39m
kube-scheduler-kub-k8s-master            1/1     Running   0          37m

2.查看異常pod信息:
[root@kub-k8s-master ~]# kubectl  describe pods kube-flannel-ds-sr6tq -n  kube-system
Name:               kube-flannel-ds-sr6tq
Namespace:          kube-system
Priority:           0
PriorityClassName:  <none>
。。。。。
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Pulling    12m                  kubelet, node2     pulling image "registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64"
  Normal   Pulled     11m                  kubelet, node2     Successfully pulled image "registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64"
  Normal   Created    11m                  kubelet, node2     Created container
  Normal   Started    11m                  kubelet, node2     Started container
  Normal   Created    11m (x4 over 11m)    kubelet, node2     Created container
  Normal   Started    11m (x4 over 11m)    kubelet, node2     Started container
  Normal   Pulled     10m (x5 over 11m)    kubelet, node2     Container image "registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64" already present on machine
  Normal   Scheduled  7m15s                default-scheduler  Successfully assigned kube-system/kube-flannel-ds-sr6tq to node2
  Warning  BackOff    7m6s (x23 over 11m)  kubelet, node2     Back-off restarting failed container

3.遇到這種情況直接 刪除異常pod:
[root@kub-k8s-master ~]# kubectl delete pod kube-flannel-ds-sr6tq -n kube-system
pod "kube-flannel-ds-sr6tq" deleted

4.查看pods:
[root@kub-k8s-master ~]# kubectl get pods -n kube-system
NAME                                     READY   STATUS    RESTARTS   AGE
coredns-5644d7b6d9-sm8hs                 1/1     Running   0          44m
coredns-5644d7b6d9-vddll                 1/1     Running   0          44m
etcd-kub-k8s-master                      1/1     Running   0          42m
kube-apiserver-kub-k8s-master            1/1     Running   0          43m
kube-controller-manager-kub-k8s-master   1/1     Running   0          43m
kube-flannel-ds-amd64-9wgd8              1/1     Running   0          43m
kube-flannel-ds-amd64-lffc8              1/1     Running   0          7m10s
kube-flannel-ds-amd64-m8kk2              1/1     Running   0          7m1s
kube-proxy-dwq9l                         1/1     Running   0          7m1s
kube-proxy-l77lz                         1/1     Running   0          7m10s
kube-proxy-sgphs                         1/1     Running   0          44m
kube-scheduler-kub-k8s-master            1/1     Running   0          42m

5.查看節點:
[root@kub-k8s-master ~]# kubectl get nodes
NAME             STATUS   ROLES    AGE     VERSION
kub-k8s-master   Ready    master   43m     v1.18.0
kub-k8s-node1    Ready    <none>   6m46s   v1.18.0
kub-k8s-node2    Ready    <none>   6m37s   v1.18.0
到此集羣配置完成

錯誤整理

錯誤
問題1:服務器時間不一致會報錯
查看服務器時間
=====================================
問題2:kubeadm init不成功,發現如下提示,然後超時報錯
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s

查看kubelet狀態發現如下錯誤,主機master找不到和鏡像下載失敗,發現pause鏡像是從aliyuncs下載的,其實我已經下載好了官方的pause鏡像,按着提示的鏡像名稱重新給pause鏡像打個ali的tag,最後重置kubeadm的環境重新初始化,錯誤解決
[root@master manifests]# systemctl  status kubelet -l
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since 四 2019-01-31 15:20:32 CST; 5min ago
     Docs: https://kubernetes.io/docs/
 Main PID: 23908 (kubelet)
    Tasks: 19
   Memory: 30.8M
   CGroup: /system.slice/kubelet.service
           └─23908 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --cgroup-driver=cgroupfs --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1

1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.432357   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.532928   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.633192   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: I0131 15:25:41.729296   23908 kubelet_node_status.go:278] Setting node annotation to enable volume controller attach/detach
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.733396   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740110   23908 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740153   23908 kuberuntime_sandbox.go:68] CreatePodSandbox for pod "kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)" failed: rpc error: code = Unknown desc = failed pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740166   23908 kuberuntime_manager.go:662] createPodSandbox for pod "kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)" failed: rpc error: code = Unknown desc = failed pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740207   23908 pod_workers.go:190] Error syncing pod e8f43404e60ae844e375d50b1e39d91e ("kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)"), skipping: failed to "CreatePodSandbox" for "kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)" with CreatePodSandboxError: "CreatePodSandbox for pod \"kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)\" failed: rpc error: code = Unknown desc = failed pulling image \"registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1\": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument"
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.833981   23908 kubelet.go:2266] node "master" not found

解決方式

重置kubeadm環境
整個集羣所有節點(包括master)重置/移除節點
1.驅離k8s-node-1節點上的pod(master上)
[root@kub-k8s-master ~]# kubectl drain kub-k8s-node1 --delete-local-data --force --ignore-daemonsets

2.刪除節點(master上)
[root@kub-k8s-master ~]# kubectl delete node kub-k8s-node1

3.重置節點(node上-也就是在被刪除的節點上)
[root@kub-k8s-node1 ~]# kubeadm reset

注1:需要把master也驅離、刪除、重置,這裏給我坑死了,第一次沒有驅離和刪除master,最後的結果是查看結果一切正常,但coredns死活不能用,搞了整整1天,切勿嘗試

注2:master上在reset之後需要刪除如下文件
# rm -rf /var/lib/cni/ $HOME/.kube/config

###注意:如果整個k8s集羣都做完了,需要重置按照上面步驟操作。如果是在初始化出錯只需要操作第三步

重新生成token

kubeadm 生成的token過期後,集羣增加節點

通過kubeadm初始化後,都會提供node加入的token:
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 192.168.100.10:6443 --token n38l80.y2icehgzsyuzkthi \
    --discovery-token-ca-cert-hash sha256:5fb6576ef82b5655dee285e0c93432aee54d38779bc8488c32f5cbbb90874bac
默認token的有效期爲24小時,當過期之後,該token就不可用了。

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

解決方法:
1. 重新生成新的token:
[root@node1 flannel]# kubeadm  token create
kiyfhw.xiacqbch8o8fa8qj
[root@node1 flannel]# kubeadm  token list
TOKEN                     TTL         EXPIRES                     USAGES                   DESCRIPTION   EXTRA GROUPS
gvvqwk.hn56nlsgsv11mik6   <invalid>   2018-10-25T14:16:06+08:00   authentication,signing   <none>        system:bootstrappers:kubeadm:default-node-token
kiyfhw.xiacqbch8o8fa8qj   23h         2018-10-27T06:39:24+08:00   authentication,signing   <none>        system:bootstrappers:kubeadm:default-node-token

2. 獲取ca證書sha256編碼hash值:
[root@node1 flannel]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
5417eb1b68bd4e7a4c82aded83abc55ec91bd601e45734d6aba85de8b1ebb057

3. 節點加入集羣:
  kubeadm join 18.16.202.35:6443 --token kiyfhw.xiacqbch8o8fa8qj --discovery-token-ca-cert-hash sha256:5417eb1b68bd4e7a4c82aded83abc55ec91bd601e45734d6aba85de8b1ebb057
幾秒鐘後,您應該注意到kubectl get nodes在主服務器上運行時輸出中的此節點。

上面的方法比較繁瑣,一步到位:
kubeadm token create --print-join-command

第二種方法:
token=$(kubeadm token generate)
kubeadm token create $token --print-join-command --ttl=0
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章