kubeadm部署最新版k8s集群v1.18.0

kubeadm部署最新版k8s集群V1.18.0

安装部署时间:2020年3月26日
文档更新时间:2020年3月27日
文档更新原因:国内yum源有1.18.0版本kubeadm kubelet kubectl

官方文档:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

kubeadm部署k8s高可用集群的官方文档:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/

更新日志:

(新版本刚出,没有中文更新日志,可能部分解释有偏差)

https://kubernetes.io/blog/
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md

	我们很高兴宣布Kubernetes 1.18的交付,这是我们2020年的第一版!Kubernetes 1.18包含38个增强功能:15个增强功能已趋于稳定,beta增强了11个,alpha增强了12个。

Kubernetes 1.18是一个“完美”的版本。为了改善用户体验,已经在改进beta和稳定功能方面进行了大量工作。我们付	出了同等的努力来增加新的开发和令人兴奋的新功能,这些承诺有望进一步增强用户体验。对alpha,beta和稳定版进行几乎一样多的增强是一项伟大的成就。它显示了社区在提高Kubernetes的可靠性以及继续扩展其现有功能方面所做的巨大努力。

主要主题
Kubernetes拓扑管理器移至Beta-对齐!
	拓扑管理器功能是 1.18版中Kubernetes的beta功能,它使CPU和设备(例如SR-IOV VF)实现NUMA对齐,这将使您的工作负载在针对低延迟而优化的环境中运行。在引入拓扑管理器之前,CPU和设备管理器将做出彼此独立的资源分配决策。这可能会导致在多套接字系统上分配不良信息,从而导致延迟关键型应用程序的性能下降。

Serverside Apply推出Beta 2
	服务器端Apply在1.16中升级为Beta,但现在在1.18中引入了第二个Beta。这个新版本将跟踪和管理所有新Kubernetes对象的字段更改,从而使您知道什么更改了资源以及何时更改。

使用IngressClass扩展Ingress并用IngressClass替换不推荐使用的注释
	在Kubernetes 1.18中,Ingress有两个重要的补充:一个新pathType字段和一个新IngressClass资源。该pathType字段允许指定路径应如何匹配。除了默认ImplementationSpecific类型外,还有new Exact和Prefixpath类型。

	该IngressClass资源用于描述Kubernetes集群中的Ingress类型。入口可以通过ingressClassName在入口上使用新字段来指定与它们关联的类。此新资源和字段替换了不建议使用的kubernetes.io/ingress.class注释。

SIG-CLI引入kubectl调试
	SIG-CLI一直在争论是否需要调试实用程序。随着临时容器的发展,我们可以通过在kubectl exec。该kubectl debug命令的添加(它是Alpha,但欢迎您提供反馈),使开发人员可以轻松地在集群中调试其Pod。我们认为这种增加是无价的。此命令允许创建一个临时容器,该容器在要检查的Pod旁边运行,并且还附加到控制台以进行交互式故障排除。

为Kubernetes引入Windows CSI支持Alpha
	随着Kubernetes 1.18的发布,用于Windows的CSI代理的Alpha版本也已发布。CSI代理使非特权(预先批准)的容器能够在Windows上执行特权存储操作。现在,可以利用CSI代理在Windows中支持CSI驱动程序。

其他更新
	毕业到稳定
	污秽驱逐
	kubectl diff
	CSI块存储支持
	API Server空运行
	在CSI呼叫中传递Pod信息
	支持树外vSphere Cloud Provider
	为Windows工作负载支持GMSA
	跳过附加的非附加CSI卷
	PVC克隆
	将kubectl软件包代码移至暂存
	Windows的RunAsUserName
	适用于服务和端点的AppProtocol
	扩展大页面功能
	客户端签名重构,以标准化选项和上下文处理
	节点本地DNS缓存
主要变化
	EndpointSlice API
	将kubectl软件包代码移至暂存
	CertificateSigningRequest API
	扩展大页面功能
	客户端签名重构,以标准化选项和上下文处理

各个组件之间的兼容性关系

注:因网络原因,kubeadm,kubectl,kubelet的版本无法更新到V1.18.0,故将使用最新版V1.17.4,且可兼容。后续国内yum源支持,可根据下面官方文档升级。(已更新,请忽略)

升级kubeadm集群:
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/

kubernetes各个组件版本之间的兼容性关系:
https://kubernetes.io/zh/docs/setup/release/version-skew-policy/


kubelet
kubelet 版本号不能高于 kube-apiserver,最多可以比 kube-apiserver 低两个小版本。

例如:
kube-apiserver 版本号如果是 1.13
kubelet 只能是 1.13 、 1.12 和 1.11


kubectl
kubectl 可以比 kube-apiserver 高一个小版本,也可以低一个小版本。

例如:
如果 kube-apiserver 当前是 1.13 版本
kubectl 则支持 1.14 、1.13 和 1.12
查看指定版本k8s,所需镜像的版本:
[root@k8s-master ~]# kubeadm config images list --kubernetes-version=v1.18.0

k8s.gcr.io/kube-apiserver:v1.18.0
k8s.gcr.io/kube-controller-manager:v1.18.0
k8s.gcr.io/kube-scheduler:v1.18.0
k8s.gcr.io/kube-proxy:v1.18.0
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/coredns:1.6.7

安装开始

准备三台机器

vi /etc/hosts
192.168.100.10   kub-k8s-master
192.168.100.20   kub-k8s-node1
192.168.100.30   kub-k8s-node2

所有机器系统配置

1.关闭防火墙:
# systemctl stop firewalld
# systemctl disable firewalld
2.禁用SELinux:
# setenforce 0
3.编辑文件/etc/selinux/config,将SELINUX修改为disabled,如下:
# sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux
SELINUX=disabled 

关闭系统Swap:1.5之后的新规定

Kubernetes 1.8开始要求关闭系统的Swap,如果不关闭,默认配置下kubelet将无法启动。方法一,通过kubelet的启动参数–fail-swap-on=false更改这个限制。方法二,关闭系统的Swap。

1.关闭swap分区
# swapoff -a
修改/etc/fstab文件,注释掉SWAP的自动挂载,使用free -m确认swap已经关闭。
2.注释掉swap分区:
# sed -i 's/.*swap.*/#&/' /etc/fstab
# free -m
              total        used        free      shared  buff/cache   available
Mem:           1980         123        1310           9         546        1693
Swap:             0           0           0

# 注:以上两步都做,第一步临时关闭,第二步永久关闭

安装docker–三台机器都操作

# yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
# yum install -y yum-utils device-mapper-persistent-data lvm2 git
# yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

# yum install docker-ce -y

启动并设置开机启动

阿里下载镜像

[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-controller-manager:v1.18.0
[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-proxy:v1.18.0
[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-apiserver:v1.18.0
[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-scheduler:v1.18.0
[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/coredns:1.6.7
[root@k8s-master ~]# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0
[root@k8s-master ~]# docker pull registry.cn-qingdao.aliyuncs.com/kubernetes-image/pause:3.2

给镜像打tag

[root@k8s-master ~]# docker tag registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-controller-manager:v1.18.0 k8s.gcr.io/kube-controller-manager:v1.18.0
[root@k8s-master ~]# docker tag registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-proxy:v1.18.0 k8s.gcr.io/kube-proxy:v1.18.0
[root@k8s-master ~]# docker tag registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-apiserver:v1.18.0 k8s.gcr.io/kube-apiserver:v1.18.0
[root@k8s-master ~]# docker tag registry.cn-qingdao.aliyuncs.com/kubernetes-image/kube-scheduler:v1.18.0 k8s.gcr.io/kube-scheduler:v1.18.0
[root@k8s-master ~]# docker tag 67d k8s.gcr.io/coredns:1.6.7
[root@k8s-master ~]# docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0 k8s.gcr.io/etcd:3.4.3-0
[root@k8s-master ~]# docker tag 80d k8s.gcr.io/pause:3.2

使用kubeadm部署Kubernetes:

所有节点安装kubeadm和kubelet:

配置yum源
# cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
所有节点:
1.安装
# yum makecache fast
# yum install -y kubelet kubeadm kubectl ipvsadm

注:ipvsadm(此处只要不是非常老的内核,可不安装,只需要将所需内核模块开启即可)

2.加载ipvs相关内核模块
如果重新开机,需要重新加载(可以写在 /etc/rc.local 中开机自动加载)
# modprobe ip_vs
# modprobe ip_vs_rr
# modprobe ip_vs_wrr
# modprobe ip_vs_sh
# modprobe nf_conntrack_ipv4
3.编辑文件添加开机启动
# vim /etc/rc.local 
# chmod +x /etc/rc.local

重启服务器

4.配置:
配置内核参数,将桥接的IPv4流量传递到iptables的链
# cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0
net.ipv4.ip_forward = 1
EOF

注:net.ipv4.ip_forward = 1  如果不配置,后期node节点无法访问其他pod ip
5.使配置生效
# sysctl --system

6.如果net.bridge.bridge-nf-call-iptables报错,加载br_netfilter模块
# modprobe br_netfilter
# sysctl -p /etc/sysctl.d/k8s.conf


7.查看是否加载成功
# lsmod | grep ip_vs
ip_vs_sh               12688  0 
ip_vs_wrr              12697  0 
ip_vs_rr               12600  0 
ip_vs                 141092  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack          133387  2 ip_vs,nf_conntrack_ipv4
libcrc32c              12644  3 xfs,ip_vs,nf_conntrack

配置启动kubelet(所有节点)

1.配置kubelet使用pause镜像
获取docker的cgroups
# DOCKER_CGROUPS=$(docker info | grep 'Cgroup' | cut -d' ' -f4)
# echo $DOCKER_CGROUPS
=================================
配置变量:
[root@k8s-master ~]# DOCKER_CGROUPS=`docker info |grep 'Cgroup' | awk '{print $3}'`
[root@k8s-master ~]# echo $DOCKER_CGROUPS
cgroupfs

2.配置kubelet的cgroups
# cat >/etc/sysconfig/kubelet<<EOF
KUBELET_EXTRA_ARGS="--cgroup-driver=$DOCKER_CGROUPS --pod-infra-container-image=k8s.gcr.io/pause:3.2"
EOF
启动
# systemctl daemon-reload
# systemctl enable kubelet && systemctl restart kubelet
在这里使用 # systemctl status kubelet,你会发现报错误信息;

10月 11 00:26:43 node1 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
10月 11 00:26:43 node1 systemd[1]: Unit kubelet.service entered failed state.
10月 11 00:26:43 node1 systemd[1]: kubelet.service failed.

运行 # journalctl -xefu kubelet 命令查看systemd日志才发现,真正的错误是:
    unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory
#这个错误在运行kubeadm init 生成CA证书后会被自动解决,此处可先忽略。
#简单地说就是在kubeadm init 之前kubelet会不断重启。

配置master节点

运行初始化过程如下:
[root@master ~# kubeadm init --kubernetes-version=v1.18.0 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.100.10 --ignore-preflight-errors=Swap
注:
apiserver-advertise-address=192.168.100.10    ---master的ip地址。
--kubernetes-version=v1.18.0   --更具具体版本进行修改
注意在检查一下swap分区是否关闭

注:k8s官方推荐是用systemd驱动程序

如果报错会有版本提示,那就是有更新新版本了
[init] Using Kubernetes version: v1.16.1
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.03.0-ce. Latest validated version: 18.09
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kub-k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.246.166]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kub-k8s-master localhost] and IPs [192.168.246.166 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kub-k8s-master localhost] and IPs [192.168.246.166 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 24.575209 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.16" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node kub-k8s-master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kub-k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 93erio.hbn2ti6z50he0lqs
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.100.10:6443 --token 93erio.hbn2ti6z50he0lqs \
    --discovery-token-ca-cert-hash sha256:3bc60f06a19bd09f38f3e05e5cff4299011b7110ca3281796668f4edb29a56d9  #需要记住

=======================================================================================
  
上面记录了完成的初始化输出的内容,根据输出的内容基本上可以看出手动初始化安装一个Kubernetes集群所需要的关键步骤。
其中有以下关键内容:
    [kubelet] 生成kubelet的配置文件”/var/lib/kubelet/config.yaml”
    [certificates]生成相关的各种证书
    [kubeconfig]生成相关的kubeconfig文件
    [bootstraptoken]生成token记录下来,后边使用kubeadm join往集群中添加节点时会用到
  
配置使用kubectl
如下操作在master节点操作
[root@kub-k8s-master ~]# rm -rf $HOME/.kube
[root@kub-k8s-master ~]# mkdir -p $HOME/.kube
[root@kub-k8s-master ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@kub-k8s-master ~]# chown $(id -u):$(id -g) $HOME/.kube/config

查看node节点
[root@k8s-master ~]# kubectl get nodes
NAME         STATUS     ROLES    AGE     VERSION
k8s-master   NotReady   master   2m41s   v1.18.0

配置使用网络插件

在master节点操作
下载配置
# cd ~ && mkdir flannel && cd flannel
# curl -O https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

修改配置文件kube-flannel.yml:
此处的ip配置要与上面kubeadm的pod-network一致,本来就一致,不用改
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
# 这里注意kube-flannel.yml这个文件里的flannel的镜像是0.11.0,quay.io/coreos/flannel:v0.11.0-amd64
# 默认的镜像是quay.io/coreos/flannel:v0.11.0-amd64,需要提前pull下来。


# 如果Node有多个网卡的话,参考flannel issues 39701,
# https://github.com/kubernetes/kubernetes/issues/39701
# 目前需要在kube-flannel.yml中使用--iface参数指定集群主机内网网卡的名称,
# 否则可能会出现dns无法解析。容器无法通信的情况,需要将kube-flannel.yml下载到本地,
# flanneld启动参数加上--iface=<iface-name>
    containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-amd64
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        - --iface=ens33
        - --iface=eth0
        
⚠️⚠️⚠️--iface=ens33 的值,是你当前的网卡,或者可以指定多网卡

# 1.12版本的kubeadm额外给node1节点设置了一个污点(Taint):node.kubernetes.io/not-ready:NoSchedule,
# 很容易理解,即如果节点还没有ready之前,是不接受调度的。可是如果Kubernetes的网络插件还没有部署的话,节点是不会进入ready状态的。
# 因此修改以下kube-flannel.yaml的内容,加入对node.kubernetes.io/not-ready:NoSchedule这个污点的容忍:
    - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - arm64
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      - key: node.kubernetes.io/not-ready  #添加如下三行---在261行左右
        operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel

启动:
# kubectl apply -f ~/flannel/kube-flannel.yml  #启动完成之后需要等待一会
NAME                                     READY   STATUS    RESTARTS   AGE
coredns-5644d7b6d9-sm8hs                 1/1     Running   0          9m18s
coredns-5644d7b6d9-vddll                 1/1     Running   0          9m18s
etcd-kub-k8s-master                      1/1     Running   0          8m14s
kube-apiserver-kub-k8s-master            1/1     Running   0          8m17s
kube-controller-manager-kub-k8s-master   1/1     Running   0          8m20s
kube-flannel-ds-amd64-9wgd8              1/1     Running   0          8m42s
kube-proxy-sgphs                         1/1     Running   0          9m18s
kube-scheduler-kub-k8s-master            1/1     Running   0          8m10s

查看:
# kubectl get pods --namespace kube-system
# kubectl get service
# kubectl get svc --namespace kube-system
只有网络插件也安装配置完成之后,才能会显示为ready状态

在这里插入图片描述
[]
在这里插入图片描述
所有node节点操作

配置node节点加入集群:

在所有node节点操作,此命令为初始化master成功后返回的结果
# kubeadm join 192.168.100.10:6443 --token 93erio.hbn2ti6z50he0lqs \
    --discovery-token-ca-cert-hash sha256:3bc60f06a19bd09f38f3e05e5cff4299011b7110ca3281796668f4edb29a56d9

在这里插入图片描述
在master操作:

各种检测:
1.查看pods:
[root@kub-k8s-master ~]# kubectl get pods -n kube-system
NAME                                     READY   STATUS    RESTARTS   AGE
coredns-5644d7b6d9-sm8hs                 1/1     Running   0          39m
coredns-5644d7b6d9-vddll                 1/1     Running   0          39m
etcd-kub-k8s-master                      1/1     Running   0          37m
kube-apiserver-kub-k8s-master            1/1     Running   0          38m
kube-controller-manager-kub-k8s-master   1/1     Running   0          38m
kube-flannel-ds-amd64-9wgd8              1/1     Running   0          38m
kube-flannel-ds-amd64-lffc8              1/1     Running   0          2m11s
kube-flannel-ds-amd64-m8kk2              1/1     Running   0          2m2s
kube-proxy-dwq9l                         1/1     Running   0          2m2s
kube-proxy-l77lz                         1/1     Running   0          2m11s
kube-proxy-sgphs                         1/1     Running   0          39m
kube-scheduler-kub-k8s-master            1/1     Running   0          37m

2.查看异常pod信息:
[root@kub-k8s-master ~]# kubectl  describe pods kube-flannel-ds-sr6tq -n  kube-system
Name:               kube-flannel-ds-sr6tq
Namespace:          kube-system
Priority:           0
PriorityClassName:  <none>
。。。。。
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Pulling    12m                  kubelet, node2     pulling image "registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64"
  Normal   Pulled     11m                  kubelet, node2     Successfully pulled image "registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64"
  Normal   Created    11m                  kubelet, node2     Created container
  Normal   Started    11m                  kubelet, node2     Started container
  Normal   Created    11m (x4 over 11m)    kubelet, node2     Created container
  Normal   Started    11m (x4 over 11m)    kubelet, node2     Started container
  Normal   Pulled     10m (x5 over 11m)    kubelet, node2     Container image "registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64" already present on machine
  Normal   Scheduled  7m15s                default-scheduler  Successfully assigned kube-system/kube-flannel-ds-sr6tq to node2
  Warning  BackOff    7m6s (x23 over 11m)  kubelet, node2     Back-off restarting failed container

3.遇到这种情况直接 删除异常pod:
[root@kub-k8s-master ~]# kubectl delete pod kube-flannel-ds-sr6tq -n kube-system
pod "kube-flannel-ds-sr6tq" deleted

4.查看pods:
[root@kub-k8s-master ~]# kubectl get pods -n kube-system
NAME                                     READY   STATUS    RESTARTS   AGE
coredns-5644d7b6d9-sm8hs                 1/1     Running   0          44m
coredns-5644d7b6d9-vddll                 1/1     Running   0          44m
etcd-kub-k8s-master                      1/1     Running   0          42m
kube-apiserver-kub-k8s-master            1/1     Running   0          43m
kube-controller-manager-kub-k8s-master   1/1     Running   0          43m
kube-flannel-ds-amd64-9wgd8              1/1     Running   0          43m
kube-flannel-ds-amd64-lffc8              1/1     Running   0          7m10s
kube-flannel-ds-amd64-m8kk2              1/1     Running   0          7m1s
kube-proxy-dwq9l                         1/1     Running   0          7m1s
kube-proxy-l77lz                         1/1     Running   0          7m10s
kube-proxy-sgphs                         1/1     Running   0          44m
kube-scheduler-kub-k8s-master            1/1     Running   0          42m

5.查看节点:
[root@kub-k8s-master ~]# kubectl get nodes
NAME             STATUS   ROLES    AGE     VERSION
kub-k8s-master   Ready    master   43m     v1.18.0
kub-k8s-node1    Ready    <none>   6m46s   v1.18.0
kub-k8s-node2    Ready    <none>   6m37s   v1.18.0
到此集群配置完成

错误整理

错误
问题1:服务器时间不一致会报错
查看服务器时间
=====================================
问题2:kubeadm init不成功,发现如下提示,然后超时报错
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s

查看kubelet状态发现如下错误,主机master找不到和镜像下载失败,发现pause镜像是从aliyuncs下载的,其实我已经下载好了官方的pause镜像,按着提示的镜像名称重新给pause镜像打个ali的tag,最后重置kubeadm的环境重新初始化,错误解决
[root@master manifests]# systemctl  status kubelet -l
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since 四 2019-01-31 15:20:32 CST; 5min ago
     Docs: https://kubernetes.io/docs/
 Main PID: 23908 (kubelet)
    Tasks: 19
   Memory: 30.8M
   CGroup: /system.slice/kubelet.service
           └─23908 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --cgroup-driver=cgroupfs --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1

1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.432357   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.532928   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.633192   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: I0131 15:25:41.729296   23908 kubelet_node_status.go:278] Setting node annotation to enable volume controller attach/detach
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.733396   23908 kubelet.go:2266] node "master" not found
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740110   23908 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740153   23908 kuberuntime_sandbox.go:68] CreatePodSandbox for pod "kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)" failed: rpc error: code = Unknown desc = failed pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740166   23908 kuberuntime_manager.go:662] createPodSandbox for pod "kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)" failed: rpc error: code = Unknown desc = failed pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.740207   23908 pod_workers.go:190] Error syncing pod e8f43404e60ae844e375d50b1e39d91e ("kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)"), skipping: failed to "CreatePodSandbox" for "kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)" with CreatePodSandboxError: "CreatePodSandbox for pod \"kube-controller-manager-master_kube-system(e8f43404e60ae844e375d50b1e39d91e)\" failed: rpc error: code = Unknown desc = failed pulling image \"registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1\": Error response from daemon: Get https://registry.cn-hangzhou.aliyuncs.com/v2/: dial tcp 0.0.0.80:443: connect: invalid argument"
1月 31 15:25:41 master kubelet[23908]: E0131 15:25:41.833981   23908 kubelet.go:2266] node "master" not found

解决方式

重置kubeadm环境
整个集群所有节点(包括master)重置/移除节点
1.驱离k8s-node-1节点上的pod(master上)
[root@kub-k8s-master ~]# kubectl drain kub-k8s-node1 --delete-local-data --force --ignore-daemonsets

2.删除节点(master上)
[root@kub-k8s-master ~]# kubectl delete node kub-k8s-node1

3.重置节点(node上-也就是在被删除的节点上)
[root@kub-k8s-node1 ~]# kubeadm reset

注1:需要把master也驱离、删除、重置,这里给我坑死了,第一次没有驱离和删除master,最后的结果是查看结果一切正常,但coredns死活不能用,搞了整整1天,切勿尝试

注2:master上在reset之后需要删除如下文件
# rm -rf /var/lib/cni/ $HOME/.kube/config

###注意:如果整个k8s集群都做完了,需要重置按照上面步骤操作。如果是在初始化出错只需要操作第三步

重新生成token

kubeadm 生成的token过期后,集群增加节点

通过kubeadm初始化后,都会提供node加入的token:
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 192.168.100.10:6443 --token n38l80.y2icehgzsyuzkthi \
    --discovery-token-ca-cert-hash sha256:5fb6576ef82b5655dee285e0c93432aee54d38779bc8488c32f5cbbb90874bac
默认token的有效期为24小时,当过期之后,该token就不可用了。

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

解决方法:
1. 重新生成新的token:
[root@node1 flannel]# kubeadm  token create
kiyfhw.xiacqbch8o8fa8qj
[root@node1 flannel]# kubeadm  token list
TOKEN                     TTL         EXPIRES                     USAGES                   DESCRIPTION   EXTRA GROUPS
gvvqwk.hn56nlsgsv11mik6   <invalid>   2018-10-25T14:16:06+08:00   authentication,signing   <none>        system:bootstrappers:kubeadm:default-node-token
kiyfhw.xiacqbch8o8fa8qj   23h         2018-10-27T06:39:24+08:00   authentication,signing   <none>        system:bootstrappers:kubeadm:default-node-token

2. 获取ca证书sha256编码hash值:
[root@node1 flannel]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
5417eb1b68bd4e7a4c82aded83abc55ec91bd601e45734d6aba85de8b1ebb057

3. 节点加入集群:
  kubeadm join 18.16.202.35:6443 --token kiyfhw.xiacqbch8o8fa8qj --discovery-token-ca-cert-hash sha256:5417eb1b68bd4e7a4c82aded83abc55ec91bd601e45734d6aba85de8b1ebb057
几秒钟后,您应该注意到kubectl get nodes在主服务器上运行时输出中的此节点。

上面的方法比较繁琐,一步到位:
kubeadm token create --print-join-command

第二种方法:
token=$(kubeadm token generate)
kubeadm token create $token --print-join-command --ttl=0
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章