利用 kubeadm 簡單搭建k8s(已更新爲V1.13.0版本)

1. 基本系統環境

1.1 系統內核

查看當前系統內核(我這裏是5.0.5-1.el7.elrepo.x86_64):

uname -a

版本必須大於等於3.10,否則需要升級內核:

# ELRepo 倉庫(可以先看一下 /etc/yum.repos.d/ 中是否有yum 源)
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm

# 查看可用內核
yum --disablerepo="*" --enablerepo="elrepo-kernel" list available

# 安裝最新內核
yum --enablerepo=elrepo-kernel install kernel-ml

# 查看可用內核,一般上面安裝的序號爲0
awk -F\' '$1=="menuentry " {print i++ " : " $2}' /etc/grub2.cfg

# 設置默認內核爲最新的
grub2-set-default 0

# 生成 grub 配置文件
grub2-mkconfig -o /boot/grub2/grub.cfg

# 重啓
reboot

# 驗證
uname -a

1.2 關閉增強型Linux SELINUX並禁用防火牆

setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
# 更改爲 SELINUX=disabled
vim /etc/selinux/config

systemctl stop firewalld
systemctl disable firewalld
# 驗證
systemctl status firewalld

1.3 確保hostname、MAC地址和UUID的唯一性

# 查看mac地址確認
cat /sys/class/net/ens33/address

# 查看確認UUID
cat /sys/class/dmi/id/product_uuid

1.4 關閉swap

swapoff -a,並註釋/etc/fstab中的

/dev/mapper/centos-swap swap                    swap    defaults        0 0

確認關閉(swap應該爲0)

free -m

1.5 配置網橋的流量,避免錯誤

引入br_netfilter模塊:

# 方式1
lsmod | grep br_netfilter

# 方式2
modprobe br_netfilter

配置各節點系統內核參數使流過網橋的流量也進入iptables/netfilter框架中,可以配置到/etc/sysctl.conf中,或者在/usr/lib/sysctl.d//run/sysctl.d//etc/sysctl.d/任意目錄下創建配置文件進行配置,比如:

cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

指定加載的配置文件:

# 從/etc/sysctl.d/k8s.conf 加載配置
sysctl -p /etc/sysctl.d/k8s.conf
# 加載任意配置文件
sysctl --system

1.6 節點hosts文件

節點安排如下(根據自己虛擬機的IP自行更改),每個機器上hosts文件都加進去:

10.4.37.24 k8smaster
10.4.37.69 k8snode1
10.4.37.72 k8snode2

或者在克隆機器後,直接更改每個機器的主機命名(其他節點類似):

# 更改主機名
hostname k8smaster
# 編輯hosts,追加
vim /etc/hosts
127.0.0.1 k8smaster

2. 安裝docker

詳細版本搭配見External Dependencies

添加docker的yum源(yum的源配置 /etc/yum.repos.d/

yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

【注】如果沒有代理,將上述yum的源改成國內的,這裏使用阿里雲:

yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

搜索docker-ce可用鏡像:

yum list docker-ce --showduplicates | sort -r

安裝指定版本的docker(目前最新的1.14.0版本的kubenetes依賴的docker可以爲1.13.1, 17.03, 17.06, 17.09, 18.06, 18.09):

yum install docker-ce-17.09.1.ce-1.el7.centos

創建docker配置目錄和docker守護線程配置文件:

mkdir /etc/docker
vim /etc/docker/daemon.json

寫入(可以更改驅動爲默認的cgroupfs,docker加速器可以去阿里雲獲取)

{
    "registry-mirrors": ["https://xxxx.mirror.aliyuncs.com"],
    "exec-opts": ["native.cgroupdriver=systemd"]
}

配置網絡策略:

iptables -P FORWARD ACCEPT

配置docker啓動策略並重新啓動:

systemctl daemon-reload && systemctl enable docker && systemctl restart docker

【附】

若版本不符合需要對docker進行升級或者刪除後再進行上述的安裝:

yum -y remove docker*
rm -rf /var/lib/docker

3. 安裝kubelet kubectl kubeadm(v1.14.0)

  • kubeadm:啓動集羣的命令工具;
  • kubelet:用於啓動Pod和容器等對象的工具(核心組件),所以它需要在集羣中的每個節點上部署;
  • kubectl:用於和集羣通信的命令行;

3.1 配置yum源

如果可以翻牆(科學上網Shadowsocks),可以直接使用google的源:

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kube*
EOF

添加Kubernetes安裝源認證key:

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -

【注】如果沒有代理,將上述yum的源改成國內的,這裏使用阿里雲:

cat <<EOF > kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

安裝最新的k8s安裝工具並設置啓動策略:

# 安裝最新的kubelet和kubectl以及kubeadm
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
# 設置啓動策略
systemctl daemon-reload && systemctl enable kubelet && systemctl start kubelet

當使用docker時,kubeadm 會自動檢測 kubelet 使用的cgroup driver ,並將運行時設置到 /var/lib/kubelet/kubeadm-flags.env 文件中,如果使用不同的CRI(容器運行時),就需要編輯 /etc/default/kubelet(該文件需要手動創建)指定cgroup-driver的值:

Environment=
KUBELET_EXTRA_ARGS=--cgroup-driver=systemd

由於 kubelet 默認的驅動就是cgroupfs,所以只有CRI的cgroup driver不是cgroupfs時才需要指定(k8s推薦docker的cgroup driver配置爲systemd)。

拉取核心組件鏡像(翻牆不穩定可能需要多拉幾次):

kubeadm config images pull

注:

如果無法拉下鏡像可以通過kubeadm config images list --kubernetes-version=v1.13.0命令查看具體需要哪些鏡像:

k8s.gcr.io/kube-apiserver:v1.13.0
k8s.gcr.io/kube-controller-manager:v1.13.0
k8s.gcr.io/kube-scheduler:v1.13.0
k8s.gcr.io/kube-proxy:v1.13.0
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.2.24
k8s.gcr.io/coredns:1.2.6

然後從docker的google的克隆鏡像拉取,完事後更換tag

# 從docker.io拉取上述鏡像
docker pull mirrorgooglecontainers/kube-apiserver:v1.13.0
docker pull mirrorgooglecontainers/kube-controller-manager:v1.13.0
docker pull mirrorgooglecontainers/kube-scheduler:v1.13.0
docker pull mirrorgooglecontainers/kube-proxy:v1.13.0
docker pull mirrorgooglecontainers/pause:3.1
docker pull mirrorgooglecontainers/etcd:3.2.24
docker pull coredns/coredns:1.2.6

# 重新打上google的tag
docker tag mirrorgooglecontainers/kube-apiserver:v1.13.0 k8s.gcr.io/kube-apiserver:v1.13.0
# 刪除舊鏡像
docker rmi mirrorgooglecontainers/kube-apiserver:v1.13.0

# 重新打上
docker tag mirrorgooglecontainers/kube-controller-manager:v1.13.0 k8s.gcr.io/kube-controller-manager:v1.13.0
# 刪除舊鏡
docker rmi mirrorgooglecontainers/kube-controller-manager:v1.13.0

# 重新打上
docker tag mirrorgooglecontainers/kube-scheduler:v1.13.0 k8s.gcr.io/kube-scheduler:v1.13.0
# 刪除舊鏡
docker rmi mirrorgooglecontainers/kube-scheduler:v1.13.0

# 重新打上
docker tag mirrorgooglecontainers/kube-proxy:v1.13.0 k8s.gcr.io/kube-proxy:v1.13.0
# 刪除舊鏡
docker rmi mirrorgooglecontainers/kube-proxy:v1.13.0

# 重新打上
docker tag mirrorgooglecontainers/pause:3.1 k8s.gcr.io/pause:3.1
# 刪除舊鏡
docker rmi mirrorgooglecontainers/pause:3.1

# 重新打上
docker tag mirrorgooglecontainers/etcd:3.2.24 k8s.gcr.io/etcd:3.2.24
# 刪除舊鏡
docker rmi mirrorgooglecontainers/etcd:3.2.24

# 重新打上
docker tag coredns/coredns:1.2.6 k8s.gcr.io/coredns:1.2.6
# 刪除舊鏡
docker rmi coredns/coredns:1.2.6

3.2 克隆機器

 上述基本步驟搞完,爲了方便起見,將上述的機器clone2份以作集羣測試,將節點hosts文件中的節點IP更改成clone得到的2臺機器的IP;

4. Master 節點

4.1 集羣初始化

至少2核4G內存

添加flannel

# 拉取flannel鏡像
docker pull quay.io/coreos/flannel:v0.10.0-amd64

mkdir -p /etc/cni/net.d/

cat <<EOF> /etc/cni/net.d/10-flannel.conf
{"name":"cbr0","type":"flannel","delegate": {"isDefaultGateway": true}}
EOF

mkdir /usr/share/oci-umount/oci-umount.d -p
mkdir /run/flannel/

cat <<EOF> /run/flannel/subnet.env
FLANNEL_NETWORK=172.100.0.0/16
FLANNEL_SUBNET=172.100.1.0/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
EOF

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml

利用 kubeadm init 一鍵完成Master節點的安裝:

kubeadm init --pod-network-cidr 10.244.0.0/16 --kubernetes-version stable

可以指定kubernetes的版本(如--kubernetes-version=v1.9.1),這裏直接用最新的穩定版(1.14.0)。

然後就死等,出現錯誤查看日誌/var/log/messages、查看kubelet的日誌:

journalctl -f -u kubelet

kubadm的配置文件:/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

如果修改了配置文件後需要:

# 保存配置, 重啓
systemctl daemon-reload
systemctl restart kubelet
systemctl enable kubelet && systemctl start kubelet

如果中間出現錯誤,重新初始化:

kubeadm reset
kubeadm init --pod-network-cidr 10.244.0.0/16 --kubernetes-version stable

打印日誌如下:

[init] Using Kubernetes version: v1.14.0
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8smaster localhost] and IPs [10.4.37.24 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8smaster localhost] and IPs [10.4.37.24 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8smaster kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.4.37.24]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 30.503797 seconds
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --experimental-upload-certs
[mark-control-plane] Marking the node k8smaster as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8smaster as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: pl1vir.d7e5xy3xy3uuymou
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.4.37.24:6443 --token pl1vir.d7e5xy3xy3uuymou \
    --discovery-token-ca-cert-hash sha256:b6ecd6ad73e072f2290a14213e32b681cd41c9010a9403bb32e1e213f7c167d2

這裏需要記錄一下 kubeadm init 輸出的 kubeadm join …… 命令,後面需要這個命令將各個節點加入集羣中,先將它複製到其他地方備份,這玩意兒是真重要(沒騙你……)

kubeadm join 10.4.37.24:6443 --token pl1vir.d7e5xy3xy3uuymou \
    --discovery-token-ca-cert-hash sha256:b6ecd6ad73e072f2290a14213e32b681cd41c9010a9403bb32e1e213f7c167d2

上述的令牌(已經加密,生命週期24小時)用於 master 和加入的 node 之間相互身份之間驗證,憑藉這個令牌可以讓任何人將認證的節點加入到該集羣,如果需要對令牌進行增、刪、查的操作,可以使用 kubeadm token 命令,具體可參看kubeadm token

根據它的提示,爲了讓 kubelet 爲非root用戶使用,需要作如下的一些配置:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

當然如果已經是 root 用戶安裝的(我這邊本地虛擬機就是以root用戶執行的),可以執行:

export KUBECONFIG=/etc/kubernetes/admin.conf

查看集羣狀態:

# 查看集羣狀態
kubectl get cs

# 輸出
NAME                 STATUS    MESSAGE             ERROR
controller-manager   Healthy   ok                  
scheduler            Healthy   ok                  
etcd-0               Healthy   {"health":"true"}

【注】

 當然,在隨後查看 kubelet 日誌的過程中,還是發現有些問題,如下:

4月 01 13:41:52 k8smaster kubelet[21177]: W0401 13:41:52.869421   21177 cni.go:213] Unable to update cni config: No networks found in /etc/cni/net.d
4月 01 13:41:56 k8smaster kubelet[21177]: E0401 13:41:56.656635   21177 kubelet.go:2170] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

4.2 Pod 網絡附加組件(重要)

 Pod 的網絡組件很重要,這個組件主要作用是讓 Pod 之間可以相互通信,該網絡組件必須在任何應用部署之前進行部署,當然 CoreDNS 也只有在網絡組件安裝之後才能正常啓動,kubeadm 僅僅支持基於網絡(networks)的CNI(Container Network Interface),不支持 kubenet。有些項目提供了使用 CNI 的 k8s Pod,其中某些也支持網絡協議。

 POD網絡不能與任何主機網絡重疊,否則可能導致問題,如果發現網絡插件的首選POD網絡與某些主機網絡發生衝突,應該考慮一個合適的CIDR替換,並在kubeadm init with--pod network cidr期間使用它,並將其作爲網絡插件yaml文件中的替換,根據上述集羣初始化的輸出提示,安裝Pod的網路插件:

kubectl apply -f [podnetwork].yaml

每個集羣中僅且可以安裝一個Pod網絡組件,可選網絡組件有:Calico、Canal、Cilium、Flannel、Kube-router等等,這裏選用 Kube-router,將橋接的IPv4流量傳遞到IPtables的鏈(重要),這樣才能使得 CNI 正常工作,只要把/proc/sys/net/bridge/bridge-nf-call-iptables設置爲1即可(在準備工作中已經做過):

sysctl net.bridge.bridge-nf-call-iptables=1

Kube-router 依賴於 kube-controller-manager 爲節點分配 CIDR(無類別域間路由,Classless Inter-Domain Routing),因此初始化kubeadm init時帶上--pod-network-cidr標識,其實上述初始化過程中我已經帶上了該標識,驗證一下:

kubectl get pods --all-namespaces

# 結果
NAMESPACE     NAME                                READY   STATUS    RESTARTS   AGE
kube-system   coredns-fb8b8dccf-gzs2k             1/1     Running   0          27h
kube-system   coredns-fb8b8dccf-hs56b             1/1     Running   0          27h
kube-system   etcd-k8smaster                      1/1     Running   1          27h
kube-system   kube-apiserver-k8smaster            1/1     Running   1          27h
kube-system   kube-controller-manager-k8smaster   1/1     Running   1          27h
kube-system   kube-flannel-ds-z7r6t               1/1     Running   0          5h31m
kube-system   kube-proxy-75w9l                    1/1     Running   1          27h
kube-system   kube-scheduler-k8smaster            1/1     Running   1          27h

kubectl get pods --all-namespaces是通過檢測 CoreDNS 是否運行來判斷網絡組件是否正常安裝。

4.3 控制平面節點(Plane node)隔離

 默認情況下,出於安全原因,集羣不會在master上調度pod,如果偏想在master上調度Pod,對於單節點Kubernetes集羣,可以執行:

kubectl taint nodes --all node-role.kubernetes.io/master-

上述命令將會移除所有的node-role.kubernetes.io/master-污點(只要包含該對象的節點,將從中全部移除該對象,包括master節點),這樣調度器就可以在任何節點上調度Pod。這一塊我是搭建的不是單節點集羣,所以不進行配置。

5. Node節點

5.1 向集羣中添加 worker node 節點

至少4核16G內存

爲了讓其他節點也可以使用kubectl相關命令,將Master節點生成的/etc/kubernetes/admin.conf文件複製到普通節點對象的位置上。

上面搞完了Master節點,接着搞普通的節點(worker node),普通的節點就是負責具體工作的節點,開始之前,需要先配置各個節點SHH免密登錄,然後再進行加入(上述初始化過程中複製出來的命令):

kubeadm join 10.4.37.24:6443 --token pl1vir.d7e5xy3xy3uuymou --discovery-token-ca-cert-hash sha256:b6ecd6ad73e072f2290a14213e32b681cd41c9010a9403bb32e1e213f7c167d2

如果距離初始化集羣的時間太長可以看一下令牌是否過期:

# 查看令牌
kubeadm token list
# 結果如下
TOKEN                     TTL       EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
pl1vir.d7e5xy3xy3uuymou   17h       2019-04-02T11:01:09+08:00   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:bootstrappers:kubeadm:default-node-token

可以看到初始化生成的令牌過期時間爲2019-04-02T11:01:09(24小時),如果過期需要手動創建新的令牌:

kubeadm token create

系統將生成一個新的令牌(類似於pl1vir.d7e5xy3xy3uuymou),還需要一個加密串discovery-token-ca-cert-hash,可以通過一下命令獲取:

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \
   openssl dgst -sha256 -hex | sed 's/^.* //'

我的第二個令牌爲:

kubeadm join 10.4.37.24:6443 --token 4lqozr.xyawinzvspo4zha7 --discovery-token-ca-cert-hash sha256:b6ecd6ad73e072f2290a14213e32b681cd41c9010a9403bb32e1e213f7c167d2

或者直接使用kubeadm token create --print-join-command會直接將加入集羣的命令打印出來,然後將命令拼接出來即可將節點加入集羣。在實操過程中,join minon 時一直卡住,原因未知,查看日誌journalctl -xeu kubelet發現如下異常:

4月 02 09:15:25 k8snode1 kubelet[75413]: E0402 09:15:25.661220   75413 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)

這個問題卡了好久,不找到怎麼解決。

重置環境再次加入:

# 重置kubeadm
kubeadm reset
# 重置iptables
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

在master節點上運行kubectl get nodes以查看節點是否加入,獲取的節點狀態爲:

[root@bogon yum.repos.d]# kubectl get nodes
NAME         STATUS     ROLES    AGE   VERSION
k8s-master   Ready      master   15h   v1.13.0
k8s-node-2   NotReady   <none>   15h   v1.13.0
localhost    NotReady   <none>   15h   v1.13.0

查看2個節點上的kubelet的日誌journalctl -xeu kubelet

4月 04 09:14:01 k8s-node-1 kubelet[23444]: W0404 09:14:01.200847   23444 cni.go:203] Unable to update cni config: No networks found in /etc/cni/net.d
4月 04 09:14:01 k8s-node-1 kubelet[23444]: E0404 09:14:01.203078   23444 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

難道worker節點上也要安裝 flannel ,嘗試在2個節點上進行安裝,問題消失。

【注】

在隨後使用kubectl get nodes查看集羣節點過程中突然出現The connection to the server localhost:8080 was refused - did you specify the right host or port?,解決方案:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

5.2 通過普通的 worker node (廣義來講就是非 Master 節點)控制集羣(可選)

 爲了在其他機器上使用 kubectl 以便來和集羣進行通信,需要將 Master 節點上的配置文件複製到需要使用 kubectl 和集羣通信的節點上,在 node 節點上直接進行遠程複製:

# 將 master 節點上的配置文件複製到自己機器上
scp root@<master ip>:/etc/kubernetes/admin.conf $HOME/.kube/config
# 獲取集羣節點
kubectl --kubeconfig $HOME/.kube/config get nodes

但實際在複製後可以使用kubectl get nodes可以直接獲取集羣節點信息。

【注】

在隨後的過程,Master節點突然出現:The connection to the server localhost:8080 was refused - did you specify the right host or port?,將配置文件複製到對應的地方可解決:cp /etc/kubernetes/admin.conf $HOME/.kube/config

5.3 將 API Server 代理到本地(可選)

 如果想從集羣外部連接到 API Server,可以使kubectl proxy

scp root@<master ip>:/etc/kubernetes/admin.conf $HOME/.kube/admin.conf
kubectl --kubeconfig $HOME/admin.conf proxy

這樣就可以在本地訪問http://localhost:8001/api/v1

5.4 卸載

 這個過程主要是撤銷 kubeadm 所做的事情,在卸載之前需要確保 node 已經清空了,使用憑證和master通信:

# 清空 node 節點數據
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets

# 卸載 node
kubectl delete node <node name>

然後在待卸載的節點上重置 kubeadm 安裝的環境:

kubeadm reset

上述命令重置會清理 iptables 規則或者 IPVS tables,如果有需要就手動:

iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

如果是想重置 IPVS tables,執行:

ipvsadm -C

這樣以後如果想執行kubeadm init或者kubeadm join相關命令,直接搞就行。

【注】

注:在之後使用的過程中,系統升級後,發現kubectl命令無法使用:

[root@k8s-master ~]# kubectl get pods
The connection to the server 10.4.37.17:6443 was refused - did you specify the right host or port?
[root@k8s-master ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: activating (auto-restart) (Result: exit-code) since 日 2019-05-05 14:27:27 CST; 10s ago
     Docs: https://kubernetes.io/docs/
  Process: 22295 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
 Main PID: 22295 (code=exited, status=255)

5月 05 14:27:38 k8s-master kubelet[22335]: I0505 14:27:38.036646   22335 client.go:75] Connecting to docker on unix:///var/run/docker.sock
5月 05 14:27:38 k8s-master kubelet[22335]: I0505 14:27:38.036659   22335 client.go:104] Start docker client with request timeout=2m0s
5月 05 14:27:38 k8s-master kubelet[22335]: W0505 14:27:38.037961   22335 docker_service.go:561] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
5月 05 14:27:38 k8s-master kubelet[22335]: I0505 14:27:38.037992   22335 docker_service.go:238] Hairpin mode set to "hairpin-veth"
5月 05 14:27:38 k8s-master kubelet[22335]: I0505 14:27:38.040956   22335 docker_service.go:253] Docker cri networking managed by cni
5月 05 14:27:38 k8s-master kubelet[22335]: I0505 14:27:38.058198   22335 docker_service.go:258] Docker Info: &{ID:CCPW:VM3Q:D47E:JZ5T:HAU2:4627:YCPL:NRPI:WSX2:NH6Y:VDSJ:5XUD Containers:30 ContainersRunning:0 ContainersPaused:0 ContainersStopped:3...ports d_type true]] 
5月 05 14:27:38 k8s-master kubelet[22335]: F0505 14:27:38.058260   22335 server.go:265] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
5月 05 14:27:38 k8s-master systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
5月 05 14:27:38 k8s-master systemd[1]: Unit kubelet.service entered failed state.
5月 05 14:27:38 k8s-master systemd[1]: kubelet.service failed.
Hint: Some lines were ellipsized, use -l to show in full.

可以發現是kubelet的服務掛了,重啓一下kubelet:systemctl restart kubelet,然後發現還是不行,查看日誌:

journalctl -xefu kubelet

# 結果
5月 05 14:39:01 k8s-master systemd[1]: kubelet.service holdoff time over, scheduling restart.
5月 05 14:39:01 k8s-master systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kubelet.service has finished shutting down.
5月 05 14:39:01 k8s-master systemd[1]: Started kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kubelet.service has finished starting up.
-- 
-- The start-up result is done.
5月 05 14:39:02 k8s-master kubelet[24995]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
5月 05 14:39:02 k8s-master kubelet[24995]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.028334   24995 server.go:417] Version: v1.14.1
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.028511   24995 plugins.go:103] No cloud provider specified.
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.028531   24995 server.go:754] Client rotation is on, will bootstrap in background
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.030730   24995 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.096889   24995 server.go:625] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.097133   24995 container_manager_linux.go:261] container manager verified user specified cgroup-root exists: []
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.097145   24995 container_manager_linux.go:266] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms}
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.097216   24995 container_manager_linux.go:286] Creating device plugin manager: true
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.097263   24995 state_mem.go:36] [cpumanager] initializing new in-memory state store
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.097333   24995 state_mem.go:84] [cpumanager] updated default cpuset: ""
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.097340   24995 state_mem.go:92] [cpumanager] updated cpuset assignments: "map[]"
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.097388   24995 kubelet.go:279] Adding pod path: /etc/kubernetes/manifests
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.097406   24995 kubelet.go:304] Watching apiserver
5月 05 14:39:02 k8s-master kubelet[24995]: E0505 14:39:02.098718   24995 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:442: Failed to list *v1.Service: Get https://10.4.37.17:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.4.37.17:6443: connect: connection refused
5月 05 14:39:02 k8s-master kubelet[24995]: E0505 14:39:02.098881   24995 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.4.37.17:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dk8s-master&limit=500&resourceVersion=0: dial tcp 10.4.37.17:6443: connect: connection refused
5月 05 14:39:02 k8s-master kubelet[24995]: E0505 14:39:02.098919   24995 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Node: Get https://10.4.37.17:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8s-master&limit=500&resourceVersion=0: dial tcp 10.4.37.17:6443: connect: connection refused
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.099995   24995 client.go:75] Connecting to docker on unix:///var/run/docker.sock
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.100008   24995 client.go:104] Start docker client with request timeout=2m0s
5月 05 14:39:02 k8s-master kubelet[24995]: W0505 14:39:02.101221   24995 docker_service.go:561] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.101251   24995 docker_service.go:238] Hairpin mode set to "hairpin-veth"
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.105065   24995 docker_service.go:253] Docker cri networking managed by cni
5月 05 14:39:02 k8s-master kubelet[24995]: I0505 14:39:02.127753   24995 docker_service.go:258] Docker Info: &{ID:CCPW:VM3Q:D47E:JZ5T:HAU2:4627:YCPL:NRPI:WSX2:NH6Y:VDSJ:5XUD Containers:30 ContainersRunning:0 ContainersPaused:0 ContainersStopped:30 Images:41 Driver:overlay DriverStatus:[[Backing Filesystem xfs] [Supports d_type true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:19 OomKillDisable:true NGoroutines:28 SystemTime:2019-05-05T14:39:02.114019459+08:00 LoggingDriver:json-file CgroupDriver:cgroupfs NEventsListener:0 KernelVersion:3.10.0-957.10.1.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc0005c2070 NCPU:4 MemTotal:3954184192 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:k8s-master Labels:[] ExperimentalBuild:false ServerVersion:17.09.1-ce ClusterStore: ClusterAdvertise: Runtimes:map[runc:{Path:docker-runc Args:[]}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil>} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:06b9cb35161009dcb7123345749fef02f7cea8e0 Expected:06b9cb35161009dcb7123345749fef02f7cea8e0} RuncCommit:{ID:3f2f8b84a77f73d38244dd690525642a72156c64 Expected:3f2f8b84a77f73d38244dd690525642a72156c64} InitCommit:{ID:949e6fa Expected:949e6fa} SecurityOptions:[name=seccomp,profile=default]}
5月 05 14:39:02 k8s-master kubelet[24995]: F0505 14:39:02.127894   24995 server.go:265] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
5月 05 14:39:02 k8s-master systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
5月 05 14:39:02 k8s-master systemd[1]: Unit kubelet.service entered failed state.
5月 05 14:39:02 k8s-master systemd[1]: kubelet.service failed.

發現是驅動又不一樣了,狂醉……(之前記得都改成了cgroupfs),更改驅動即可:

vim /var/lib/kubelet/kubeadm-flags.env

# 指定驅動爲cgroupfs,其他的不要動
Environment=
KUBELET_EXTRA_ARGS=--cgroup-driver=cgroupfs
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章