k8s集羣安裝過程中的相關問題和解決

前言

斷點續傳模式~

記錄

我用的是ubuntu16.04,首先要做的是配置apt源,這裏推薦阿里雲的源地址 https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg,centos的在這兒https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64,用梯子的或者海外的用谷歌源是最方便的,後面可以省很多事。

安裝docker和kubeadm、kubectl、kubelet基本沒啥問題,照着來就行,安裝完工具就需要下載鏡像了。

由於鏡像在谷歌倉庫(k8s.grc.io),config image pull會報以下錯誤:

[preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull'
[preflight] Some fatal errors occurred:
    [ERROR ImagePull]: failed to pull image [k8s.gcr.io/kube-apiserver-amd64:v1.11.3]: exit status

這個很明顯是牆的緣故,手動從docker的國內託管站點下載:

docker pull mirrorgooglecontainers/kube-apiserver-amd64:v1.11.3
docker pull mirrorgooglecontainers/kube-controller-manager-amd64:v1.11.3
docker pull mirrorgooglecontainers/kube-scheduler-amd64:v1.11.3
docker pull mirrorgooglecontainers/kube-proxy-amd64:v1.11.3
docker pull mirrorgooglecontainers/pause:3.1
docker pull mirrorgooglecontainers/etcd-amd64:3.2.18

相關的鏡像就上面幾個,但是下下來之後仍然無法kubeadm init,依然會去谷歌倉庫拉鏡像,卡了很久才發現從docker倉庫拉下來的鏡像和谷歌倉庫中的不一樣,因此在kubeadm init時會認爲這些鏡像不存在,接着去谷歌倉庫拉,docker images可以查看這些鏡像現在的名字,所以需要執行如下命令:

docker tag docker.io/mirrorgooglecontainers/kube-proxy-amd64:v1.11.3 k8s.gcr.io/kube-proxy-amd64:v1.11.3
docker tag docker.io/mirrorgooglecontainers/kube-scheduler-amd64:v1.11.3 k8s.gcr.io/kube-scheduler-amd64:v1.11.3
docker tag docker.io/mirrorgooglecontainers/kube-apiserver-amd64:v1.11.3 k8s.gcr.io/kube-apiserver-amd64:v1.11.3
docker tag docker.io/mirrorgooglecontainers/kube-controller-manager-amd64:v1.11.3 k8s.gcr.io/kube-controller-manager-amd64:v1.11.3
docker tag docker.io/mirrorgooglecontainers/etcd-amd64:3.2.18 k8s.gcr.io/etcd-amd64:3.2.18
docker tag docker.io/mirrorgooglecontainers/pause:3.1 k8s.gcr.io/pause:3.1
docker tag docker.io/coredns/coredns:1.1.3 k8s.gcr.io/coredns:1.1.3

接着出現了CPU數量少於2(VM)和swap分區的問題,調整了CPU的數量,並禁用了swap分區,這兩個問題解決,搜了一下爲啥要禁止swap分區,大概就是不使用虛擬內存以提高性能,將實例緊密包裝到儘可能接近百分之百的意思吧,接着繼續執行kubeadm init,報錯如下:

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
	timed out waiting for the condition

This error is likely caused by:
	- The kubelet is not running
	- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
	- 'systemctl status kubelet'
	- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.

開始以爲是kubelet的版本和apiserver版本不兼容,於是又重新裝了一遍工具,發現端口被佔用,這是因爲之前master啓動時佔了6443端口,kubeadm reset後解決,但是中間又很二的執行了一個改ip的操作-_-!,依然報上面的錯誤,在日誌裏看到重複報以下錯誤:

1006 02:44:41.050125   19805 reflector.go:123] k8s.io/client-go/informers/factor
1006 02:44:41.129531   19805 kubelet.go:2267] node "ubuntu" not found
1006 02:44:41.230347   19805 kubelet.go:2267] node "ubuntu" not found
1006 02:44:41.331174   19805 kubelet.go:2267] node "ubuntu" not found
1006 02:44:41.431984   19805 kubelet.go:2267] node "ubuntu" not found
1006 02:44:41.532748   19805 kubelet.go:2267] node "ubuntu" not found
1006 02:44:41.596687   19805 controller.go:135] failed to ensure node lease exist
1006 02:44:41.633573   19805 kubelet.go:2267] node "ubuntu" not found
1006 02:44:41.734381   19805 kubelet.go:2267] node "ubuntu" not found
1006 02:44:41.745881   19805 kubelet_node_status.go:94] Unable to register node with apiserver ~~
1006 02:44:41.835220   19805 kubelet.go:2267] node "ubuntu" not found
1006 02:44:41.936016   19805 kubelet.go:2267] node "ubuntu" not found

因爲改地址導致舊的ip無法匹配,最簡單的解決方案依然是kubeadm reset,推薦修改conf參數的方式來解決,將conf文件中的舊地址修改爲新地址,重啓kubelet服務,問題解決。

因爲整個安裝操作都是在root下進行,所以還要複製下配置文件到home目錄,然後改個權限,做完這些master節點安裝kubernetes就成功了,這裏要把安裝成功後的token記下來,用於之後node的添加:

已達總線帶寬上限,後面的下次續傳。

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章