基於Centos 7.8 和Kubeadm部署k8s高可用集羣

原文作者:Zhangguanzhang

原文鏈接:http://zhangguanzhang.github.io/2019/11/24/kubeadm-base-use/

 

一:系統基礎配置

這裏我們認爲您的系統是最新且最小化安裝的。

1. 確保時間統一

yum install chrony -y systemctl enable chronyd && systemctl restart chronyd

2:關閉交換分區
swapoff -a && sysctl -w vm.swappiness=0
sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab

3:關閉防火牆以及selinux
systemctl stop firewalld && systemctl disable firewalld
setenforce 0
sed -ri '/^[^#]*SELINUX=/s#=.+$#=disabled#' /etc/selinux/config

4. 關閉NetworkManager,如果ip不是通過NetworkManager納管的,建議關閉,然後使用network;這裏我們依然使用的是network
systemctl disable NetworkManager && systemctl stop NetworkManager
systemctl restart network

5. 安裝epel源,並且替換爲阿里雲的epel源
yum install epel-release wget -y
wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo

6. 安裝依賴組件
yum install -y \
curl \
git \
conntrack-tools \
psmisc \
nfs-utils \
jq \
socat \
bash-completion \
ipset \
ipvsadm \
conntrack \
libseccomp \
net-tools \
crontabs \
sysstat \
unzip \
iftop \
nload \
strace \
bind-utils \
tcpdump \
telnet \
lsof \
htop
 

 二:集羣kube-proxy使用ipvs模式需要開機加載下列模塊

這裏按照規範使用systemd-modules-load來加載而不是在/etc/rc.local裏寫modprobe

vim /etc/modules-load.d/ipvs.conf

ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
br_netfilter

systemctl daemon-reload && systemctl enable --now systemd-modules-load.service

確認內核加載模塊

[root@k8s-m1 ~]# lsmod | grep ip_v
ip_vs_sh               12688  0 
ip_vs_wrr              12697  0 
ip_vs_rr               12600  0 
ip_vs                 145497  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack          139264  1 ip_vs
libcrc32c              12644  3 xfs,ip_vs,nf_conntrack

三: 設定系統參數

所有機器需要設定/etc/sysctl.d/k8s.conf的系統參數,目前對ipv6支持不怎麼好,所以裏面也關閉ipv6了。

cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
net.ipv4.neigh.default.gc_stale_time = 120
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.all.arp_announce = 2
net.ipv4.ip_forward = 1
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_synack_retries = 2
# 要求iptables不對bridge的數據進行處理
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-arptables = 1
net.netfilter.nf_conntrack_max = 2310720
fs.inotify.max_user_watches=89100
fs.may_detach_mounts = 1
fs.file-max = 52706963
fs.nr_open = 52706963
vm.overcommit_memory=1
vm.panic_on_oom=0
EOF

如果kube-proxy使用ipvs的話爲了防止timeout需要設置下tcp參數

cat <<EOF >> /etc/sysctl.d/k8s.conf
# https://github.com/moby/moby/issues/31208 
# ipvsadm -l --timout
# 修復ipvs模式下長連接timeout問題 小於900即可
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 10
EOF
sysctl --system

優化設置 journal 日誌相關,避免日誌重複蒐集,浪費系統資源。修改systemctl啓動的最小文件打開數量。關閉ssh反向dns解析

# 下面兩句apt系列系統沒有,執行不影響
sed -ri 's/^\$ModLoad imjournal/#&/' /etc/rsyslog.conf
sed -ri 's/^\$IMJournalStateFile/#&/' /etc/rsyslog.conf

sed -ri 's/^#(DefaultLimitCORE)=/\1=100000/' /etc/systemd/system.conf
sed -ri 's/^#(DefaultLimitNOFILE)=/\1=100000/' /etc/systemd/system.conf

sed -ri 's/^#(UseDNS )yes/\1no/' /etc/ssh/sshd_config

文件最大打開數,按照規範,在子配置文件寫

cat>/etc/security/limits.d/kubernetes.conf<<EOF
*       soft    nproc   131072
*       hard    nproc   131072
*       soft    nofile  131072
*       hard    nofile  131072
root    soft    nproc   131072
root    hard    nproc   131072
root    soft    nofile  131072
root    hard    nofile  131072
EOF

docker官方的內核檢查腳本建議(RHEL7/CentOS7: User namespaces disabled; add 'user_namespace.enable=1' to boot command line),如果是yum系列的系統使用下面命令開啓,

grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"

 四: 安裝docker

檢查系統內核和模塊是否適合運行 docker (僅適用於 linux 系統),該腳本可能因爲牆的原因無法生成,可以先去掉重定向看看能不能訪問到腳本

curl -s https://raw.githubusercontent.com/docker/docker/master/contrib/check-config.sh > check-config.sh
bash ./check-config.sh

現在docker存儲驅動都是使用的overlay2(不要使用devicemapper,這個坑非常多),我們重點關注overlay2是否不是綠色

這裏我們使用年份命名版本的docker-ce,假設我們要安裝v1.18.5的k8s,我們去https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG

裏進對應版本的CHANGELOG-1.18.md裏搜The list of validated docker versions remain查找官方驗證過的docker版本,docker版本不一定得在列表裏,實際上測試過19.03也能使用(19.03+修復了runc的一個性能bug),這裏我們使用docker官方的安裝腳本安裝docker(該腳本支持centos和ubuntu).

export VERSION=19.03
curl -fsSL "https://get.docker.com/" | bash -s -- --mirror Aliyun

所有機器配置加速源並配置docker的啓動參數使用systemd,使用systemd是官方的建議,詳見 https://kubernetes.io/docs/setup/cri/

mkdir -p /etc/docker/
cat>/etc/docker/daemon.json<<EOF
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "bip": "169.254.123.1/24",
  "oom-score-adjust": -1000,
  "registry-mirrors": [
      "https://fz5yth0r.mirror.aliyuncs.com",
      "https://dockerhub.mirrors.nwafu.edu.cn/",
      "https://mirror.ccs.tencentyun.com",
      "https://docker.mirrors.ustc.edu.cn/",
      "https://reg-mirror.qiniu.com",
      "http://hub-mirror.c.163.com/",
      "https://registry.docker-cn.com"
  ],
  "storage-driver": "overlay2",
  "storage-opts": [
    "overlay2.override_kernel_check=true"
  ],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m",
    "max-file": "3"
  }
}
EOF

Live Restore Enabled這個千萬別開,某些極端情況下容器Dead狀態之類的必須重啓docker daemon才能解決,開了就只能重啓機器解決了

複製補全腳本

cp /usr/share/bash-completion/completions/docker /etc/bash_completion.d/

啓動docker並看下信息是否正常

systemctl enable --now docker
docker info

五:kube-nginx部署

這裏我們使用nginx實現local proxy來玩,因爲localproxy是每臺機器上的,可以不用SLB和無視在雲上vpc裏無法使用vip的限制,需要每個機器上運行nginx實現
每臺機器配置hosts

[root@k8s-m1 src]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1 apiserver.k8s.local
192.168.50.101 apiserver01.k8s.local
192.168.50.102 apiserver02.k8s.local
192.168.50.103 apiserver03.k8s.local
192.168.50.101 k8s-m1
192.168.50.102 k8s-m2
192.168.50.103 k8s-m3
192.168.50.104 k8s-node1
192.168.50.105 k8s-node2
192.168.50.106 k8s-node3

每臺機器生成nginx配置文件,上面的三個hosts可以不寫,寫下面配置文件裏域名寫ip即可,但是這樣更改ip需要重新加載。這裏我跟原作者不一樣的,是我自己手動編譯nginx來做的。

mkdir -p /etc/kubernetes
[root@k8s-m1 src]# cat /etc/kubernetes/nginx.conf 
user nginx nginx;
worker_processes auto;
events {
    worker_connections  20240;
    use epoll;
}
error_log /var/log/kube_nginx_error.log info;

stream {
    upstream kube-servers {
        hash  consistent;
        server apiserver01.k8s.local:6443 weight=5 max_fails=1 fail_timeout=3s;
        server apiserver02.k8s.local:6443 weight=5 max_fails=1 fail_timeout=3s;
        server apiserver03.k8s.local:6443 weight=5 max_fails=1 fail_timeout=3s;
    }

    server {
        listen 8443 reuseport;
        proxy_connect_timeout 3s;
        # 加大timeout
        proxy_timeout 3000s;
        proxy_pass kube-servers;
    }
}

因爲localproxy是每臺機器上的,可以不用SLB和vpc無法使用vip的限制,這裏我們編譯安裝kube-nginx;所有機器都需要安裝

yum install gcc gcc-c++ -y
groupadd nginx
useradd -r -g nginx nginx
wget http://nginx.org/download/nginx-1.16.1.tar.gz -P /usr/local/src/
cd /usr/local/src/
tar zxvf nginx-1.16.1.tar.gz
cd nginx-1.16.1/
./configure --with-stream --without-http --prefix=/usr/local/kube-nginx --without-http_uwsgi_module --without-http_scgi_module --without-http_fastcgi_module
make && make install

#編寫systemd啓動
[root@k8s-m1 src]# cat /usr/lib/systemd/system/kube-nginx.service 
[Unit]
Description=kube-apiserver nginx proxy
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=forking
ExecStartPre=/usr/local/kube-nginx/sbin/nginx -c /etc/kubernetes/nginx.conf -p /usr/local/kube-nginx -t
ExecStart=/usr/local/kube-nginx/sbin/nginx -c /etc/kubernetes/nginx.conf -p /usr/local/kube-nginx
ExecReload=/usr/local/kube-nginx/sbin/nginx -c /etc/kubernetes/nginx.conf -p /usr/local/kube-nginx -s reload
PrivateTmp=true
Restart=always
RestartSec=5
StartLimitInterval=0
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

systemctl daemon-reload && systemctl enable kube-nginx && systemctl restart kube-nginx

 

六: kubeadm部署

1. 配置kubernetes阿里雲的源

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

2. master部分

k8s的node就是kubelet+cri(一般是docker),kubectl是一個agent讀取kubeconfig去訪問kube-apiserver來操作集羣,kubeadm是部署,所以master節點需要安裝三個,node一般不需要kubectl

安裝相關軟件

yum install -y \
    kubeadm-1.18.5 \
    kubectl-1.18.5 \
    kubelet-1.18.5 \
    --disableexcludes=kubernetes && \
    systemctl enable kubelet

node節點安裝軟件

 yum install -y \
    kubeadm-1.18.5 \
    kubelet-1.18.5 \
    --disableexcludes=kubernetes && \
    systemctl enable kubelet

配置集羣信息(第一個master上配置)

打印默認init的配置信息

kubeadm config print init-defaults > initconfig.yaml

#我們看下默認init的集羣參數

apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 1.2.3.4
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: k8s-m1
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.16.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
scheduler: {}

我們主要關注和只保留ClusterConfiguration的段,然後修改下,可以參考下列的v1beta2文檔,如果是低版本可能是v1beta1,某些字段和新的是不一樣的,自行查找godoc看
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#hdr-Basics
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#pkg-constants
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#ClusterConfiguration
ip啥的自行更改成和自己的一致,cidr不懂咋計算就別亂改。controlPlaneEndpoint寫域名(內網沒dns所有機器寫hosts也行)或者SLB,VIP,原因和注意事項見 https://zhangguanzhang.github.io/2019/03/11/k8s-ha/ 這個文章我把HA解釋得很清楚了,不要再問我了,下面是最終的yaml

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
imageRepository: registry.aliyuncs.com/k8sxio
kubernetesVersion: v1.18.5 # 如果鏡像列出的版本不對就這裏寫正確版本號
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
networking: #https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Networking
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16
controlPlaneEndpoint: apiserver.k8s.local:8443 # 單個master的話寫master的ip或者不寫
apiServer: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#APIServer
  timeoutForControlPlane: 4m0s
  extraArgs:
    authorization-mode: "Node,RBAC"
    enable-admission-plugins: "NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeClaimResize,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,Priority,PodPreset"
    runtime-config: api/all=true,settings.k8s.io/v1alpha1=true
    storage-backend: etcd3
    etcd-servers: https://192.168.50.101:2379,https://192.168.50.102:2379,https://192.168.50.103:2379
  certSANs:
  - 10.96.0.1 # service cidr的第一個ip
  - 127.0.0.1 # 多個master的時候負載均衡出問題了能夠快速使用localhost調試
  - localhost
  - apiserver.k8s.local # 負載均衡的域名或者vip
  - 192.168.50.101
  - 192.168.50.102
  - 192.168.50.103
  - apiserver01.k8s.local
  - apiserver02.k8s.local
  - apiserver03.k8s.local
  - master
  - kubernetes
  - kubernetes.default 
  - kubernetes.default.svc 
  - kubernetes.default.svc.cluster.local
  extraVolumes:
  - hostPath: /etc/localtime
    mountPath: /etc/localtime
    name: localtime
    readOnly: true
controllerManager: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#ControlPlaneComponent
  extraArgs:
    bind-address: "0.0.0.0"
    experimental-cluster-signing-duration: 867000h
  extraVolumes:
  - hostPath: /etc/localtime
    mountPath: /etc/localtime
    name: localtime
    readOnly: true
scheduler: 
  extraArgs:
    bind-address: "0.0.0.0"
  extraVolumes:
  - hostPath: /etc/localtime
    mountPath: /etc/localtime
    name: localtime
    readOnly: true
dns: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#DNS
  type: CoreDNS # or kube-dns
  imageRepository: coredns # azk8s.cn已失效,使用dockerhub上coredns官方鏡像
  imageTag: 1.6.7  # 阿里鏡像倉庫目前只有1.6.7,最新見dockerhub
etcd: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Etcd
  local:
    imageRepository: quay.io/coreos
    imageTag: v3.4.7
    dataDir: /var/lib/etcd
    serverCertSANs: # server和peer的localhost,127,::1都默認自帶的不需要寫
    - master
    - 192.168.50.101
    - 192.168.50.102
    - 192.168.50.103
    - etcd01.k8s.local
    - etcd02.k8s.local
    - etcd03.k8s.local
    peerCertSANs:
    - master
    - 192.168.50.101
    - 192.168.50.102
    - 192.168.50.103
    - etcd01.k8s.local
    - etcd02.k8s.local
    - etcd03.k8s.local
    extraArgs: # 暫時沒有extraVolumes
      auto-compaction-retention: "1h"
      max-request-bytes: "33554432"
      quota-backend-bytes: "8589934592"
      enable-v2: "false" # disable etcd v2 api
  # external: //外部etcd的時候這樣配置 https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Etcd
    # endpoints:
    # - "https://172.19.0.2:2379"
    # - "https://172.19.0.3:2379"
    # - "https://172.19.0.4:2379"
    # caFile: "/etc/kubernetes/pki/etcd/ca.crt"
    # certFile: "/etc/kubernetes/pki/etcd/etcd.crt"
    # keyFile: "/etc/kubernetes/pki/etcd/etcd.key"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration # https://godoc.org/k8s.io/kube-proxy/config/v1alpha1#KubeProxyConfiguration
mode: ipvs # or iptables
ipvs:
  excludeCIDRs: null
  minSyncPeriod: 0s
  scheduler: "rr" # 調度算法
  syncPeriod: 15s
iptables:
  masqueradeAll: true
  masqueradeBit: 14
  minSyncPeriod: 0s
  syncPeriod: 30s
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration # https://godoc.org/k8s.io/kubelet/config/v1beta1#KubeletConfiguration
cgroupDriver: systemd
failSwapOn: true # 如果開啓swap則設置爲false

檢查文件是否錯誤,忽略warning,錯誤的話會拋出error,沒錯則會輸出到包含字符串kubeadm join xxx啥的

 

kubeadm init --config initconfig.yaml --dry-run

檢查鏡像是否正確,版本號不正確就把yaml裏的kubernetesVersion取消註釋寫上自己對應的版本號

kubeadm config images list --config initconfig.yaml

預先拉取鏡像

kubeadm config images pull --config initconfig.yaml # 下面是輸出
[config/images] Pulled gcr.azk8s.cn/google_containers/kube-apiserver:v1.18.5
[config/images] Pulled gcr.azk8s.cn/google_containers/kube-controller-manager:v1.18.5
[config/images] Pulled gcr.azk8s.cn/google_containers/kube-scheduler:v1.18.5
[config/images] Pulled gcr.azk8s.cn/google_containers/kube-proxy:v1.18.5
[config/images] Pulled gcr.azk8s.cn/google_containers/pause:3.1
[config/images] Pulled quay.azk8s.cn/coreos/etcd:v3.4.7
[config/images] Pulled coredns/coredns:1.6.3

七:kubeadm init

下面init只在第一個master上面操作

# --experimental-upload-certs 參數的意思爲將相關的證書直接上傳到etcd中保存,這樣省去我們手動分發證書的過程
# 注意在v1.15+版本中,已經變成正式參數,不再是實驗性質,之前的版本請使用 --experimental-upload-certs

kubeadm init --config initconfig.yaml --upload-certs

如果超時了看看是不是kubelet沒起來,調試見 https://github.com/zhangguanzhang/Kubernetes-ansible/wiki/systemctl-running-debug

記住init後打印的token,複製kubectl的kubeconfig,kubectl的kubeconfig路徑默認是~/.kube/config

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

init的yaml信息實際上會存在集羣的configmap裏,我們可以隨時查看,該yaml在其他node和master join的時候會使用到

kubectl -n kube-system get cm kubeadm-config -o yaml

如果單個master,也不想整其他的node,需要去掉master節點上的污點,下一步的多master操作不需要整

kubectl taint nodes --all node-role.kubernetes.io/master-

設置ep的rbac

kube-apiserver的web健康檢查路由有權限,我們需要開放用來監控或者對接SLB的健康檢查,yaml文件 https://github.com/zhangguanzhang/Kubernetes-ansible-base/blob/roles/master/files/healthz-rbac.yml

kubectl apply -f https://raw.githubusercontent.com/zhangguanzhang/Kubernetes-ansible-base/roles/master/files/healthz-rbac.yml

配置其他master的k8s管理組件

手動拷貝(某些低版本不支持上傳證書的時候操作,如果前面kubeadm init的時候加了上傳證書選項這步不用執行)

第一個master上拷貝ca證書到其他master節點上,因爲交互輸入密碼,我們安裝sshpass,zhangguanzhang是root密碼

yum install sshpass -y
alias ssh='sshpass -p zhangguanzhang ssh -o StrictHostKeyChecking=no'
alias scp='sshpass -p zhangguanzhang scp -o StrictHostKeyChecking=no'

複製ca證書到其他master節點

for node in 172.19.0.3 172.19.0.4;do
    ssh $node 'mkdir -p /etc/kubernetes/pki/etcd'
    scp -r /etc/kubernetes/pki/ca.* $node:/etc/kubernetes/pki/
    scp -r /etc/kubernetes/pki/sa.* $node:/etc/kubernetes/pki/
    scp -r /etc/kubernetes/pki/front-proxy-ca.* $node:/etc/kubernetes/pki/
    scp -r /etc/kubernetes/pki/etcd/ca.* $node:/etc/kubernetes/pki/etcd/
done
其他master join進來
kubeadm join apiserver.k8s.local:8443 --token vo6qyo.4cm47w561q9p830v \
    --discovery-token-ca-cert-hash sha256:46e177c317037a4815c6deaab8089da4340663efeeead40810d4f53239256671 \
    --control-plane --certificate-key ba869da2d611e5afba5f9959a5f18891c20fb56d90592225765c0b965e3d8783

token忘記的話可以kubeadm token list查看,可以通過kubeadm token create創建
sha256的值可以通過下列命令獲取

openssl x509 -pubkey -in \
    /etc/kubernetes/pki/ca.crt | \
    openssl rsa -pubin -outform der 2>/dev/null | \
    openssl dgst -sha256 -hex | sed 's/^.* //'

設置kubectl的補全腳本

kubectl completion bash > /etc/bash_completion.d/kubectl

所有master配置etcdctl

複製出容器裏的etcdctl

docker cp `docker ps -a | awk '/k8s_etcd/{print $1}'`:/usr/local/bin/etcdctl /usr/local/bin/etcdctl

1.13還是具體哪個版本後k8s默認使用v3 api的etcd,這裏我們配置下etcdctl的參數

cat >/etc/profile.d/etcd.sh<<'EOF'
ETCD_CERET_DIR=/etc/kubernetes/pki/etcd/
ETCD_CA_FILE=ca.crt
ETCD_KEY_FILE=healthcheck-client.key
ETCD_CERT_FILE=healthcheck-client.crt
ETCD_EP=https://192.168.50.101:2379,https://192.168.50.102:2379,https://192.168.50.103:2379

alias etcd_v2="etcdctl --cert-file ${ETCD_CERET_DIR}/${ETCD_CERT_FILE} \
              --key-file ${ETCD_CERET_DIR}/${ETCD_KEY_FILE}  \
              --ca-file ${ETCD_CERET_DIR}/${ETCD_CA_FILE}  \
              --endpoints $ETCD_EP"

alias etcd_v3="ETCDCTL_API=3 \
    etcdctl   \
   --cert ${ETCD_CERET_DIR}/${ETCD_CERT_FILE} \
   --key ${ETCD_CERET_DIR}/${ETCD_KEY_FILE} \
   --cacert ${ETCD_CERET_DIR}/${ETCD_CA_FILE} \
    --endpoints $ETCD_EP"
EOF

重新ssh下或者手動加載下環境變量. /etc/profile.d/etcd.sh

[root@k8s-m1 ~]# etcd_v3 endpoint status --write-out=table
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|          ENDPOINT           |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.50.101:2379 | 9fdaf6a25119065e |   3.4.7 |  3.1 MB |     false |      false |         5 |     305511 |             305511 |        |
| https://192.168.50.102:2379 | a3d9d41cf6d05e08 |   3.4.7 |  3.1 MB |      true |      false |         5 |     305511 |             305511 |        |
| https://192.168.50.103:2379 | 3b34476e501895d4 |   3.4.7 |  3.0 MB |     false |      false |         5 |     305511 |             305511 |        |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

配置etcd備份腳本

mkdir -p /opt/etcd
cat>/opt/etcd/etcd_cron.sh<<'EOF'
#!/bin/bash
set -e

export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin

:  ${bak_dir:=/root/} #缺省備份目錄,可以修改成存在的目錄
:  ${cert_dir:=/etc/kubernetes/pki/etcd/}
:  ${endpoints:=https://192.168.50.101:2379,https://192.168.50.102:2379,https://192.168.50.103:2379}

bak_prefix='etcd-'
cmd_suffix='date +%Y-%m-%d-%H:%M'
bak_suffix='.db'

#將規範化後的命令行參數分配至位置參數($1,$2,...)
temp=`getopt -n $0 -o c:d: -u -- "$@"`

[ $? != 0 ] && {
    echo '
Examples:
  # just save once
  bash $0 /tmp/etcd.db
  # save in contab and  keep 5
  bash $0 -c 5
    '
    exit 1
    }
set -- $temp


# -c 備份保留副本數量
# -d 指定備份存放目錄
while true;do
    case "$1" in
        -c)
            [ -z "$bak_count" ] && bak_count=$2
            printf -v null %d "$bak_count" &>/dev/null || \
                { echo 'the value of the -c must be number';exit 1; }
            shift 2
            ;;
        -d)
            [ ! -d "$2" ] && mkdir -p $2
            bak_dir=$2
            shift 2
            ;;
         *)
            [[ -z "$1" || "$1" == '--' ]] && { shift;break; }
            echo "Internal error!"
            exit 1
            ;;
    esac
done


function etcd_v2(){

    etcdctl --cert-file $cert_dir/healthcheck-client.crt \
            --key-file  $cert_dir/healthcheck-client.key \
            --ca-file   $cert_dir/ca.crt \
        --endpoints $endpoints $@
}

function etcd_v3(){

    ETCDCTL_API=3 etcdctl   \
       --cert $cert_dir/healthcheck-client.crt \
       --key  $cert_dir/healthcheck-client.key \
       --cacert $cert_dir/ca.crt \
       --endpoints $endpoints $@
}

etcd::cron::save(){
    cd $bak_dir/
    etcd_v3 snapshot save  $bak_prefix$($cmd_suffix)$bak_suffix
    rm_files=`ls -t $bak_prefix*$bak_suffix | tail -n +$[bak_count+1]`
    if [ -n "$rm_files" ];then
        rm -f $rm_files
    fi
}

main(){
    [ -n "$bak_count" ] && etcd::cron::save || etcd_v3 snapshot save $@
}

main $@
EOF

crontab -e添加下面內容自動保留四個備份副本

bash /opt/etcd/etcd_cron.sh  -c 4 -d /opt/etcd/ &>/dev/null

node

按照前面的做:

  • 配置系統設置
  • 設置hostname
  • 安裝docker-ce
  • 設置hosts和nginx
  • 配置軟件源,安裝kubeadm kubelet

和master的join一樣,提前準備好環境和docker,然後join的時候不需要帶--control-plane,只有一個master的話join的那個ip寫controlPlaneEndpoint的值

kubeadm join apiserver.k8s.local:8443 --token vo6qyo.4cm47w561q9p830v \
    --discovery-token-ca-cert-hash sha256:46e177c317037a4815c6deaab8089da4340663efeeead40810d4f53239256671
[root@k8s-m1 ~]# kubectl get node
NAME        STATUS   ROLES    AGE    VERSION
k8s-m1      Ready    master   23h    v1.18.5
k8s-m2      Ready    master   23h    v1.18.5
k8s-m3      Ready    master   23h    v1.18.5
k8s-node1   Ready    node     23h    v1.18.5
k8s-node2   Ready    node     121m   v1.18.5
k8s-node3   Ready    node     82m    v1.18.5

addon(此章開始到結尾選取任意一個master上執行)

容器的網絡還沒處理好,coredns無法分配到ip會處於pending狀態,這裏我用flannel部署,如果你瞭解bgp可以使用calico
yaml文件來源與flannel官方github https://github.com/coreos/flannel/tree/master/Documentation

修改

  • 如果是在1.16之前使用psp,policy/v1beta1得修改成extensions/v1beta1;這裏不用修改

apiVersion: policy/v1beta1
kind: PodSecurityPolicy

- rbac的version改爲下面,不要使用v1beta1了,使用下面命令修改

sed -ri '/apiVersion: rbac/s#v1.+#v1#' kube-flannel.yml

- 官方yaml自帶了四種架構的daemonset,我們刪掉除了amd64以外的,大概是227行到結尾

sed -ri '227,$d' kube-flannel.yml

- pod的cidr修改了的話這裏也要修改,如果是在同一個二層,可以使用把vxlan改爲性能更強的host-gw模式,vxlan的話需要安全組放開8472端口的udp

net-conf.json: |
  {
    "Network": "10.244.0.0/16",
    "Backend": {
      "Type": "vxlan"
    }
  }

- 修改limits,需要大於request

limits:
  cpu: "200m"
  memory: "100Mi"

 

 

部署flannel

貌似沒有遇到這個錯誤

1.15後node的cidr是數組,而不是單個了,flannel目前0.11和之前版本部署的話會有下列錯誤,見文檔
https://github.com/kubernetes/kubernetes/blob/v1.15.0/staging/src/k8s.io/api/core/v1/types.go#L3890-L3893
https://github.com/kubernetes/kubernetes/blob/v1.18.2/staging/src/k8s.io/api/core/v1/types.go#L4206-L4216

 

Error registering network: failed to acquire lease: node "xxx" pod cidr not assigned

 

手動打patch,後續擴的node也記得打下

nodes=`kubectl get node --no-headers | awk '{print $1}'`
for node in $nodes;do
    cidr=`kubectl get node "$node" -o jsonpath='{.spec.podCIDRs[0]}'`
    [ -z "$(kubectl get node $node -o jsonpath='{.spec.podCIDR}')" ] && {
        kubectl patch node "$node" -p '{"spec":{"podCIDR":"'"$cidr"'"}}' 
    }
done

最終的kube-flannel.yml如下:

[root@k8s-m1 ~]# cat kube-flannel.yml 
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp.flannel.unprivileged
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
    seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
  privileged: false
  volumes:
    - configMap
    - secret
    - emptyDir
    - hostPath
  allowedHostPaths:
    - pathPrefix: "/etc/cni/net.d"
    - pathPrefix: "/etc/kube-flannel"
    - pathPrefix: "/run/flannel"
  readOnlyRootFilesystem: false
  # Users and groups
  runAsUser:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
  # Privilege Escalation
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  # Capabilities
  allowedCapabilities: ['NET_ADMIN']
  defaultAddCapabilities: []
  requiredDropCapabilities: []
  # Host namespaces
  hostPID: false
  hostIPC: false
  hostNetwork: true
  hostPorts:
  - min: 0
    max: 65535
  # SELinux
  seLinux:
    # SELinux is unused in CaaSP
    rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
rules:
  - apiGroups: ['extensions']
    resources: ['podsecuritypolicies']
    verbs: ['use']
    resourceNames: ['psp.flannel.unprivileged']
  - apiGroups:
      - ""
    resources:
      - pods
    verbs:
      - get
  - apiGroups:
      - ""
    resources:
      - nodes
    verbs:
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - nodes/status
    verbs:
      - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: flannel
  namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "host-gw"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-amd64
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: kubernetes.io/arch
                    operator: In
                    values:
                      - amd64
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.12.0-amd64
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.12.0-amd64
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "200m"
            memory: "100Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg

這裏採用了host-gw模式,因爲遇到了udp的內核bug,詳細請參考:https://zhangguanzhang.github.io/2020/05/23/k8s-vxlan-63-timeout/

 

kubectl apply -f kube-flannel.yml

驗證集羣可用性

 

kubectl -n kube-system get pod -o wide

等待kube-system空間下的pod都是running後我們來測試下集羣可用性

cat<<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:alpine
        name: nginx
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  selector:
    app: nginx
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
---
apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  containers:
  - name: busybox
    image: zhangguanzhang/centos
    command:
      - sleep
      - "3600"
    imagePullPolicy: IfNotPresent
  restartPolicy: Always
EOF

等待pod running

驗證集羣dns

$ kubectl exec -ti busybox -- nslookup kubernetes
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

關於kubeadm過程和更多詳細參數選項見下面文章

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章