OpenShift Container Platform 4.3部署實錄

本文參照紅帽官方文檔,在裸機安裝Openshift4.3文檔進行。因爲只有一臺64G內存的PC機,安裝vmware vsphere 6.7免費版進行本測試,所以嘗試在OCP官方文檔要求的最低內存需求基礎上,內存減半安裝,記錄如下。

1、ocp安裝的過程

紅帽官方文檔記載的安裝過程如下:

  1. bootstrap啓動並從準備好master需要的資源
  2. master從bootstrap獲取需要的資源並完成啓動
  3. master通過bootstrap構建etcd集羣
  4. bootstrap使用剛纔構建的etcd集羣啓動一個臨時的kubernetes control plane
  5. 臨時control plane在master節點啓動生產control plane
  6. 臨時control plane關閉並將控制權移交給生產control plane
  7. bootstrap將ocp組建注入到生產control plane
  8. 安裝程序關閉bootstrap
  9. control plane 部署計算節點
  10. control plane 通過operator方式安裝其他服務

2、準備服務器資源

服務器規劃如下:

  • 3臺control plane節點,安裝etcd、control plane組件和infras基礎組件,因爲資源緊張,不部署dns服務器,通過hosts文件解析域名;
  • 2臺compute 節點,運行實際負載;
  • 1臺bootstrap節點,執行安裝任務;
  • 1臺misc/lb節點,用於準備安裝資源、啓動bootstrap,並作爲lb節點使用。
Hostname vcpu ram hdd ip fqdn
misc/lb 4 8g 120g 192.168.128.30 misc.ocptest.ipingcloud.com/lb.ocptest.ipincloud.com
bootstrap 4 8g 120g 192.168.128.31 bootstrap.ocptest.ipincloud.com
master1 4 8g 120g 192.168.128.32 master1.ocptest.ipincloud.com
master2 4 8g 120g 192.168.128.33 master2.ocptest.ipincloud.com
master3 4 8g 120g 192.168.128.34 master3.ocptest.ipincloud.com
worker1 2 4g 120g 192.168.128.35 worker1.ocptest.ipincloud.com
worker2 2 4g 120g 192.168.128.36 worker2.ocptest.ipincloud.com

3、準備網絡資源

api server和ingress公用一個lb,即misc/lb
以爲dns配置記錄,ocptest是cluster名,ipingcloud.com是基礎域名.這些配置,需要修改ansi-playbook文件的tasks/相應模板。
參見
https://github.com/scwang18/ocp4-upi-helpernode.git

  • dns配置
組件 dns記錄 描述
Kubernetes API api.ocptest.ipincloud.com 該DNS記錄指向control plane節點的負載平衡器。羣集外部和羣集中所有節點都必須可以解析此記錄。
Kubernetes API api-int.ocptest.ipincloud.com 該DNS記錄指向control plane節點的負載平衡器。該記錄必須可從羣集中的所有節點上解析。
Routes *.apps.ocptest.ipincloud.com 通配符DNS記錄指向ingress slb。羣集外部和羣集中所有節點都必須可以解析此記錄。
etcd etcd-.ocptest.ipincloud.com DNS記錄指向etcd節點,羣集所有節點都必須可以解析此記錄。
etcd _etcd-server-ssl._tcp.ocptest.ipincloud.com 因爲etcd使用2380對外服務,因此,需要建立對應每臺etcd節點的srv dns記錄,優先級0,權重10和端口2380,如下表
  • etcd srv dns記錄表

#一下激怒是必須的,用於bootstrap創建etcd服務器上,自動配置etcd服務解析

#_service._proto.name. TTL class SRV priority weight port target.
_etcd-server-ssl._tcp.<cluster_name>.<base_domain> 86400 IN SRV 0 10 2380 etcd-0.<cluster_name>.<base_domain>.
_etcd-server-ssl._tcp.<cluster_name>.<base_domain> 86400 IN SRV 0 10 2380 etcd-1.<cluster_name>.<base_domain>.
_etcd-server-ssl._tcp.<cluster_name>.<base_domain> 86400 IN SRV 0 10 2380 etcd-2.<cluster_name>.<base_domain>.
  • 創建ssh私鑰並加入ssh agent

通過免登陸ssh私鑰,可以用core用戶身份登錄到master節點,在集羣上進行安裝調試和災難恢復。

(1)在misc節點上執行一下命令創建sshkey

ssh-keygen -t rsa -b 4096 -N '' 

以上命令在~/.ssh/文件夾下創建id_rsa和id_rsa.pub兩個文件。

(2)啓動ssh agent進程並把將無密碼登錄的私鑰加入ssh agent

eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa

下一步安裝ocp時,需要將ssh公鑰提供給安裝程序配置文件。

因爲我們採用自己手動準備資源方式,因此,需要將ssh公鑰放到集羣各節點,本機就可以免密碼登錄集羣節點

#將剛纔生成的 ~/.ssh目錄中的 id_rsa.pub 這個文件拷貝到你要登錄的集羣節點 的~/.ssh目錄中
scp ~/.ssh/id_rsa.pub [email protected]:~/.ssh/
#然後在集羣節點上運行以下命令來將公鑰導入到~/.ssh/authorized_keys這個文件中
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

4、獲取安裝程序

需要註冊紅帽官網賬號,下載測試版安裝程序,下載鏈接具體過程略。
https://cloud.redhat.com/openshift/install/metal/user-provisioned

  • 下載安裝程序
rm -rf /data/pkg
mkdir -p /data/pkg
cd /data/pkg

#ocp安裝程序
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-install-linux-4.3.0.tar.gz

#ocp 客戶端
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-client-linux-4.3.0.tar.gz

#rhcos安裝程序
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer.iso

#rhcos  bios raw文件
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-metal.raw.gz

#如果採用iso文件方式安裝,相面兩個文件都不需要下載

#rhcos安裝程序內核文件,用於使用ipex方式安裝
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer-kernel

#rhcos初始化鏡像文件,用於使用ipex方式安裝
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer-initramfs.img

5、準備工具機misc

參照王徵的腳本修改的工具機準備工具,可以方便的在工具機上啓動 LB、DHCP、PXE、DNS和HTTP服務
(1)安裝ansible和git

yum -y install ansible git

(2)從github拉取playbook

cd /data/pkg
git clone https://github.com/scwang18/ocp4-upi-helpernode.git

(3)修改playbook的參數文件
根據自己的網絡規劃修改參數文件

[root@centos75 pkg]# cd /data/pkg/ocp4-upi-helpernode/
[root@centos75 ocp4-upi-helpernode]# cat vars-static.yaml
[root@misc pkg]# cat vars-static.yaml
---
staticips: true
named: true
helper:
  name: "helper"
  ipaddr: "192.168.128.30"
  networkifacename: "ens192"
dns:
  domain: "ipincloud.com"
  clusterid: "ocptest"
  forwarder1: "192.168.128.30"
  forwarder2: "192.168.128.30"
  registry:
    name: "registry"
    ipaddr: "192.168.128.30"
  yum:
    name: "yum"
    ipaddr: "192.168.128.30"
bootstrap:
  name: "bootstrap"
  ipaddr: "192.168.128.31"
masters:
  - name: "master1"
    ipaddr: "192.168.128.32"
  - name: "master2"
    ipaddr: "192.168.128.33"
  - name: "master3"
    ipaddr: "192.168.128.34"
workers:
  - name: "worker1"
    ipaddr: "192.168.128.35"
  - name: "worker2"
    ipaddr: "192.168.128.36"
force_ocp_download: false

ocp_bios: "file:///data/pkg/rhcos-4.3.0-x86_64-metal.raw.gz"
ocp_initramfs: "file:///data/pkg/rhcos-4.3.0-x86_64-installer-initramfs.img"
ocp_install_kernel: "file:///data/pkg/rhcos-4.3.0-x86_64-installer-kernel"
ocp_client: "file:///data/pkg/openshift-client-linux-4.3.0.tar.gz"
ocp_installer: "file:///data/pkg/openshift-install-linux-4.3.0.tar.gz"
ocp_filetranspiler: "file:///data/pkg/filetranspiler-master.zip"
registry_server: "registry.ipincloud.com:8443"
[root@misc pkg]#

(4)執行ansible安裝

ansible-playbook -e @vars-static.yaml tasks/main.yml

6、準備docker env

# 在可以科學上網的機器上打包必要的鏡像文件

#rm -rf /data/ocp4
mkdir -p /data/ocp4
cd /data/ocp4

# 這個腳本不好用,不下載,使用下面自己修改過
# wget https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.3/scripts/build.dist.sh

yum -y install podman docker-distribution pigz skopeo docker buildah jq python3-pip 

pip3 install yq

# https://blog.csdn.net/ffzhihua/article/details/85237411
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm
rpm2cpio python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm | cpio -iv --to-stdout ./etc/rhsm/ca/redhat-uep.pem | tee /etc/rhsm/ca/redhat-uep.pem

systemctl start docker

docker login -u wuliangye2019 -p Red@123! registry.redhat.io
docker login -u wuliangye2019 -p Red@123! registry.access.redhat.com
docker login -u wuliangye2019 -p Red@123! registry.connect.redhat.com

podman login -u wuliangye2019 -p Red@123! registry.redhat.io
podman login -u wuliangye2019 -p Red@123! registry.access.redhat.com
podman login -u wuliangye2019 -p Red@123! registry.connect.redhat.com

# to download the pull-secret.json, open following link
# https://cloud.redhat.com/openshift/install/metal/user-provisioned
cat << 'EOF' > /data/pull-secret.json
{"auths":{"cloud.openshift.com":{"auth":"xxxxxxxxxxx}}}
EOF

創建 build.dist.sh文件

#!/usr/bin/env bash

set -e
set -x

var_date=$(date '+%Y-%m-%d')
echo $var_date
#以下不用每次都執行
#cat << EOF >>  /etc/hosts
#127.0.0.1 registry.ipincloud.com
#EOF


#mkdir -p /etc/crts/
#cd /etc/crts
#openssl req \
#   -newkey rsa:2048 -nodes -keyout ipincloud.com.key \
#   -x509 -days 3650 -out ipincloud.com.crt -subj \
#   "/C=CN/ST=GD/L=SZ/O=Global Security/OU=IT Department/CN=*.ipincloud.com"

#cp /etc/crts/ipincloud.com.crt /etc/pki/ca-trust/source/anchors/
#update-ca-trust extract

systemctl stop docker-distribution

rm -rf /data/registry
mkdir -p /data/registry
cat << EOF > /etc/docker-distribution/registry/config.yml
version: 0.1
log:
  fields:
    service: registry
storage:
    cache:
        layerinfo: inmemory
    filesystem:
        rootdirectory: /data/registry
    delete:
        enabled: true
http:
    addr: :8443
    tls:
       certificate: /etc/crts/ipincloud.com.crt
       key: /etc/crts/ipincloud.com.key
EOF
systemctl restart docker
systemctl enable docker-distribution

systemctl restart docker-distribution

build_number_list=$(cat << EOF
4.3.0
EOF
)
mkdir -p /data/ocp4
cd /data/ocp4

install_build() {
    BUILDNUMBER=$1
    echo ${BUILDNUMBER}
    
    mkdir -p /data/ocp4/${BUILDNUMBER}
    cd /data/ocp4/${BUILDNUMBER}

    #下載並安裝openshift客戶端和安裝程序 第一次需要運行,工具機ansi初始化時,已經完成這些動作了
    #wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/release.txt

    #wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz
    #wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-install-linux-${BUILDNUMBER}.tar.gz

    #解壓安裝程序和客戶端到用戶執行目錄 第一次需要運行
    #tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
    #tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
    
    export OCP_RELEASE=${BUILDNUMBER}
    export LOCAL_REG='registry.ipincloud.com:8443'
    export LOCAL_REPO='ocp4/openshift4'
    export UPSTREAM_REPO='openshift-release-dev'
    export LOCAL_SECRET_JSON="/data/pull-secret.json"
    export OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE=${LOCAL_REG}/${LOCAL_REPO}:${OCP_RELEASE}
    export RELEASE_NAME="ocp-release"

    oc adm release mirror -a ${LOCAL_SECRET_JSON} \
    --from=quay.io/${UPSTREAM_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-x86_64 \
    --to-release-image=${LOCAL_REG}/${LOCAL_REPO}:${OCP_RELEASE} \
    --to=${LOCAL_REG}/${LOCAL_REPO}

}

while read -r line; do
    install_build $line
done <<< "$build_number_list"

cd /data/ocp4

#wget -O ocp4-upi-helpernode-master.zip https://github.com/wangzheng422/ocp4-upi-helpernode/archive/master.zip

#以下注釋,因爲quay.io/wangzheng422這個倉庫的registry版本是v1不能與v2共存
#podman pull quay.io/wangzheng422/filetranspiler
#podman save quay.io/wangzheng422/filetranspiler | pigz -c > filetranspiler.tgz

#podman pull docker.io/library/registry:2
#podman save docker.io/library/registry:2 | pigz -c > registry.tgz

systemctl start docker

docker login -u wuliangye2019 -p Red@123! registry.redhat.io
docker login -u wuliangye2019 -p Red@123! registry.access.redhat.com
docker login -u wuliangye2019 -p Red@123! registry.connect.redhat.com

podman login -u wuliangye2019 -p Red@123! registry.redhat.io
podman login -u wuliangye2019 -p Red@123! registry.access.redhat.com
podman login -u wuliangye2019 -p Red@123! registry.connect.redhat.com

# 以下命令要運行 2-3個小時,耐心等待。。。

# build operator catalog
podman login registry.ipincloud.com:8443 -u root -p Scwang18
oc adm catalog build \
    --appregistry-endpoint https://quay.io/cnr \
    --appregistry-org redhat-operators \
    --to=${LOCAL_REG}/ocp4-operator/redhat-operators:v1
    
oc adm catalog mirror \
    ${LOCAL_REG}/ocp4-operator/redhat-operators:v1 \
    ${LOCAL_REG}/operator

#cd /data
#tar cf - registry/ | pigz -c > registry.tgz

#cd /data
#tar cf - ocp4/ | pigz -c > ocp4.tgz

執行build.dist.sh腳本

這裏有個巨坑,因爲從quay.io拉取image鏡像到本地時,拉取的文件有5G多,通常一次拉取不完,會出錯,每次出錯後,重新運行build.dist.sh會把以前的registry刪除掉,從頭再來,浪費很多時間,實際上可以不用刪除,執行oc adm release mirror時會自動跳過已經存在的image。血淚教訓。

bash build.dist.sh

oc adm release mirror執行完畢後,回根據官方鏡像倉庫生成本地鏡像倉庫,返回的信息需要記錄下來,特別是imageContentSource信息,後面 install-config.yaml 文件裏配置進去


Success
Update image:  registry.ipincloud.com:8443/ocp4/openshift4:4.3.0
Mirror prefix: registry.ipincloud.com:8443/ocp4/openshift4

To use the new mirrored repository to install, add the following section to the install-config.yaml:

imageContentSources:
- mirrors:
  - registry.ipincloud.com:8443/ocp4/openshift4
  source: quay.io/openshift-release-dev/ocp-release
- mirrors:
  - registry.ipincloud.com:8443/ocp4/openshift4
  source: quay.io/openshift-release-dev/ocp-v4.0-art-dev


To use the new mirrored repository for upgrades, use the following to create an ImageContentSourcePolicy:

apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
  name: example
spec:
  repositoryDigestMirrors:
  - mirrors:
    - registry.ipincloud.com:8443/ocp4/openshift4
    source: quay.io/openshift-release-dev/ocp-release
  - mirrors:
    - registry.ipincloud.com:8443/ocp4/openshift4
    source: quay.io/openshift-release-dev/ocp-v4.0-art-dev

以下命令不需要執行,在build.dish.sh裏已經執行了

oc adm release mirror -a /data/pull-secret.json --from=quay.io/openshift-release-dev/ocp-release:4.3.0-x86_64 --to-release-image=registry.ipincloud.com:8443/ocp4/openshift4:4.3.0 --to=registry.ipincloud.com:8443/ocp4/openshift4    

podman login registry.ipincloud.com:8443 -u root -p Scwang18
oc adm catalog build \
    --appregistry-endpoint https://quay.io/cnr \
    --appregistry-org redhat-operators \
    --to=registry.ipincloud.com:8443/ocp4-operator/redhat-operators:v1
    
oc adm catalog mirror \
    registry.ipincloud.com:8443/ocp4-operator/redhat-operators:v1 \
    registry.ipincloud.com:8443/operator

#如果oc adm catalog mirror執行不成功,會生成一個mapping.txt的文件,可以根據這個文件,執行不成功的行刪除,再以下面的方式執行
oc image mirror -a /data/pull-secret.json -f /data/mapping-ok.txt


oc image mirror quay.io/external_storage/nfs-client-provisioner:latest registry.ipincloud.com:8443/ocp4/openshift4/nfs-client-provisioner:latest

oc image mirror quay.io/external_storage/nfs-client-provisioner:latest registry.ipincloud.com:8443/quay.io/external_storage/nfs-client-provisioner:latest

#查看鏡像的sha
curl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X GET  https://registry.ipincloud.com:8443/v2/ocp4/openshift4/nfs-client-provisioner/manifests/latest 2>&1 | grep Docker-Content-Digest | awk '{print ($3)}'

#刪除鏡像摘要
curl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X DELETE https://registry.ipincloud.com:8443/v2/ocp4/openshift4/nfs-client-provisioner/manifests/sha256:022ea0b0d69834b652a4c53655d78642ae23f0324309097be874fb58d09d2919

#回收鏡像空間
podman exec -it  mirror-registry /bin/registry garbage-collect  /etc/docker/registry/config.yml

7、創建installer配置文件

(1)創建installer文件夾

rm -rf /data/install
mkdir -p /data/install
cd /data/install

(2)定製install-config.yaml文件

  • 補充pullSecret
[root@misc data]# cat /data/pull-secret.json
{"auths":{"cloud.openshift.com":{"auth":"省略"}}}
  • 添加sshKey(3.1創建的公鑰文件內容)
cat ~/.ssh/id_rsa.pub
  • additionalTrustBundle(Mirror registry創建是生成的csr)
[root@misc crts]# cat /etc/crts/ipincloud.com.crt
-----BEGIN CERTIFICATE-----
xxx省略
-----END CERTIFICATE-----
  • 添加代理

生產環境可以不用直連外網,通過在install-config.yaml文件爲集羣設置代理。

本次測試,爲了加速外網下載,我在aws上事先搭建了一個v2ray server,misc服務器作爲v2ray客戶端,具體搭建過程另文敘述。

  • 在反覆試驗時,比如 install-config.yaml 所在的目錄是 config,必須 rm -rf install 而不是 rm -rf install/*,後者未刪除其中的隱藏文件 .openshift_install_state.json,有可能引起:x509: certificate has expired or is not yet valid。

  • 在文檔和博客示例中 install-config.yaml 的 cidr 配置爲 10 網段,由於未細看文檔理解成了節點機網段,這造成了整個過程中最莫名其妙的錯誤:no matches for kind MachineConfig。

  • 最終文件內容如下:

[root@centos75 install]# vi install-config.yaml
apiVersion: v1
baseDomain: ipincloud.com
proxy:
  httpProxy: http://192.168.128.30:8001
  httpsProxy: http://192.168.128.30:8001
compute:
- hyperthreading: Enabled
  name: worker
  replicas: 0
controlPlane:
  hyperthreading: Enabled
  name: master
  replicas: 3
metadata:
  name: ocptest
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  none: {}
fips: false
pullSecret: '{"auths":{"省略'
additionalTrustBundle: |
  -----BEGIN CERTIFICATE-----
  省略,注意這裏要前面空兩格
  -----END CERTIFICATE-----
imageContentSources:
- mirrors:
  - registry.ipincloud.com:8443/ocp4/openshift4
  source: quay.io/openshift-release-dev/ocp-release
- mirrors:
  - registry.ipincloud.com:8443/ocp4/openshift4
  source: quay.io/openshift-release-dev/ocp-v4.0-art-dev

(3)備份定製install-config.yaml文件,便於以後可以重複使用

cd /data/install
cp install-config.yaml  ../install-config.yaml.20200205

8、創建Kubernetes manifest和Ignition配置文件

(1)生成Kubernetes manifests文件

openshift-install create manifests --dir=/data/install

注意:指定install-config.yaml所在目錄是,需要使用絕的路徑

(2)修改 manifests/cluster-scheduler-02-config.yml文件以防止pod調度到control plane節點

紅帽官方安裝文檔說明,kubernetes不支持ingress的load balancer訪問control-plane節點的pod

a.打開manifests/cluster-scheduler-02-config.yml
b.找到mastersSchedulable參數,設置爲False
c.保存並退出。

vi /data/install/manifests/cluster-scheduler-02-config.yml

(3)創建Ignition配置文件

注意:創建Ignition配置文件完成後,install-config.yaml文件將被刪除,請務必先備份此文件。

openshift-install create ignition-configs --dir=/data/install

(4)將Ignition配置文件拷貝到http服務器目錄,待安裝時使用

cd /data/install
\cp -f bootstrap.ign /var/www/html/ignition/bootstrap.ign
\cp -f master.ign /var/www/html/ignition/master1.ign
\cp -f master.ign /var/www/html/ignition/master2.ign
\cp -f master.ign /var/www/html/ignition/master3.ign
\cp -f worker.ign /var/www/html/ignition/worker1.ign
\cp -f worker.ign /var/www/html/ignition/worker2.ign

cd /var/www/html/ignition/
chmod 755 *.ign

至此,已完成必要的配置文件設置,開始進入下一步創建節點。

9、定製RHCOS ISO

安裝時需要修改啓動參數,只能手動錄入,每臺機器修改很麻煩,容易出錯,因此我們採用genisoimage來定製每臺機器的安裝鏡像。

#安裝鏡像創建工具
yum -y install genisoimage libguestfs-tools
systemctl start libvirtd

#設置環境變量
export NGINX_DIRECTORY=/data/pkg
export RHCOSVERSION=4.3.0
export VOLID=$(isoinfo -d -i ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso | awk '/Volume id/ { print $3 }')
#生成一個臨時文件目錄,用於放置過程文件
TEMPDIR=$(mktemp -d)
echo $VOLID
echo $TEMPDIR


cd ${TEMPDIR}
# Extract the ISO content using guestfish (to avoid sudo mount)
#使用guestfish可以將不用sudo mount將iso文件解壓出來
guestfish -a ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso \
  -m /dev/sda tar-out / - | tar xvf -

#定義修改配置文件的函數
modify_cfg(){
  for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do
    # 添加恰當的 image 和 ignition url
    sed -e '/coreos.inst=yes/s|$| coreos.inst.install_dev=sda coreos.inst.image_url='"${URL}"'\/install\/'"${BIOSMODE}"'.raw.gz coreos.inst.ignition_url='"${URL}"'\/ignition\/'"${NODE}"'.ign ip='"${IP}"'::'"${GATEWAY}"':'"${NETMASK}"':'"${FQDN}"':'"${NET_INTERFACE}"':none:'"${DNS}"' nameserver='"${DNS}"'|' ${file} > $(pwd)/${NODE}_${file##*/}
    # 修改參數裏的啓動等待時間
    sed -i -e 's/default vesamenu.c32/default linux/g' -e 's/timeout 600/timeout 10/g' $(pwd)/${NODE}_${file##*/}
  done
}

#設置url,網關、dns等iso啓動通用參數變量
URL="http://192.168.128.30:8080"
GATEWAY="192.168.128.254"
NETMASK="255.255.255.0"
DNS="192.168.128.30"

#設置bootstrap節點變量
NODE="bootstrap"
IP="192.168.128.31"
FQDN="bootstrap"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg

#設置master1節點變量
NODE="master1"
IP="192.168.128.32"
FQDN="master1"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg

#設置master2節點變量
NODE="master2"
IP="192.168.128.33"
FQDN="master2"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg

#設置master3節點變量
NODE="master3"
IP="192.168.128.34"
FQDN="master3"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg

#設置master4節點變量
NODE="worker1"
IP="192.168.128.35"
FQDN="worker1"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg

#設置master5節點變量
NODE="worker2"
IP="192.168.128.36"
FQDN="worker2"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg


# 爲每個節點創建不同的安裝鏡像
# https://github.com/coreos/coreos-assembler/blob/master/src/cmd-buildextend-installer#L97-L103
for node in bootstrap master1 master2 master3 worker1 worker2; do
  # 爲每個節點創建不同的 grub.cfg and isolinux.cfg 文件
  for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do
    /bin/cp -f $(pwd)/${node}_${file##*/} ${file}
  done
  # 創建iso鏡像
  genisoimage -verbose -rock -J -joliet-long -volset ${VOLID} \
    -eltorito-boot isolinux/isolinux.bin -eltorito-catalog isolinux/boot.cat \
    -no-emul-boot -boot-load-size 4 -boot-info-table \
    -eltorito-alt-boot -efi-boot images/efiboot.img -no-emul-boot \
    -o ${NGINX_DIRECTORY}/${node}.iso .
done

# 清除過程文件
cd
rm -Rf ${TEMPDIR}

cd ${NGINX_DIRECTORY}

9、在節點機器上安裝RHCOS

(1)將定製的ISO文件拷貝到vmware esxi主機上,準備裝節點

[root@misc pkg]# scp bootstrap.iso [email protected]:/vmfs/volumes/hdd/iso
[root@misc pkg]# scp m*.iso [email protected]:/vmfs/volumes/hdd/iso
[root@misc pkg]# scp w*.iso [email protected]:/vmfs/volumes/hdd/iso

(2)按規劃創建master,設置從iso啓動安裝

  • 進入啓動界面後,直接點擊安裝,系統自動回自動下載bios和配置文件,完成安裝
  • 安裝完成後,需要將iso文件退出來,避免再次進入安裝界面
  • 安裝順序是bootstrap,master1,master2,master3,待master安裝並啓動完成後,再進行worker安裝
  • 安裝過程中可以通過proxy查看進度 http://registry.ipincloud.com:9000/
  • 安裝過程中可以在misc節點查看詳細的bootstrap進度。
openshift-install --dir=/data/install wait-for bootstrap-complete --log-level debug

注意事項:

  • ignition和iso文件的正確匹配
  • 我在安裝的時候,master1提示etcdmain: member ab84b6a6e4a3cc9a has already been bootstrapped,花了很多時間分析和解決問題,因爲master1在安裝完成後,etcd組件會自動安裝並註冊爲member,我再次使用iso文件重新安裝master1後,etcd自動安裝註冊時,會檢測到etcd及集羣裏已經有這個member,無法重新註冊,因此這個節點的etcd一直無法正常啓動,解決辦法是:

手工修改-aster1節點的etcd的yaml文件,在exec etcd命令末尾增加–initial-cluster-state=existing參數,再刪除問題POD後,系統會自動重新安裝etcd pod,恢復正常。
正常啓動以後,要把這個改回去,否則machine-config回一直無法完成

#
[root@master1 /]# vi /etc/kubernetes/manifests/etcd-member.yaml

      exec etcd \
        --initial-advertise-peer-urls=https://${ETCD_IPV4_ADDRESS}:2380 \
        --cert-file=/etc/ssl/etcd/system:etcd-server:${ETCD_DNS_NAME}.crt \
        --key-file=/etc/ssl/etcd/system:etcd-server:${ETCD_DNS_NAME}.key \
        --trusted-ca-file=/etc/ssl/etcd/ca.crt \
        --client-cert-auth=true \
        --peer-cert-file=/etc/ssl/etcd/system:etcd-peer:${ETCD_DNS_NAME}.crt \
        --peer-key-file=/etc/ssl/etcd/system:etcd-peer:${ETCD_DNS_NAME}.key \
        --peer-trusted-ca-file=/etc/ssl/etcd/ca.crt \
        --peer-client-cert-auth=true \
        --advertise-client-urls=https://${ETCD_IPV4_ADDRESS}:2379 \
        --listen-client-urls=https://0.0.0.0:2379 \
        --listen-peer-urls=https://0.0.0.0:2380 \
        --listen-metrics-urls=https://0.0.0.0:9978 \
        --initial-cluster-state=existing
        
[root@master1 /]# crictl pods
POD ID              CREATED             STATE               NAME                                                     NAMESPACE                                ATTEMPT
c4686dc3e5f4f       38 minutes ago      Ready               etcd-member-master1.ocptest.ipincloud.com                openshift-etcd                           5        
[root@master1 /]# crictl rmp xxx

  • 檢查是否安裝完成
    如果出現INFO It is now safe to remove the bootstrap resources,表示master節點安裝完成,控制面轉移到master集羣。

[root@misc install]# openshift-install --dir=/data/install wait-for bootstrap-complete --log-level debug
DEBUG OpenShift Installer v4.3.0
DEBUG Built from commit 2055609f95b19322ee6cfdd0bea73399297c4a3e
INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocptest.ipincloud.com:6443...
INFO API v1.16.2 up
INFO Waiting up to 30m0s for bootstrapping to complete...
DEBUG Bootstrap status: complete
INFO It is now safe to remove the bootstrap resources
[root@misc install]#

(3)安裝worker

  • 進入啓動界面後,直接點擊安裝,系統自動回自動下載bios和配置文件,完成安裝
  • 安裝完成後,需要將iso文件退出來,避免再次進入安裝界面
  • 安裝順序是bootstrap,master1,master2,master3,待master安裝並啓動完成後,再進行worker安裝
  • 安裝過程中可以通過proxy查看進度 http://registry.ipincloud.com:9000/
  • 也可以在misc節點是查看詳細安裝節點
[root@misc redhat-operators-manifests]#  openshift-install --dir=/data/install wait-for install-complete --log-level debug
DEBUG OpenShift Installer v4.3.0
DEBUG Built from commit 2055609f95b19322ee6cfdd0bea73399297c4a3e
INFO Waiting up to 30m0s for the cluster at https://api.ocptest.ipincloud.com:6443 to initialize...
DEBUG Cluster is initialized
INFO Waiting up to 10m0s for the openshift-console route to be created...
DEBUG Route found in openshift-console namespace: console
DEBUG Route found in openshift-console namespace: downloads
DEBUG OpenShift console route is created
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
INFO Access the OpenShift web-console here:
https://console-openshift-console.apps.ocptest.ipincloud.com
INFO Login to the console with user: kubeadmin, password: pubmD-8Baaq-IX36r-WIWWf


  • 需要審批worker節點的加入申請

查看待審批的csr

[root@misc ~]# oc get csr
NAME        AGE   REQUESTOR                                                                   CONDITION
csr-7lln5   70m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-d48xk   69m   system:node:master1.ocptest.ipincloud.com                                   Approved,Issued
csr-f2g7r   69m   system:node:master2.ocptest.ipincloud.com                                   Approved,Issued
csr-gbn2n   69m   system:node:master3.ocptest.ipincloud.com                                   Approved,Issued
csr-hwxwx   13m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
csr-ppgxx   13m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
csr-wg874   70m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-zkp79   70m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
[root@misc ~]#

執行審批

oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve

(3)在misc上啓動nfs


bash /data/pkg/ocp4-upi-helpernode/files/nfs-provisioner-setup.sh
#查看狀態
oc get pods -n nfs-provisioner
(4)ocp內部registry使用nfs作爲存儲
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"storage":{"pvc":{"claim":""}}}}' --type=merge

oc get clusteroperator image-registry

10 配置登錄

(1)配置普通管理員賬號

#在misc機器上創建admin token
mkdir -p ~/auth
htpasswd -bBc ~/auth/admin-passwd admin scwang18
#拷貝到本地
mkdir -p ~/auth
scp -P 20030 [email protected]:/root/auth/admin-passwd  ~/auth/
#在 OAuth Details 頁面添加 HTPasswd 類型的 Identity Providers 並上傳admin-passwd 文件。
https://console-openshift-console.apps.ocptest.ipincloud.com
#授予新建的admin用戶集羣管理員權限
oc adm policy add-cluster-role-to-user cluster-admin admin
發佈了7 篇原創文章 · 獲贊 2 · 訪問量 1580
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章