本文參照紅帽官方文檔,在裸機安裝Openshift4.3文檔進行。因爲只有一臺64G內存的PC機,安裝vmware vsphere 6.7免費版進行本測試,所以嘗試在OCP官方文檔要求的最低內存需求基礎上,內存減半安裝,記錄如下。
1、ocp安裝的過程
紅帽官方文檔記載的安裝過程如下:
- bootstrap啓動並從準備好master需要的資源
- master從bootstrap獲取需要的資源並完成啓動
- master通過bootstrap構建etcd集羣
- bootstrap使用剛纔構建的etcd集羣啓動一個臨時的kubernetes control plane
- 臨時control plane在master節點啓動生產control plane
- 臨時control plane關閉並將控制權移交給生產control plane
- bootstrap將ocp組建注入到生產control plane
- 安裝程序關閉bootstrap
- control plane 部署計算節點
- control plane 通過operator方式安裝其他服務
2、準備服務器資源
服務器規劃如下:
- 3臺control plane節點,安裝etcd、control plane組件和infras基礎組件,因爲資源緊張,不部署dns服務器,通過hosts文件解析域名;
- 2臺compute 節點,運行實際負載;
- 1臺bootstrap節點,執行安裝任務;
- 1臺misc/lb節點,用於準備安裝資源、啓動bootstrap,並作爲lb節點使用。
Hostname | vcpu | ram | hdd | ip | fqdn |
---|---|---|---|---|---|
misc/lb | 4 | 8g | 120g | 192.168.128.30 | misc.ocptest.ipingcloud.com/lb.ocptest.ipincloud.com |
bootstrap | 4 | 8g | 120g | 192.168.128.31 | bootstrap.ocptest.ipincloud.com |
master1 | 4 | 8g | 120g | 192.168.128.32 | master1.ocptest.ipincloud.com |
master2 | 4 | 8g | 120g | 192.168.128.33 | master2.ocptest.ipincloud.com |
master3 | 4 | 8g | 120g | 192.168.128.34 | master3.ocptest.ipincloud.com |
worker1 | 2 | 4g | 120g | 192.168.128.35 | worker1.ocptest.ipincloud.com |
worker2 | 2 | 4g | 120g | 192.168.128.36 | worker2.ocptest.ipincloud.com |
3、準備網絡資源
api server和ingress公用一個lb,即misc/lb
以爲dns配置記錄,ocptest是cluster名,ipingcloud.com是基礎域名.這些配置,需要修改ansi-playbook文件的tasks/相應模板。
參見
https://github.com/scwang18/ocp4-upi-helpernode.git
- dns配置
組件 | dns記錄 | 描述 |
---|---|---|
Kubernetes API | api.ocptest.ipincloud.com | 該DNS記錄指向control plane節點的負載平衡器。羣集外部和羣集中所有節點都必須可以解析此記錄。 |
Kubernetes API | api-int.ocptest.ipincloud.com | 該DNS記錄指向control plane節點的負載平衡器。該記錄必須可從羣集中的所有節點上解析。 |
Routes | *.apps.ocptest.ipincloud.com | 通配符DNS記錄指向ingress slb。羣集外部和羣集中所有節點都必須可以解析此記錄。 |
etcd | etcd-.ocptest.ipincloud.com | DNS記錄指向etcd節點,羣集所有節點都必須可以解析此記錄。 |
etcd | _etcd-server-ssl._tcp.ocptest.ipincloud.com | 因爲etcd使用2380對外服務,因此,需要建立對應每臺etcd節點的srv dns記錄,優先級0,權重10和端口2380,如下表 |
- etcd srv dns記錄表
#一下激怒是必須的,用於bootstrap創建etcd服務器上,自動配置etcd服務解析
#_service._proto.name. | TTL | class | SRV | priority | weight | port | target. |
---|---|---|---|---|---|---|---|
_etcd-server-ssl._tcp.<cluster_name>.<base_domain> | 86400 | IN | SRV | 0 | 10 | 2380 | etcd-0.<cluster_name>.<base_domain>. |
_etcd-server-ssl._tcp.<cluster_name>.<base_domain> | 86400 | IN | SRV | 0 | 10 | 2380 | etcd-1.<cluster_name>.<base_domain>. |
_etcd-server-ssl._tcp.<cluster_name>.<base_domain> | 86400 | IN | SRV | 0 | 10 | 2380 | etcd-2.<cluster_name>.<base_domain>. |
- 創建ssh私鑰並加入ssh agent
通過免登陸ssh私鑰,可以用core用戶身份登錄到master節點,在集羣上進行安裝調試和災難恢復。
(1)在misc節點上執行一下命令創建sshkey
ssh-keygen -t rsa -b 4096 -N ''
以上命令在~/.ssh/文件夾下創建id_rsa和id_rsa.pub兩個文件。
(2)啓動ssh agent進程並把將無密碼登錄的私鑰加入ssh agent
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa
下一步安裝ocp時,需要將ssh公鑰提供給安裝程序配置文件。
因爲我們採用自己手動準備資源方式,因此,需要將ssh公鑰放到集羣各節點,本機就可以免密碼登錄集羣節點
#將剛纔生成的 ~/.ssh目錄中的 id_rsa.pub 這個文件拷貝到你要登錄的集羣節點 的~/.ssh目錄中
scp ~/.ssh/id_rsa.pub [email protected]:~/.ssh/
#然後在集羣節點上運行以下命令來將公鑰導入到~/.ssh/authorized_keys這個文件中
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
4、獲取安裝程序
需要註冊紅帽官網賬號,下載測試版安裝程序,下載鏈接具體過程略。
https://cloud.redhat.com/openshift/install/metal/user-provisioned
- 下載安裝程序
rm -rf /data/pkg
mkdir -p /data/pkg
cd /data/pkg
#ocp安裝程序
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-install-linux-4.3.0.tar.gz
#ocp 客戶端
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-client-linux-4.3.0.tar.gz
#rhcos安裝程序
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer.iso
#rhcos bios raw文件
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-metal.raw.gz
#如果採用iso文件方式安裝,相面兩個文件都不需要下載
#rhcos安裝程序內核文件,用於使用ipex方式安裝
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer-kernel
#rhcos初始化鏡像文件,用於使用ipex方式安裝
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer-initramfs.img
5、準備工具機misc
參照王徵的腳本修改的工具機準備工具,可以方便的在工具機上啓動 LB、DHCP、PXE、DNS和HTTP服務
(1)安裝ansible和git
yum -y install ansible git
(2)從github拉取playbook
cd /data/pkg
git clone https://github.com/scwang18/ocp4-upi-helpernode.git
(3)修改playbook的參數文件
根據自己的網絡規劃修改參數文件
[root@centos75 pkg]# cd /data/pkg/ocp4-upi-helpernode/
[root@centos75 ocp4-upi-helpernode]# cat vars-static.yaml
[root@misc pkg]# cat vars-static.yaml
---
staticips: true
named: true
helper:
name: "helper"
ipaddr: "192.168.128.30"
networkifacename: "ens192"
dns:
domain: "ipincloud.com"
clusterid: "ocptest"
forwarder1: "192.168.128.30"
forwarder2: "192.168.128.30"
registry:
name: "registry"
ipaddr: "192.168.128.30"
yum:
name: "yum"
ipaddr: "192.168.128.30"
bootstrap:
name: "bootstrap"
ipaddr: "192.168.128.31"
masters:
- name: "master1"
ipaddr: "192.168.128.32"
- name: "master2"
ipaddr: "192.168.128.33"
- name: "master3"
ipaddr: "192.168.128.34"
workers:
- name: "worker1"
ipaddr: "192.168.128.35"
- name: "worker2"
ipaddr: "192.168.128.36"
force_ocp_download: false
ocp_bios: "file:///data/pkg/rhcos-4.3.0-x86_64-metal.raw.gz"
ocp_initramfs: "file:///data/pkg/rhcos-4.3.0-x86_64-installer-initramfs.img"
ocp_install_kernel: "file:///data/pkg/rhcos-4.3.0-x86_64-installer-kernel"
ocp_client: "file:///data/pkg/openshift-client-linux-4.3.0.tar.gz"
ocp_installer: "file:///data/pkg/openshift-install-linux-4.3.0.tar.gz"
ocp_filetranspiler: "file:///data/pkg/filetranspiler-master.zip"
registry_server: "registry.ipincloud.com:8443"
[root@misc pkg]#
(4)執行ansible安裝
ansible-playbook -e @vars-static.yaml tasks/main.yml
6、準備docker env
# 在可以科學上網的機器上打包必要的鏡像文件
#rm -rf /data/ocp4
mkdir -p /data/ocp4
cd /data/ocp4
# 這個腳本不好用,不下載,使用下面自己修改過
# wget https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.3/scripts/build.dist.sh
yum -y install podman docker-distribution pigz skopeo docker buildah jq python3-pip
pip3 install yq
# https://blog.csdn.net/ffzhihua/article/details/85237411
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm
rpm2cpio python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm | cpio -iv --to-stdout ./etc/rhsm/ca/redhat-uep.pem | tee /etc/rhsm/ca/redhat-uep.pem
systemctl start docker
docker login -u wuliangye2019 -p Red@123! registry.redhat.io
docker login -u wuliangye2019 -p Red@123! registry.access.redhat.com
docker login -u wuliangye2019 -p Red@123! registry.connect.redhat.com
podman login -u wuliangye2019 -p Red@123! registry.redhat.io
podman login -u wuliangye2019 -p Red@123! registry.access.redhat.com
podman login -u wuliangye2019 -p Red@123! registry.connect.redhat.com
# to download the pull-secret.json, open following link
# https://cloud.redhat.com/openshift/install/metal/user-provisioned
cat << 'EOF' > /data/pull-secret.json
{"auths":{"cloud.openshift.com":{"auth":"xxxxxxxxxxx}}}
EOF
創建 build.dist.sh文件
#!/usr/bin/env bash
set -e
set -x
var_date=$(date '+%Y-%m-%d')
echo $var_date
#以下不用每次都執行
#cat << EOF >> /etc/hosts
#127.0.0.1 registry.ipincloud.com
#EOF
#mkdir -p /etc/crts/
#cd /etc/crts
#openssl req \
# -newkey rsa:2048 -nodes -keyout ipincloud.com.key \
# -x509 -days 3650 -out ipincloud.com.crt -subj \
# "/C=CN/ST=GD/L=SZ/O=Global Security/OU=IT Department/CN=*.ipincloud.com"
#cp /etc/crts/ipincloud.com.crt /etc/pki/ca-trust/source/anchors/
#update-ca-trust extract
systemctl stop docker-distribution
rm -rf /data/registry
mkdir -p /data/registry
cat << EOF > /etc/docker-distribution/registry/config.yml
version: 0.1
log:
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /data/registry
delete:
enabled: true
http:
addr: :8443
tls:
certificate: /etc/crts/ipincloud.com.crt
key: /etc/crts/ipincloud.com.key
EOF
systemctl restart docker
systemctl enable docker-distribution
systemctl restart docker-distribution
build_number_list=$(cat << EOF
4.3.0
EOF
)
mkdir -p /data/ocp4
cd /data/ocp4
install_build() {
BUILDNUMBER=$1
echo ${BUILDNUMBER}
mkdir -p /data/ocp4/${BUILDNUMBER}
cd /data/ocp4/${BUILDNUMBER}
#下載並安裝openshift客戶端和安裝程序 第一次需要運行,工具機ansi初始化時,已經完成這些動作了
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/release.txt
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-install-linux-${BUILDNUMBER}.tar.gz
#解壓安裝程序和客戶端到用戶執行目錄 第一次需要運行
#tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
#tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
export OCP_RELEASE=${BUILDNUMBER}
export LOCAL_REG='registry.ipincloud.com:8443'
export LOCAL_REPO='ocp4/openshift4'
export UPSTREAM_REPO='openshift-release-dev'
export LOCAL_SECRET_JSON="/data/pull-secret.json"
export OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE=${LOCAL_REG}/${LOCAL_REPO}:${OCP_RELEASE}
export RELEASE_NAME="ocp-release"
oc adm release mirror -a ${LOCAL_SECRET_JSON} \
--from=quay.io/${UPSTREAM_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-x86_64 \
--to-release-image=${LOCAL_REG}/${LOCAL_REPO}:${OCP_RELEASE} \
--to=${LOCAL_REG}/${LOCAL_REPO}
}
while read -r line; do
install_build $line
done <<< "$build_number_list"
cd /data/ocp4
#wget -O ocp4-upi-helpernode-master.zip https://github.com/wangzheng422/ocp4-upi-helpernode/archive/master.zip
#以下注釋,因爲quay.io/wangzheng422這個倉庫的registry版本是v1不能與v2共存
#podman pull quay.io/wangzheng422/filetranspiler
#podman save quay.io/wangzheng422/filetranspiler | pigz -c > filetranspiler.tgz
#podman pull docker.io/library/registry:2
#podman save docker.io/library/registry:2 | pigz -c > registry.tgz
systemctl start docker
docker login -u wuliangye2019 -p Red@123! registry.redhat.io
docker login -u wuliangye2019 -p Red@123! registry.access.redhat.com
docker login -u wuliangye2019 -p Red@123! registry.connect.redhat.com
podman login -u wuliangye2019 -p Red@123! registry.redhat.io
podman login -u wuliangye2019 -p Red@123! registry.access.redhat.com
podman login -u wuliangye2019 -p Red@123! registry.connect.redhat.com
# 以下命令要運行 2-3個小時,耐心等待。。。
# build operator catalog
podman login registry.ipincloud.com:8443 -u root -p Scwang18
oc adm catalog build \
--appregistry-endpoint https://quay.io/cnr \
--appregistry-org redhat-operators \
--to=${LOCAL_REG}/ocp4-operator/redhat-operators:v1
oc adm catalog mirror \
${LOCAL_REG}/ocp4-operator/redhat-operators:v1 \
${LOCAL_REG}/operator
#cd /data
#tar cf - registry/ | pigz -c > registry.tgz
#cd /data
#tar cf - ocp4/ | pigz -c > ocp4.tgz
執行build.dist.sh腳本
這裏有個巨坑,因爲從quay.io拉取image鏡像到本地時,拉取的文件有5G多,通常一次拉取不完,會出錯,每次出錯後,重新運行build.dist.sh會把以前的registry刪除掉,從頭再來,浪費很多時間,實際上可以不用刪除,執行oc adm release mirror時會自動跳過已經存在的image。血淚教訓。
bash build.dist.sh
oc adm release mirror執行完畢後,回根據官方鏡像倉庫生成本地鏡像倉庫,返回的信息需要記錄下來,特別是imageContentSource信息,後面 install-config.yaml 文件裏配置進去
Success
Update image: registry.ipincloud.com:8443/ocp4/openshift4:4.3.0
Mirror prefix: registry.ipincloud.com:8443/ocp4/openshift4
To use the new mirrored repository to install, add the following section to the install-config.yaml:
imageContentSources:
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
To use the new mirrored repository for upgrades, use the following to create an ImageContentSourcePolicy:
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
name: example
spec:
repositoryDigestMirrors:
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
以下命令不需要執行,在build.dish.sh裏已經執行了
oc adm release mirror -a /data/pull-secret.json --from=quay.io/openshift-release-dev/ocp-release:4.3.0-x86_64 --to-release-image=registry.ipincloud.com:8443/ocp4/openshift4:4.3.0 --to=registry.ipincloud.com:8443/ocp4/openshift4
podman login registry.ipincloud.com:8443 -u root -p Scwang18
oc adm catalog build \
--appregistry-endpoint https://quay.io/cnr \
--appregistry-org redhat-operators \
--to=registry.ipincloud.com:8443/ocp4-operator/redhat-operators:v1
oc adm catalog mirror \
registry.ipincloud.com:8443/ocp4-operator/redhat-operators:v1 \
registry.ipincloud.com:8443/operator
#如果oc adm catalog mirror執行不成功,會生成一個mapping.txt的文件,可以根據這個文件,執行不成功的行刪除,再以下面的方式執行
oc image mirror -a /data/pull-secret.json -f /data/mapping-ok.txt
oc image mirror quay.io/external_storage/nfs-client-provisioner:latest registry.ipincloud.com:8443/ocp4/openshift4/nfs-client-provisioner:latest
oc image mirror quay.io/external_storage/nfs-client-provisioner:latest registry.ipincloud.com:8443/quay.io/external_storage/nfs-client-provisioner:latest
#查看鏡像的sha
curl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X GET https://registry.ipincloud.com:8443/v2/ocp4/openshift4/nfs-client-provisioner/manifests/latest 2>&1 | grep Docker-Content-Digest | awk '{print ($3)}'
#刪除鏡像摘要
curl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X DELETE https://registry.ipincloud.com:8443/v2/ocp4/openshift4/nfs-client-provisioner/manifests/sha256:022ea0b0d69834b652a4c53655d78642ae23f0324309097be874fb58d09d2919
#回收鏡像空間
podman exec -it mirror-registry /bin/registry garbage-collect /etc/docker/registry/config.yml
7、創建installer配置文件
(1)創建installer文件夾
rm -rf /data/install
mkdir -p /data/install
cd /data/install
(2)定製install-config.yaml文件
- 補充pullSecret
[root@misc data]# cat /data/pull-secret.json
{"auths":{"cloud.openshift.com":{"auth":"省略"}}}
- 添加sshKey(3.1創建的公鑰文件內容)
cat ~/.ssh/id_rsa.pub
- additionalTrustBundle(Mirror registry創建是生成的csr)
[root@misc crts]# cat /etc/crts/ipincloud.com.crt
-----BEGIN CERTIFICATE-----
xxx省略
-----END CERTIFICATE-----
- 添加代理
生產環境可以不用直連外網,通過在install-config.yaml文件爲集羣設置代理。
本次測試,爲了加速外網下載,我在aws上事先搭建了一個v2ray server,misc服務器作爲v2ray客戶端,具體搭建過程另文敘述。
-
在反覆試驗時,比如 install-config.yaml 所在的目錄是 config,必須 rm -rf install 而不是 rm -rf install/*,後者未刪除其中的隱藏文件 .openshift_install_state.json,有可能引起:x509: certificate has expired or is not yet valid。
-
在文檔和博客示例中 install-config.yaml 的 cidr 配置爲 10 網段,由於未細看文檔理解成了節點機網段,這造成了整個過程中最莫名其妙的錯誤:no matches for kind MachineConfig。
-
最終文件內容如下:
[root@centos75 install]# vi install-config.yaml
apiVersion: v1
baseDomain: ipincloud.com
proxy:
httpProxy: http://192.168.128.30:8001
httpsProxy: http://192.168.128.30:8001
compute:
- hyperthreading: Enabled
name: worker
replicas: 0
controlPlane:
hyperthreading: Enabled
name: master
replicas: 3
metadata:
name: ocptest
networking:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
fips: false
pullSecret: '{"auths":{"省略'
additionalTrustBundle: |
-----BEGIN CERTIFICATE-----
省略,注意這裏要前面空兩格
-----END CERTIFICATE-----
imageContentSources:
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
(3)備份定製install-config.yaml文件,便於以後可以重複使用
cd /data/install
cp install-config.yaml ../install-config.yaml.20200205
8、創建Kubernetes manifest和Ignition配置文件
(1)生成Kubernetes manifests文件
openshift-install create manifests --dir=/data/install
注意:指定install-config.yaml所在目錄是,需要使用絕的路徑
(2)修改 manifests/cluster-scheduler-02-config.yml文件以防止pod調度到control plane節點
紅帽官方安裝文檔說明,kubernetes不支持ingress的load balancer訪問control-plane節點的pod
a.打開manifests/cluster-scheduler-02-config.yml
b.找到mastersSchedulable參數,設置爲False
c.保存並退出。
vi /data/install/manifests/cluster-scheduler-02-config.yml
(3)創建Ignition配置文件
注意:創建Ignition配置文件完成後,install-config.yaml文件將被刪除,請務必先備份此文件。
openshift-install create ignition-configs --dir=/data/install
(4)將Ignition配置文件拷貝到http服務器目錄,待安裝時使用
cd /data/install
\cp -f bootstrap.ign /var/www/html/ignition/bootstrap.ign
\cp -f master.ign /var/www/html/ignition/master1.ign
\cp -f master.ign /var/www/html/ignition/master2.ign
\cp -f master.ign /var/www/html/ignition/master3.ign
\cp -f worker.ign /var/www/html/ignition/worker1.ign
\cp -f worker.ign /var/www/html/ignition/worker2.ign
cd /var/www/html/ignition/
chmod 755 *.ign
至此,已完成必要的配置文件設置,開始進入下一步創建節點。
9、定製RHCOS ISO
安裝時需要修改啓動參數,只能手動錄入,每臺機器修改很麻煩,容易出錯,因此我們採用genisoimage來定製每臺機器的安裝鏡像。
#安裝鏡像創建工具
yum -y install genisoimage libguestfs-tools
systemctl start libvirtd
#設置環境變量
export NGINX_DIRECTORY=/data/pkg
export RHCOSVERSION=4.3.0
export VOLID=$(isoinfo -d -i ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso | awk '/Volume id/ { print $3 }')
#生成一個臨時文件目錄,用於放置過程文件
TEMPDIR=$(mktemp -d)
echo $VOLID
echo $TEMPDIR
cd ${TEMPDIR}
# Extract the ISO content using guestfish (to avoid sudo mount)
#使用guestfish可以將不用sudo mount將iso文件解壓出來
guestfish -a ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso \
-m /dev/sda tar-out / - | tar xvf -
#定義修改配置文件的函數
modify_cfg(){
for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do
# 添加恰當的 image 和 ignition url
sed -e '/coreos.inst=yes/s|$| coreos.inst.install_dev=sda coreos.inst.image_url='"${URL}"'\/install\/'"${BIOSMODE}"'.raw.gz coreos.inst.ignition_url='"${URL}"'\/ignition\/'"${NODE}"'.ign ip='"${IP}"'::'"${GATEWAY}"':'"${NETMASK}"':'"${FQDN}"':'"${NET_INTERFACE}"':none:'"${DNS}"' nameserver='"${DNS}"'|' ${file} > $(pwd)/${NODE}_${file##*/}
# 修改參數裏的啓動等待時間
sed -i -e 's/default vesamenu.c32/default linux/g' -e 's/timeout 600/timeout 10/g' $(pwd)/${NODE}_${file##*/}
done
}
#設置url,網關、dns等iso啓動通用參數變量
URL="http://192.168.128.30:8080"
GATEWAY="192.168.128.254"
NETMASK="255.255.255.0"
DNS="192.168.128.30"
#設置bootstrap節點變量
NODE="bootstrap"
IP="192.168.128.31"
FQDN="bootstrap"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
#設置master1節點變量
NODE="master1"
IP="192.168.128.32"
FQDN="master1"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
#設置master2節點變量
NODE="master2"
IP="192.168.128.33"
FQDN="master2"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
#設置master3節點變量
NODE="master3"
IP="192.168.128.34"
FQDN="master3"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
#設置master4節點變量
NODE="worker1"
IP="192.168.128.35"
FQDN="worker1"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
#設置master5節點變量
NODE="worker2"
IP="192.168.128.36"
FQDN="worker2"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
# 爲每個節點創建不同的安裝鏡像
# https://github.com/coreos/coreos-assembler/blob/master/src/cmd-buildextend-installer#L97-L103
for node in bootstrap master1 master2 master3 worker1 worker2; do
# 爲每個節點創建不同的 grub.cfg and isolinux.cfg 文件
for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do
/bin/cp -f $(pwd)/${node}_${file##*/} ${file}
done
# 創建iso鏡像
genisoimage -verbose -rock -J -joliet-long -volset ${VOLID} \
-eltorito-boot isolinux/isolinux.bin -eltorito-catalog isolinux/boot.cat \
-no-emul-boot -boot-load-size 4 -boot-info-table \
-eltorito-alt-boot -efi-boot images/efiboot.img -no-emul-boot \
-o ${NGINX_DIRECTORY}/${node}.iso .
done
# 清除過程文件
cd
rm -Rf ${TEMPDIR}
cd ${NGINX_DIRECTORY}
9、在節點機器上安裝RHCOS
(1)將定製的ISO文件拷貝到vmware esxi主機上,準備裝節點
[root@misc pkg]# scp bootstrap.iso [email protected]:/vmfs/volumes/hdd/iso
[root@misc pkg]# scp m*.iso [email protected]:/vmfs/volumes/hdd/iso
[root@misc pkg]# scp w*.iso [email protected]:/vmfs/volumes/hdd/iso
(2)按規劃創建master,設置從iso啓動安裝
- 進入啓動界面後,直接點擊安裝,系統自動回自動下載bios和配置文件,完成安裝
- 安裝完成後,需要將iso文件退出來,避免再次進入安裝界面
- 安裝順序是bootstrap,master1,master2,master3,待master安裝並啓動完成後,再進行worker安裝
- 安裝過程中可以通過proxy查看進度 http://registry.ipincloud.com:9000/
- 安裝過程中可以在misc節點查看詳細的bootstrap進度。
openshift-install --dir=/data/install wait-for bootstrap-complete --log-level debug
注意事項:
- ignition和iso文件的正確匹配
- 我在安裝的時候,master1提示etcdmain: member ab84b6a6e4a3cc9a has already been bootstrapped,花了很多時間分析和解決問題,因爲master1在安裝完成後,etcd組件會自動安裝並註冊爲member,我再次使用iso文件重新安裝master1後,etcd自動安裝註冊時,會檢測到etcd及集羣裏已經有這個member,無法重新註冊,因此這個節點的etcd一直無法正常啓動,解決辦法是:
手工修改-aster1節點的etcd的yaml文件,在exec etcd命令末尾增加–initial-cluster-state=existing參數,再刪除問題POD後,系統會自動重新安裝etcd pod,恢復正常。
正常啓動以後,要把這個改回去,否則machine-config回一直無法完成
#
[root@master1 /]# vi /etc/kubernetes/manifests/etcd-member.yaml
exec etcd \
--initial-advertise-peer-urls=https://${ETCD_IPV4_ADDRESS}:2380 \
--cert-file=/etc/ssl/etcd/system:etcd-server:${ETCD_DNS_NAME}.crt \
--key-file=/etc/ssl/etcd/system:etcd-server:${ETCD_DNS_NAME}.key \
--trusted-ca-file=/etc/ssl/etcd/ca.crt \
--client-cert-auth=true \
--peer-cert-file=/etc/ssl/etcd/system:etcd-peer:${ETCD_DNS_NAME}.crt \
--peer-key-file=/etc/ssl/etcd/system:etcd-peer:${ETCD_DNS_NAME}.key \
--peer-trusted-ca-file=/etc/ssl/etcd/ca.crt \
--peer-client-cert-auth=true \
--advertise-client-urls=https://${ETCD_IPV4_ADDRESS}:2379 \
--listen-client-urls=https://0.0.0.0:2379 \
--listen-peer-urls=https://0.0.0.0:2380 \
--listen-metrics-urls=https://0.0.0.0:9978 \
--initial-cluster-state=existing
[root@master1 /]# crictl pods
POD ID CREATED STATE NAME NAMESPACE ATTEMPT
c4686dc3e5f4f 38 minutes ago Ready etcd-member-master1.ocptest.ipincloud.com openshift-etcd 5
[root@master1 /]# crictl rmp xxx
- 檢查是否安裝完成
如果出現INFO It is now safe to remove the bootstrap resources,表示master節點安裝完成,控制面轉移到master集羣。
[root@misc install]# openshift-install --dir=/data/install wait-for bootstrap-complete --log-level debug
DEBUG OpenShift Installer v4.3.0
DEBUG Built from commit 2055609f95b19322ee6cfdd0bea73399297c4a3e
INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocptest.ipincloud.com:6443...
INFO API v1.16.2 up
INFO Waiting up to 30m0s for bootstrapping to complete...
DEBUG Bootstrap status: complete
INFO It is now safe to remove the bootstrap resources
[root@misc install]#
(3)安裝worker
- 進入啓動界面後,直接點擊安裝,系統自動回自動下載bios和配置文件,完成安裝
- 安裝完成後,需要將iso文件退出來,避免再次進入安裝界面
- 安裝順序是bootstrap,master1,master2,master3,待master安裝並啓動完成後,再進行worker安裝
- 安裝過程中可以通過proxy查看進度 http://registry.ipincloud.com:9000/
- 也可以在misc節點是查看詳細安裝節點
[root@misc redhat-operators-manifests]# openshift-install --dir=/data/install wait-for install-complete --log-level debug
DEBUG OpenShift Installer v4.3.0
DEBUG Built from commit 2055609f95b19322ee6cfdd0bea73399297c4a3e
INFO Waiting up to 30m0s for the cluster at https://api.ocptest.ipincloud.com:6443 to initialize...
DEBUG Cluster is initialized
INFO Waiting up to 10m0s for the openshift-console route to be created...
DEBUG Route found in openshift-console namespace: console
DEBUG Route found in openshift-console namespace: downloads
DEBUG OpenShift console route is created
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
INFO Access the OpenShift web-console here:
https://console-openshift-console.apps.ocptest.ipincloud.com
INFO Login to the console with user: kubeadmin, password: pubmD-8Baaq-IX36r-WIWWf
- 需要審批worker節點的加入申請
查看待審批的csr
[root@misc ~]# oc get csr
NAME AGE REQUESTOR CONDITION
csr-7lln5 70m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-d48xk 69m system:node:master1.ocptest.ipincloud.com Approved,Issued
csr-f2g7r 69m system:node:master2.ocptest.ipincloud.com Approved,Issued
csr-gbn2n 69m system:node:master3.ocptest.ipincloud.com Approved,Issued
csr-hwxwx 13m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-ppgxx 13m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-wg874 70m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-zkp79 70m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
[root@misc ~]#
執行審批
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
(3)在misc上啓動nfs
bash /data/pkg/ocp4-upi-helpernode/files/nfs-provisioner-setup.sh
#查看狀態
oc get pods -n nfs-provisioner
(4)ocp內部registry使用nfs作爲存儲
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"storage":{"pvc":{"claim":""}}}}' --type=merge
oc get clusteroperator image-registry
10 配置登錄
(1)配置普通管理員賬號
#在misc機器上創建admin token
mkdir -p ~/auth
htpasswd -bBc ~/auth/admin-passwd admin scwang18
#拷貝到本地
mkdir -p ~/auth
scp -P 20030 [email protected]:/root/auth/admin-passwd ~/auth/
#在 OAuth Details 頁面添加 HTPasswd 類型的 Identity Providers 並上傳admin-passwd 文件。
https://console-openshift-console.apps.ocptest.ipincloud.com
#授予新建的admin用戶集羣管理員權限
oc adm policy add-cluster-role-to-user cluster-admin admin