OpenShift簡介

微服務架構應用日漸廣泛，Docker和Kubernetes技術是不可或缺的。Red Hat OpenShift 3是建立在Docker和Kubernetes基礎之上的容器應用平臺，用於開發和部署企業應用程序。

OpenShift版本

OpenShift Dedicated(Enterprise)

Private, high-availability OpenShift clusters hosted on Amazon Web Services or Google Cloud Platform
Delivered as a hosted service and supported by Red Hat

OpenShift Container Platform(Enterprise)

Across cloud and on-premise infrastructure
Customizable, with full administrative control

OKD
OpenShift開源社區版(Origin Community Distribution of Kubernetes)

OpenShift架構

Master Node提供的組件：API Server (負責處理客戶端請求, 包括node、user、 administrator和其他的infrastructure系統)；Controller Manager Server (包括scheduler和replication controller)；OpenShift客戶端工具 (oc)
Compute Node(Application Node) 部署application
Infra Node 運行router、image registry和其他的infrastructure服務(由管理員安裝的系統服務Application)
etcd 可以部署在Master Node，也可以單獨部署，用來存儲共享數據：master state、image、 build、deployment metadata等
Pod 最小的Kubernetes object，可以部署一個或多個container

安裝計劃

軟件環境

AWS RHEL 7.5/CentOS 7.6
OKD 3.11
Ansible 2.7
Docker 1.13.1
Kubernetes 1.11

使用Ansible安裝openshift，僅需配置一些Node信息和參數即可完成集羣安裝，大大提高了安裝速度。

本文檔也適用於CentOS 7：
CentOS 7需安裝NetworkManager：

# yum -y install NetworkManager
# systemctl start NetworkManager

CentOS 7需編輯/etc/sysconfig/network-scripts/ifcfg-eth0，增加NM_CONTROLLED=yes，否則不能成功安裝ServiceMonitor(注意，從image啓動instance後此參數會丟失，需要重新配置)。

安裝openshift後各節點會自動增加yum倉庫CentOS-OpenShift-Origin311.repo，其內容如下：

[centos-openshift-origin311]
name=CentOS OpenShift Origin
baseurl=http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin311/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-PaaS

[centos-openshift-origin311-testing]
name=CentOS OpenShift Origin Testing
baseurl=http://buildlogs.centos.org/centos/7/paas/x86_64/openshift-origin311/
enabled=0
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-PaaS

[centos-openshift-origin311-debuginfo]
name=CentOS OpenShift Origin DebugInfo
baseurl=http://debuginfo.centos.org/centos/7/paas/x86_64/
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-PaaS

[centos-openshift-origin311-source]
name=CentOS OpenShift Origin Source
baseurl=http://vault.centos.org/centos/7/paas/Source/openshift-origin311/
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-PaaS

爲提高安裝速度，減少出錯機率，建議使用私有yum倉庫、私有docker registry，提前獲取資源。CentOS 7需要使用的yum倉庫有base、updates、extras，RHEL需要啓用redhat-rhui.repo中的rhui-REGION-rhel-server-extras。

基礎安裝中需要的docker images：

docker.io/ansibleplaybookbundle/origin-ansible-service-broker   latest              530 MB
docker.io/cockpit/kubernetes                         latest              336 MB
docker.io/openshift/origin-node                      v3.11.0             1.17 GB
docker.io/openshift/origin-control-plane             v3.11.0             826 MB
docker.io/openshift/origin-deployer                  v3.11.0             381 MB
docker.io/openshift/origin-template-service-broker   v3.11.0             332 MB
docker.io/openshift/origin-pod                       v3.11.0             258 MB
docker.io/openshift/origin-console                   v3.11.0             264 MB
docker.io/openshift/origin-service-catalog           v3.11.0             330 MB
docker.io/openshift/origin-web-console               v3.11.0             339 MB
docker.io/openshift/origin-haproxy-router            v3.11.0             407 MB
docker.io/openshift/origin-docker-registry           v3.11.0             310 MB
docker.io/openshift/prometheus                       v2.3.2              316 MB
docker.io/openshift/prometheus-alertmanager          v0.15.2             233 MB
docker.io/openshift/prometheus-node-exporter         v0.16.0             216 MB
docker.io/grafana/grafana                            5.2.1               245 MB
quay.io/coreos/cluster-monitoring-operator           v0.1.1              510 MB
quay.io/coreos/configmap-reload                      v0.0.1              4.79 MB
quay.io/coreos/etcd                                  v3.2.22             37.3 MB
quay.io/coreos/kube-rbac-proxy                       v0.3.1              40.2 MB
quay.io/coreos/prometheus-config-reloader            v0.23.2             12.2 MB
quay.io/coreos/prometheus-operator                   v0.23.2             47 MB

GlusterFS:

docker.io/gluster/gluster-centos                           latest              395 MB
docker.io/gluster/glusterblock-provisioner            latest              230 MB
docker.io/heketi/heketi                                         latest              386 MB

Metrics：

docker.io/openshift/origin-metrics-schema-installer   v3.11.0           551 MB
docker.io/openshift/origin-metrics-hawkular-metrics   v3.11.0           860 MB
docker.io/openshift/origin-metrics-heapster    v3.11.0             710 MB
docker.io/openshift/origin-metrics-cassandra   v3.11.0             590 MB
quay.io/coreos/kube-state-metrics              v1.3.1              22.2 MB

Logging:

docker.io/openshift/origin-logging-elasticsearch5   v3.11.0             450 MB
docker.io/openshift/origin-logging-fluentd          v3.11.0             486 MB
docker.io/openshift/origin-logging-kibana5          v3.11.0             475 MB
docker.io/openshift/origin-logging-curator5         v3.11.0             272 MB
docker.io/openshift/oauth-proxy                     v1.1.0              235 MB

創建私有yum倉庫、私有docker registry的方法請參見：Yum Repository詳解、Docker學習筆記--CLI和Registry。

AWS Linux目前不支持OpenShift。

硬件需求

Masters

最小4 vCPU
最小16 GB RAM
/var/最小40 GB硬盤空間
/usr/local/bin/最小1 GB硬盤空間
臨時目錄最小1 GB硬盤空間

Nodes

1 vCPU
最小8 GB RAM
/var/最小15 GB硬盤空間
/usr/local/bin/最小1 GB硬盤空間
臨時目錄最小1 GB硬盤空間

Storage

/var/lib/openshift Less than 10GB
/var/lib/etcd Less than 20 GB
/var/lib/docker 50GB
/var/lib/containers 50GB
/var/log 10 to 30 GB

安裝類型

	RPM-based Installations	System Container Installations
Delivery Mechanism	RPM packages using yum	System container images using docker
Service Management	systemd	docker and systemd units
Operating System	Red Hat Enterprise Linux (RHEL)	RHEL Atomic Host

RPM安裝通過包管理器來安裝和配置服務，system container安裝使用系統容器鏡像來安裝服務，服務運行在獨立的容器內。
從OKD 3.10開始，如果使用Red Hat Enterprise Linux (RHEL)操作系統，將使用RPM方法安裝OKD組件。如果使用RHEL Atomic，將使用system container方法。不同安裝類型提供相同的功能，安裝類型的選擇依賴於操作系統、你想使用的服務管理和系統升級方法。

本文使用RPM安裝方法。

Node ConfigMaps

Configmaps定義Node配置，自OKD 3.10忽略openshift_node_labels值。默認創建了下面的ConfigMaps：

node-config-master
node-config-infra
node-config-compute
node-config-all-in-one
node-config-master-infra

默認配置如下（可查看openshift-ansible/roles/openshift_facts/defaults/main.yml）：

openshift_node_groups:
  - name: node-config-master
    labels:
      - 'node-role.kubernetes.io/master=true'
    edits: []
  - name: node-config-master-crio
    labels:
      - 'node-role.kubernetes.io/master=true'
      - "{{ openshift_crio_docker_gc_node_selector | lib_utils_oo_dict_to_keqv_list | join(',') }}"
    edits: "{{ openshift_node_group_edits_crio }}"
  - name: node-config-infra
    labels:
      - 'node-role.kubernetes.io/infra=true'
    edits: []
  - name: node-config-infra-crio
    labels:
      - 'node-role.kubernetes.io/infra=true'
      - "{{ openshift_crio_docker_gc_node_selector | lib_utils_oo_dict_to_keqv_list | join(',') }}"
    edits: "{{ openshift_node_group_edits_crio }}"
  - name: node-config-compute
    labels:
      - 'node-role.kubernetes.io/compute=true'
    edits: []
  - name: node-config-compute-crio
    labels:
      - 'node-role.kubernetes.io/compute=true'
      - "{{ openshift_crio_docker_gc_node_selector | lib_utils_oo_dict_to_keqv_list | join(',') }}"
    edits: "{{ openshift_node_group_edits_crio }}"
  - name: node-config-master-infra
    labels:
      - 'node-role.kubernetes.io/master=true'
      - 'node-role.kubernetes.io/infra=true'
    edits: []
  - name: node-config-master-infra-crio
    labels:
      - 'node-role.kubernetes.io/master=true'
      - 'node-role.kubernetes.io/infra=true'
      - "{{ openshift_crio_docker_gc_node_selector | lib_utils_oo_dict_to_keqv_list | join(',') }}"
    edits: "{{ openshift_node_group_edits_crio }}"
  - name: node-config-all-in-one
    labels:
      - 'node-role.kubernetes.io/master=true'
      - 'node-role.kubernetes.io/infra=true'
      - 'node-role.kubernetes.io/compute=true'
    edits: []
  - name: node-config-all-in-one-crio
    labels:
      - 'node-role.kubernetes.io/master=true'
      - 'node-role.kubernetes.io/infra=true'
      - 'node-role.kubernetes.io/compute=true'
      - "{{ openshift_crio_docker_gc_node_selector | lib_utils_oo_dict_to_keqv_list | join(',') }}"
    edits: "{{ openshift_node_group_edits_crio }}"

集羣安裝時選擇node-config-master、node-config-infra、node-config-compute。

環境場景

Master、Compute、Infra Node各一，etcd部署在master上
Master、Compute、Infra Node各三，etcd部署在master上

爲快速瞭解OpenShift安裝，我們先使用第一種環境，成功後再安裝第二種環境。Ansible一般使用單獨的機器，兩種情況分別需要創建4和10臺EC2。

前期準備

更新系統

# yum update

檢查SELinux

檢查/etc/selinux/config，確保內容如下：

SELINUX=enforcing
SELINUXTYPE=targeted

配置DNS

爲了使用更清晰的名字，需要創建額外的DNS服務器，爲EC2配置合適的域名，如下：

master1.itrunner.org    A   10.64.33.100
master2.itrunner.org    A   10.64.33.103
node1.itrunner.org      A   10.64.33.101
node2.itrunner.org      A   10.64.33.102

EC2需要配置DNS服務器，創建dhclient.conf文件

# vi /etc/dhcp/dhclient.conf

添加如下內容：

supersede domain-name-servers 10.164.18.18;

配置完畢後需要重啓才能生效，重啓後/etc/resolv.conf內容如下：

# Generated by NetworkManager
search cn-north-1.compute.internal
nameserver 10.164.18.18

OKD使用了dnsmasq，安裝成功後會自動配置所有Node，/etc/resolv.conf會被修改，nameserver變爲本機IP。Pod將使用Node作爲DNS，Node轉發請求。

# nameserver updated by /etc/NetworkManager/dispatcher.d/99-origin-dns.sh
# Generated by NetworkManager
search cluster.local cn-north-1.compute.internal itrunner.org
nameserver 10.64.33.100

配置hostname

hostnamectl set-hostname --static master1.itrunner.org

編輯/etc/cloud/cloud.cfg文件，在底部添加以下內容：

preserve_hostname: true

安裝基礎包

所有node都要安裝。下面是官方文檔的說明：

# yum install wget git net-tools bind-utils yum-utils iptables-services bridge-utils bash-completion kexec-tools sos psacct

查看openshift-ansible源碼roles/openshift_node/defaults/main.yml -> default_r_openshift_node_image_prep_packages，其中列出了Node默認安裝的rpm包，合併整理如下：

# yum install wget git net-tools bind-utils yum-utils iptables-services bridge-utils bash-completion kexec-tools sos psacct \
dnsmasq ntp logrotate httpd-tools firewalld libselinux-python conntrack-tools openssl iproute python-dbus PyYAML \
glusterfs-fuse device-mapper-multipath nfs-utils iscsi-initiator-utils ceph-common atomic python-docker-py

安裝Docker

所有Node都要安裝Docker，版本必須爲1.13.1，不能使用Docker官方版本。
推薦Docker Storage使用overlay2，overlay2具有更好的性能。如使用Device Mapper，推薦使用Device Mapper Thin Provisioning，不要使用Device Mapper loop-lvm，會產生性能問題。
爲控制日誌大小，可以設置日誌參數。

overlay2
RHEL/CentOS 7默認Docker Storage類型爲overlay2。

安裝腳本：

#!/bin/bash

# 刪除以前安裝的docker
#yum -y remove docker docker-client docker-common container-selinux
#rm -rf /var/lib/docker/*

# 安裝docker
yum -y install docker

# 配置日誌
sed -i "4c OPTIONS='--selinux-enabled --signature-verification=false --log-opt max-size=1M --log-opt max-file=3'" /etc/sysconfig/docker

# 配置registry-mirrors
cat <<EOF > /etc/docker/daemon.json
{
  "registry-mirrors": ["https://registry.itrunner.org"]
}
EOF

# 啓動docker
systemctl enable docker
systemctl start docker
systemctl is-active docker

安裝檢查：

# docker info | egrep -i 'storage|pool|space|filesystem'
Storage Driver: overlay2 
 Backing Filesystem: xfs

Device Mapper Thin Provisioning
建議爲docker單獨掛載一塊40G以上的硬盤(AWS只連接卷即可，不需執行其他任何操作)。
安裝腳本：

#!/bin/bash

# 刪除以前安裝的docker和配置
#lvremove -f /dev/docker-vg/docker-pool
#vgremove docker-vg
#pvremove /dev/xvdf1
#wipefs -af /dev/xvdf

#yum -y remove docker docker-client docker-common container-selinux
#rm -rf /var/lib/docker/*

# 安裝docker
yum -y install docker

# 配置日誌
sed -i "4c OPTIONS='--selinux-enabled --signature-verification=false --log-opt max-size=1M --log-opt max-file=3'" /etc/sysconfig/docker

# 配置registry-mirrors
cat <<EOF > /etc/docker/daemon.json
{
  "registry-mirrors": ["https://registry.itrunner.org"]
}
EOF

# 配置docker存儲
cat <<EOF > /etc/sysconfig/docker-storage-setup
DEVS=/dev/xvdf
VG=docker-vg
EOF

docker-storage-setup

# 啓動docker
systemctl enable docker
systemctl start docker
systemctl is-active docker

成功執行後輸出如下：

...
Complete!
INFO: Volume group backing root filesystem could not be determined
INFO: Writing zeros to first 4MB of device /dev/xvdf
4+0 records in
4+0 records out
4194304 bytes (4.2 MB) copied, 0.0124923 s, 336 MB/s
INFO: Device node /dev/xvdf1 exists.
  Physical volume "/dev/xvdf1" successfully created.
  Volume group "docker-vg" successfully created
  Rounding up size to full physical extent 24.00 MiB
  Thin pool volume with chunk size 512.00 KiB can address at most 126.50 TiB of data.
  Logical volume "docker-pool" created.
  Logical volume docker-vg/docker-pool changed.
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
active

安裝檢查：

# cat /etc/sysconfig/docker-storage
DOCKER_STORAGE_OPTIONS="--storage-driver devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/docker--vg-docker--pool --storage-opt dm.use_deferred_removal=true --storage-opt dm.use_deferred_deletion=true "
# lsblk
NAME                              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
xvda                              202:0    0  100G  0 disk
└─xvda1                           202:1    0  100G  0 part /
xvdf                              202:80   0   20G  0 disk
└─xvdf1                           202:81   0   20G  0 part
  ├─docker--vg-docker--pool_tmeta 253:0    0   24M  0 lvm
  │ └─docker--vg-docker--pool     253:2    0    8G  0 lvm
  └─docker--vg-docker--pool_tdata 253:1    0    8G  0 lvm
    └─docker--vg-docker--pool     253:2    0    8G  0 lvm

# docker info | egrep -i 'storage|pool|space|filesystem'
Storage Driver: devicemapper 
 Pool Name: docker_vg-docker--pool 
 Pool Blocksize: 524.3 kB
 Backing Filesystem: xfs
 Data Space Used: 62.39 MB
 Data Space Total: 6.434 GB
 Data Space Available: 6.372 GB
 Metadata Space Used: 40.96 kB
 Metadata Space Total: 16.78 MB
 Metadata Space Available: 16.74 MB

默認，配置thin pool使用磁盤容量的40%，在使用中會自動擴展到100%。

安裝Ansible

不僅Ansible Host要安裝Ansible，所有Node也要安裝。使用EPEL Repository安裝ansible：

# yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
# sed -i -e "s/^enabled=1/enabled=0/" /etc/yum.repos.d/epel.repo
# yum -y --enablerepo=epel install ansible pyOpenSSL

Ansible需要能訪問其他所有機器才能完成安裝，因此需要配置免密登錄。可使用ssh-keygen重新生成密鑰對，若使用ec2-user密鑰，可使用PuTTYgen工具Export OpenSSH key，然後將私鑰拷貝到ec2-user/.ssh目錄下，私鑰修改爲默認名稱id_rsa，然後授權：

$ cd .ssh/
$ chmod 600 *

配置成功後逐一測試連接：

ssh master1.itrunner.org

如使用密碼或需要密碼的密鑰登錄，請使用keychain。

配置Security Group

Security Group	Port
All OKD Hosts	tcp/22 from host running the installer/Ansible
etcd Security Group	tcp/2379 from masters, tcp/2380 from etcd hosts
Master Security Group	tcp/8443 from 0.0.0.0/0, tcp/53 from all OKD hosts, udp/53 from all OKD hosts, tcp/8053 from all OKD hosts, udp/8053 from all OKD hosts
Node Security Group	tcp/10250 from masters, udp/4789 from nodes, tcp/8444 from nodes, tcp/1936 from nodes
Infrastructure Nodes	tcp/443 from 0.0.0.0/0, tcp/80 from 0.0.0.0/0

tcp/8444: Port that the controller service listens on. Required to be open for the /metrics and /healthz endpoints.

kube-service-catalog：tcp 6443、39930、41382、45536
HAProxy：tcp/9000
NFS：tcp/udp 2049

GlusterFS，官網建議：

For the Gluster to communicate within a cluster either the firewalls have to be turned off or enable communication for each server.
    iptables -I INPUT -p all -s `<ip-address>` -j ACCEPT

指定端口：
tcp/3260
tcp/2222 - sshd
tcp/111- portmapper
tcp/24007 – Gluster Daemon
tcp/24008 – Management
tcp/24010 - gluster-blockd
tcp/49152 and greater - glusterfsd，每個brick需要單獨的端口，從49152遞增，建議設定一個足夠大的範圍。

openshift-logging：Infra Node間需要開放tcp/9200、tcp/9300

openshift-metrics：tcp/7000、tcp/7001、tcp/7575、tcp/9042、TCP/9160
openshift-monitoring：tcp/3000、tcp/6783、tcp/9090、tcp/9091、tcp/9093、tcp/9094、tcp/9100、tcp/9443
Docker Registry：tcp/5000、tcp/9000

配置ELB

第二種場景下需要配置ELB。
使用外部ELB時，Inventory文件不需定義lb，需要指定openshift_master_cluster_hostname、openshift_master_cluster_public_hostname、openshift_master_default_subdomain三個參數（請參見後面章節）。
openshift_master_cluster_hostname和openshift_master_cluster_public_hostname負責master的load balance，ELB定義時指向Master Node，其中openshift_master_cluster_hostname供內部使用，openshift_master_cluster_public_hostname供外部訪問（Web Console），兩者可以設置爲同一域名，但openshift_master_cluster_hostname所使用的ELB必須配置爲Passthrough。

爲了安全，生產環境openshift_master_cluster_hostname和openshift_master_cluster_public_hostname應設置爲兩個不同域名。
openshift_master_default_subdomain定義OpenShift部署應用的域名，ELB指向Infra Node。
因此，共需創建三個ELB，配置使用openshift ansible默認端口：

openshift_master_cluster_hostname 必須創建網絡負載均衡器，協議爲TCP，端口8443，Target要使用IP方式。
openshift_master_cluster_public_hostname ALB，協議HTTPS，端口8443；協議HTTP，端口80。
openshift_master_default_subdomain ALB，協議HTTPS，端口443；協議HTTP，端口80和8080。

爲了方便使用，openshift_master_cluster_public_hostname、openshift_master_default_subdomain一般配置爲企業的域名，不直接使用AWS ELB的DNS名稱。
注意：要使用ALB，Classic Load Balancer不支持wss協議，web console中不能查看log，不能使用terminal。

安裝OpenShift

下載openshift-ansible

$ cd ~
$ git clone https://github.com/openshift/openshift-ansible
$ cd openshift-ansible
$ git checkout release-3.11

若要使用自定義的CentOS-OpenShift-Origin倉庫，編輯文件~/openshift-ansible/roles/openshift_repos/templates/CentOS-OpenShift-Origin311.repo.j2，替換centos-openshift-origin311的baseurl，如下：

[centos-openshift-origin311]
name=CentOS OpenShift Origin
baseurl=http://10.188.12.119/centos/7/paas/x86_64/openshift-origin311/

不建議CentOS使用yum安裝openshift-ansible，其代碼不完全一致，存在老的依賴和語法，出現bug也不方便更新，如要使用需安裝ansible 2.6。
使用yum安裝openshift-ansible：

# yum -y install centos-release-openshift-origin311
# yum -y install openshift-ansible

CentOS需編輯/usr/share/ansible/openshift-ansible/playbooks/init/base_packages.yml，將其中的python-docker替換爲python-docker-py。

創建初始用戶

我們使用密碼驗證登錄OpenShift，創建兩個初始用戶admin和developer：

# yum install -y httpd-tools
# htpasswd -c /home/ec2-user/htpasswd admin
# htpasswd /home/ec2-user/htpasswd developer

在下節的Inventory文件中，可以使用openshift_master_htpasswd_users、openshift_master_htpasswd_file兩種方式配置初始用戶，如下：

# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
# Defining htpasswd users
#openshift_master_htpasswd_users={'admin': '$apr1$qriH3ihA$LLxkL.EAH5Ntv3a4036nl/', 'developer': '$apr1$SkmCPrCP$Yn1JMxDwHzPOdYl9iPax80'}
# or
#openshift_master_htpasswd_file=/home/ec2-user/htpasswd

OpenShift安裝成功後密碼保存在master的/etc/origin/master/htpasswd文件內。

配置Inventory文件

Inventory文件定義了host和配置信息，默認文件爲/etc/ansible/hosts。
場景一
master、compute、infra各一個結點，etcd部署在master上。

# Create an OSEv3 group that contains the masters, nodes, and etcd groups
[OSEv3:children]
masters
nodes
etcd

# Set variables common for all OSEv3 hosts
[OSEv3:vars]
# SSH user, this user should allow ssh based auth without requiring a password
ansible_ssh_user=ec2-user

# If ansible_ssh_user is not root, ansible_become must be set to true
ansible_become=true

openshift_deployment_type=origin
openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability

# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
# Defining htpasswd users
#openshift_master_htpasswd_users={'user1': '<pre-hashed password>', 'user2': '<pre-hashed password>'}
# or
#openshift_master_htpasswd_file=<path to local pre-generated htpasswd file>

# host group for masters
[masters]
master1.itrunner.org

# host group for etcd
[etcd]
master1.itrunner.org

# host group for nodes, includes region info
[nodes]
master1.itrunner.org openshift_node_group_name='node-config-master'
compute1.itrunner.org openshift_node_group_name='node-config-compute'
infra1.itrunner.org openshift_node_group_name='node-config-infra'

場景二
master、compute、infra各三個結點，在非生產環境下，load balance可以不使用外部ELB，使用HAProxy，etcd可以單獨部署，也可以與master部署在一起。

Multiple Masters Using Native HA with External Clustered etcd

# Create an OSEv3 group that contains the master, nodes, etcd, and lb groups.
# The lb group lets Ansible configure HAProxy as the load balancing solution.
# Comment lb out if your load balancer is pre-configured.
[OSEv3:children]
masters
nodes
etcd
lb

# Set variables common for all OSEv3 hosts
[OSEv3:vars]
ansible_ssh_user=root
openshift_deployment_type=origin

# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
# Defining htpasswd users
#openshift_master_htpasswd_users={'user1': '<pre-hashed password>', 'user2': '<pre-hashed password>'}
# or
#openshift_master_htpasswd_file=<path to local pre-generated htpasswd file>

# Native high availbility cluster method with optional load balancer.
# If no lb group is defined installer assumes that a load balancer has
# been preconfigured. For installation the value of
# openshift_master_cluster_hostname must resolve to the load balancer
# or to one or all of the masters defined in the inventory if no load
# balancer is present.
openshift_master_cluster_method=native
openshift_master_cluster_hostname=openshift-internal.example.com
openshift_master_cluster_public_hostname=openshift-cluster.example.com

# apply updated node defaults
openshift_node_kubelet_args={'pods-per-core': ['10'], 'max-pods': ['250'], 'image-gc-high-threshold': ['90'], 'image-gc-low-threshold': ['80']}

# enable ntp on masters to ensure proper failover
openshift_clock_enabled=true

# host group for masters
[masters]
master[1:3].example.com

# host group for etcd
[etcd]
etcd1.example.com
etcd2.example.com
etcd3.example.com

# Specify load balancer host
[lb]
lb.example.com

# host group for nodes, includes region info
[nodes]
master[1:3].example.com openshift_node_group_name='node-config-master'
node[1:3].example.com openshift_node_group_name='node-config-compute'
infra-node[1:3].example.com openshift_node_group_name='node-config-infra'

Multiple Masters Using Native HA with Co-located Clustered etcd

# Create an OSEv3 group that contains the master, nodes, etcd, and lb groups.
# The lb group lets Ansible configure HAProxy as the load balancing solution.
# Comment lb out if your load balancer is pre-configured.
[OSEv3:children]
masters
nodes
etcd
lb

# Set variables common for all OSEv3 hosts
[OSEv3:vars]
ansible_ssh_user=root
openshift_deployment_type=origin

# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
# Defining htpasswd users
#openshift_master_htpasswd_users={'user1': '<pre-hashed password>', 'user2': '<pre-hashed password>'}
# or
#openshift_master_htpasswd_file=<path to local pre-generated htpasswd file>

# Native high availability cluster method with optional load balancer.
# If no lb group is defined installer assumes that a load balancer has
# been preconfigured. For installation the value of
# openshift_master_cluster_hostname must resolve to the load balancer
# or to one or all of the masters defined in the inventory if no load
# balancer is present.
openshift_master_cluster_method=native
openshift_master_cluster_hostname=openshift-internal.example.com
openshift_master_cluster_public_hostname=openshift-cluster.example.com

# host group for masters
[masters]
master[1:3].example.com

# host group for etcd
[etcd]
master1.example.com
master2.example.com
master3.example.com

# Specify load balancer host
[lb]
lb.example.com

# host group for nodes, includes region info
[nodes]
master[1:3].example.com openshift_node_group_name='node-config-master'
node[1:3].example.com openshift_node_group_name='node-config-compute'
infra-node[1:3].example.com openshift_node_group_name='node-config-infra'

ELB Load Balancer

使用外部ELB，需要指定openshift_master_cluster_hostname、openshift_master_cluster_public_hostname、openshift_master_default_subdomain，不需定義lb。

# Create an OSEv3 group that contains the master, nodes, etcd, and lb groups.
# The lb group lets Ansible configure HAProxy as the load balancing solution.
# Comment lb out if your load balancer is pre-configured.
[OSEv3:children]
masters
nodes
etcd
# Since we are providing a pre-configured LB VIP, no need for this group
#lb

# Set variables common for all OSEv3 hosts
[OSEv3:vars]
# SSH user, this user should allow ssh based auth without requiring a password
ansible_ssh_user=ec2-user

# If ansible_ssh_user is not root, ansible_become must be set to true
ansible_become=true

openshift_deployment_type=origin
openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability

# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
# Defining htpasswd users
#openshift_master_htpasswd_users={'user1': '<pre-hashed password>', 'user2': '<pre-hashed password>'}
# or
#openshift_master_htpasswd_file=<path to local pre-generated htpasswd file>

# Native high availability cluster method with optional load balancer.
# If no lb group is defined installer assumes that a load balancer has
# been preconfigured. For installation the value of
# openshift_master_cluster_hostname must resolve to the load balancer
# or to one or all of the masters defined in the inventory if no load
# balancer is present.
openshift_master_cluster_method=native
openshift_master_cluster_hostname=openshift-master-internal-123456b57ac7be6c.elb.cn-north-1.amazonaws.com.cn
openshift_master_cluster_public_hostname=openshift.itrunner.org
openshift_master_default_subdomain=apps.itrunner.org
#openshift_master_api_port=443
#openshift_master_console_port=443

# host group for masters
[masters]
master[1:3].itrunner.org

# host group for etcd
[etcd]
master1.itrunner.org
master2.itrunner.org
master3.itrunner.org

# Since we are providing a pre-configured LB VIP, no need for this group
#[lb]
#lb.itrunner.org

# host group for nodes, includes region info
[nodes]
master[1:3].itrunner.org openshift_node_group_name='node-config-master'
app[1:3].itrunner.org openshift_node_group_name='node-config-compute'
infra[1:3].itrunner.org openshift_node_group_name='node-config-infra'

安裝與卸載OpenShift

安裝OpenShift
一切準備就緒，使用ansible安裝OpenShift非常簡單，僅需運行prerequisites.yml和deploy_cluster.yml兩個playbook。

$ ansible-playbook ~/openshift-ansible/playbooks/prerequisites.yml
$ ansible-playbook ~/openshift-ansible/playbooks/deploy_cluster.yml

如不使用默認的inventory文件，可以使用-i指定文件位置：

$ ansible-playbook [-i /path/to/inventory] ~/openshift-ansible/playbooks/prerequisites.yml
$ ansible-playbook [-i /path/to/inventory] ~/openshift-ansible/playbooks/deploy_cluster.yml

以上兩步均可重複運行。

deploy出現錯誤時，除閱讀錯誤日誌外，也可以執行以下命令查找有問題的pod：

oc get pod --all-namespaces -o wide
NAMESPACE     NAME                                      READY STATUS             RESTARTS   AGE  IP              NODE                   NOMINATED NODE
kube-system   master-api-master1.itrunner.org           0/1   CrashLoopBackOff   1          24m  10.188.21.101   master1.itrunner.org   <none>
kube-system   master-api-master2.itrunner.org           1/1   Running            0          3h   10.188.21.102   master2.itrunner.org   <none>
kube-system   master-api-master3.itrunner.org           1/1   Running            0          3h   10.188.21.103   master3.itrunner.org   <none>
kube-system   master-controllers-master1.itrunner.org   0/1   Error              1          24m  10.188.21.101   master1.itrunner.org

根據錯誤信息修正後，先嚐試retry：

$ ansible-playbook --limit @/home/centos/openshift-ansible/playbooks/deploy_cluster.retry ~/openshift-ansible/playbooks/prerequisites.yml

再運行錯誤提示中的playbook，再重新運行deploy_cluster.yml。

另外，可嘗試清空node以下文件夾的內容：/root/.ansible_async/ /root/.ansible /root/openshift_bootstrap/ /home/centos/.ansible/，修改或刪除文件.kube/config、/etc/ansible/facts.d/openshift.fact。

deploy過程中如出現長時間等待的情況，大半是沒有使用yum、docker倉庫造成的，安裝進程正在下載rpm或image，可在各節點運行journalctl -f查看日誌查找原因。另外，通過查看日誌可發現，這種情況下即使deploy_cluster.yml進程已停止，節點的安裝進程可能仍在繼續。

prerequisites.yml安裝成功後輸出如下：

PLAY RECAP
*******************************************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0
app1.itrunner.org : ok=59   changed=12   unreachable=0    failed=0
app2.itrunner.org : ok=59   changed=12   unreachable=0    failed=0
app3.itrunner.org : ok=59   changed=12   unreachable=0    failed=0
infra1.itrunner.org : ok=59   changed=12   unreachable=0    failed=0
infra2.itrunner.org : ok=59   changed=12   unreachable=0    failed=0
infra3.itrunner.org : ok=59   changed=12   unreachable=0    failed=0
master1.itrunner.org : ok=79   changed=12   unreachable=0    failed=0
master2.itrunner.org : ok=64   changed=12   unreachable=0    failed=0
master3.itrunner.org : ok=64   changed=12   unreachable=0    failed=0

INSTALLER STATUS
**********************************************************************************************
Initialization  : Complete (0:01:07)

deploy_cluster.yml安裝成功後輸出如下：

PLAY RECAP
**********************************************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0
app1.itrunner.org : ok=114  changed=16   unreachable=0    failed=0
app2.itrunner.org : ok=114  changed=16   unreachable=0    failed=0
app3.itrunner.org : ok=114  changed=16   unreachable=0    failed=0
infra1.itrunner.org : ok=114  changed=16   unreachable=0    failed=0
infra2.itrunner.org : ok=114  changed=16   unreachable=0    failed=0
infra3.itrunner.org : ok=114  changed=16   unreachable=0    failed=0
master1.itrunner.org : ok=685  changed=162  unreachable=0    failed=0
master2.itrunner.org : ok=267  changed=45   unreachable=0    failed=0
master3.itrunner.org : ok=267  changed=45   unreachable=0    failed=0

INSTALLER STATUS 
***********************************************************************************************
Initialization               : Complete (0:01:06)
Health Check                 : Complete (0:00:30)
Node Bootstrap Preparation   : Complete (0:03:23)
etcd Install                 : Complete (0:00:42)
Master Install               : Complete (0:03:28)
Master Additional Install    : Complete (0:00:34)
Node Join                    : Complete (0:00:47)
Hosted Install               : Complete (0:00:43)
Cluster Monitoring Operator  : Complete (0:00:12)
Web Console Install          : Complete (0:00:40)
Console Install              : Complete (0:00:35)
metrics-server Install       : Complete (0:00:00)
Service Catalog Install      : Complete (0:03:20)

卸載OpenShift
安裝出錯時可嘗試卸載全部或部分OpenShift再重新安裝。

卸載所有Node

需使用安裝時的inventory文件，下例爲使用默認文件：

$ ansible-playbook ~/openshift-ansible/playbooks/adhoc/uninstall.yml

卸載部分Node

新建一個inventory文件，配置要卸載的node:

[OSEv3:children]
nodes 

[OSEv3:vars]
ansible_ssh_user=ec2-user
openshift_deployment_type=origin

[nodes]
node3.example.com openshift_node_group_name='node-config-infra'

指定inventory文件，運行uninstall.yml playbook：

$ ansible-playbook -i /path/to/new/file ~/openshift-ansible/playbooks/adhoc/uninstall.yml

組件安裝與卸載
OKD安裝後，要增加或卸載某一組件，或安裝過程中出錯重試，只需運行openshift-ansible/playbooks/中組件特定的playbook，比如：
playbooks/openshift-glusterfs/config.yml 部署glusterfs
playbooks/openshift-glusterfs/registry.yml 部署glusterfs和OpenShift Container Registry(使用glusterfs_registry)
playbooks/openshift-glusterfs/uninstall.yml 卸載glusterfs

有的組件沒有提供uninstall.yml，可以修改安裝變量值爲false後，再運行config.yml，比如：

openshift_metrics_server_install=false

驗證安裝

查看所有節點是否成功安裝，在Master上運行：

$   oc get nodes
NAME                   STATUS    ROLES     AGE       VERSION
app1.itrunner.org      Ready     compute   6m        v1.11.0+d4cacc0
app2.itrunner.org      Ready     compute   6m        v1.11.0+d4cacc0
app3.itrunner.org      Ready     compute   6m        v1.11.0+d4cacc0
infra1.itrunner.org    Ready     infra     6m        v1.11.0+d4cacc0
infra2.itrunner.org    Ready     infra     6m        v1.11.0+d4cacc0
infra3.itrunner.org    Ready     infra     6m        v1.11.0+d4cacc0
master1.itrunner.org   Ready     master    6m        v1.11.0+d4cacc0
master2.itrunner.org   Ready     master    6m        v1.11.0+d4cacc0
master3.itrunner.org   Ready     master    6m        v1.11.0+d4cacc0

查看節點使用情況統計信息(CPU、Memory)

$ oc adm top nodes
NAME                   CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%
app1.itrunner.org      90m          2%        818Mi           5%
app2.itrunner.org      219m         5%        4242Mi          26%
app3.itrunner.org      104m         2%        1122Mi          7%
infra1.itrunner.org    303m         7%        3042Mi          19%
infra2.itrunner.org    558m         13%       3589Mi          22%
infra3.itrunner.org    192m         4%        1404Mi          8%
master1.itrunner.org   271m         6%        2426Mi          15%
master2.itrunner.org   378m         9%        2717Mi          17%
master3.itrunner.org   250m         6%        2278Mi          14%

登錄Web Console

先在master執行以下命令，將角色cluster-admin授予用戶admin，這樣纔有權限從Web Console查看OpenShift整體情況：

$ oc adm policy add-cluster-role-to-user cluster-admin admin

未曾登錄過時執行上面命令，會輸出下面信息，原因請看權限管理一節，不會影響使用：

Warning: User 'admin' not found

場景一，使用master hostname訪問Web Console: https://master1.itrunner.org:8443/console
場景二，使用域名訪問Web Console: https://openshift.itrunner.org:8443/console

OC Client Tool

所有節點都安裝了oc client tool，在master上默認使用系統用戶system:admin登錄，創建了配置文件~/.kube/config，可執行下面的命令查看當前用戶：

$ oc whoami
system:admin

在其他節點運行oc命令要先登錄，如下：

$ oc login https://openshift.itrunner.org:8443 -u developer
Authentication required for https://openshift.itrunner.org:8443 (openshift)
Username: developer
Password:
Login successful.

若用戶未授權則會輸出如下信息：

You don't have any projects. You can try to create a new project, by running

    oc new-project <projectname>

Welcome! See 'oc help' to get started.

登錄成功後會自動創建/更新配置文件~/.kube/config。

安裝OC Client Tool
oc client tool可以單獨下載安裝，登錄Web Console，依次點擊Help->Command Line Tools：

進入Command Line Tools頁面：

點擊下載鏈接Download oc，在文末選擇要安裝的系統版本，下載安裝。
登錄OpenShift
安裝後，點擊oc login後的Copy to Clipboard按鈕，粘貼內容到CLI，使用Token登錄：

$ oc login https://openshift.itrunner.org:8443 --token=xxxx

也可以使用用戶名/密碼登錄：

$ oc login https://openshift.itrunner.org:8443 -u developer --certificate-authority=/path/to/cert.crt

退出

$ oc logout

查看OpenShift資源類型

$ oc api-resources
NAME                             SHORTNAMES     APIGROUP     NAMESPACED   KIND
bindings                                                     true         Binding
componentstatuses                cs                          false        ComponentStatus
configmaps                       cm                          true         ConfigMap
endpoints                        ep                          true         Endpoints
events                           ev                          true         Event
limitranges                      limits                      true         LimitRange
namespaces                       ns                          false        Namespace
nodes                            no                          false        Node
persistentvolumeclaims           pvc                         true         PersistentVolumeClaim
persistentvolumes                pv                          false        PersistentVolume
pods                             po                          true         Pod
podtemplates                                                 true         PodTemplate
replicationcontrollers           rc                          true         ReplicationController
resourcequotas                   quota                       true         ResourceQuota
secrets                                                      true         Secret
...

查詢指定資源列表
資源類型可以使用NAME、SHORTNAMES或KIND，不區分大小寫，下面三個命令等同：

$ oc get pods
$ oc get po
$ oc get pod

查詢指定資源詳細信息

$ oc describe pods

更多oc命令請查看官方文檔。

配置組件和存儲

在生產集羣環境中安裝metrics來監控Pod內存、CPU、網絡情況，安裝Cluster logging來歸集日誌，這些是有必要的。registry、metrics、logging都需配置存儲。默認，安裝OpenShift Ansible broker (OAB)，OAB部署自己的etcd實例，獨立於OKD集羣的etcd，需要配置獨立的存儲。如未配置OAB存儲將進入 "CrashLoop" 狀態，直到etcd實例可用。

Cluster logging利用EFK Stack(Elasticsearch、Fluentd、Kibana)來歸集日誌。Fluentd從node、project、pod收集日誌存儲到Elasticsearch中。kibana爲WEB UI，用來查看日誌。另外，Curator負責從Elasticsearch中刪除老日誌。集羣管理員可以查看所有日誌，開發者可以查看授權的項目日誌。

Elasticsearch內存映射區域vm.max_map_count最小值要求爲262144，需提前更改Infra Node配置，執行命令：

# sysctl -w vm.max_map_count=262144

永久更改需編輯文件/etc/sysctl.conf，在最後一行添加：

vm.max_map_count=262144

建議安裝完基本組件後再單獨安裝Metric和logging組件，這樣可以從控制檯或命令行監控安裝狀態，也可以減少出錯機率。

$ ansible-playbook ~/openshift-ansible/playbooks/openshift-metrics/config.yml
$ ansible-playbook ~/openshift-ansible/playbooks/openshift-logging/config.yml

要卸載Metric和logging組件，修改install變量值後再運行config.yml：

openshift_metrics_install_metrics=false

openshift_logging_install_logging=false
openshift_logging_purge_logging=true

Metric和logging具體配置請看後面章節。

NFS

NFS不能保證一致性，不建議核心OKD組件使用NFS。經測試，RHEL NFS存儲registry、metrics、logging都存在問題，不推薦生產環境使用。Elasticsearch依賴於NFS不提供的文件系統行爲，可能會發生數據損壞和其他問題。

NFS優點是配置簡單，安裝快。按如下配置，集羣安裝時將在[nfs] host自動創建NFS卷，路徑爲nfs_directory/volume_name。爲了核心infrastructure組件能使用NFS，需要配置penshift_enable_unsupported_configurations=True。

[OSEv3:children]
masters
nodes
etcd
nfs

[OSEv3:vars]
openshift_enable_unsupported_configurations=True

openshift_hosted_registry_selector='node-role.kubernetes.io/infra=true'
openshift_hosted_registry_storage_kind=nfs
openshift_hosted_registry_storage_access_modes=['ReadWriteMany']
openshift_hosted_registry_storage_nfs_directory=/exports
openshift_hosted_registry_storage_nfs_options='*(rw,root_squash)'
openshift_hosted_registry_storage_volume_name=registry
openshift_hosted_registry_storage_volume_size=10Gi

openshift_metrics_server_install=true
openshift_metrics_install_metrics=true
#openshift_metrics_hawkular_hostname=hawkular-metrics.apps.itrunner.org
openshift_metrics_hawkular_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_cassandra_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_heapster_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_storage_kind=nfs
openshift_metrics_storage_access_modes=['ReadWriteOnce']
openshift_metrics_storage_nfs_directory=/exports
openshift_metrics_storage_nfs_options='*(rw,root_squash)'
openshift_metrics_storage_volume_name=metrics
openshift_metrics_storage_volume_size=10Gi

openshift_logging_install_logging=true
openshift_logging_purge_logging=false
openshift_logging_use_ops=false
openshift_logging_es_cluster_size=1
openshift_logging_es_number_of_replicas=1
openshift_logging_kibana_hostname=kibana.apps.itrunner.org
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_kibana_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_curator_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_storage_kind=nfs
openshift_logging_storage_access_modes=['ReadWriteOnce']
openshift_logging_storage_nfs_directory=/exports
openshift_logging_storage_nfs_options='*(rw,root_squash)'
openshift_logging_storage_volume_name=logging
openshift_logging_storage_volume_size=10Gi
openshift_logging_kibana_hostname=kibana.apps.iata-asd.org
openshift_logging_kibana_memory_limit=512Mi
openshift_logging_fluentd_memory_limit=512Mi
openshift_logging_es_memory_limit=10Gi

ansible_service_broker_install=true
openshift_hosted_etcd_storage_kind=nfs
openshift_hosted_etcd_storage_access_modes=["ReadWriteOnce"]
openshift_hosted_etcd_storage_nfs_directory=/exports
openshift_hosted_etcd_storage_nfs_options="*(rw,root_squash,sync,no_wdelay)"
openshift_hosted_etcd_storage_volume_name=etcd
openshift_hosted_etcd_storage_volume_size=1G
openshift_hosted_etcd_storage_labels={'storage': 'etcd'}

[nfs]
master1.itrunner.org

安裝後，創建文件/etc/exports.d/openshift-ansible.exports，內容如下：

"/exports/registry" *(rw,root_squash)
"/exports/metrics" *(rw,root_squash)
"/exports/logging" *(rw,root_squash)
"/exports/logging-es-ops" *(rw,root_squash)
"/exports/etcd" *(rw,root_squash,sync,no_wdelay)

運行oc get pv，可以查看已創建的persistent volume，如下：

NAME              CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                                 STORAGECLASS   REASON    AGE
etcd-volume       1G         RWO            Retain           Available                                                                  15h
logging-volume    10Gi       RWO            Retain           Bound       openshift-infra/metrics-cassandra-1                            15h
metrics-volume    10Gi       RWO            Retain           Bound       openshift-logging/logging-es-0                                 15h
registry-volume   10Gi       RWX            Retain           Bound       default/registry-claim                                         15h

GlusterFS

爲避免潛在的性能影響，建議規劃兩個GlusterFS集羣: 一個專門用於存儲 infrastructure application，另一個用於存儲一般應用。每個集羣至少需要3個節點，共需要6個節點。

存儲節點最小RAM爲8GB，必須至少有一個raw block device用作GlusterFS存儲(AWS只連接卷即可，不需執行其他任何操作)。每個GlusterFS volume大約消耗30MB RAM，請根據volume數目來調整RAM總量。
device(/dev/xvdf)檢查：

# lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
xvda    202:0    0    8G  0 disk
-xvda1  202:1    0    8G  0 part /
xvdf    202:80   0   10G  0 disk
# file -s /dev/xvdf
/dev/xvdf: data

可使用OKD節點內的Containerized GlusterFS，也可使用External GlusterFS。

安裝完基本組件後，再配置安裝GlusterFS，GlusterFS安裝時間較長。GlusterFS安裝成功後再安裝Metric和logging。

默認，SELinux不允許從Pod寫入遠程GlusterFS服務器。若要在啓用SELinux的情況下寫入GlusterFS卷，請在運行GlusterFS的每個節點上運行以下命令：

# setsebool -P virt_sandbox_use_fusefs=on virt_use_fusefs=on

GlusterFS基本配置

Containerized GlusterFS

[OSEv3:children]
...
glusterfs
glusterfs_registry

[OSEv3:vars]
...
openshift_storage_glusterfs_namespace=app-storage
openshift_storage_glusterfs_storageclass=true
openshift_storage_glusterfs_storageclass_default=false
openshift_storage_glusterfs_block_deploy=true
openshift_storage_glusterfs_block_host_vol_size=100
openshift_storage_glusterfs_block_storageclass=true
openshift_storage_glusterfs_block_storageclass_default=false
# openshift_storage_glusterfs_heketi_fstab="/var/lib/heketi/fstab"
openshift_storage_glusterfs_wipe=true
openshift_storage_glusterfs_heketi_wipe=true

openshift_storage_glusterfs_registry_namespace=infra-storage
openshift_storage_glusterfs_registry_storageclass=false
openshift_storage_glusterfs_registry_storageclass_default=false
openshift_storage_glusterfs_registry_block_deploy=true
openshift_storage_glusterfs_registry_block_host_vol_size=100
openshift_storage_glusterfs_registry_block_storageclass=true
openshift_storage_glusterfs_registry_block_storageclass_default=false
openshift_storage_glusterfs_registry_wipe=true
openshift_storage_glusterfs_registry_heketi_wipe=true

[glusterfs]
app[1:3].itrunner.org glusterfs_devices='[ "/dev/xvdf", "/dev/xvdg" ]'

[glusterfs_registry]
infra[1:3].itrunner.org glusterfs_devices='[ "/dev/xvdf", "/dev/xvdg" ]'

glusterfs: 普通存儲集羣，存儲一般應用
glusterfs_registry: 專用存儲集羣，存儲 infrastructure application，如OpenShift Container Registry

glusterfs、glusterfs_registry變量定義分別以openshift_storageglusterfs、openshift_storage_glusterfsregistry 開頭。完整變量列表請查看GlusterFS role README

GlusterFS支持兩種卷類型：GlusterFS Volume和gluster-block Volume，推薦OpenShift Logging和OpenShift Metrics使用gluster-block Volume，storage_class_name選擇"glusterfs-registry-block"。

$ oc get storageclass
NAME                       PROVISIONER                AGE
glusterfs-registry         kubernetes.io/glusterfs    2d
glusterfs-registry-block   gluster.org/glusterblock   2d
glusterfs-storage          kubernetes.io/glusterfs    2d
glusterfs-storage-block    gluster.org/glusterblock   2d

External GlusterFS

變量定義時需增加heketi配置，node定義需指定IP。

[OSEv3:children]
...
glusterfs
glusterfs_registry

[OSEv3:vars]
...
openshift_storage_glusterfs_namespace=app-storage
openshift_storage_glusterfs_storageclass=true
openshift_storage_glusterfs_storageclass_default=false
openshift_storage_glusterfs_block_deploy=true
openshift_storage_glusterfs_block_host_vol_size=100
openshift_storage_glusterfs_block_storageclass=true
openshift_storage_glusterfs_block_storageclass_default=false
openshift_storage_glusterfs_is_native=false
openshift_storage_glusterfs_heketi_is_native=true
openshift_storage_glusterfs_heketi_executor=ssh
openshift_storage_glusterfs_heketi_ssh_port=22
openshift_storage_glusterfs_heketi_ssh_user=root
openshift_storage_glusterfs_heketi_ssh_sudo=false
openshift_storage_glusterfs_heketi_ssh_keyfile="/root/.ssh/id_rsa"

openshift_storage_glusterfs_registry_namespace=infra-storage
openshift_storage_glusterfs_registry_storageclass=false
openshift_storage_glusterfs_registry_storageclass_default=false
openshift_storage_glusterfs_registry_block_deploy=true
openshift_storage_glusterfs_registry_block_host_vol_size=100
openshift_storage_glusterfs_registry_block_storageclass=true
openshift_storage_glusterfs_registry_block_storageclass_default=false
openshift_storage_glusterfs_registry_is_native=false
openshift_storage_glusterfs_registry_heketi_is_native=true
openshift_storage_glusterfs_registry_heketi_executor=ssh
openshift_storage_glusterfs_registry_heketi_ssh_port=22
openshift_storage_glusterfs_registry_heketi_ssh_user=root
openshift_storage_glusterfs_registry_heketi_ssh_sudo=false
openshift_storage_glusterfs_registry_heketi_ssh_keyfile="/root/.ssh/id_rsa"

[glusterfs]
gluster1.example.com glusterfs_ip=192.168.10.11 glusterfs_devices='[ "/dev/xvdc", "/dev/xvdd" ]'
gluster2.example.com glusterfs_ip=192.168.10.12 glusterfs_devices='[ "/dev/xvdc", "/dev/xvdd" ]'
gluster3.example.com glusterfs_ip=192.168.10.13 glusterfs_devices='[ "/dev/xvdc", "/dev/xvdd" ]'

[glusterfs_registry]
gluster4.example.com glusterfs_ip=192.168.10.14 glusterfs_devices='[ "/dev/xvdc", "/dev/xvdd" ]'
gluster5.example.com glusterfs_ip=192.168.10.15 glusterfs_devices='[ "/dev/xvdc", "/dev/xvdd" ]'
gluster6.example.com glusterfs_ip=192.168.10.16 glusterfs_devices='[ "/dev/xvdc", "/dev/xvdd" ]'

Containerized GlusterFS、metrics、logging配置
存儲類型使用dynamic時要設置openshift_master_dynamic_provisioning_enabled=True。

[OSEv3:children]
...
glusterfs
glusterfs_registry

[OSEv3:vars]
...
openshift_master_dynamic_provisioning_enabled=True

openshift_storage_glusterfs_namespace=app-storage
openshift_storage_glusterfs_storageclass=true
openshift_storage_glusterfs_storageclass_default=false
openshift_storage_glusterfs_block_deploy=true
openshift_storage_glusterfs_block_host_vol_size=100
openshift_storage_glusterfs_block_storageclass=true
openshift_storage_glusterfs_block_storageclass_default=false
openshift_storage_glusterfs_wipe=true
openshift_storage_glusterfs_heketi_wipe=true

openshift_storage_glusterfs_registry_namespace=infra-storage
openshift_storage_glusterfs_registry_storageclass=false
openshift_storage_glusterfs_registry_storageclass_default=false
openshift_storage_glusterfs_registry_block_deploy=true
openshift_storage_glusterfs_registry_block_host_vol_size=100
openshift_storage_glusterfs_registry_block_storageclass=true
openshift_storage_glusterfs_registry_block_storageclass_default=false
openshift_storage_glusterfs_registry_wipe=true
openshift_storage_glusterfs_registry_heketi_wipe=true

openshift_hosted_registry_storage_kind=glusterfs
openshift_hosted_registry_storage_volume_size=5Gi
openshift_hosted_registry_selector='node-role.kubernetes.io/infra=true'

openshift_metrics_install_metrics=true
openshift_metrics_hawkular_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_cassandra_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_heapster_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_storage_kind=dynamic
openshift_metrics_storage_volume_size=10Gi
openshift_metrics_cassandra_pvc_storage_class_name="glusterfs-registry-block"

openshift_logging_install_logging=true
openshift_logging_purge_logging=false
openshift_logging_use_ops=false
openshift_logging_es_cluster_size=1
openshift_logging_es_number_of_replicas=1
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_kibana_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_curator_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_storage_kind=dynamic
openshift_logging_elasticsearch_storage_type=pvc
openshift_logging_es_pvc_storage_class_name=glusterfs-registry-block
openshift_logging_es_pvc_size=10Gi
#openshift_logging_kibana_proxy_debug=true
openshift_logging_kibana_hostname=kibana.apps.itrunner.org
openshift_logging_kibana_memory_limit=512Mi
openshift_logging_fluentd_memory_limit=512Mi
openshift_logging_es_memory_limit=10Gi
openshift_logging_curator_default_days=10

[glusterfs]
app[1:3].itrunner.org glusterfs_devices='[ "/dev/xvdf", "/dev/xvdg" ]'

[glusterfs_registry]
infra[1:3].itrunner.org glusterfs_devices='[ "/dev/xvdf", "/dev/xvdg" ]'

安裝與卸載GlusterFS

$ ansible-playbook ~/openshift-ansible/playbooks/openshift-glusterfs/registry.yml
$ ansible-playbook ~/openshift-ansible/playbooks/openshift-glusterfs/uninstall.yml

安裝過程中以下兩步較慢(最新的3.11已變快了)，請耐心等待，如出錯不要卸載，重新安裝即可：

TASK [openshift_storage_glusterfs : Wait for GlusterFS pods]
TASK [openshift_storage_glusterfs : Load heketi topology]

在Load heketi topology這一步時查看pod，其中deploy-heketi-storage的任務爲執行部署heketi storage操作，成功後會自動刪除。

$ oc projects
$ oc project app-storage
$ oc get pods
NAME                            READY     STATUS    RESTARTS   AGE
deploy-heketi-storage-1-vtxgh   1/1       Running   0          2m
glusterfs-storage-jl9m6         1/1       Running   0          6m
glusterfs-storage-mq2rk         1/1       Running   0          6m
glusterfs-storage-tb5bj         1/1       Running   0          6m

安裝中隨時查看pod運行情況，如有不正常的pod，可使用oc logs -f pod_name查看日誌：

$ oc get pods -n app-storage
$ oc get pods -n infra-storage

出現異常情況時，執行卸載再重新安裝，下面兩個參數設爲true，在卸載時會清空數據：

openshift_storage_glusterfs_wipe=true
openshift_storage_glusterfs_heketi_wipe=true

成功安裝後有以下pod：

NAME                                          READY     STATUS    RESTARTS   AGE
glusterblock-storage-provisioner-dc-1-m4555   1/1       Running   0          18s
glusterfs-storage-26v4l                       1/1       Running   0          19m
glusterfs-storage-ft4bn                       1/1       Running   0          19m
glusterfs-storage-rxglx                       1/1       Running   0          19m
heketi-storage-1-mql5g                        1/1       Running   0          49s

NAME                                           READY     STATUS    RESTARTS   AGE
glusterblock-registry-provisioner-dc-1-k5l4z   1/1       Running   0          6m
glusterfs-registry-2f9vt                       1/1       Running   0          39m
glusterfs-registry-j78c6                       1/1       Running   0          39m
glusterfs-registry-xkl6p                       1/1       Running   0          39m
heketi-registry-1-655dm                        1/1       Running   0          6m

成功安裝後輸出：

PLAY RECAP
***********************************************************************************************
localhost                  : ok=12   changed=0    unreachable=0    failed=0                  
app1.itrunner.org : ok=27   changed=3    unreachable=0    failed=0             
app2.itrunner.org : ok=27   changed=3    unreachable=0    failed=0             
app3.itrunner.org : ok=27   changed=3    unreachable=0    failed=0             
infra1.itrunner.org : ok=28   changed=3    unreachable=0    failed=0           
infra2.itrunner.org : ok=27   changed=3    unreachable=0    failed=0           
infra3.itrunner.org : ok=27   changed=3    unreachable=0    failed=0           
master1.itrunner.org : ok=199  changed=53   unreachable=0    failed=0          
master2.itrunner.org : ok=33   changed=0    unreachable=0    failed=0          
master3.itrunner.org : ok=33   changed=0    unreachable=0    failed=0          

INSTALLER STATUS
************************************************************************************************
Initialization  : Complete (0:01:41)

Metrics

Metrics默認URL爲https://hawkular-metrics.{{openshift_master_default_subdomain}} ，可通過變量openshift_metrics_hawkular_hostname配置，但不能變更openshift_master_default_subdomain部分。

Metrics安裝在openshift-infra項目中，成功安裝後輸出如下：

PLAY RECAP **************************************************************************
localhost                  : ok=13   changed=0    unreachable=0    failed=0
app1.itrunner.org : ok=3    changed=0    unreachable=0    failed=0
app2.itrunner.org : ok=0    changed=0    unreachable=0    failed=0
app3.itrunner.org : ok=0    changed=0    unreachable=0    failed=0
infra1.itrunner.org : ok=0    changed=0    unreachable=0    failed=0
infra2.itrunner.org : ok=0    changed=0    unreachable=0    failed=0
infra3.itrunner.org : ok=0    changed=0    unreachable=0    failed=0
master1.itrunner.org : ok=224  changed=48   unreachable=0    failed=0
master2.itrunner.org : ok=25   changed=0    unreachable=0    failed=0
master3.itrunner.org : ok=25   changed=0    unreachable=0    failed=0

INSTALLER STATUS *******************************************************************
Initialization   : Complete (0:00:41)
Metrics Install  : Complete (0:01:25)

安裝完成，需等待所有pod狀態正常後纔可訪問metrics：

$ oc get -n openshift-infra pod
NAME                            READY     STATUS      RESTARTS   AGE
hawkular-cassandra-1-zlgbt      1/1       Running     0          2m
hawkular-metrics-schema-hfcv7   0/1       Completed   0          2m
hawkular-metrics-xz9nx          1/1       Running     0          2m
heapster-m4bhl                  1/1       Running     0          1m

Logging

注意，安裝Logging時一定要獲取最新的openshift-ansible源碼，最初的release-3.11存在bug。

生產環境每個Elasticsearch Shard的副本數至少爲1；高可用環境下，副本數至少爲2，至少要有三個Elasticsearch節點（openshift_logging_es_cluster_size默認值爲1，openshift_logging_es_number_of_replicas默認值爲0）：

openshift_logging_es_cluster_size=3
openshift_logging_es_number_of_replicas=2

默認，日誌保存時間爲30天，curator每天3:30執行日誌刪除操作：

openshift_logging_curator_default_days=30
openshift_logging_curator_run_hour=3
openshift_logging_curator_run_minute=30

Logging安裝在openshift-logging項目中，成功安裝後輸出如下：

PLAY RECAP ***********************************************************************************
localhost                  : ok=13   changed=0    unreachable=0    failed=0
app1.itrunner.org : ok=2    changed=1    unreachable=0    failed=0
app2.itrunner.org : ok=2    changed=1    unreachable=0    failed=0
app3.itrunner.org : ok=2    changed=1    unreachable=0    failed=0
infra1.itrunner.org : ok=2    changed=1    unreachable=0    failed=0
infra2.itrunner.org : ok=2    changed=1    unreachable=0    failed=0
infra3.itrunner.org : ok=2    changed=1    unreachable=0    failed=0
master1.itrunner.org : ok=268  changed=61   unreachable=0    failed=0
master2.itrunner.org : ok=29   changed=1    unreachable=0    failed=0
master3.itrunner.org : ok=29   changed=1    unreachable=0    failed=0

INSTALLER STATUS ****************************************************************************
Initialization   : Complete (0:00:52)
Logging Install  : Complete (0:02:50)

安裝時可能會輸出如下錯誤信息：

RUNNING HANDLER [openshift_logging_elasticsearch : Check if there is a rollout in progress for {{ _es_node }}] *********************************************************
fatal: [master1.itrunner.org]: FAILED! => {"changed": true, "cmd": ["oc", "--config=/etc/origin/master/admin.kubeconfig", "rollout", "status", "--watch=false", "dc/logging-es-data-master-9mtypbi7", "-n", "openshift-logging"], "delta": "0:00:00.241347", "end": "2019-03-01 12:37:14.287914", "msg": "non-zero return code", "rc": 1, "start": "2019-03-01 12:37:14.046567", "stderr": "error: Deployment config \"logging-es-data-master-9mtypbi7\" waiting on manual update (use 'oc rollout latest logging-es-data-master-9mtypbi7')", "stderr_lines": ["error: Deployment config \"logging-es-data-master-9mtypbi7\" waiting on manual update (use 'oc rollout latest logging-es-data-master-9mtypbi7')"], "stdout": "", "stdout_lines": []}
...ignoring

需要在master運行如下命令：

$ oc rollout latest logging-es-data-master-9mtypbi

成功安裝後的pod如下：

$ oc get -o wide -n openshift-logging pod
NAME                                      READY     STATUS      RESTARTS   AGE       IP            NODE                   NOMINATED NODE
logging-curator-1551583800-n78m5          0/1       Completed   0          23h       10.128.2.13   app1.itrunner.org      <none>
logging-es-data-master-9mtypbi7-2-gkzpq   2/2       Running     0          2d        10.128.4.9    app3.itrunner.org      <none>
logging-fluentd-69hhv                     1/1       Running     0          2d        10.128.4.7    app3.itrunner.org      <none>
logging-fluentd-7w7cq                     1/1       Running     0          2d        10.131.0.9    infra1.itrunner.org    <none>
logging-fluentd-bp4jm                     1/1       Running     0          2d        10.130.2.7    app2.itrunner.org      <none>
logging-fluentd-dn7tk                     1/1       Running     3          2d        10.128.0.58   master3.itrunner.org   <none>
logging-fluentd-jwrpn                     1/1       Running     0          2d        10.129.0.9    master1.itrunner.org   <none>
logging-fluentd-lbh5t                     1/1       Running     0          2d        10.128.2.10   app1.itrunner.org      <none>
logging-fluentd-rfdgv                     1/1       Running     0          2d        10.129.2.11   infra3.itrunner.org    <none>
logging-fluentd-vzr84                     1/1       Running     0          2d        10.130.0.7    master2.itrunner.org   <none>
logging-fluentd-z2fbd                     1/1       Running     0          2d        10.131.2.12   infra2.itrunner.org    <none>
logging-kibana-1-zqjx2                    2/2       Running     0          2d        10.128.2.9    app1.itrunner.org      <none>

openshift_logging_fluentd_nodeselector默認值爲logging-infra-fluentd: 'true'，默認所有Node都會安裝fluentd，會給node添加logging-infra-fluentd: 'true'標籤，可以通過openshift_logging_fluentd_hosts=['host1.example.com', 'host2.example.com']指定要安裝fluentd的Node。

es和kibana pod包含兩個docker container，分別爲（elasticsearch、proxy）、（kibana、kibana-proxy），其中proxy爲OAuth代理，查看日誌時要注意選擇container。登錄kibana時，如出現錯誤請先查看kibana-proxy日誌。比如，證書錯誤日誌如下：

$ oc get -n openshift-logging pod
$ oc logs -f logging-kibana-1-zqjx2 -c kibana-proxy
2019/03/03 01:09:22 oauthproxy.go:635: error redeeming code (client:10.131.0.1:52544): Post https://openshift.itrunner.org:8443/oauth/token:  x509: certificate is valid for www.itrunner.org, itrunner.org, not openshift.itrunner.org
2019/03/03 01:09:22 oauthproxy.go:434: ErrorPage 500 Internal Error Internal Error

安裝後可以修改curator配置：

$ oc edit cronjob/logging-curator

Logging UI界面：

權限管理

OpenShift有三種類型的用戶：

普通用戶(User) 常用用戶(Web Console登錄、oc login等)，在首次登錄時自動創建或通過API創建
系統用戶在infrastructure定義時自動創建，主要用於infrastructure與API安全地交互，例如：system:admin、system:openshift-registry
服務賬戶(ServiceAccount) 與項目關聯的特殊系統用戶，在項目創建時自動創建，或由項目管理員創建。用戶名由四部分組成：system:serviceaccount:[project]:[name]，例如：system:serviceaccount:default:deployer、system:serviceaccount:foo:builder。

可以使用OC命令管理用戶、組、角色、角色綁定，也可以在Cluster Console -> Home -> Search中選擇相應對象進行管理。在Cluster Console -> Administration中可以管理Service Account、角色、角色綁定，在Application Console -> Resources -> Other Resources中可以管理項目的Service Account、角色、角色綁定。下面主要介紹使用OC命令管理用戶權限。

User管理

初始安裝我們創建了兩個用戶admin和developer。在未曾登錄過系統的情況下，執行以下命令：

$ oc get users

這時不能查詢到用戶信息。從web console分別使用這兩個用戶登錄，再次查詢：

$ oc get users
NAME        UID                                    FULL NAME    IDENTITIES
admin       da115cc1-3c11-11e9-90ee-027e1f8419da                htpasswd_auth:admin
developer   022387d9-4168-11e9-8f82-027e1f8419da                htpasswd_auth:developer

初始安裝時我們只是定義了identity provider，使用htpasswd創建了用戶名、密碼，實際上用戶並未創建，在首次登錄時會自動創建用戶。

上面用戶的FULL NAME是空的，如何修改呢？執行以下命令：

$ oc edit user/admin

在打開的文件中添加fullName，如下：

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: user.openshift.io/v1
fullName: Administrator
groups: null
identities:
- htpasswd_auth:admin
kind: User
metadata:
  creationTimestamp: 2019-03-01T11:04:58Z
  name: admin
  resourceVersion: "1750648"
  selfLink: /apis/user.openshift.io/v1/users/admin
  uid: da115cc1-3c11-11e9-90ee-027e1f8419da

除用戶信息外，還有與之對應的identity信息，Identity保存了用戶和IDP的映射關係。查詢identity：

$  oc get identities
NAME                      IDP NAME        IDP USER NAME   USER NAME   USER UID
htpasswd_auth:admin       htpasswd_auth   admin           admin       da115cc1-3c11-11e9-90ee-027e1f8419da
htpasswd_auth:developer   htpasswd_auth   developer       developer   022387d9-4168-11e9-8f82-027e1f8419da

編輯identity：

$ oc edit identity/htpasswd_auth:admin

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: user.openshift.io/v1
kind: Identity
metadata:
  creationTimestamp: 2019-03-01T11:04:58Z
  name: htpasswd_auth:admin
  resourceVersion: "12472"
  selfLink: /apis/user.openshift.io/v1/identities/htpasswd_auth%3Aadmin
  uid: da11e717-3c11-11e9-90ee-027e1f8419da
providerName: htpasswd_auth
providerUserName: admin
user:
  name: admin
  uid: da115cc1-3c11-11e9-90ee-027e1f8419da

手工創建User
先設置用戶密碼：

# htpasswd /etc/origin/master/htpasswd jason

然後依次執行以下命令：

$ oc create user jason --full-name "Sun Jingchuan"
$ oc create identity htpasswd_auth:jason
$ oc create useridentitymapping htpasswd_auth:jason jason

刪除User

$ oc delete user developer
$ oc delete identity htpasswd_auth:developer
$ sudo htpasswd -D /etc/origin/master/htpasswd developer

組管理

爲了方便用戶管理，例如授權策略, 或一次向多個用戶授予權限，可以將用戶加到組中。
新建組
語法：

oc adm groups new <group_name> <user1> <user2>

例如：

$ oc adm groups new hello jason coco

查詢組

$ oc get groups
NAME      USERS
hello      jason, coco

添加組員

$ oc adm groups add-users hello test

刪除組員

$ oc adm groups remove-users hello test

角色與權限

默認，未授權的用戶登錄系統後只有創建項目和管理自己項目的權限。OpenShift權限管理是基於角色的（Role-based Access Control (RBAC) ），每個角色擁有一系列規則（Rule），規則定義了允許的操作。角色可以授予用戶或組。角色分爲兩種類型：Cluster Role和Local Role，兩者的區別在於Cluster Role定義在集羣級別（所有項目），Local Role限定在項目範圍。

創建角色

創建Cluster Role

語法：

$ oc create clusterrole <name> --verb=<verb> --resource=<resource>

例如：

$ oc create clusterrole podviewonly --verb=get --resource=pod

創建Local Role

語法：

$ oc create role <name> --verb=<verb> --resource=<resource> -n <project>

例如：

$ oc create role podview --verb=get --resource=pod -n blue

查看角色及其關聯的規則集

$ oc describe clusterrole.rbac
$ oc describe clusterrole.rbac cluster-admin
$ oc describe clusterrole.rbac self-provisioner
$ oc describe role.rbac --all-namespaces

默認角色

Default Cluster Role	Description
admin	A project manager. If used in a local binding, an admin user will have rights to view any resource in the project and modify any resource in the project except for quota
basic-user	A user that can get basic information about projects and users.
cluster-admin	A super-user that can perform any action in any project. When bound to a user with a local binding, they have full control over quota and every action on every resource in the project
cluster-status	A user that can get basic cluster status information
edit	A user that can modify most objects in a project, but does not have the power to view or modify roles or bindings
self-provisioner	A user that can create their own projects
view	A user who cannot make any modifications, but can see most objects in a project. They cannot view or modify roles or bindings
cluster-reader	A user who can read, but not view, objects in the cluster

角色綁定
角色可以授予用戶或組，Cluster Role可以綁定在集羣或項目級別。

Local Role綁定操作

Command	Description
$ oc adm policy who-can [verb] [resource]	Indicates which users can perform an action on a resource
$ oc adm policy add-role-to-user [role] [username]	Binds a given role to specified users in the current project
$ oc adm policy remove-role-from-user [role] [username]	Removes a given role from specified users in the current project
$ oc adm policy remove-user [username]	Removes specified users and all of their roles in the current project
$ oc adm policy add-role-to-group [role] [groupname]	Binds a given role to specified groups in the current project
$ oc adm policy remove-role-from-group [role] [groupname]	Removes a given role from specified groups in the current project
$ oc adm policy remove-group [groupname]	Removes specified groups and all of their roles in the current project

Local Role綁定操作也可以使用命令oc policy。

Cluster Role綁定操作

Command	Description
$ oc adm policy add-cluster-role-to-user [role] [username]	Binds a given role to specified users for all projects in the cluster
$ oc adm policy remove-cluster-role-from-user [role] [username]	Removes a given role from specified users for all projects in the cluster
$ oc adm policy add-cluster-role-to-group [role] [groupname]	Binds a given role to specified groups for all projects in the cluster
$ oc adm policy remove-cluster-role-from-group [role] [groupname]	Removes a given role from specified groups for all projects in the cluster

例如：

$ oc adm policy add-role-to-user admin jason -n my-project

若未使用-n指定project則爲當前project。

$ oc adm policy add-cluster-role-to-user cluster-admin admin

上例，未使用--rolebinding-name指定rolebinding名稱，則使用默認名稱，首次執行時名稱爲cluster-admin-0，若繼續執行下面命令，名稱則爲cluster-admin-1：

$ oc adm policy add-cluster-role-to-user cluster-admin jason

指定rolebinding名稱，可以創建/修改rolebinding，向已創建的rolebinding中添加用戶。
可以一次將角色授予多個用戶：

$ oc adm policy add-cluster-role-to-user cluster-admin jason coco --rolebinding-name=cluster-admin-0

也可以使用create clusterrolebinding創建rolebinding：

$ oc create clusterrolebinding cluster-admins --clusterrole=cluster-admin --user=admin

查詢角色綁定的用戶

$ oc describe rolebinding.rbac -n my-project
...
$ oc describe clusterrolebinding.rbac cluster-admin cluster-admins cluster-admin-0
Name:         cluster-admin
Labels:       kubernetes.io/bootstrapping=rbac-defaults
Annotations:  rbac.authorization.kubernetes.io/autoupdate=false
Role:
  Kind:  ClusterRole
  Name:  cluster-admin
Subjects:
  Kind   Name            Namespace
  ----   ----            ---------
  Group  system:masters

Name:         cluster-admins
Labels:       <none>
Annotations:  rbac.authorization.kubernetes.io/autoupdate=false
Role:
  Kind:  ClusterRole
  Name:  cluster-admin
Subjects:
  Kind   Name                   Namespace
  ----   ----                   ---------
  Group  system:cluster-admins
  User   system:admin

Name:         cluster-admin-0
Labels:       <none>
Annotations:  <none>
Role:
  Kind:  ClusterRole
  Name:  cluster-admin
Subjects:
  Kind  Name   Namespace
  ----  ----   ---------
  User  admin

刪除rolebinding

$ oc delete clusterrolebinding cluster-admin-1

移除其中一個用戶：

$ oc annotate clusterrolebinding.rbac cluster-admins 'rbac.authorization.kubernetes.io/autoupdate=false' --overwrite
$ oc adm policy remove-cluster-role-from-user cluster-admin test

Service Account

查詢Service Account

$ oc get sa
NAME       SECRETS   AGE
builder    3         1d
default    2         1d
deployer   2         1d
$ oc describe sa builder
...

默認Service Account
每個項目創建時都會創建builder、deployer、default三個服務賬戶。

builder 構建pod時使用，擁有system:image-builder角色，允許使用內部Docker registry將image push到image stream
deployer 部署pod時使用，擁有system:deployer角色，允許查看和修改replication controller和pod
default 運行pod時使用

所有service account都擁有system:image-puller角色，允許從內部registry獲取image。
創建Service Account

$ oc create sa robot

Service Account組
每個Service Account都是system:serviceaccount、system:serviceaccount:[project]兩個組的成員，system:serviceaccount包含所有Service Account。
將角色授予Service Account

$ oc policy add-role-to-user view system:serviceaccount:top-secret:robot
$ oc policy add-role-to-group view system:serviceaccount -n top-secret

Secret

Secret提供了一種機制來保存敏感信息，如密碼、OKD客戶端配置文件、dockercfg文件等。可通過OC命令或Application Console -> Resources -> Secrets管理Secret。

默認，新建項目後，爲builder、default、deployer三個Service Account各創建了三個Secret，其中一個類型爲dockercfg，另兩個類型爲service-account-token：

$ oc get secrets
NAME                       TYPE                                  DATA      AGE
builder-dockercfg-pvb27    kubernetes.io/dockercfg               1         6d
builder-token-dl69q        kubernetes.io/service-account-token   4         6d
builder-token-knkwg        kubernetes.io/service-account-token   4         6d
default-dockercfg-sb9gw    kubernetes.io/dockercfg               1         6d
default-token-s4qg4        kubernetes.io/service-account-token   4         6d
default-token-zpjj8        kubernetes.io/service-account-token   4         6d
deployer-dockercfg-f6g5x   kubernetes.io/dockercfg               1         6d
deployer-token-brvhh       kubernetes.io/service-account-token   4         6d
deployer-token-wvvdb       kubernetes.io/service-account-token   4         6d

兩個service-account-token中之一供dockercfg內部使用，每個Service Account綁定了一個dockercfg和一個token。

Secret Type

Generic Secret
Image Secret
Source Secret
Webhook Secret

Docker registry
如在同一項目內訪問OpenShift內部Docker registry，則已有了正確的權限，不需要其他操作；如跨項目訪問，則需授權。
允許project-a內的pod訪問project-b的image：

$ oc policy add-role-to-user system:image-puller system:serviceaccount:project-a:default --namespace=project-b

或：

$ oc policy add-role-to-group system:image-puller system:serviceaccounts:project-a --namespace=project-b

Secret實現了敏感內容與pod的解耦。在pod內有三種方式使用Secret：

to populate environment variables for containers

apiVersion: v1
kind: Pod
metadata:
  name: secret-example-pod
spec:
  containers:
    - name: secret-test-container
      image: busybox
      command: [ "/bin/sh", "-c", "export" ]
      env:
        - name: TEST_SECRET_USERNAME_ENV_VAR
          valueFrom:
            secretKeyRef:
              name: test-secret
              key: username
  restartPolicy: Never

as files in a volume mounted on one or more of its containers

apiVersion: v1
kind: Pod
metadata:
  name: secret-example-pod
spec:
  containers:
    - name: secret-test-container
      image: busybox
      command: [ "/bin/sh", "-c", "cat /etc/secret-volume/*" ]
      volumeMounts:
          # name must match the volume name below
          - name: secret-volume
            mountPath: /etc/secret-volume
            readOnly: true
  volumes:
    - name: secret-volume
      secret:
        secretName: test-secret
  restartPolicy: Never

by kubelet when pulling images for the pod

2018泰尼卡意大利巨人之旅 • 阿爾卑斯

參考資料

OpenShift
OpenShift Github
OpenShift Documentation
OKD
OKD Latest Documentation
Ansible Documentation
External Load Balancer Integrations with OpenShift Enterprise 3
Red Hat OpenShift on AWS
Docker Documentation
Kubernetes Documentation
Kubernetes中文社區
 SSL For Free

AWS RHEL/CentOS 7快速安裝配置OpenShift 3.11