kubernetes 部署redis cluster
前言
Redis cluster的集羣關係的維繫,並非是依賴於ip的,而是依賴於cluster內部的唯一id, ip只在首次建立集羣關係時連接彼此使用,不作爲成員連接憑據,取而代之的是id。畫外音:只要持有id,容器重啓ip怎麼變化都不會影響到維繫redis cluster的成員關係。
那麼id怎麼保存的呢?redis cluster在建立起來後,每個主備節點都會保存一份cluster節點的元數據文件,因此,爲了保證在kubernetes內pod重啓後,集羣相關的角色配置等不丟失,此文件必須持久化,因此,適合使用statefulSet來部署。
部署
直接貼部署的yaml文件:
apiVersion: v1
data:
redis.conf: |2
appendonly no
save 900 1
save 300 10
save 60 300
maxmemory 4GB
maxmemory-policy allkeys-lru
cluster-enabled yes
cluster-config-file /var/lib/redis/nodes.conf # cluster的node元數據文件
#cluster-node-timeout 5000 # 主從心跳檢查的時間間隔
cluster-node-timeout 500 # 這裏臨時配置成500ms,爲了下方的主備切換的測試
dir /var/lib/redis
port 6379
kind: ConfigMap
metadata:
name: app-rds-cluster
namespace: default
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: app-rds-cluster
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Gi
storageClassName: cephfs
---
# 用來提供redis cluster對外連接服務的
apiVersion: v1
kind: Service
metadata:
name: app-rds-cluster
labels:
app: redis
spec:
ports:
- name: redis-port
port: 6379
selector:
app: app-rds-cluster
appCluster: redis-cluster
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: app-rds-cluster
spec:
serviceName: "app-rds-cluster"
replicas: 6
template:
metadata:
labels:
app: app-rds-cluster
appCluster: redis-cluster
spec:
terminationGracePeriodSeconds: 5
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- app-rds-cluster
topologyKey: kubernetes.io/hostname
containers:
- name: redis
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
image: redis:v4.0.14
command:
- "redis-server" #redis啓動命令
args:
- "/etc/redis/redis.conf" # 配置文件
# command: redis-server /etc/redis/redis.conf
resources: #資源
requests: #請求的資源
cpu: "100m" #m代表千分之,相當於0.1 個cpu資源
memory: "100Mi" #內存100m大小
limits:
cpu: "1" # 1代表1核
memory: "4096Mi" #內存100m大小
ports:
- name: redis
containerPort: 6379
protocol: "TCP"
- name: cluster
containerPort: 16379
protocol: "TCP"
volumeMounts:
- name: "redis-conf" # 掛載configmap生成的文件
mountPath: "/etc/redis"
- name: pvc
mountPath: "/var/lib/redis"
subPathExpr: $(POD_NAME)/data/
nodeSelector:
RDSDB: ''
volumes:
- name: "redis-conf"
configMap:
name: app-rds-cluster
- name: pvc
persistentVolumeClaim:
claimName: app-rds-cluster
部署說明:
- redis配置文件使用configmap掛載
- 共配置6個節點,3主3從,這也是官方文檔中說的推薦最小的集羣規模了。
- redis cluster node元數據文件很重要,下面會有詳細說明。這個文件必須要做持久化,可使用pvc/hostPath等方式。
- 使用redis-trib工具建立cluster,建立好cluster後,後續cluster關係維護不再需要它。
- 網上有不少其他文章中都部署了2個service,包含一個headless service,建cluster時用作服務發現;和一個正常的service,給客戶端連接使用。但這個headless service其實根本沒有必要,建cluster的時候手動取一次pod ip就好,何必多建一個無用的service
假設你的持久化沒有問題,那麼,你現在應該有6個運行正常的pod了:
[root@008019 redis-cluster]# kubectl get pods -o wide --all-namespaces | grep app-rds-cluster
default app-rds-cluster-0 1/1 Running 0 5m1s 172.36.4.43 008031 <none> <none>
default app-rds-cluster-1 1/1 Running 0 4m58s 172.36.1.30 008020 <none> <none>
default app-rds-cluster-2 1/1 Running 0 4m56s 172.36.6.21 020203 <none> <none>
default app-rds-cluster-3 1/1 Running 0 4m53s 172.36.5.25 008032 <none> <none>
default app-rds-cluster-4 1/1 Running 0 4m51s 172.36.0.198 008019 <none> <none>
default app-rds-cluster-5 1/1 Running 0 4m48s 172.36.3.134 020204 <none> <none>
# 獲取到這些pod 的ip,下面會用到
[root@008019 redis-cluster]# echo `kubectl get pods -o wide --all-namespaces | grep app-rds-cluster | awk '{print $7":6379"}'`
172.36.4.43:6379 172.36.1.30:6379 172.36.6.21:6379 172.36.5.25:6379 172.36.0.198:6379 172.36.3.134:6379
建立集羣
因此建立集羣的操作只需要執行一次,後續集羣關係redis節點之間會基於節點元數據文件自動維護,所以,專門部署一個環境,用來初始化配置集羣,部署文件如下:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: redis-cluster-manager
name: redis-cluster-manager
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: redis-cluster-manager
template:
metadata:
labels:
app: redis-cluster-manager
spec:
containers:
- command:
- tail
args:
- -f
- /dev/null
env:
- name: DB_NAME
value: redis-cluster-manager
image: centos:centos7
imagePullPolicy: Always
name: redis-cluster-manager
resources:
limits:
cpu: 500m
memory: 300Mi
requests:
cpu: 100m
memory: 100Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
[root@008019 redis-cluster]# kubectl get pods -o wide --all-namespaces | grep redis-cluster-manager
default redis-cluster-manager-5468b99f7f-lxpw7 1/1 Running 0 87m 172.36.4.42 008031 <none> <none>
[root@008019 redis-cluster]# kubectl exec -it redis-cluster-manager-5468b99f7f-lxpw7 bash
[root@redis-cluster-manager-5468b99f7f-lxpw7 /]#
安裝集羣配置工具:
cat >> /etc/yum.repos.d/epel.repo<<'EOF'
[epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
baseurl=https://mirrors.tuna.tsinghua.edu.cn/epel/7/$basearch
#mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=$basearch
failovermethod=priority
enabled=1
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
EOF
yum -y install redis-trib.noarch
初始化集羣:
# 拿到上面獲取到的pod ip,執行命令,交互步驟輸入yes,不出意外,集羣初始化成功
# 解釋一下,這裏的6個節點,--replicas 1,前3個會是master,後3個會是前3個的slave
redis-trib create --replicas 1 \
172.36.4.43:6379 172.36.1.30:6379 172.36.6.21:6379 172.36.5.25:6379 172.36.0.198:6379 172.36.3.134:6379
...
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
連接
選擇其中一個
[root@008019 redis-cluster]# kubectl exec -it redis-cluster-manager-5468b99f7f-lxpw7 bash
[root@redis-cluster-manager-5468b99f7f-lxpw7 /]#
[root@redis-cluster-manager-5468b99f7f-lxpw7 /]# exit
exit
[root@008019 redis-cluster]# kubectl exec -it app-rds-cluster-0 bash
# redis-cli連接cluster記得加-c參數
root@app-rds-cluster-0:/var/lib/redis# redis-cli -c
# 設一個key
127.0.0.1:6379> get a
-> Redirected to slot [15495] located at 172.36.6.21:6379
(nil)
172.36.6.21:6379> set a 1
OK
172.36.6.21:6379> get a
"1"
# 查看角色
172.36.6.21:6379> role
1) "master"
2) (integer) 1954
3) 1) 1) "172.36.3.134"
2) "6379"
3) "1954"
172.36.6.21:6379> quit
# 查看集羣節點元數據文件,第一列是id號,後面則是ip與角色信息, myself指的是自身
root@app-rds-cluster-0:/var/lib/redis# cat nodes.conf
bdcda6ce963add5bc9706e912b6fcf4b355e2add 172.36.3.134:6379@16379 slave d24da6b03aa3b614a1429847b3a47836f89dbc07 0 1578042761171 6 connected
f85ce96851823832a9bda4233cc6e3066e97c050 172.36.5.25:6379@16379 slave 561f70b2c94c6e46f7b9588705f2cb48861b1e89 0 1578042760571 4 connected
561f70b2c94c6e46f7b9588705f2cb48861b1e89 172.36.4.43:6379@16379 myself,master - 0 1578042760000 1 connected 0-5460
e5c64aa60dcc73746e196b6c1630019ec0c10ad5 172.36.0.198:6379@16379 slave 7e2a3fa95ef9402bfba7b2e08a27812272640179 0 1578042760000 5 connected
7e2a3fa95ef9402bfba7b2e08a27812272640179 172.36.1.30:6379@16379 master - 0 1578042760000 2 connected 5461-10922
d24da6b03aa3b614a1429847b3a47836f89dbc07 172.36.6.21:6379@16379 master - 0 1578042759569 3 connected 10923-16383
vars currentEpoch 6 lastVoteEpoch 0
root@app-rds-cluster-0:/var/lib/redis#
root@app-rds-cluster-0:/var/lib/redis#
使用service 固定ip連接:
[root@008019 ~]# kubectl get service --all-namespaces | grep app-rds-cluster
default app-rds-cluster ClusterIP 10.123.80.163 <none> 6379/TCP 48m
[root@008019 ~]# kubectl exec -it redis-cluster-manager-5468b99f7f-lxpw7 bash
[root@redis-cluster-manager-5468b99f7f-lxpw7 /]# redis-cli -h 10.123.80.163 -c
10.123.80.163:6379> get a
-> Redirected to slot [15495] located at 172.36.6.22:6379
"1"
主備切換
作爲一個高可用集羣,主備切換是基本能力,看一看切換過程發生了什麼:
# 重啓一個master
[root@008019 ~]# kubectl delete pod app-rds-cluster-0
pod "app-rds-cluster-0" deleted
[root@008019 ~]#
[root@008019 ~]# kubectl get pods -o wide --all-namespaces | grep app-rds-cl
default app-rds-cluster-0 1/1 Running 0 20s 172.36.4.44 008031 <none> <none>
default app-rds-cluster-1 1/1 Running 0 31m 172.36.1.30 008020 <none> <none>
default app-rds-cluster-2 1/1 Running 0 31m 172.36.6.21 020203 <none> <none>
default app-rds-cluster-3 1/1 Running 0 31m 172.36.5.25 008032 <none> <none>
default app-rds-cluster-4 1/1 Running 0 31m 172.36.0.198 008019 <none> <none>
default app-rds-cluster-5 1/1 Running 0 31m 172.36.3.134 020204 <none> <none>
# 對比上面可以發現,redis-0的ip從172.36.4.43變成了172.36.4.44,那麼連接看看數據和角色
# 可以看到,此時master已經變成了slave, 但nodes.conf內的ip沒有隨着容器ip變化而修改,只是修改了從屬關係的記錄。當然,數據也沒有丟失。
root@app-rds-cluster-0:/var/lib/redis# cat nodes.conf
7e2a3fa95ef9402bfba7b2e08a27812272640179 172.36.1.31:6379@16379 master - 1578044978612 1578044978607 2 connected 5461-10922
e5c64aa60dcc73746e196b6c1630019ec0c10ad5 172.36.0.199:6379@16379 slave 7e2a3fa95ef9402bfba7b2e08a27812272640179 0 1578044978614 5 connected
561f70b2c94c6e46f7b9588705f2cb48861b1e89 172.36.4.43:6379@16379 myself,slave f85ce96851823832a9bda4233cc6e3066e97c050 0 1578044978608 9 connected
bdcda6ce963add5bc9706e912b6fcf4b355e2add 172.36.3.135:6379@16379 slave d24da6b03aa3b614a1429847b3a47836f89dbc07 1578044978612 1578044978608 10 connected
f85ce96851823832a9bda4233cc6e3066e97c050 172.36.5.26:6379@16379 master - 0 1578044978614 11 connected 0-5460
d24da6b03aa3b614a1429847b3a47836f89dbc07 172.36.6.22:6379@16379 master - 1578044978612 1578044978608 10 connected 10923-16383
vars currentEpoch 11 lastVoteEpoch 10
root@app-rds-cluster-0:/var/lib/redis# redis-cli -c
127.0.0.1:6379> get a
-> Redirected to slot [15495] located at 172.36.6.22:6379
"1"
有興趣可以同時刪除多個節點試試,6節點,3個爲主,按照raft的算法,外加主掛了之後還有備接替上,因此,除非2主+2備或更多節點同時掛掉,不然不會影響redis的服務。