Sentinel 進程是用於監控 redis 集羣中 Master 主服務器工作的狀態,在 Master 主服務器發生故障的時候,可以實現 Master 和 Slave 服務器的切換,保證系統的高可用,其已經被集成在 redis2.6+的版本中, Redis 的哨兵模式到了 2.8 版本之後就穩定了下來。一般在生產環境也建議使用 Redis 的 2.8 版本的以後版本。哨兵(Sentinel) 是一個分佈式系統, 可以在一個架構中運行多個哨兵(sentinel) 進程,這些進程使用流言協議(gossip protocols)來接收關於 Master 主服務器是否下線的信息,並使用投票協議(Agreement Protocols)來決定是否執行自動故障遷移,以及選擇哪個 Slave 作爲新的 Master。每個哨兵(Sentinel)進程會向其它哨兵(Sentinel)、 Master、 Slave 定時發送消息,以確認對方是否”活”着,如果發現對方在指定配置時間(可配置的)內未得到迴應,則暫時認爲對方已掉線,也就是所謂的” 主觀認爲宕機” , 主觀是每個成員都具有的獨自的而且可能相同也可能不同的意識,英文名稱: Subjec Down,簡稱 SDOWN。有主觀宕機,肯定就有客觀宕機。當“哨兵羣”中的多數 Sentinel 進程在對 Master 主服務器做出 SDOWN 的判斷,並且通過 SENTINEL is-master-down-by-addr 命令互相交流之後,得出的 Master Server 下線判斷,這種方式就是“客觀宕機”, 客觀是不依賴於某種意識而已經實際存在的一切事物, 英文名稱是: Objectively Down, 簡稱 ODOWN。通過一定的 vote 算法,從剩下的 slave 從服務器節點中,選一臺提升爲 Master 服務器節點,然後自動修改相關配置,並開啓故障轉移(failover)。
Sentinel 機制可以解決 master 和 slave 角色的切換問題。
環境
主機名 | 主機IP地址 |
---|---|
Master | 192.168.36.110 |
Slave-1 | 192.168.36.111 |
Slave-2 | 192.168.36.112 |
環境前確保開啓Redis服務
[root@Master ~]#ss -ntl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:6379 *:*
[root@Slave-1 ~]#ss -ntl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:6379 *:*
[root@Slave-2 ~]#ss -ntl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:6379 *:*
手動配置Master
Redis服務器默認爲master,指定master服務器後將其他slave服務器使用命令配置爲master服務器的slave。因爲哨兵的前提是已經手動實現了一個redis master-slave的運行環境。
Slave-1配置slave
[root@Slave-1 ~]#vim /apps/redis/etc/redis.conf
....
281 slaveof 192.168.36.110 6379 # slaveof指向master
288 masterauth 123456
....
[root@Slave-1 ~]#ps -ef | grep redis
root 7397 1 0 10:45 ? 00:00:01 redis-server 0.0.0.0:6379
root 7484 7349 0 10:54 pts/0 00:00:00 grep --color=auto redis
[root@Slave-1 ~]#kill -9 7397 # 終止進程
[root@Slave-1 ~]#redis-server /apps/redis/etc/redis.conf # 重新加載配置文件
Slave-2配置slave
[root@Slave-2 ~]#vim /apps/redis/etc/redis.conf
....
281 slaveof 192.168.36.110 6379
288 masterauth 123456
....
[root@Slave-2 ~]#ps -ef | grep redis
root 8017 1 0 10:44 ? 00:00:01 redis-server 0.0.0.0:6379
root 8173 7926 0 10:56 pts/0 00:00:00 grep --color=auto redis
[root@Slave-2 ~]#kill 8017
[root@Slave-2 ~]#redis-server /apps/redis/etc/redis.conf # 重新加載配置文件
狀態查看
# Slave-1狀態
[root@Slave-1 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:slave # 已變爲slave
master_host:192.168.36.110
master_port:6379
master_link_status:up # 開啓了狀態同步
master_last_io_seconds_ago:8
master_sync_in_progress:0
slave_repl_offset:84
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:99a1dcabb930a97bbdea90450b2f891778c83e37
master_replid2:0000000000000000000000000000000000000000 # 保存了上一次的master_replid的值,當發生故障轉移後此值會記錄當前的master的id
master_repl_offset:84
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:84
# Slave-2 狀態
[root@Slave-2 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:slave # 已變爲slave
master_host:192.168.36.110
master_port:6379
master_link_status:up # 開啓了狀態同步
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:224
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:99a1dcabb930a97bbdea90450b2f891778c83e37
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:224
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:155
repl_backlog_histlen:70
# Master狀態
[root@Master ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:master
connected_slaves:2 # 2個slave,此時Slave-1、Slave-2已經加入進來
slave0:ip=192.168.36.111,port=6379,state=online,offset=336,lag=1
slave1:ip=192.168.36.112,port=6379,state=online,offset=336,lag=1
master_replid:99a1dcabb930a97bbdea90450b2f891778c83e37
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:336
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:336
# 此時兩個slave同步master數據,可以查看但不能寫數據
127.0.0.1:6379> KEYS *
1) "key3"
2) "key2"
3) "key1"
127.0.0.1:6379> SET key5 value5
(error) READONLY You can't write aga×××t a read only slave.
127.0.0.1:6379> GET key3
"value4"
三臺服務器編輯sentinel配置文件
# 由於Redis爲編譯安裝,所以需要cp拷貝sentinel配置文件
# 如果yum安裝,則存在sentinel配置文件,無需拷貝
[root@Master ~]#cp /root/redis-4.0.14/sentinel.conf /apps/redis/etc/
Master配置
[root@Master ~]#vim /apps/redis/etc/sentinel.conf
[root@Master ~]#grep "^[a-Z]" /apps/redis/etc/sentinel.conf
bind 0.0.0.0
port 26379
daemonize yes
#pidfile "redis-sentinel.pid"
logfile "sentinel_26379.log"
dir "/apps/redis/"
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 192.168.36.111 6379 2 # 法定人數限制(quorum),即有幾個 slave 認爲 master down 了就進行故障轉移
sentinel auth-pass mymaster 123456
sentinel down-after-milliseconds mymaster 10000 # (SDOWN)主觀下線的時間,單位(毫秒)
sentinel parallel-syncs mymaster 1 # 發生故障轉移時候同時向新 master 同步數據的 slave 數量, 數字越小總同步時間越長
sentinel failover-timeout mymaster 180000 # 所有 slaves 指向新的 master 所需的超時時間
sentinel deny-scripts-reconfig yes # 禁止修改腳本
# 將配置文件scp到兩個slave節點
[root@Master redis-4.0.14]#scp /apps/redis/sentinel.conf 192.168.36.111:/apps/redis/
[email protected]'s password:
sentinel.conf 100% 282 214.2KB/s 00:00
[root@Master redis-4.0.14]#scp /apps/redis/sentinel.conf 192.168.36.112:/apps/redis/
[email protected]'s password:
sentinel.conf 100% 282 267.0KB/s 00:00
啓動哨兵
[root@Master ~]#redis-sentinel /apps/redis/etc/sentinel.conf
[root@Master ~]#ss -ntl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 511 *:26379 *:*
[root@Slave-1 ~]#redis-sentinel /apps/redis/etc/sentinel.conf
[root@Slave-2 ~]#redis-sentinel /apps/redis/etc/sentinel.conf
哨兵日誌
[root@Master ~]#tail -f /apps/redis/logs/sentinel_26379.log
14129:X 14 Jun 16:23:34.697 # Sentinel is now ready to exit, bye bye...
14134:X 14 Jun 16:23:40.985 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
14134:X 14 Jun 16:23:40.985 # Redis version=4.0.14, bits=64, commit=00000000, modified=0, pid=14134, just started
14134:X 14 Jun 16:23:40.985 # Configuration loaded
14134:X 14 Jun 16:23:40.986 * Increased maximum number of open files to 10032 (it was originally set to 1024).
14134:X 14 Jun 16:23:40.987 * Running mode=sentinel, port=26379.
14134:X 14 Jun 16:23:40.987 # Sentinel ID is 69d6647e2c6236b5b72d8e943b5d5707db47b9a4
14134:X 14 Jun 16:23:40.987 # +monitor master mymaster 192.168.36.110 6379 quorum 2
14134:X 14 Jun 16:23:43.015 * +sentinel sentinel abeb0c89a25c690b5cbe09491de6ab822deee15e 192.168.36.112 26379 @ mymaster 192.168.36.110 6379
14134:X 14 Jun 16:23:43.050 * +sentinel sentinel 4d3b7eb172aaef1a58b35c1a567534c67f3977ef 192.168.36.111 26379 @ mymaster 192.168.36.110 6379
狀態查看
[root@Master ~]#redis-cli -h 192.168.36.110 -p 26379 # 通過哨兵26379端口進行查看
192.168.36.110:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=192.168.36.110:6379,slaves=2,sentinels=3
停止master節點的redis服務時哨兵日誌變化
[root@Master ~]#tail -f /apps/redis/logs/sentinel_26379.log
14232:X 14 Jun 16:29:58.189 # +sdown master mymaster 192.168.36.110 6379
14232:X 14 Jun 16:29:58.218 # +new-epoch 1
14232:X 14 Jun 16:29:58.219 # +vote-for-leader 4d3b7eb172aaef1a58b35c1a567534c67f3977ef 1
14232:X 14 Jun 16:29:58.266 # +odown master mymaster 192.168.36.110 6379 #quorum 3/2
14232:X 14 Jun 16:29:58.266 # Next failover delay: I will not start a failover before Fri Jun 14 16:35:58 2019
14232:X 14 Jun 16:29:59.468 # +config-update-from sentinel 4d3b7eb172aaef1a58b35c1a567534c67f3977ef 192.168.36.111 26379 @ mymaster 192.168.36.110 6379
14232:X 14 Jun 16:29:59.468 # +switch-master mymaster 192.168.36.110 6379 192.168.36.111 6379
14232:X 14 Jun 16:29:59.469 * +slave slave 192.168.36.112:6379 192.168.36.112 6379 @ mymaster 192.168.36.111 6379
14232:X 14 Jun 16:29:59.469 * +slave slave 192.168.36.110:6379 192.168.36.110 6379 @ mymaster 192.168.36.111 6379
14232:X 14 Jun 16:30:29.507 # +sdown slave 192.168.36.110:6379 192.168.36.110 6379 @ mymaster 192.168.36.111 6379
查看哨兵信息
[root@Master ~]#redis-cli -h 192.168.36.110 -p 26379
192.168.36.110:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=192.168.36.111:6379,slaves=2,sentinels=3
故障轉移後的redis配置文件變化
# 故障轉移後 redis.conf 中的 replicaof 行的 master IP 會被修改, sentinel.conf 中的 sentinel monitor IP 會被修改
[root@Slave-1 ~]#cat /apps/redis/sentinel.conf
bind 0.0.0.0
port 26379
logfile "sentinel_26379.log"
dir "/apps/redis/logs"
sentinel myid 4d3b7eb172aaef1a58b35c1a567534c67f3977ef
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 192.168.36.111 6379 2
sentinel auth-pass mymaster 123456
sentinel config-epoch mymaster 1
# Generated by CONFIG REWRITE
sentinel leader-epoch mymaster 1
sentinel known-slave mymaster 192.168.36.110 6379
sentinel known-slave mymaster 192.168.36.112 6379
sentinel known-sentinel mymaster 192.168.36.110 26379 69d6647e2c6236b5b72d8e943b5d5707db47b9a4
sentinel known-sentinel mymaster 192.168.36.112 26379 abeb0c89a25c690b5cbe09491de6ab822deee15e
sentinel current-epoch 1
當前redis狀態
[root@Slave-1 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:master # Slave-1變爲master節點
connected_slaves:1 #
slave0:ip=192.168.36.112,port=6379,state=online,offset=162954,lag=1
master_replid:e95e0241596bd1073ca558fc7cb892a7a6b4dbe6 # 故障轉移後的當前master_replid
master_replid2:305f29a1bce5172f4c7e263de0d346fd33362d4d # 故障轉移前的master_replid
master_repl_offset:163240
second_repl_offset:72111
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:163240
[root@Slave-2 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:slave
master_host:192.168.36.111 # 故障轉移後新master IP地址
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:187718
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:e95e0241596bd1073ca558fc7cb892a7a6b4dbe6
master_replid2:305f29a1bce5172f4c7e263de0d346fd33362d4d
master_repl_offset:187718
second_repl_offset:72111
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:71
repl_backlog_histlen:187648