關於Zookeeper框架Curator中的主從選舉框架LeaderLatch的坑

根據網上給的LeaderLatch的示例代碼寫的業務代碼,這裏面用到的serverId是從配置中讀取的(每個新部署的實例自動生成,之後一直不變)

@PostConstruct
public void setUp() throws Exception {
    RetryPolicy retryPolicy = new ExponentialBackoffRetry(1000, 3);
    client = CuratorFrameworkFactory.builder()
            .connectString(zkConnectString)
            .retryPolicy(retryPolicy)
            .sessionTimeoutMs(60000)
            .connectionTimeoutMs(3000)
            .namespace(Constants.GLOBAL_NAME_SPACE)
            .build();
    client.start();

    leaderLatch = new LeaderLatch(client, QUORUM_PATH, serverId);
    leaderLatch.addListener(new LeaderLatchListener() {
        @Override
        public void isLeader() {
            log.info("Currently run as leader");
        }

        @Override
        public void notLeader() {
            log.info("Currently run as slave");
        }
    });
    leaderLatch.start();
}

這段代碼達到的預期效果應該是當前實例在運行爲leader的時候,日誌打印Currently run as leader;當丟失leader的時候,日誌打印Currently run as leader。
多實例運行時,剛開始選主是沒問題的,只有一個爲leader。但是丟失主的實例不能切換Slave方式運行;我這裏測試丟失主的方式有兩種,一是斷開實例與zk之間的連接,二是刪除zk上面的該實例鎖住的數據(例如leaderlatch路徑爲/test,那麼每個實例會在/test這個路徑下生成一個臨時節點,將這個臨時節點手動刪除)。
而且這麼寫代碼,很難做主從切換緩衝時間來防止主從來回切換。

所以,將代碼中的listener去掉,修改爲主動輪詢監聽:

@PostConstruct
public void setUp() throws Exception {
    RetryPolicy retryPolicy = new ExponentialBackoffRetry(1000, 3);
    client = CuratorFrameworkFactory.builder()
            .connectString(zkConnectString)
            .retryPolicy(retryPolicy)
            .sessionTimeoutMs(60000)
            .connectionTimeoutMs(3000)
            .namespace(Constants.GLOBAL_NAME_SPACE)
            .build();
    client.start();

    leaderLatch = new LeaderLatch(client, QUORUM_PATH, serverId);
    leaderLatch.start();
}

@Scheduled(fixedDelay = 3000)
public void checkLeader() throws Exception {
    //首先利用serverId檢查自己是否還存在於leaderlatch選舉結果集中
    //考慮網絡阻塞,zk數據異常丟失等情況
    boolean isExist = false;
    Collection<Participant> participants = leaderLatch.getParticipants();
    for (Participant participant : participants) {
        if (serverId.equals(participant.getId())) {
            isExist = true;
            break;
        }
    }
    //如果不存在,則重新加入選舉
    if (!isExist) {
        log.info("Current server does not exist on zk, reset leaderlatch");
        leaderLatch.close();
        leaderLatch = new LeaderLatch(client, QUORUM_PATH, serverId);
        leaderLatch.start();
        log.info("Successfully reset leaderlatch");
    }

    //查看當前leader是否是自己
    //注意,不能用leaderLatch.hasLeadership()因爲有zk數據丟失的不確定性
    //利用serverId對比確認是否主爲自己
    Participant leader = leaderLatch.getLeader();
    boolean hashLeaderShip = serverId.equals(leader.getId());

    if (log.isInfoEnabled()) {
        log.info("Current Participant: {}", JSON.toJSONString(participants));
        log.info("Current Leader: {}", leader);
    }

    //主從切換緩衝
    if(hashLeaderShip) {
        isLeaderCount++;
        isSlaveCount = 0;
    } else {
        isLeaderCount = 0;
        isSlaveCount ++;
    }

    if (isLeaderCount > 3 && !isLeader) {
        log.info("Currently run as leader");
    }

    if (isSlaveCount > 3 && isLeader) {
        log.info("Currently run as slave");
    }
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章