zookeeper广泛用于分布式服务中,比如选举。这里简单介绍下,算是入门。
基本概念
我们知道zookeeper的结构是树形结构
1.集群host启动后的监听/master节点的删除事件
2.各服务器host尝试创建master,成功则把自己的信息存在master上,失败则读取master节点信息
3.服务注册:各服务器host在/serverList节点下创建子节点,并把自己的信息存在节点上
4.服务发现:服务消费者方查询/serverList子节点列表,获取服务提供方各个host信息
这个就是利用zookeeper选举的基本思路。
不过上边的方案还是有问题的:即惊群效应。当服务的机器数量比较多的时候,触发了选举,此时大量的请求进入zookeeper会产生比较大的压力。
改进思路还是用zookeeper的临时顺序节点:多个机器同时向该根路径下创建临时顺序节点,没抢到Leader的节点都监听前一个节点的删除事件,这样在前一个节点删除后进行重新抢主。这样一般只会有一个机器去选举成为leader,如果多个机器同时挂掉,参与选主的机器也不会太多。
recipes实现
基于这个思路,recipes提供了两种实现选举的方案:LeaderLatch与LeaderSelector。成为leader后都有对应的Listener回调。
LeaderLatch
LeaderLatch配合LeaderLatchListener使用
示例:
public class LeadLatchMain {
public static void main(String[] args) throws Exception {
List<LeaderLatch> leaders = new ArrayList<LeaderLatch>();
for (int i = 0; i < 10; i++) {
RetryPolicy retryPolicy = new ExponentialBackoffRetry(1000, 3);
CuratorFramework client = CuratorFrameworkFactory.builder()
.connectString("172.16.59.154:2181").sessionTimeoutMs(5000).connectionTimeoutMs(5000).retryPolicy(retryPolicy)
.namespace("base")
.build();
String name = "client" + i;
// 指定客户端和选举路径
LeaderLatch leader = new LeaderLatch(client, "/master");
leader.addListener(new LeaderLatchListener() {
@Override
public void isLeader() {
// 如果成为leader了,则回调该方法
System.out.println("isLeader " + name + ", Id :" + leader.getOurPath());
try {
List<String> children = client.getChildren().forPath("/master");
children.forEach(c -> System.out.println(c));
} catch (Exception e) {
e.printStackTrace();
}
}
@Override
public void notLeader() {
System.out.println("notLeader " + name);
}
});
leaders.add(leader);
client.start();// 连接zookeeper服务器
leader.start();// 开始参与选举
}
// checkLeader(leaders);
Thread.sleep(Integer.MAX_VALUE);
}
private static void checkLeader(List<LeaderLatch> leaderLatchList) throws Exception {
//Leader选举需要时间 等待10秒
Thread.sleep(10000);
for (int i = 0; i < leaderLatchList.size(); i++) {
LeaderLatch leaderLatch = leaderLatchList.get(i);
//通过hasLeadership()方法判断当前节点是否是leader
if (leaderLatch.hasLeadership()) {
System.out.println("当前leader:" + leaderLatch.getId());
// 释放leader权限 重新进行抢主
leaderLatch.close();
checkLeader(leaderLatchList);
}
}
}
}
接下来一起看下源码实现:点开LeaderLatch 的start 方法,进去。
最主要的实现是这个函数
private void checkLeadership(List<String> children) throws Exception {
if ( debugCheckLeaderShipLatch != null ) {
debugCheckLeaderShipLatch.await();
}
final String localOurPath = ourPath.get();
// 节点按编号排序
List<String> sortedChildren = LockInternals.getSortedChildren(LOCK_NAME, sorter, children);
// 获取当前子节点对应的序号
int ourIndex = (localOurPath != null) ? sortedChildren.indexOf(ZKPaths.getNodeFromPath(localOurPath)) : -1;
if ( ourIndex < 0 ) {
log.error("Can't find our node. Resetting. Index: " + ourIndex);
reset();// 当前节点不在子节点列表内,选举失败
}
else if ( ourIndex == 0 ) {
// 当前节点编号最小, 设当前节点为leader
setLeadership(true);
}
else {
// 没有成为leader,因此获取前一个节点的路径,并注册监听
String watchPath = sortedChildren.get(ourIndex - 1);
Watcher watcher = new Watcher() {
@Override
public void process(WatchedEvent event)
{
// 当且仅当当前节点状态为STARTED,且被监听的前一个节点发生删除,重新进入getChildren 进行选举
if ( (state.get() == State.STARTED) && (event.getType() == Event.EventType.NodeDeleted) && (localOurPath != null) ) {
try {
getChildren();
} catch ( Exception ex ) {
ThreadUtils.checkInterrupted(ex);
log.error("An error occurred checking the leadership.", ex);
}
}
}
};
BackgroundCallback callback = new BackgroundCallback() {
@Override
public void processResult(CuratorFramework client, CuratorEvent event) throws Exception {
// 注册失败,则重新进入选举
if ( event.getResultCode() == KeeperException.Code.NONODE.intValue() ) {
// previous node is gone - reset
reset();
}
}
};
// use getData() instead of exists() to avoid leaving unneeded watchers which is a type of resource leak
client.getData().usingWatcher(watcher).inBackground(callback).forPath(ZKPaths.makePath(latchPath, watchPath));
}
}
LeaderSelector
LeaderSelector与 LeaderSelectorListener,LeaderSelectorListenerAdapter一期使用
public class LeadSelectorMain {
public static void main(String[] args) throws InterruptedException {
for (int i = 0; i < 10; i++) {
RetryPolicy retryPolicy = new ExponentialBackoffRetry(1000, 3);
CuratorFramework client = CuratorFrameworkFactory.builder()
.connectString("172.16.59.154:2181").sessionTimeoutMs(5000).connectionTimeoutMs(5000).retryPolicy(retryPolicy)
.namespace("base")
.build();
client.start();// 连接
String name = "client" + i;
// 0.LeaderSelector 的构造参数 LeaderSelectorListener 都会被包装成 WrappedListener
// 1.此处使用LeaderSelectorListenerAdapter,它实现了stateChanged 函数,当客户端与zk失连后,抛出 CancelLeadershipException 异常
// 2.WrappedListener 捕获该异常后,会自动取消领导权
LeaderSelector leaderSelector = new LeaderSelector(client, "/master", new LeaderSelectorListenerAdapter() {
@Override
public void takeLeadership(CuratorFramework client) throws Exception {
// 官方文档:http://curator.apache.org/getting-started.html
// this callback will get called when you are the leader
// do whatever leader work you need to and only exit
// this method when you want to relinquish leadership
System.out.println(name + " 成为leader了");// 也就是说,该客户端成为Leader后,该方法会被回调
// sleep 10秒
Thread.sleep(10000);
System.out.println(name + " 放弃成为leader");// 退出 takeLeadership 方法后,就放弃成为leader
}
});
//放弃领导权之后,自动再次竞选
leaderSelector.autoRequeue();// not required, but this is behavior that you will probably expect
leaderSelector.start();
}
System.out.println("log end----------");
Thread.sleep(Integer.MAX_VALUE);
}
}
开发人员只需要在takeLeadership 函数中实现自己的代码记录。