Zookeeper Watcher 流程分析(結合源碼)

概述

ZK提供了分佈式數據的發佈/訂閱功能,一個典型的發佈/訂閱模型系統定義了一種一對多的訂閱關係,能夠讓多個訂閱者同時監聽某個主題對象,當這個主題對象自身狀態發生變化時,會通知所有的訂閱者。在ZK中引入了 Watcher 機制來實現這種分佈式的通知功能

ZK允許客戶端向服務器端註冊一個 Watcher 監聽,當服務端的一些指定事件觸發了這個 Watcher ,那麼就會向指定客戶端發送一個事件通知來實現分佈式通知功能。

大致流程就是 Client 向ZK中註冊 Watcher,如果註冊成功的話,會將對應的 Watcher 存儲在本地。當ZK服務器端觸發 Watcher 事件之後,會向客戶端發送通知,客戶端會從 ClientWatchManager 中取出對應的 Watcher 進行回調。

Watcher 接口

說了那麼久、Watcher 究竟是啥?有什麼用處?

/**
 * This interface specifies the public interface an event handler class must
 * implement. A ZooKeeper client will get various events from the ZooKeeper
 * server it connects to. An application using such a client handles these
 * events by registering a callback object with the client. The callback object
 * is expected to be an instance of a class that implements Watcher interface.
 */
@InterfaceAudience.Public
public interface Watcher {
    void process(WatchedEvent event);
}

只要你通過這個接口的實現類對象去向ZK服務端註冊監聽,那麼當有ZK服務端有事件通知到Client,那麼就會回調這個 process 方法。

WatchedEvent

那麼 WatchedEvent 又有什麼玄機呢?

public class WatchedEvent {
   /**
    * Enumeration of states the ZooKeeper may be at the event
    */
    private final KeeperState keeperState;
   /**
    * Enumeration of types of events that may occur on the ZooKeeper
   	*/
    private final EventType eventType;
    private String path;
}

KeeperStateEventType 是兩個枚舉類,分別代表通知狀態和事件類型。path 就是 client 監聽到路徑。

常見的 KeeperStateEventType 組合

KeeperState EventType 觸發條件 說明
SyncConnected None(-1) 客戶端與服務端成功建立會話 客戶端和服務端處於連接狀態
SyncConnected NodeCreated(1) Watcher 監聽對應的數據節點被創建 客戶端和服務端處於連接狀態
SyncConnected NodeDeleted(2) Watcher 監聽對應的數據節點被刪除 客戶端和服務端處於連接狀態
SyncConnected NodeDataChanged(3) Watcher 監聽對應的數據節點的內容發生變更(數據內容和數據版本號) 客戶端和服務端處於連接狀態
SyncConnected NodeChildrenChanged(4) Watcher 監聽對應的數據節點的子節點列表發生改變 客戶端和服務端處於連接狀態

關於 NodeDataChanged 事件類型,這裏的變更包括節點的數據內容發生變更,也包括數據的版本號(dataVersion) 變更,所以只要有客戶端調用了數據更新接口,不管數據內容是否發生改變、都會導致 dataVersion 發生改變,從而觸發對應 Watcher 的監聽。這樣子就能避免典型樂觀鎖 ABA 的問題。

WatcherEvent

我們可以在 WatchedEvent 中發現有這麼一個方法

 /**
     *  Convert WatchedEvent to type that can be sent over network
     */
    public WatcherEvent getWrapper() {
        return new WatcherEvent(eventType.getIntValue(), keeperState.getIntValue(), path);
    }

籠統的說,WatcherEventWatchedEvent 表示的是同一個事物,都是對服務端事件的封裝。WatchedEvent 是一個用於邏輯處理的對象、而WatcherEvent 是用於傳輸的實體對象。從上面的代碼我們可以看到,創建 WatcherEvent 的參數就是 WatchedEvent 中各個屬性的值。

http://people.apache.org/~larsgeorge/zookeeper-1215258/build/docs/dev-api/org/apache/zookeeper/proto/WatcherEvent.html 中可以看到它實現了 Record 接口

public class WatcherEvent
extends Object
implements org.apache.jute.Record

而在 Record 接口中定義了序列化和反序列的方法

@InterfaceAudience.Public
public interface Record {
    void serialize(OutputArchive archive, String tag) throws IOException;
    void deserialize(InputArchive archive, String tag) throws IOException;
}

相關組件

相關過程

概括可以分爲三個過程

  • 客戶端註冊 Watcher
  • 服務端處理 Watcher
  • 客戶端回調 Watcher

客戶端註冊 Watcher

我們在創建一個ZK 客戶端實例對象的時候、可以向構造方法中傳入一個默認的 Watcher

 public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher) 

參數中的這個 Watcher 將會被保存在 ZKWatchManager 中,作爲整個會話期間的默認的 Watcher

watchManager.defaultWatcher = watcher;

除此之外、ZK 客戶端也可以通過 getData,getChildren,exist三個接口向ZK服務端註冊 Watcher

我們以 getData接口來分析

public byte[] getData(final String path, Watcher watcher, Stat stat){
  .....
}
public byte[] getData(String path, boolean watch, Stat stat) throws KeeperException, InterruptedException {
        return getData(path, getDefaultWatcher(watch), stat);
}

如果我們的參數 watch爲 true , 那麼 getDefaultWatcher就是去拿我們創建Zookeeper 時傳入的默認的 Watcher

 private Watcher getDefaultWatcher(boolean required) {
        if (required) {
            if (watchManager.defaultWatcher != null) {
                return watchManager.defaultWatcher;
            } else {
                throw new IllegalStateException("Default watcher is required, but it is null.");
            }
        }
        return null;
    }

下面是 完整的 getData代碼

 public byte[] getData(final String path, Watcher watcher, Stat stat) throws KeeperException, InterruptedException {
        final String clientPath = path;
        PathUtils.validatePath(clientPath);

        // the watch contains the un-chroot path
        // 創建 數據類型  的 watch registration
        WatchRegistration wcb = null;
        if (watcher != null) {
            wcb = new DataWatchRegistration(watcher, clientPath);
        }

        // 將客戶端change root directory 的路徑加上、變回服務端那邊正常的路徑
        final String serverPath = prependChroot(clientPath);

        RequestHeader h = new RequestHeader();
        h.setType(ZooDefs.OpCode.getData);
        GetDataRequest request = new GetDataRequest();
        request.setPath(serverPath);
        // 標記是否有 watcher
        request.setWatch(watcher != null);
        GetDataResponse response = new GetDataResponse();
        
        ReplyHeader r = cnxn.submitRequest(h, request, response, wcb);
        if (r.getErr() != 0) {
            throw KeeperException.create(KeeperException.Code.get(r.getErr()), clientPath);
        }
        if (stat != null) {
            DataTree.copyStat(response.getStat(), stat);
        }
        return response.getData();
    }
  1. 創建一個 DataWatchRegistration
  2. 轉換 path (客戶端這邊可能 change root directory,發送請求前要將其轉爲爲服務端那邊的路徑)
  3. 使用 ClientCnxn 提交這個請求
public ReplyHeader submitRequest(
        RequestHeader h,
        Record request,
        Record response,
        WatchRegistration watchRegistration,
        WatchDeregistration watchDeregistration) throws InterruptedException {
        ReplyHeader r = new ReplyHeader();
        Packet packet = queuePacket(
            h,
            r,
            request,
            response,
            null,
            null,
            null,
            null,
            watchRegistration,
            watchDeregistration);
       	....
       	....
        return r;
    }


最終這個 Request 被加入到 outgoingQueue中

public Packet queuePacket(
        RequestHeader h,
        ReplyHeader r,
        Record request,
        Record response,
        AsyncCallback cb,
        String clientPath,
        String serverPath,
        Object ctx,
        WatchRegistration watchRegistration,
        WatchDeregistration watchDeregistration) {
        Packet packet = null;

        packet = new Packet(h, r, request, response, watchRegistration);
         
        synchronized (state) {
   					...
              ....
                outgoingQueue.add(packet);
            }
        }

最終發送請求到服務端,在 SendThread#readResponse 中處理返回結果

void readResponse(ByteBuffer incomingBuffer) throws IOException {
            ByteBufferInputStream bbis = new ByteBufferInputStream(incomingBuffer);
            BinaryInputArchive bbia = BinaryInputArchive.getArchive(bbis);
            ReplyHeader replyHdr = new ReplyHeader();

            replyHdr.deserialize(bbia, "header");
            switch (replyHdr.getXid()) {
            case PING_XID:
               ....
               ....
                return;
              case AUTHPACKET_XID:
                ...
                ...
              return;
                // 處理服務端到通知
            case NOTIFICATION_XID:
                LOG.debug("Got notification session id: 0x{}",
                    Long.toHexString(sessionId));
                WatcherEvent event = new WatcherEvent();
                event.deserialize(bbia, "response");

                // convert from a server path to a client path
                if (chrootPath != null) {
                    String serverPath = event.getPath();
                    if (serverPath.compareTo(chrootPath) == 0) {
                        event.setPath("/");
                    } else if (serverPath.length() > chrootPath.length()) {
                        event.setPath(serverPath.substring(chrootPath.length()));
                     } else {
                         LOG.warn("Got server path {} which is too short for chroot path {}.",
                             event.getPath(), chrootPath);
                     }
                }

                WatchedEvent we = new WatchedEvent(event);
                LOG.debug("Got {} for session id 0x{}", we, Long.toHexString(sessionId));
                // 加入到事件隊列中、由EventThread處理
                eventThread.queueEvent(we);
                return;
            default:
                break;
            }

           // 移除這個Pacjet
            Packet packet;
            synchronized (pendingQueue) {
                if (pendingQueue.size() == 0) {
                    throw new IOException("Nothing in the queue, but got " + replyHdr.getXid());
                }
                packet = pendingQueue.remove();
            }
            /*
             * Since requests are processed in order, we better get a response
             * to the first request!
             */
            try {
               	....
               	.....
            } finally {
              	// 將Watcher 保存在 ClientWatchManager
                finishPacket(packet);
            }
        }

主要做了啥事情

  1. 反序列化,獲取請求頭中的 XID判斷是否是服務端到通知、如果是的話、加入到事件隊列中、由EventThread去處理
  2. 從 outgoingQueue中移除 Packet。
  3. 調用 finishPacket 函數、進行一些後續處理
 protected void finishPacket(Packet p) {
        int err = p.replyHeader.getErr();
        if (p.watchRegistration != null) {
            p.watchRegistration.register(err);
        }
        ...
        ...			
 }

最後回到 WatchRegistration將對應的 Watcher 註冊到對應的 Map<String, Set<Watcher>>中。

服務端處理 Watcher

先來認識幾個主要的組件類

WatchManager 是 ZK 服務端 Watcher 的管理者,其內部管理的 watchTablewatch2Paths 兩個存儲結構,分別用兩個維度對 Watcher 進行存儲。

  • watchTable 從數據節點路徑的粒度來託管 Watcher。
  • watch2Paths 從 Watcher 的粒度來控制事件觸發需要觸發的數據節點。

ServerCnxn是一個 Zookeeper 客戶端和負擔之間的連接接口、代表了一個客戶端和服務端的連接,其默認實現是 NIOServerCnxn,從 3.4.0 開始引入了基於Netty 的實現 NettyServerCnxn

ServerCnxn同時實現了 Watcher接口,因此我們可以將其看作是一個 Watcher對象.

數據節點的路徑和 ServerCnxn都會被存儲在 WatchManager

服務端收到客戶端的請求後會在 FinalRequestProcessor#processRequest中判斷當前請求是否需要註冊 Watcher。

case OpCode.getData: {
                lastOp = "GETD";
                GetDataRequest getDataRequest = new GetDataRequest();
                ByteBufferInputStream.byteBuffer2Record(request.request, getDataRequest);
                path = getDataRequest.getPath();
  							// 調用處理 getData 請求的方法
                rsp = handleGetDataRequest(getDataRequest, cnxn, request.authInfo);
                requestPathMetricsCollector.registerRequest(request.type, path);
                break;
            }
private Record handleGetDataRequest(Record request, ServerCnxn cnxn, List<Id> authInfo) throws KeeperException, IOException {
      	....
        ....
        // 這注意、客戶端是否需要註冊 Watcher、請求中只是有一個 boolean 字段來表示
        // 從請求中獲取是否需要註冊 Watcher
        byte[] b = zks.getZKDatabase().getData(path, stat, getDataRequest.getWatch() ? cnxn : null);
        return new GetDataResponse(b, stat);
    }
public byte[] getData(String path, Stat stat, Watcher watcher)  {
        return dataTree.getData(path, stat, watcher);
    }

public byte[] getData(String path, Stat stat, Watcher watcher)  {
         
        synchronized (n) {
            n.copyStat(stat);
            if (watcher != null) {
              // 這裏的 dataWatches 就是 IWatchManager 接口對應的實例
                dataWatches.addWatch(path, watcher);
            }
            data = n.data;
        }
        updateReadStat(path, data == null ? 0 : data.length);
        return data;
    }

最終會被放置到 watchTablewatch2Paths中存儲

 @Override
    public boolean addWatch(String path, Watcher watcher) {
        return addWatch(path, watcher, WatcherMode.DEFAULT_WATCHER_MODE);
    }

    @Override
    public synchronized boolean addWatch(String path, Watcher watcher, WatcherMode watcherMode) {
        if (isDeadWatcher(watcher)) {
            return false;
        }
				// 從中拿出 Set
        Set<Watcher> list = watchTable.get(path);
        if (list == null) {
            list = new HashSet<>(4);
            watchTable.put(path, list);
        }
        list.add(watcher);
				// 
        Set<String> paths = watch2Paths.get(watcher);
        if (paths == null) {
            paths = new HashSet<>();
            watch2Paths.put(watcher, paths);
        }

        watcherModeManager.setWatcherMode(watcher, path, watcherMode);
        return paths.add(path);
    }

Watcher 的觸發

NodeDataChange的觸發是我們節點的數據內容或者節點的 dataVersion發生改變。

那麼我們可以來看看 org.apache.zookeeper.server.DataTree#setData方法

public Stat setData(String path, byte[] data, int version, long zxid, long time) throws KeeperException.NoNodeException {
        Stat s = new Stat();
        DataNode n = nodes.get(path);
        if (n == null) {
            throw new KeeperException.NoNodeException();
        }
        byte[] lastdata = null;
        synchronized (n) {
            lastdata = n.data;
            nodes.preChange(path, n);
            n.data = data;
            n.stat.setMtime(time);
            n.stat.setMzxid(zxid);
            n.stat.setVersion(version);
            n.copyStat(s);
            nodes.postChange(path, n);
        }
      
				....
        ....
        updateWriteStat(path, dataBytes);
  			// 調用IWatchManager 的方法
        dataWatches.triggerWatch(path, EventType.NodeDataChanged);
        return s;
    }
 @Override
    public WatcherOrBitSet triggerWatch(String path, EventType type) {
        return triggerWatch(path, type, null);
    }

    @Override
    public WatcherOrBitSet triggerWatch(String path, EventType type, WatcherOrBitSet supress) {
      // 封裝成 WatchedEvent 
        WatchedEvent e = new WatchedEvent(type, KeeperState.SyncConnected, path);
        Set<Watcher> watchers = new HashSet<>();
        PathParentIterator pathParentIterator = getPathParentIterator(path);
        synchronized (this) {
            for (String localPath : pathParentIterator.asIterable()) {
                Set<Watcher> thisWatchers = watchTable.get(localPath);
              // 無監聽
                if (thisWatchers == null || thisWatchers.isEmpty()) {
                    continue;
                }
                Iterator<Watcher> iterator = thisWatchers.iterator();
                while (iterator.hasNext()) {
                    Watcher watcher = iterator.next();
                    WatcherMode watcherMode = watcherModeManager.getWatcherMode(watcher, localPath);
                    if (watcherMode.isRecursive()) {
                         
                    } else if (!pathParentIterator.atParentPath()) {
                        watchers.add(watcher);
                        if (!watcherMode.isPersistent()) {
                          // 移除掉
                            iterator.remove();
                            Set<String> paths = watch2Paths.get(watcher);
                            if (paths != null) {
                              // 從 watch2Paths 中移除掉
                                paths.remove(localPath);
                            }
                        }
                    }
                }
               
            }
        }
        for (Watcher w : watchers) {
            if (supress != null && supress.contains(w)) {
                continue;
            }
          // 調用 process 方法
            w.process(e);
        }
				.....
        .....
        return new WatcherOrBitSet(watchers);
    }

上面已經提及、ServerCnxn實現了 Watcher接口,我們看看 org.apache.zookeeper.server.NIOServerCnxn#process

@Override
    public void process(WatchedEvent event) {
      // 請求頭中的 XID 設置爲 -1,上面分析 SendThread.readResponse 的時候提及過
        ReplyHeader h = new ReplyHeader(ClientCnxn.NOTIFICATION_XID, -1L, 0);
     
        // WatchedEvent 變爲 WatcherEvent
        WatcherEvent e = event.getWrapper();
				// 給客戶端發送通知
        sendResponse(h, e, "notification", null, null, ZooDefs.OpCode.error);
    }

基本流程

  • 封裝 WatchedEvent
  • watchTable中找到對應的 Watcher,並將 watchTablewatch2Paths中相關的 Watcher 和路徑清除掉(只能觸發一次喔)
  • 調用 process方法。

客戶端回調 Watcher

我們先來認識下 EventThread這個類

繼承自 Thread,使用 LinkedBlockingQueue<Object> waitingEvents保存將要處理的事件,然後 ```run`` 方法不斷的從隊列中獲取進行處理。

我們已經知道客戶端中由 SendThread#readResponse處理(這段代碼也出現在上面的客戶端註冊 Watcher 的時候)

case NOTIFICATION_XID:
                LOG.debug("Got notification session id: 0x{}",
                    Long.toHexString(sessionId));
                WatcherEvent event = new WatcherEvent();
                event.deserialize(bbia, "response");

                // convert from a server path to a client path
                if (chrootPath != null) {
                    String serverPath = event.getPath();
                    if (serverPath.compareTo(chrootPath) == 0) {
                        event.setPath("/");
                    } else if (serverPath.length() > chrootPath.length()) {
                        event.setPath(serverPath.substring(chrootPath.length()));
                     } else {
                         LOG.warn("Got server path {} which is too short for chroot path {}.",
                             event.getPath(), chrootPath);
                     }
                }

                WatchedEvent we = new WatchedEvent(event);
                LOG.debug("Got {} for session id 0x{}", we, Long.toHexString(sessionId));
                // 加入到事件隊列中、由EventThread處理
                eventThread.queueEvent(we);
                return;

加入到 ```waitingEvents`` 隊列中

public void queueEvent(WatchedEvent event) {
            queueEvent(event, null);
        }

        private void queueEvent(WatchedEvent event, Set<Watcher> materializedWatchers) {
            if (event.getType() == EventType.None && sessionState == event.getState()) {
                return;
            }
            sessionState = event.getState();
            final Set<Watcher> watchers;
            if (materializedWatchers == null) {
                // 從 clientWatchManager 中獲取對應的 Watcher,也會從對應的 Map中移除 Watcher
              // 一樣是一次性的
                watchers = watcher.materialize(event.getState(), event.getType(), event.getPath());
            } else {
                watchers = new HashSet<Watcher>();
                watchers.addAll(materializedWatchers);
            }
            WatcherSetEventPair pair = new WatcherSetEventPair(watchers, event);
            // 加入到 waitingEvents 中、等待 run 方法 拿出來處理
            waitingEvents.add(pair);
        }

run 方法

 public void run() {
            try {
                isRunning = true;
                while (true) {
                    Object event = waitingEvents.take();
                    if (event == eventOfDeath) {
                        wasKilled = true;
                    } else {
                        processEvent(event);
                    }
                  ......
                  ......
                }}
        }

        private void processEvent(Object event) {
            try {
                if (event instanceof WatcherSetEventPair) {
                    // each watcher will process the event
                    WatcherSetEventPair pair = (WatcherSetEventPair) event;
                    for (Watcher watcher : pair.watchers) {
                        try {
                          // 調用 process 方法,串行同步處理
                            watcher.process(pair.event);
                        } catch (Throwable t) {
                            LOG.error("Error while calling watcher.", t);
                        }
                    }
                } }
          .......
          .......

    }

總結

Watcher 的特性

  • 一次性:無論是客戶端還是服務端、一旦 Watcher 觸發、都會將其從存儲中移除。
  • 客戶端串行執行: 串行同步執行的過程、千萬不要因爲一個 Watcher 而影響整個客戶端回調 Watcher
  • 輕量: WatchedEvent 是通知機制中最小的通知單元,只包含了三部分的內容: 通知狀態、事件類型、節點路徑。而不會將節點的內容以通知的方式告知客戶端、而是需要客戶端收到通知之後、主動去服務端獲取數據。

相關文章

ZooKeeper 數據模型

編譯運行Zookeeper源碼

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章