ZooKeeper客戶端事件串行化處理

爲了提升系統的性能,進一步提高系統的吞吐能力,最近公司很多系統都在進行異步化改造。在異步化改造的過程中,肯定會比以前碰到更多的多線程問題,上週就碰到ZooKeeper客戶端異步化過程中的一個死鎖問題,這裏說明下。

通常ZooKeeper對於同一個API,提供了同步和異步兩種調用方式。
同步接口很容易理解,使用方法如下:

ZooKeeper zk = new ZooKeeper(...);
List children = zk.getChildren( path, true );

異步接口就相對複雜一點,使用方法如下:

ZooKeeper zk = new ZooKeeper(...);
zk.getChildren( path, true, new AsyncCallback.Children2Callback() {
@Override
public void proce***esult( int rc, String path, Object ctx, List children, Stat stat ) {
System.out.println( "Recive the response." );
}
}, null);

我們可以看到,異步調用中,需要註冊一個Children2Callback,並實現回調方法:proce***esult。

上週碰到這樣的問題:應用註冊了對某znode子節點列表變化的監聽,邏輯是在接受到ZooKeeper服務器節點列表變更通知(EventType.NodeChildrenChanged)的時候,會重新獲取一次子節點列表。之前,他們是使用同步接口,整個應用可以正常運行,但是這次異步化改造後,出現了詭異現象,能夠收到子節點的變更通知,但是無法重新獲取子節點列表了。

下面,我首先把應用之前使用同步接口的邏輯代碼,用一個簡單的demo來演示下,如下:

package book.chapter05;
import java.io.IOException;
import java.util.List;
import java.util.concurrent.CountDownLatch;
import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.Watcher.Event.EventType;
import org.apache.zookeeper.ZooDefs.Ids;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.Watcher.Event.KeeperState;
/**
* ZooKeeper API 獲取子節點列表,使用同步(sync)接口。
* @author <a href="mailto:[email protected]">銀時</a>
*/
public class ZooKeeper_GetChildren_API_Sync_Usage implements Watcher {
private CountDownLatch connectedSemaphore = new CountDownLatch( 1 );
private static CountDownLatch _semaphore = new CountDownLatch( 1 );
private ZooKeeper zk;
ZooKeeper createSession( String connectString, int sessionTimeout, Watcher watcher ) throws IOException {
ZooKeeper zookeeper = new ZooKeeper( connectString, sessionTimeout, watcher );
try {
connectedSemaphore.await();
} catch ( InterruptedException e ) {
}
return zookeeper;
}
/** create path by sync */
void createPath_sync( String path, String data, CreateMode createMode ) throws IOException, KeeperException, InterruptedException {
if ( zk == null ) {
zk = this.createSession( "domain1.book.zookeeper:2181", 5000, this );
}
zk.create( path, data.getBytes(), Ids.OPEN_ACL_UNSAFE, createMode );
}
/** Get children znodes of path and set watches */
List getChildren( String path ) throws KeeperException, InterruptedException, IOException{
System.out.println( "===Start to get children znodes.===" );
if ( zk == null ) {
zk = this.createSession( "domain1.book.zookeeper:2181", 5000, this );
}
return zk.getChildren( path, true );
}
public static void main( String[] args ) throws IOException, InterruptedException {
ZooKeeper_GetChildren_API_Sync_Usage sample = new ZooKeeper_GetChildren_API_Sync_Usage();
String path = "/get_children_test";
try {
sample.createPath_sync( path, "", CreateMode.PERSISTENT );
sample.createPath_sync( path + "/c1", "", CreateMode.PERSISTENT );
List childrenList = sample.getChildren( path );
System.out.println( childrenList );
//Add a new child znode to test watches event notify.
sample.createPath_sync( path + "/c2", "", CreateMode.PERSISTENT );
_semaphore.await();
} catch ( KeeperException e ) {
System.err.println( "error: " + e.getMessage() );
e.printStackTrace();
}
}
/**
* Process when receive watched event
*/
@Override
public void process( WatchedEvent event ) {
System.out.println( "Receive watched event:" + event );
if ( KeeperState.SyncConnected == event.getState() ) {
if( EventType.None == event.getType() &amp;&amp; null == event.getPath() ){
connectedSemaphore.countDown();
}else if( event.getType() == EventType.NodeChildrenChanged ){
//children list changed
try {
System.out.println( this.getChildren( event.getPath() ) );
_semaphore.countDown();
} catch ( Exception e ) {}
}
}
}
}

輸出結果如下:

Receive watched event:WatchedEvent state:SyncConnected type:None path:null
===Start to get children znodes.===
[c1]
Receive watched event:WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/get_children_test
===Start to get children znodes.===
[c1, c2]

在上面這個程序中,我們首先創建了一個父節點:/get_children_test,以及一個子節點:/get_children_test/c1。然後調用getChildren的同步接口來獲取/get_children_test節點下的所有子節點,調用的同時註冊一個watches。之後,我們繼續向/get_children_test節點創建子節點:/get_children_test/c2,這個時候,因爲我們之前我們註冊了一個watches,因此,一旦此時有子節點被創建,ZooKeeperServer就會向客戶端發出“子節點變更”的通知,於是,客戶端可以再次調用getChildren方法來獲取新的子節點列表。

這個例子當然是能夠正常運行的。現在,我們進行異步化改造,如下:

package book.chapter05;
import java.io.IOException;
import java.util.List;
import java.util.concurrent.CountDownLatch;
import org.apache.zookeeper.AsyncCallback;
import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.Watcher.Event.EventType;
import org.apache.zookeeper.ZooDefs.Ids;
import org.apache.zookeeper.data.Stat;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.Watcher.Event.KeeperState;
/**
* ZooKeeper API 獲取子節點列表,使用異步(ASync)接口。
* @author <a href="mailto:[email protected]">銀時</a>
*/
public class ZooKeeper_GetChildren_API_ASync_Usage_Deadlock implements Watcher {
private CountDownLatch connectedSemaphore = new CountDownLatch( 1 );
private static CountDownLatch _semaphore = new CountDownLatch( 1 );
private ZooKeeper zk;
ZooKeeper createSession( String connectString, int sessionTimeout, Watcher watcher ) throws IOException {
ZooKeeper zookeeper = new ZooKeeper( connectString, sessionTimeout, watcher );
try {
connectedSemaphore.await();
} catch ( InterruptedException e ) {
}
return zookeeper;
}
/** create path by sync */
void createPath_sync( String path, String data, CreateMode createMode ) throws IOException, KeeperException, InterruptedException {
if ( zk == null ) {
zk = this.createSession( "domain1.book.zookeeper:2181", 5000, this );
}
zk.create( path, data.getBytes(), Ids.OPEN_ACL_UNSAFE, createMode );
}
/** Get children znodes of path and set watches */
void getChildren( String path ) throws KeeperException, InterruptedException, IOException{
System.out.println( "===Start to get children znodes.===" );
if ( zk == null ) {
zk = this.createSession( "domain1.book.zookeeper:2181", 5000, this );
}
final CountDownLatch _semaphore_get_children = new CountDownLatch( 1 );
zk.getChildren( path, true, new AsyncCallback.Children2Callback() {
@Override
public void proce***esult( int rc, String path, Object ctx, List children, Stat stat ) {
System.out.println( "Get Children znode result: [response code: " + rc + ", param path: " + path + ", ctx: " + ctx + ", children list: "
+ children + ", stat: " + stat );
_semaphore_get_children.countDown();
}
}, null);
_semaphore_get_children.await();
}
public static void main( String[] args ) throws IOException, InterruptedException {
ZooKeeper_GetChildren_API_ASync_Usage_Deadlock sample = new ZooKeeper_GetChildren_API_ASync_Usage_Deadlock();
String path = "/get_children_test";
try {
sample.createPath_sync( path, "", CreateMode.PERSISTENT );
sample.createPath_sync( path + "/c1", "", CreateMode.PERSISTENT );
//Get children and register watches.
sample.getChildren( path );
//Add a new child znode to test watches event notify.
sample.createPath_sync( path + "/c2", "", CreateMode.PERSISTENT );
_semaphore.await();
} catch ( KeeperException e ) {
System.err.println( "error: " + e.getMessage() );
e.printStackTrace();
}
}
/**
* Process when receive watched event
*/
@Override
public void process( WatchedEvent event ) {
System.out.println( "Receive watched event:" + event );
if ( KeeperState.SyncConnected == event.getState() ) {
if( EventType.None == event.getType() &amp;&amp; null == event.getPath() ){
connectedSemaphore.countDown();
}else if( event.getType() == EventType.NodeChildrenChanged ){
//children list changed
try {
this.getChildren( event.getPath() );
_semaphore.countDown();
} catch ( Exception e ) {
e.printStackTrace();
}
}
}
}
}

輸出結果如下:

Receive watched event:WatchedEvent state:SyncConnected type:None path:null
===Start to get children znodes.===
Get Children znode result: [response code: 0, param path: /get_children_test, ctx: null, children list: [c1], stat: 555,555,1373931727380,1373931727380,0,1,0,0,0,1,556
Receive watched event:WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/get_children_test
===Start to get children znodes.===

在上面這個demo中,執行邏輯和之前的同步版本基本一致,唯一有區別的地方在於獲取子節點列表的過程異步化了。這樣一改造,問題就出來了,整個程序在進行第二次獲取節點列表的時候,卡住了。和應用方確認了,之前同步版本從來沒有出現過這個現象的,所以開始排查這個異步化中哪裏會阻塞。

這裏,我們重點講解在ZooKeeper客戶端中,需要處理來自服務端的兩類事件通知:一類是Watches時間通知,另一類則是異步接口調用的響應。值得一提的是,在ZooKeeper的客戶端線程模型中,這兩個事件由同一個線程處理,並且是串行處理。具體可以自己查看事件處理的核心類:org.apache.zookeeper.ClientCnxn.EventThread。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章