curator筆記-分佈式鎖的實現與原理

1.分佈式鎖

在我們進行單機應用開發,涉及併發同步的時候,我們往往採用synchronized或者Lock的方式來解決多線程間的代碼同步問題。但當我們的應用是分佈式部署的情況下,那麼就需要一種更加高級的鎖機制來處理這個進程級別的代碼同步問題。那麼接下來引出現在比較常用的幾種分佈式鎖實現方案,如下圖:

 

分佈式鎖的常用實現方案

 

而在這幾次的實現方案也是各有優缺點,對比如下:

 

優劣對比


2.Curator的分佈式鎖介紹

今天我們主要介紹這個基於Zookeeper實現的分佈式鎖方案(Curator),當然隨着我們去了解Curator這個產品的時候,會驚喜的發現,它帶給我們的不僅僅是分佈式鎖的實現。此處先不做介紹,我會另外用博客來記錄,有興趣的朋友可以自行下載這個項目來解讀。 apache/curator
現在先讓我們看看Curator的幾種鎖方案:

四種鎖方案

 

  • InterProcessMutex:分佈式可重入排它鎖
  • InterProcessSemaphoreMutex:分佈式排它鎖
  • InterProcessReadWriteLock:分佈式讀寫鎖
  • InterProcessMultiLock:將多個鎖作爲單個實體管理的容器

接下來我們以InterProcessMutex爲例,介紹一下這個分佈式可重入排它鎖的實現原理


3.InterProcessMutex代碼跟蹤

一、獲取鎖的過程

1).實例化InterProcessMutex:

// 代碼進入:InterProcessMutex.java
    /**
     * @param client client
     * @param path   the path to lock
     */
    public InterProcessMutex(CuratorFramework client, String path)
    {
        this(client, path, new StandardLockInternalsDriver());
    }
    /**
     * @param client client
     * @param path   the path to lock
     * @param driver lock driver
     */
    public InterProcessMutex(CuratorFramework client, String path, LockInternalsDriver driver)
    {
        this(client, path, LOCK_NAME, 1, driver);
    }

兩個構造函數共同的入參:

  • client:curator實現的zookeeper客戶端
  • path:要在zookeeper加鎖的路徑,即後面創建臨時節點的父節點

我們可以看到上面兩個構造函數中,其實第一個也是在調用第二個構造函數,它傳入了一個默認的StandardLockInternalsDriver對象,即標準的鎖驅動類(該類的作用在後面會介紹)。就是說InterProcessMutex也支持你傳入自定義的鎖驅動類來擴展。

// 代碼進入:InterProcessMutex.java
InterProcessMutex(CuratorFramework client, String path, String lockName, int maxLeases, LockInternalsDriver driver)
    {
        basePath = PathUtils.validatePath(path);
        internals = new LockInternals(client, driver, path, lockName, maxLeases);
    }
// 代碼進入:LockInternals.java
LockInternals(CuratorFramework client, LockInternalsDriver driver, String path, String lockName, int maxLeases)
    {
        this.driver = driver;
        this.lockName = lockName;
        this.maxLeases = maxLeases;

        this.client = client.newWatcherRemoveCuratorFramework();
        this.basePath = PathUtils.validatePath(path);
        this.path = ZKPaths.makePath(path, lockName);
    }

跟着構造函數的代碼走完,它接着做了兩件事:驗證入參path的合法性 & 實例化了一個LockInternals對象。

2).加鎖方法acquire:
實例化完成的InterProcessMutex對象,開始調用acquire()方法來嘗試加鎖:

// 代碼進入:InterProcessMutex.java
   /**
     * Acquire the mutex - blocking until it's available. Note: the same thread
     * can call acquire re-entrantly. Each call to acquire must be balanced by a call
     * to {@link #release()}
     *
     * @throws Exception ZK errors, connection interruptions
     */
    @Override
    public void acquire() throws Exception
    {
        if ( !internalLock(-1, null) )
        {
            throw new IOException("Lost connection while trying to acquire lock: " + basePath);
        }
    }

    /**
     * Acquire the mutex - blocks until it's available or the given time expires. Note: the same thread
     * can call acquire re-entrantly. Each call to acquire that returns true must be balanced by a call
     * to {@link #release()}
     *
     * @param time time to wait
     * @param unit time unit
     * @return true if the mutex was acquired, false if not
     * @throws Exception ZK errors, connection interruptions
     */
    @Override
    public boolean acquire(long time, TimeUnit unit) throws Exception
    {
        return internalLock(time, unit);
    }
  • acquire() :入參爲空,調用該方法後,會一直堵塞,直到搶奪到鎖資源,或者zookeeper連接中斷後,上拋異常。
  • acquire(long time, TimeUnit unit):入參傳入超時時間以及單位,搶奪時,如果出現堵塞,會在超過該時間後,返回false。

對比兩種方式,可以選擇適合自己業務邏輯的方法。但是一般情況下,我推薦後者,傳入超時時間,避免出現大量的臨時節點累積以及線程堵塞的問題。

3).鎖的可重入:

// 代碼進入:InterProcessMutex.java
private boolean internalLock(long time, TimeUnit unit) throws Exception
    {
        /*
           Note on concurrency: a given lockData instance
           can be only acted on by a single thread so locking isn't necessary
        */

        Thread currentThread = Thread.currentThread();

        LockData lockData = threadData.get(currentThread);
        if ( lockData != null )
        {
            // re-entering
            lockData.lockCount.incrementAndGet();
            return true;
        }
        String lockPath = internals.attemptLock(time, unit, getLockNodeBytes());
        if ( lockPath != null )
        {
            LockData newLockData = new LockData(currentThread, lockPath);
            threadData.put(currentThread, newLockData);
            return true;
        }
        return false;
    }

這段代碼裏面,實現了鎖的可重入。每個InterProcessMutex實例,都會持有一個ConcurrentMap類型的threadData對象,以線程對象作爲Key,以LockData作爲Value值。通過判斷當前線程threadData是否有值,如果有,則表示線程可以重入該鎖,於是將lockData的lockCount進行累加;如果沒有,則進行鎖的搶奪。
internals.attemptLock方法返回lockPath!=null時,表明了該線程已經成功持有了這把鎖,於是乎LockData對象被new了出來,並存放到threadData中。

4).搶奪鎖:
重頭戲來了,attemptLock方法就是核心部分,直接看代碼:

// 代碼進入:LockInternals.java
String attemptLock(long time, TimeUnit unit, byte[] lockNodeBytes) throws Exception
    {
        final long      startMillis = System.currentTimeMillis();
        final Long      millisToWait = (unit != null) ? unit.toMillis(time) : null;
        final byte[]    localLockNodeBytes = (revocable.get() != null) ? new byte[0] : lockNodeBytes;
        int             retryCount = 0;

        String          ourPath = null;
        boolean         hasTheLock = false;
        boolean         isDone = false;
        while ( !isDone )
        {
            isDone = true;

            try
            {
                ourPath = driver.createsTheLock(client, path, localLockNodeBytes);
                hasTheLock = internalLockLoop(startMillis, millisToWait, ourPath);
            }
            catch ( KeeperException.NoNodeException e )
            {
                // gets thrown by StandardLockInternalsDriver when it can't find the lock node
                // this can happen when the session expires, etc. So, if the retry allows, just try it all again
                if ( client.getZookeeperClient().getRetryPolicy().allowRetry(retryCount++, System.currentTimeMillis() - startMillis, RetryLoop.getDefaultRetrySleeper()) )
                {
                    isDone = false;
                }
                else
                {
                    throw e;
                }
            }
        }

        if ( hasTheLock )
        {
            return ourPath;
        }
        return null;
    }

此處注意三個地方

  • 1.while循環
    正常情況下,這個循環會在下一次結束。但是當出現NoNodeException異常時,會根據zookeeper客戶端的重試策略,進行有限次數的重新獲取鎖。
  • 2.driver.createsTheLock
    顧名思義,這個driver的createsTheLock方法就是在創建這個鎖,即在zookeeper的指定路徑上,創建一個臨時序列節點。注意:此時只是純粹的創建了一個節點,不是說線程已經持有了鎖。
// 代碼進入:StandardLockInternalsDriver.java
    @Override
    public String createsTheLock(CuratorFramework client, String path, byte[] lockNodeBytes) throws Exception
    {
        String ourPath;
        if ( lockNodeBytes != null )
        {
            ourPath = client.create().creatingParentContainersIfNeeded().withProtection().withMode(CreateMode.EPHEMERAL_SEQUENTIAL).forPath(path, lockNodeBytes);
        }
        else
        {
            ourPath = client.create().creatingParentContainersIfNeeded().withProtection().withMode(CreateMode.EPHEMERAL_SEQUENTIAL).forPath(path);
        }
        return ourPath;
    }
  • 3.internalLockLoop
    判斷自身是否能夠持有鎖。如果不能,進入wait,等待被喚醒。
// 代碼進入:LockInternals.java
private boolean internalLockLoop(long startMillis, Long millisToWait, String ourPath) throws Exception
    {
        boolean     haveTheLock = false;
        boolean     doDelete = false;
        try
        {
            if ( revocable.get() != null )
            {
                client.getData().usingWatcher(revocableWatcher).forPath(ourPath);
            }

            while ( (client.getState() == CuratorFrameworkState.STARTED) && !haveTheLock )
            {
                List<String>        children = getSortedChildren();
                String              sequenceNodeName = ourPath.substring(basePath.length() + 1); // +1 to include the slash

                PredicateResults    predicateResults = driver.getsTheLock(client, children, sequenceNodeName, maxLeases);
                if ( predicateResults.getsTheLock() )
                {
                    haveTheLock = true;
                }
                else
                {
                    String  previousSequencePath = basePath + "/" + predicateResults.getPathToWatch();

                    synchronized(this)
                    {
                        try 
                        {
                            // use getData() instead of exists() to avoid leaving unneeded watchers which is a type of resource leak
                            client.getData().usingWatcher(watcher).forPath(previousSequencePath);
                            if ( millisToWait != null )
                            {
                                millisToWait -= (System.currentTimeMillis() - startMillis);
                                startMillis = System.currentTimeMillis();
                                if ( millisToWait <= 0 )
                                {
                                    doDelete = true;    // timed out - delete our node
                                    break;
                                }

                                wait(millisToWait);
                            }
                            else
                            {
                                wait();
                            }
                        }
                        catch ( KeeperException.NoNodeException e ) 
                        {
                            // it has been deleted (i.e. lock released). Try to acquire again
                        }
                    }
                }
            }
        }
        catch ( Exception e )
        {
            ThreadUtils.checkInterrupted(e);
            doDelete = true;
            throw e;
        }
        finally
        {
            if ( doDelete )
            {
                deleteOurPath(ourPath);
            }
        }
        return haveTheLock;
    }

誒!又是一大片代碼。好吧,咱還是分段挑裏面重要的說。

  • while循環

如果你一開始使用無參的acquire方法,那麼此處的循環可能就是一個死循環。當zookeeper客戶端啓動時,並且當前線程還沒有成功獲取到鎖時,就會開始新的一輪循環。

  • getSortedChildren

這個方法比較簡單,就是獲取到所有子節點列表,並且從小到大根據節點名稱後10位數字進行排序。在上面提到了,創建的是序列節點。如下生成的示例:

 

zookeeper序列節點

  • driver.getsTheLock
// 代碼進入:StandardLockInternalsDriver.java
@Override
    public PredicateResults getsTheLock(CuratorFramework client, List<String> children, String sequenceNodeName, int maxLeases) throws Exception
    {
        int             ourIndex = children.indexOf(sequenceNodeName);
        validateOurIndex(sequenceNodeName, ourIndex);

        boolean         getsTheLock = ourIndex < maxLeases;
        String          pathToWatch = getsTheLock ? null : children.get(ourIndex - maxLeases);

        return new PredicateResults(pathToWatch, getsTheLock);
    }

判斷是否可以持有鎖,判斷規則:當前創建的節點是否在上一步獲取到的子節點列表的首位。
如果是,說明可以持有鎖,那麼getsTheLock = true,封裝進PredicateResults返回。
如果不是,說明有其他線程早已先持有了鎖,那麼getsTheLock = false,此處還需要獲取到自己前一個臨時節點的名稱pathToWatch。(注意這個pathToWatch後面有比較關鍵的作用)

  • synchronized(this)

這塊代碼在爭奪鎖失敗以後的邏輯中。那麼此處該線程應該做什麼呢?
首先添加一個watcher監聽,而監聽的地址正是上面一步返回的pathToWatch進行basePath + "/" 拼接以後的地址。也就是說當前線程會監聽自己前一個節點的變動,而不是父節點下所有節點的變動。然後華麗麗的...wait(millisToWait)。線程交出cpu的佔用,進入等待狀態,等到被喚醒。
接下來的邏輯就很自然了,如果自己監聽的節點發生了變動,那麼就將線程從等待狀態喚醒,重新一輪的鎖的爭奪。

自此, 我們完成了整個鎖的搶奪過程。

二、釋放鎖

相對上面獲取鎖的長篇大論來說,釋放的邏輯就很簡單了。

// 代碼進入:InterProcessMutex.java
/**
     * Perform one release of the mutex if the calling thread is the same thread that acquired it. If the
     * thread had made multiple calls to acquire, the mutex will still be held when this method returns.
     *
     * @throws Exception ZK errors, interruptions, current thread does not own the lock
     */
    @Override
    public void release() throws Exception
    {
        /*
            Note on concurrency: a given lockData instance
            can be only acted on by a single thread so locking isn't necessary
         */

        Thread currentThread = Thread.currentThread();
        LockData lockData = threadData.get(currentThread);
        if ( lockData == null )
        {
            throw new IllegalMonitorStateException("You do not own the lock: " + basePath);
        }

        int newLockCount = lockData.lockCount.decrementAndGet();
        if ( newLockCount > 0 )
        {
            return;
        }
        if ( newLockCount < 0 )
        {
            throw new IllegalMonitorStateException("Lock count has gone negative for lock: " + basePath);
        }
        try
        {
            internals.releaseLock(lockData.lockPath);
        }
        finally
        {
            threadData.remove(currentThread);
        }
    }
  • 減少重入鎖的計數,直到變成0。
  • 釋放鎖,即移除移除Watchers & 刪除創建的節點
  • 從threadData中,刪除自己線程的緩存

三、鎖驅動類

開始的時候,我們提到了這個StandardLockInternalsDriver-標準鎖驅動類。還提到了我們可以傳入自定義的,來擴展。
是的,我們先來看看這個它提供的功能接口:

// 代碼進入LockInternalsDriver.java
public PredicateResults getsTheLock(CuratorFramework client, List<String> children, String sequenceNodeName, int maxLeases) throws Exception;

public String createsTheLock(CuratorFramework client,  String path, byte[] lockNodeBytes) throws Exception;

// 代碼進入LockInternalsSorter.java
public String           fixForSorting(String str, String lockName);
  • getsTheLock:判斷是夠獲取到了鎖
  • createsTheLock:在zookeeper的指定路徑上,創建一個臨時序列節點。
  • fixForSorting:修復排序,在StandardLockInternalsDriver的實現中,即獲取到臨時節點的最後序列數,進行排序。

藉助於這個類,我們可以嘗試實現自己的鎖機制,比如判斷鎖獲得的策略可以做修改,比如獲取子節點列表的排序方案可以自定義。。。


4.InterProcessMutex原理總結

InterProcessMutex通過在zookeeper的某路徑節點下創建臨時序列節點來實現分佈式鎖,即每個線程(跨進程的線程)獲取同一把鎖前,都需要在同樣的路徑下創建一個節點,節點名字由uuid + 遞增序列組成。而通過對比自身的序列數是否在所有子節點的第一位,來判斷是否成功獲取到了鎖。當獲取鎖失敗時,它會添加watcher來監聽前一個節點的變動情況,然後進行等待狀態。直到watcher的事件生效將自己喚醒,或者超時時間異常返回。


5.參考資料

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章