Doug Lea文章閱讀記錄-JUC系列

3.3 Queues

The heart of the framework is maintenance of queues of blocked threads, which are restricted here to FIFO queues. Thus, the framework does not support priority-based synchronization.

核心是一個阻塞線程的先進先出的隊列,核心!

These days, there is little controversy that the most appropriate choices for synchronization queues are non-blocking data structures that do not themselves need to be constructed using lower-level locks. And of these, there are two main candidates: variants of Mellor-Crummey and Scott (MCS) locks [9], and variants of Craig, Landin, and Hagersten (CLH) locks [5][8][10]. Historically, CLH locks have been used only in spinlocks. However, they appeared more amenable than MCS for use in the synchronizer framework because they are more easily adapted to handle cancellation and timeouts, so were chosen as a basis. The resulting design is far enough removed from the original CLH structure to require explanation.

these days:目前

little controversy:毫無爭議

most appropriate choices:最適合的選擇

candidate:候選

variant:變體

amenable:順從,經得起檢驗的

adapted to:適合

隆重介紹除了MCS隊列或CLH隊列作爲同步器隊列的數據結構別無它選。儘管CLH一般只使用再自旋鎖,但是考慮到CLH更加適合處理取消和超時,所以選擇了它作爲基礎,最終的實現結果當然是和原始的CLH隊列差別很大。

A CLH queue is not very queue-like, because its enqueuing and dequeuing operations are intimately tied to its uses as a lock. It is a linked queue accessed via two atomically updatable fields, head and tail, both initially pointing to a dummy node.

intimately tied to:密切相關

atomically:原子方式的

通過原子更新兩個字段headtail訪問,來維護一個鏈表隊列。這個原子就很關鍵了,畢竟是併發場景。另外,鏈表結構保證着順序性。

結構圖:

A new node, node, is enqueued using an atomic operation:

do { 
  pred = tail; 
} while(!tail.compareAndSet(pred, node)); 

The release status for each node is kept in its predecessor node. So, the "spin" of a spinlock looks like:

while (pred.status != RELEASED) ; // spin

A dequeue operation after this spin simply entails setting the head field to the node that just got the lock:

head = node;

predecessor:上一個

是的,直接上代碼講,先又重點說明了一下入隊是原子操作,顯而易見,這個原子操作是CAS支撐的。然後是指出自旋是在哪裏自旋的,就是不斷判斷前面節點的status。最後是出隊,出隊只需要將head字段設置爲剛剛獲得鎖的節點。

Among the advantages of CLH locks are that enqueuing and dequeuing are fast, lock-free, and obstruction free (even under contention, one thread will always win an insertion race so will make progress); that detecting whether any threads are waiting is also fast (just check if head is the same as tail); and that release status is decentralized, avoiding some memory contention.

obstruction:堵塞,阻撓

contention:競爭

decentralized:分散管理的

這裏就說明一下這個設計的好處,CLH鎖的好處是出隊入隊都是無阻塞的,即使在競爭激烈情況下因爲總有一個線程能入隊,就可以順利進行下去,判斷線程是不是等待狀態也非常快,只需要判斷頭尾節點是不是同一個,另外就是自旋判斷的狀態是分散的,避免了內存競爭。

In the original versions of CLH locks, there were not even links connecting nodes. In a spinlock, the pred variable can be held as a local. However, Scott and Scherer[10] showed that by explicitly maintaining predecessor fields within nodes, CLH locks can deal with timeouts and other forms of cancellation: If a node's predecessor cancels, the node can slide up to use the previous node's status field.

The main additional modification needed to use CLH queues for blocking synchronizers is to provide an efficient way for one node to locate its successor. In spinlocks, a node need only change its status, which will be noticed on next spin by its successor, so links are unnecessary. But in a blocking synchronizer, a node needs to explicitly wake up (unpark) its successor.

explicitly:明確的

開始引導出自己在原來CLH上的改造,原先CLH並不需要有明確node之間的連接,然而有大佬背書說明確顯示鏈表就可以處理超時和取消的要求,比如取消了前面的節點,那麼後面的節點往前划動使用前面節點的狀態即可。

最主要的修改就是爲一個節點定位下一個節點提供一個有效的方式,最主要的原因是,一個自旋鎖只需要更改他的狀態,就相當於通知到下一個節點,但是作爲一個阻塞同步器,一個節點是需要明確去喚醒下一個節點的。終於提到了關鍵點:unpark

An AbstractQueuedSynchronizer queue node contains a next link to its successor. But because there are no applicable techniques for lock-free atomic insertion of double-linked list nodes using compareAndSet, this link is not atomically set as part of insertion; it is simply assigned:

pred.next = node; 

after the insertion. This is reflected in all usages. The next link is treated only as an optimized path. If a node's successor does not appear to exist (or appears to be cancelled) via its next field, it is always possible to start at the tail of the list and traverse backwards using the pred field to accurately check if there really is one.

applicable :適應的

traverse:遍歷

鋪墊了這麼久,終於可以說出AbstractQueuedSynchronizer實現的隊列是顯示隊列的事了,並且還是雙向鏈表。還沒有技術手段可以通過CAS向一個雙向鏈表插入節點,所以這個連接指向不是原子操作的一部分,只會簡單的進行賦值操作。

A second set of modifications is to use the status field kept in each node for purposes of controlling blocking, not spinning. In the synchronizer framework, a queued thread can only return from an acquire operation if it passes the tryAcquire method defined in a concrete subclass; a single "released" bit does not suffice. But control is still needed to ensure that an active thread is only allowed to invoke tryAcquire when it is at the head of the queue; in which case it may fail to acquire, and (re)block. This does not require a per-node status flag because permission can be determined by checking that the current node's predecessor is the head. And unlike the case of spinlocks, there is not enough memory contention reading head to warrant replication. However, cancellation status must still be present in the status field.

concrete:具體的

suffice:足夠

第二大修改的點是使用節點內維護的status field來控制阻塞,而不是用這個自旋。這個status field在源碼中就是Node#waitStatus

這裏就涉及到具體的代碼實現了,AQS的設計是把用模版模式把一些模版方法留給子類去實現,並且明確告訴那些實現者,鎖狀態是一個volatile修飾的int。 通過判斷自己的前節點是否爲頭節點老決定是否進行acquire操作。所以不需要每個節點內的狀態而只要判斷是不是頭節點就行了,這個和自旋鎖已經有很大不同了,那麼每個節點上存的那個Node#waitStatus存的值具體做什麼用的呢:

/** waitStatus value to indicate thread has cancelled */
static final int CANCELLED =  1;
/** waitStatus value to indicate successor's thread needs unparking */
static final int SIGNAL    = -1;
/** waitStatus value to indicate thread is waiting on condition */
static final int CONDITION = -2;
/**
 * waitStatus value to indicate the next acquireShared should
 * unconditionally propagate
 */
static final int PROPAGATE = -3;

The queue node status field is also used to avoid needless calls to park and unpark. While these methods are relatively fast as blocking primitives go, they encounter avoidable overhead in the boundary crossing between Java and the JVM runtime and/or OS. Before invoking park, a thread sets a "signal me" bit, and then rechecks synchronization and node status once more before invoking park. A releasing thread clears status. This saves threads from needlessly attempting to block often enough to be worthwhile, especially for lock classes in which lost time waiting for the next eligible thread to acquire a lock accentuates other contention effects. This also avoids requiring a releasing thread to determine its successor unless the successor has set the signal bit, which in turn eliminates those cases where it must traverse multiple nodes to cope with an apparently null next field unless signalling occurs in conjunction with cancellation.

relatively:相對的

accentuate:使突出,強調

在調用park前可以先檢查節點status狀態來避免,源代碼中註釋關聯:

Non-negative values mean that a node doesn't need to signal.

節點status的維護釋放線程狀態,不需要判斷線程狀態只需要判斷節點狀態即可

Perhaps the main difference between the variant of CLH locks used in the synchronizer framework and those employed in other languages is that garbage collection is relied on for managing storage reclamation of nodes, which avoids complexity and overhead. However, reliance on GC does still entail nulling of link fields when they are sure to never to be needed. This can normally be done when dequeuing. Otherwise, unused nodes would still be reachable, causing them to be uncollectable.

Some further minor tunings, including lazy initialization of the initial dummy node required by CLH queues upon first contention, are described in the source code documentation in the J2SE1.5 release.

reclamation:開墾;收回;再利用;矯正

java自帶GC機制,實現起來比其他沒有GC能力的簡單一些。不過出隊的時候也會把連接指向設置爲null,否則因爲還有引用導致無法回收。

還有優化的點:CLH隊列在第一次爭用時所需的初始虛擬節點的延遲初始化

老爺子說還有很多不同的地方去看我的代碼吧~,的確他在源碼中寫了大量的註釋,像論文一樣。

Omitting such details, the general form of the resulting implementation of the basic acquire operation (exclusive, noninterruptible, untimed case only) is:

// 入隊前,先進行一次搶佔鎖操作,失敗才進行入隊
if (!tryAcquire(arg)) {
  // 創建新的節點
	node = create and enqueue new node;
  // 前節點,其實就是尾節點指向
	pred = node's effective predecessor;
  // 首先判斷尾節點是否和頭節點相同,是的話直接頭節點設置
  // 不是的話再嘗試搶佔鎖
	while (pred is not head node || !tryAcquire(arg)) {
    // 搶佔鎖失敗,判斷前面節點狀態是否爲signal,是的話表示前面節點還在等待喚醒,那我就肯定先等待
		if (pred's signal bit is set)
			park();
		else
      // 直接替換前面節點的signal狀態
			compareAndSet pred's signal bit to true; 
      // 節點前移
			pred = node's effective predecessor; 
      // 繼續循環判斷前面節點
    }
	head = node; 
 }

And the release operation is:

if (tryRelease(arg) && head node's signal bit is set) { 
  compareAndSet head's signal bit to false;
  // 喚醒頭節點的後面一個節點
	unpark head's successor, if one exists 
}

The number of iterations of the main acquire loop depends, of course, on the nature of tryAcquire. Otherwise, in the absence of cancellation, each component of acquire and release is a constant-time O(1) operation, amortized across threads, disregarding any OS thread scheduling occuring within park.

Cancellation support mainly entails checking for interrupt or timeout upon each return from park inside the acquire loop. A cancelled thread due to timeout or interrupt sets its node status and unparks its successor so it may reset links. With cancellation, determining predecessors and successors and resetting status may include O(n) traversals (where n is the length of the queue). Because a thread never again blocks for a cancelled operation, links and status fields tend to restabilize quickly.

omit:省去,遺漏

the nature of:的本質

amortized cost:攤餘成本

這裏重點看下爲什麼老爺子說acquire的僞代碼是O(1)的複雜度,我覺得這裏要理解一個併發場景假設有100個線程同時搶鎖,上面的代碼是在第一輪循環時確保一定有一個節點能放入,也就是循環的次數時1+2+3+...+100這樣,所以最終的是:(1+2+3+...+n)/n。

關於取消,最壞情況是需要遍歷整個隊列,所以複雜度是O(n),假如整個隊列都是取消狀態。

3.4 Condition Queues

The synchronizer framework provides a ConditionObject class for use by synchronizers that maintain exclusive synchronization and conform to the Lock interface. Any number of condition objects may be attached to a lock object, providing classic monitor-style await, signal, and signalAll operations, including those with timeouts, along with some inspection and monitoring methods.

The ConditionObject class enables conditions to be efficiently integrated with other synchronization operations, again by fixing some design decisions. This class supports only Java-style monitor access rules in which condition operations are legal only when the lock owning the condition is held by the current thread (See [4] for discussion of alternatives). Thus, a ConditionObject attached to a ReentrantLock acts in the same way as do built-in monitors (via Object.wait etc), differing only in method names, extra functionality, and the fact that users can declare multiple conditions per lock.

inspection:視察;檢查

attached to:附屬於

integrate with:使與……結合

exclusive:獨有的

繼續介紹Condition,提供一個ConditionObject給同步器用,每一個condition都必須屬於一個lock,這點和Object的await, signal, signalAll操作一樣是要先獲得鎖。每一個鎖可以有關聯多個condition

A ConditionObject uses the same internal queue nodes as synchronizers, but maintains them on a separate condition queue. The signal operation is implemented as a queue transfer from the condition queue to the lock queue, without necessarily waking up the signalled thread before it has re-acquired its lock.

ConditionObject使用和前面提到的相同內部隊列節點來實現,不過維護單獨的隊列。signal操作是codition隊列到鎖隊列的傳輸。這個也是Condition實現的關鍵機制。

The basic await operation is:

create and add new node to condition queue; 
release lock;
block until node is on lock queue; 
re-acquire lock;

And the signal operation is:

transfer the first node from condition queue to lock queue;

Because these operations are performed only when the lock is held, they can use sequential linked queue operations (using a nextWaiter field in nodes) to maintain the condition queue. The transfer operation simply unlinks the first node from the condition queue, and then uses CLH insertion to attach it to the lock queue.

sequential:連續的,按順序的

以上是await,signal的僞代碼。因爲condition隊列的操作是在線程拿到鎖的情況下進行的,所以維護節點連接的字段nextWaiter並不需要volatile修飾。

The main complication in implementing these operations is dealing with cancellation of condition waits due to timeouts or Thread.interrupt. A cancellation and signal occuring at approximately the same time encounter a race whose outcome conforms to the specifications for built-in monitors. As revised in JSR133, these require that if an interrupt occurs before a signal, then the await method must, after reacquiring the lock, throw InterruptedException. But if it is interrupted after a
signal, then the method must return without throwing an exception, but with its thread interrupt status set.

To maintain proper ordering, a bit in the queue node status records whether the node has been (or is in the process of being)
transferred. Both the signalling code and the cancelling code try to compareAndSet this status. If a signal operation loses this race, it instead transfers the next node on the queue, if one exists. If a cancellation loses, it must abort the transfer, and then await lock re-acquisition. This latter case introduces a potentially unbounded spin. A cancelled wait cannot commence lock reacquisition until the node has been successfully inserted on the lock queue, so must spin waiting for the CLH queue insertion compareAndSet being performed by the signalling thread to succeed. The need to spin here is rare, and employs a Thread.yield to provide a scheduling hint that some other thread, ideally the one doing the signal, should instead run. While it would be possible to implement here a helping strategy for the cancellation to insert the node, the case is much too rare to justify the added overhead that this would entail. In all other cases, the basic mechanics here and elsewhere use no spins or yields, which maintains reasonable performance on uniprocessors.

complication:複雜化

approximately:大約

entail:使必要,需要

reasonable:合理

正如在JSR133中修改的那樣,這些規則要求如果一箇中斷髮生在一個信號之前,那麼await方法必須在重新獲取鎖之後拋出InterruptedException。但是如果它在一個信號之後被中斷,那麼該方法必須在不拋出異常的情況下返回,但是要設置它的線程中斷狀態。

老爺子解釋了一下實現condition的難點是處理信號和取消併發的場景,這點在分析源碼時再仔細回顧一下。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章