源碼分析——What is ConcurrentHashMap

A hash table supporting full concurrency of retrievals and high expected concurrency for updates. This class obeys the same functional specification as {@link java.util.Hashtable}, and includes versions of methods corresponding to each method of {@code Hashtable}. However, even though all operations are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access. This class is fully interoperable with {@code Hashtable} in programs that rely on its thread safety but not on its synchronization details.

一個哈希表,完全支持檢索和更新併發操作。該類遵循與{@link java.util.Hashtable}相同的功能規範,幷包含與{@code Hashtable}的每個方法相對應的方法版本。但是,即使所有操作都是線程安全的,檢索操作也不需要鎖定,並且不支持鎖整張表。該類與{@code Hashtable}線程安全可以互相操作,但不同於它的同步細節。


Retrieval operations (including {@code get}) generally do not block, so may overlap with update operations (including {@code put} and {@code remove}). Retrievals reflect the results of the most recently completed update operations holding upon their onset. (More formally, an update operation for a given key bears a happens-before relation with any (non-null) retrieval for that key reporting the updated value.) For aggregate operations such as {@code putAll} and {@code clear}, concurrent retrievals may reflect insertion or removal of only some entries. Similarly, Iterators, Spliterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw {@link java.util.ConcurrentModificationException ConcurrentModificationException}. However, iterators are designed to be used by only one thread at a time. Bear in mind that the results of aggregate status methods including {@code size}, {@code isEmpty}, and {@code containsValue} are typically useful only when a map is not undergoing concurrent updates in other threads. Otherwise the results of these methods reflect transient states that may be adequate for monitoring or estimation purposes, but not for program control.
檢索操作(包括{@code get})通常不會阻塞,因此可能更新操作阻塞(包括{@code put}和{@code remove})。檢索反映了最近完成的更新操作的結果。 (更正式地說,更新操作對給定的非空查詢key值遵循happens-before原則。)對於諸如{@code putAll}和{@code clear}之類的聚合操作,併發檢索可能反映只插入或刪除一些條目。類似地,Iterators,Spliterators和Enumerations在迭代器/枚舉的創建時或之後的某個時刻返回反映哈希表狀態的元素。它們不會拋出{@link java.util.ConcurrentModificationException ConcurrentModificationException}。但是,迭代器設計爲一次只能由一個線程使用。請記住,包括{@code size},{@ code isEmpty}和{@code containsValue}在內的聚合狀態方法的結果通常僅在map沒有在其他線程中併發更新時纔有用。否則,這些方法的結果足以用於監視或者估計目標瞬態,但不適用於程序控制。


The table is dynamically expanded when there are too many collisions (i.e., keys that have distinct hash codes but fall into the same slot modulo the table size), with the expected average effect of maintaining roughly two bins per mapping (corresponding to a 0.75 load factor threshold for resizing). There may be much variance around this average as mappings are added and removed, but overall, this maintains a commonly accepted time/space tradeoff for hash tables. However, resizing this or any other kind of hash table may be a relatively slow operation. When possible, it is a good idea to provide a size estimate as an optional {@code initialCapacity} constructor argument. An additional optional {@code loadFactor} constructor argument provides a further means of customizing initial table capacity by specifying the table density to be used in calculating the amount of space to allocate for the given number of elements. Also, for compatibility with previous versions of this class, constructors may optionally specify an expected {@code concurrencyLevel} as an additional hint for internal sizing. Note that using many keys with exactly the same {@code hashCode()} is a sure way to slow down performance of any hash table. To ameliorate impact, when keys are {@link Comparable}, this class may use comparison order among keys to help break ties.

該表被動態擴展存在太多Hash衝突時(即,具有不同哈希碼的key值落入與表大小相同的桶裏面)具有每個映射大致保持兩個箱的預期平均效果(可以對0.75負載因子進行調整)。隨着映射的添加和刪除,這個平均值可能會有很大的差異,但總的來說,這維持了哈希表的普遍接受的時間/空間權衡。但是,調整此大小或任何其他類型的Hash table可能是一個相對較慢的操作。在可能的情況下,最好將估計值作爲可選的{@code initialCapacity}構造函數參數提供。另一個可選的{@code loadFactor}構造函數參數提供了另一種通過指定在計算給定數量的元素時要分配的空間量時,使用的表密度來自定義初始表容量的方法。此外,爲了與此類的先前版本兼容,構造函數可以選擇指定預期的{@code concurrencyLevel}作爲內部大小調整的附加提示。請注意,使用具有完全相同的{@code hashCode()}肯定會降低哈希表性能。爲了改善影響,當密鑰爲{@link Comparable}時,此類可以使用keys之間的比較順序來幫助打破關係。


A {@link Set} projection of a ConcurrentHashMap may be created (using {@link #newKeySet()} or {@link #newKeySet(int)}), or viewed (using {@link #keySet(Object)} when only keys are of interest, and the mapped values are (perhaps transiently) not used or all take the same mapping value.

可以創建ConcurrentHashMap的{@link Set}投影(使用{@link newKeySet()}或{@link newKeySet(int)}),用於只查看鍵時使用{@link keySet(Object)},不關注值的時候。


A ConcurrentHashMap can be used as scalable frequency map (a form of histogram or multiset) by using {@link java.util.concurrent.atomic.LongAdder} values and initializing via {@link #computeIfAbsent computeIfAbsent}. For example, to add a count to a {@code ConcurrentHashMap freqs}, you can use {@code freqs.computeIfAbsent(k -> new LongAdder()).increment();}

通過使用{@link java.util.concurrent.atomic.LongAdder}值並通過{@link computeIfAbsent computeIfAbsent}初始化,ConcurrentHashMap可用作可伸縮頻率Map(直方圖或多集的形式)。例如,要向{@code ConcurrentHashMap freqs}添加計數,可以使用{@code freqs.computeIfAbsent(k  - > new LongAdder())。increment();}


This class and its views and iterators implement all of the optional methods of the {@link Map} and {@link Iterator} interfaces.

該類及其視圖和迭代器實現了{@link Map}和{@link Iterator}接口的所有可選方法。


Like {@link Hashtable} but unlike {@link HashMap}, this class does not allow {@code null} to be used as a key or value.

與{@link Hashtable}類似,但與{@link HashMap}不同,此類不允許將{@code null}用作鍵或值。


ConcurrentHashMaps support a set of sequential and parallel bulk operations that, unlike most {@link Stream} methods, are designed to be safely, and often sensibly, applied even with maps that are being concurrently updated by other threads; for example, when computing a snapshot summary of the values in a shared registry. There are three kinds of operation, each with four forms, accepting functions with Keys, Values, Entries, and (Key, Value) arguments and/or return values. Because the elements of a ConcurrentHashMap are not ordered in any particular way, and may be processed in different orders in different parallel executions, the correctness of supplied functions should not depend on any ordering, or on any other objects or values that may transiently change while computation is in progress; and except for forEach actions, should ideally be side-effect-free. Bulk operations on {@link java.util.Map.Entry} objects do not support method {@code setValue}.

ConcurrentHashMaps支持一組順序和並行批量操作,與大多數{@link Stream}方法不同,它們設計安全且合理,可以由多個線程同時更新map; 例如,在共享註冊表中計算值的快照摘要時。有三種操作,每種操作有四種形式,接受具有鍵,值,條目和(鍵,值)參數和/或返回值。因爲ConcurrentHashMap的元素沒有以任何特定的方式排序,並且可以在不同的並行執行中以不同的順序處理,所提供的函數的正確性不應該依賴於任何排序,或者可能依賴於任何其他可能隨時改變的對象或值。計算正在進行中;除了forEach動作外,理想情況下應該是無副作用的。 {@link java.util.Map.Entry}對象上的批量操作不支持方法{@code setValue}。

  • forEach: Perform a given action on each element. A variant form applies a given transformation on each element before performing the action.
  • search: Return the first available non-null result of applying a given function on each element; skipping further search when a result is found.
  • reduce: Accumulate each element. The supplied reduction function cannot rely on ordering (more formally, it should be both associative and commutative). There are five variants:
    • Plain reductions. (There is not a form of this method for (key, value) function arguments since there is no corresponding return type.)
    • Mapped reductions that accumulate the results of a given function applied to each element.
    • Reductions to scalar doubles, longs, and ints, using a given basis value.

   forEach:對每個元素執行給定的操作。變量形式在執行操作之前對每個元素應用給定的變換。
   search:返回在每個元素上應用給定函數的第一個可用的非null結果;在找到結果時跳過進一步搜索。
   reduce:累積每個元素。提供的簡化函數不能依賴於排序(更正式地說,它應該是關聯的和可交換的)。有五種變體:
           簡單減少。 (對於(key,value)函數參數,沒有這種方法的形式,因爲沒有相應的返回類型。)
           映射縮減,累積應用於每個元素的給定函數的結果。
           使用給定的基值用於scalar 的doubles,longs和ints。


These bulk operations accept a {@code parallelismThreshold} argument. Methods proceed sequentially if the current map size is estimated to be less than the given threshold. Using a value of {@code Long.MAX_VALUE} suppresses all parallelism. Using a value of {@code 1} results in maximal parallelism by partitioning into enough subtasks to fully utilize the {@link ForkJoinPool#commonPool()} that is used for all parallel computations. Normally, you would initially choose one of these extreme values, and then measure performance of using in-between values that trade off overhead versus throughput.

這些批量操作接受{@code parallelismThreshold}參數。如果估計當前map大小小於給定閾值,則方法順序進行。使用{@code Long.MAX_VALUE}值可以抑制所有並行性。使用值{@code 1}通過劃分爲足夠的子任務來充分利用用於所有並行計算的{@link ForkJoinPoolcommonPool()},從而實現最大並行度。通常,您最初會選擇其中一個極端值,然後評估中間值的性能,權衡開銷與吞吐量。

 

The concurrency properties of bulk operations follow from those of ConcurrentHashMap: Any non-null result returned from {@code get(key)} and related access methods bears a happens-before relation with the associated insertion or update. The result of any bulk operation reflects the composition of these per-element relations (but is not necessarily atomic with respect to the map as a whole unless it is somehow known to be quiescent). Conversely, because keys and values in the map are never null, null serves as a reliable atomic indicator of the current lack of any result. To maintain this property, null serves as an implicit basis for all non-scalar reduction operations. For the double, long, and int versions, the basis should be one that, when combined with any other value, returns that other value (more formally, it should be the identity element for the reduction). Most common reductions have these properties; for example, computing a sum with basis 0 or a minimum with basis MAX_VALUE.

批量操作的併發屬性遵循ConcurrentHashMap的併發屬性:從{@code get(key)}返回的任何非null結果和相關的訪問方法都與相關的插入或更新具有happens-before關係。任何批量操作的結果都反映了這些每元素關係的組成(但不一定是整個map的原子,除非它以某種方式被稱爲靜止)。相反,因爲映射中的鍵和值永遠不爲null,所以null可以作爲當前缺少任何結果的可靠原子指示符。要維護此屬性,null將作爲所有非標量縮減操作的隱式基礎。對於double,long和int版本,基礎應該是一個,當與任何其他值組合時,返回其他值(更正式地說,它應該是減少的標識元素)。最常見的減少具有這些特性;例如,以基數MAX_VALUE計算基數爲0或最小值的和。(說實話,這段暈暈乎乎的~~~~~)

 

Search and transformation functions provided as arguments should similarly return null to indicate the lack of any result (in which case it is not used). In the case of mapped reductions, this also enables transformations to serve as filters, returning null (or, in the case of primitive specializations, the identity basis) if the element should not be combined. You can create compound transformations and filterings by composing them yourself under this "null means there is nothing there now" rumaple before using them in search or reduce operations.

作爲參數提供的搜索和轉換函數應該類似地返回null以指示缺少任何結果(在這種情況下不使用它)。在映射縮減的情況下,這還使轉換能夠用作過濾器,如果不應該組合元素,則返回null(或者,如果是原始特化,則返回標識基礎)。您可以通過在搜索或減少操作中使用它們之前在“null表示現在沒有任何東西”的情況下自己編寫複合變換和過濾來創建複合變換和過濾。

 

Methods accepting and/or returning Entry arguments maintain key-value associations. They may be useful for example when finding the key for the greatest value. Note that "plain" Entry arguments can be supplied using {@code new AbstractMap.SimpleEntry(k,v)}.

接受和/或返回Entry參數的方法維護鍵值關聯。例如,當找到具有最大價值的密鑰時,它們可能是有用的。請注意,可以使用{@code new AbstractMap.SimpleEntry(k,v)}提供“plain”Entry參數。

 

Bulk operations may complete abruptly, throwing an exception encountered in the application of a supplied function. Bear in mind when handling such exceptions that other concurrently executing functions could also have thrown exceptions, or would have done so if the first exception had not occurred.

批量操作可能會突然完成,拋出在應用函數中遇到的異常。在處理此類異常時請記住,其他併發執行的函數也可能拋出異常,或者如果沒有發生第一個異常,則會這樣做。

 

Speedups for parallel compared to sequential forms are common but not guaranteed. Parallel operations involving brief functions on small maps may execute more slowly than sequential forms if the underlying work to parallelize the computation is more expensive than the computation itself. Similarly, parallelization may not lead to much actual parallelism if all processors are busy performing unrelated tasks.

與順序形式相比,並行加速是常見的,但不能保證。如果並行計算的基礎工作比計算本身更昂貴,則涉及小maps上的簡短函數的並行操作可能比順序形式執行得更慢。同樣,如果所有處理器忙於執行不相關的任務,並行化可能不會導致太多實際的並行性。

 

All arguments to all task methods must be non-null.

所有任務方法的所有參數都必須爲非null。

 

參考資料:

JDK1.8源碼

JDK1.8ConcurrentHashMap官網

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章