源码分析——What is ConcurrentHashMap

A hash table supporting full concurrency of retrievals and high expected concurrency for updates. This class obeys the same functional specification as {@link java.util.Hashtable}, and includes versions of methods corresponding to each method of {@code Hashtable}. However, even though all operations are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access. This class is fully interoperable with {@code Hashtable} in programs that rely on its thread safety but not on its synchronization details.

一个哈希表,完全支持检索和更新并发操作。该类遵循与{@link java.util.Hashtable}相同的功能规范,幷包含与{@code Hashtable}的每个方法相对应的方法版本。但是,即使所有操作都是线程安全的,检索操作也不需要锁定,并且不支持锁整张表。该类与{@code Hashtable}线程安全可以互相操作,但不同于它的同步细节。


Retrieval operations (including {@code get}) generally do not block, so may overlap with update operations (including {@code put} and {@code remove}). Retrievals reflect the results of the most recently completed update operations holding upon their onset. (More formally, an update operation for a given key bears a happens-before relation with any (non-null) retrieval for that key reporting the updated value.) For aggregate operations such as {@code putAll} and {@code clear}, concurrent retrievals may reflect insertion or removal of only some entries. Similarly, Iterators, Spliterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw {@link java.util.ConcurrentModificationException ConcurrentModificationException}. However, iterators are designed to be used by only one thread at a time. Bear in mind that the results of aggregate status methods including {@code size}, {@code isEmpty}, and {@code containsValue} are typically useful only when a map is not undergoing concurrent updates in other threads. Otherwise the results of these methods reflect transient states that may be adequate for monitoring or estimation purposes, but not for program control.
检索操作(包括{@code get})通常不会阻塞,因此可能更新操作阻塞(包括{@code put}和{@code remove})。检索反映了最近完成的更新操作的结果。 (更正式地说,更新操作对给定的非空查询key值遵循happens-before原则。)对于诸如{@code putAll}和{@code clear}之类的聚合操作,并发检索可能反映只插入或删除一些条目。类似地,Iterators,Spliterators和Enumerations在迭代器/枚举的创建时或之后的某个时刻返回反映哈希表状态的元素。它们不会抛出{@link java.util.ConcurrentModificationException ConcurrentModificationException}。但是,迭代器设计为一次只能由一个线程使用。请记住,包括{@code size},{@ code isEmpty}和{@code containsValue}在内的聚合状态方法的结果通常仅在map没有在其他线程中并发更新时才有用。否则,这些方法的结果足以用于监视或者估计目标瞬态,但不适用于程序控制。


The table is dynamically expanded when there are too many collisions (i.e., keys that have distinct hash codes but fall into the same slot modulo the table size), with the expected average effect of maintaining roughly two bins per mapping (corresponding to a 0.75 load factor threshold for resizing). There may be much variance around this average as mappings are added and removed, but overall, this maintains a commonly accepted time/space tradeoff for hash tables. However, resizing this or any other kind of hash table may be a relatively slow operation. When possible, it is a good idea to provide a size estimate as an optional {@code initialCapacity} constructor argument. An additional optional {@code loadFactor} constructor argument provides a further means of customizing initial table capacity by specifying the table density to be used in calculating the amount of space to allocate for the given number of elements. Also, for compatibility with previous versions of this class, constructors may optionally specify an expected {@code concurrencyLevel} as an additional hint for internal sizing. Note that using many keys with exactly the same {@code hashCode()} is a sure way to slow down performance of any hash table. To ameliorate impact, when keys are {@link Comparable}, this class may use comparison order among keys to help break ties.

该表被动态扩展存在太多Hash冲突时(即,具有不同哈希码的key值落入与表大小相同的桶里面)具有每个映射大致保持两个箱的预期平均效果(可以对0.75负载因子进行调整)。随着映射的添加和删除,这个平均值可能会有很大的差异,但总的来说,这维持了哈希表的普遍接受的时间/空间权衡。但是,调整此大小或任何其他类型的Hash table可能是一个相对较慢的操作。在可能的情况下,最好将估计值作为可选的{@code initialCapacity}构造函数参数提供。另一个可选的{@code loadFactor}构造函数参数提供了另一种通过指定在计算给定数量的元素时要分配的空间量时,使用的表密度来自定义初始表容量的方法。此外,为了与此类的先前版本兼容,构造函数可以选择指定预期的{@code concurrencyLevel}作为内部大小调整的附加提示。请注意,使用具有完全相同的{@code hashCode()}肯定会降低哈希表性能。为了改善影响,当密钥为{@link Comparable}时,此类可以使用keys之间的比较顺序来帮助打破关系。


A {@link Set} projection of a ConcurrentHashMap may be created (using {@link #newKeySet()} or {@link #newKeySet(int)}), or viewed (using {@link #keySet(Object)} when only keys are of interest, and the mapped values are (perhaps transiently) not used or all take the same mapping value.

可以创建ConcurrentHashMap的{@link Set}投影(使用{@link newKeySet()}或{@link newKeySet(int)}),用于只查看键时使用{@link keySet(Object)},不关注值的时候。


A ConcurrentHashMap can be used as scalable frequency map (a form of histogram or multiset) by using {@link java.util.concurrent.atomic.LongAdder} values and initializing via {@link #computeIfAbsent computeIfAbsent}. For example, to add a count to a {@code ConcurrentHashMap freqs}, you can use {@code freqs.computeIfAbsent(k -> new LongAdder()).increment();}

通过使用{@link java.util.concurrent.atomic.LongAdder}值并通过{@link computeIfAbsent computeIfAbsent}初始化,ConcurrentHashMap可用作可伸缩频率Map(直方图或多集的形式)。例如,要向{@code ConcurrentHashMap freqs}添加计数,可以使用{@code freqs.computeIfAbsent(k  - > new LongAdder())。increment();}


This class and its views and iterators implement all of the optional methods of the {@link Map} and {@link Iterator} interfaces.

该类及其视图和迭代器实现了{@link Map}和{@link Iterator}接口的所有可选方法。


Like {@link Hashtable} but unlike {@link HashMap}, this class does not allow {@code null} to be used as a key or value.

与{@link Hashtable}类似,但与{@link HashMap}不同,此类不允许将{@code null}用作键或值。


ConcurrentHashMaps support a set of sequential and parallel bulk operations that, unlike most {@link Stream} methods, are designed to be safely, and often sensibly, applied even with maps that are being concurrently updated by other threads; for example, when computing a snapshot summary of the values in a shared registry. There are three kinds of operation, each with four forms, accepting functions with Keys, Values, Entries, and (Key, Value) arguments and/or return values. Because the elements of a ConcurrentHashMap are not ordered in any particular way, and may be processed in different orders in different parallel executions, the correctness of supplied functions should not depend on any ordering, or on any other objects or values that may transiently change while computation is in progress; and except for forEach actions, should ideally be side-effect-free. Bulk operations on {@link java.util.Map.Entry} objects do not support method {@code setValue}.

ConcurrentHashMaps支持一组顺序和并行批量操作,与大多数{@link Stream}方法不同,它们设计安全且合理,可以由多个线程同时更新map; 例如,在共享注册表中计算值的快照摘要时。有三种操作,每种操作有四种形式,接受具有键,值,条目和(键,值)参数和/或返回值。因为ConcurrentHashMap的元素没有以任何特定的方式排序,并且可以在不同的并行执行中以不同的顺序处理,所提供的函数的正确性不应该依赖于任何排序,或者可能依赖于任何其他可能随时改变的对象或值。计算正在进行中;除了forEach动作外,理想情况下应该是无副作用的。 {@link java.util.Map.Entry}对象上的批量操作不支持方法{@code setValue}。

  • forEach: Perform a given action on each element. A variant form applies a given transformation on each element before performing the action.
  • search: Return the first available non-null result of applying a given function on each element; skipping further search when a result is found.
  • reduce: Accumulate each element. The supplied reduction function cannot rely on ordering (more formally, it should be both associative and commutative). There are five variants:
    • Plain reductions. (There is not a form of this method for (key, value) function arguments since there is no corresponding return type.)
    • Mapped reductions that accumulate the results of a given function applied to each element.
    • Reductions to scalar doubles, longs, and ints, using a given basis value.

   forEach:对每个元素执行给定的操作。变量形式在执行操作之前对每个元素应用给定的变换。
   search:返回在每个元素上应用给定函数的第一个可用的非null结果;在找到结果时跳过进一步搜索。
   reduce:累积每个元素。提供的简化函数不能依赖于排序(更正式地说,它应该是关联的和可交换的)。有五种变体:
           简单减少。 (对于(key,value)函数参数,没有这种方法的形式,因为没有相应的返回类型。)
           映射缩减,累积应用于每个元素的给定函数的结果。
           使用给定的基值用于scalar 的doubles,longs和ints。


These bulk operations accept a {@code parallelismThreshold} argument. Methods proceed sequentially if the current map size is estimated to be less than the given threshold. Using a value of {@code Long.MAX_VALUE} suppresses all parallelism. Using a value of {@code 1} results in maximal parallelism by partitioning into enough subtasks to fully utilize the {@link ForkJoinPool#commonPool()} that is used for all parallel computations. Normally, you would initially choose one of these extreme values, and then measure performance of using in-between values that trade off overhead versus throughput.

这些批量操作接受{@code parallelismThreshold}参数。如果估计当前map大小小于给定阈值,则方法顺序进行。使用{@code Long.MAX_VALUE}值可以抑制所有并行性。使用值{@code 1}通过划分为足够的子任务来充分利用用于所有并行计算的{@link ForkJoinPoolcommonPool()},从而实现最大并行度。通常,您最初会选择其中一个极端值,然后评估中间值的性能,权衡开销与吞吐量。

 

The concurrency properties of bulk operations follow from those of ConcurrentHashMap: Any non-null result returned from {@code get(key)} and related access methods bears a happens-before relation with the associated insertion or update. The result of any bulk operation reflects the composition of these per-element relations (but is not necessarily atomic with respect to the map as a whole unless it is somehow known to be quiescent). Conversely, because keys and values in the map are never null, null serves as a reliable atomic indicator of the current lack of any result. To maintain this property, null serves as an implicit basis for all non-scalar reduction operations. For the double, long, and int versions, the basis should be one that, when combined with any other value, returns that other value (more formally, it should be the identity element for the reduction). Most common reductions have these properties; for example, computing a sum with basis 0 or a minimum with basis MAX_VALUE.

批量操作的并发属性遵循ConcurrentHashMap的并发属性:从{@code get(key)}返回的任何非null结果和相关的访问方法都与相关的插入或更新具有happens-before关系。任何批量操作的结果都反映了这些每元素关系的组成(但不一定是整个map的原子,除非它以某种方式被称为静止)。相反,因为映射中的键和值永远不为null,所以null可以作为当前缺少任何结果的可靠原子指示符。要维护此属性,null将作为所有非标量缩减操作的隐式基础。对于double,long和int版本,基础应该是一个,当与任何其他值组合时,返回其他值(更正式地说,它应该是减少的标识元素)。最常见的减少具有这些特性;例如,以基数MAX_VALUE计算基数为0或最小值的和。(说实话,这段晕晕乎乎的~~~~~)

 

Search and transformation functions provided as arguments should similarly return null to indicate the lack of any result (in which case it is not used). In the case of mapped reductions, this also enables transformations to serve as filters, returning null (or, in the case of primitive specializations, the identity basis) if the element should not be combined. You can create compound transformations and filterings by composing them yourself under this "null means there is nothing there now" rumaple before using them in search or reduce operations.

作为参数提供的搜索和转换函数应该类似地返回null以指示缺少任何结果(在这种情况下不使用它)。在映射缩减的情况下,这还使转换能够用作过滤器,如果不应该组合元素,则返回null(或者,如果是原始特化,则返回标识基础)。您可以通过在搜索或减少操作中使用它们之前在“null表示现在没有任何东西”的情况下自己编写复合变换和过滤来创建复合变换和过滤。

 

Methods accepting and/or returning Entry arguments maintain key-value associations. They may be useful for example when finding the key for the greatest value. Note that "plain" Entry arguments can be supplied using {@code new AbstractMap.SimpleEntry(k,v)}.

接受和/或返回Entry参数的方法维护键值关联。例如,当找到具有最大价值的密钥时,它们可能是有用的。请注意,可以使用{@code new AbstractMap.SimpleEntry(k,v)}提供“plain”Entry参数。

 

Bulk operations may complete abruptly, throwing an exception encountered in the application of a supplied function. Bear in mind when handling such exceptions that other concurrently executing functions could also have thrown exceptions, or would have done so if the first exception had not occurred.

批量操作可能会突然完成,抛出在应用函数中遇到的异常。在处理此类异常时请记住,其他并发执行的函数也可能抛出异常,或者如果没有发生第一个异常,则会这样做。

 

Speedups for parallel compared to sequential forms are common but not guaranteed. Parallel operations involving brief functions on small maps may execute more slowly than sequential forms if the underlying work to parallelize the computation is more expensive than the computation itself. Similarly, parallelization may not lead to much actual parallelism if all processors are busy performing unrelated tasks.

与顺序形式相比,并行加速是常见的,但不能保证。如果并行计算的基础工作比计算本身更昂贵,则涉及小maps上的简短函数的并行操作可能比顺序形式执行得更慢。同样,如果所有处理器忙于执行不相关的任务,并行化可能不会导致太多实际的并行性。

 

All arguments to all task methods must be non-null.

所有任务方法的所有参数都必须为非null。

 

参考资料:

JDK1.8源码

JDK1.8ConcurrentHashMap官网

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章