hashMap源碼解析

1. HashMap簡介

  • Hash table based implementation of the Map interface. This implementation provides all of the optional map operations, and permits null values and the null key. (The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.) This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.

  • 基於哈希表實現的Map接口。這個實現提供了所有可選的映射操作,並允許null值和null鍵。(HashMap類大致相當於Hashtable,除了它是不同步的,並且允許爲空。)該類不保證映射的順序;特別是,它不能保證訂單在一段時間內保持不變。

  • This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets. Iteration over collection views requires time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

  • 這個實現爲基本操作(getput)提供了固定時間的性能,假設散列函數正確地將元素分散到各個桶中。集合視圖的迭代需要與HashMap實例的“容量”(桶的數量)及其大小(鍵值映射的數量)成比例的時間。因此,如果迭代性能很重要,那麼不要將初始容量設置得太高(或負載因子太低)是非常重要的。

  • An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

  • HashMap的實例有兩個影響其性能的參數:初始容量負載因數容量是哈希表中的桶數,初始容量就是創建哈希表時的容量。load factor是哈希表在容量自動增加之前允許獲得的滿容量的度量。當哈希表中的條目數超過負載因子和當前容量的乘積時,哈希表是rehash (即重新構建內部數據結構),因此哈希表的桶數大約是桶數的兩倍。

  • As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

  • 作爲一般規則,默認負載係數(.75)提供了一個良好的時間和空間成本之間的權衡。更高的價值減少的空間開銷,但增加查找成本(反映在大多數操作的 HashMap 類,包括getput )。預期中的條目數量地圖及其負荷係數設置時應考慮其初始容量,以減少重複操作的數量。如果初始容量大於條目的最大數量除以載荷因素,沒有重複操作會發生。

  • If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table. Note that using many keys with the same {@code hashCode()} is a sure way to slow down performance of any hash table. To ameliorate impact, when keys are {@link Comparable}, this class may use comparison order among keys to help break ties.

  • 如果要將許多映射存儲在HashMap實例中,那麼使用足夠大的容量創建映射將比根據需要執行自動散列來增長表更有效。注意,使用多個具有相同{@code hashCode()}的鍵肯

  • Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the map.

  • 注意這個實現是不同步的。如果多個線程同時訪問一個散列映射,並且至少有一個線程在結構上修改了映射,那麼它 必須在外部同步。(結構修改是指增加或刪除一個或多個映射的操作;僅僅更改與一個實例已經包含的鍵相關聯的值並不是結構修改。這通常是通過對一些自然封裝了映射的對象進行同步來實現的。

  • If no such object exists, the map should be “wrapped” using the {@link Collections#synchronizedMap Collections.synchronizedMap} method. This is best done at creation time, to prevent accidental unsynchronized access to the map:

       Map m = Collections.synchronizedMap(new HashMap(…));

  • 如果不存在這樣的對象,那麼應該使用{@link Collections#synchronizedMap Collections“包裝”映射。synchronizedMap}方法。這最好在創建時完成,以防止意外地不同步地訪問映射:

     map m = Collections。synchronizedMap(新HashMap (…)); < / pre >

  • The iterators returned by all of this class's "collection view methods" are fail-fast: if the map is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove method, the iterator will throw a {@link ConcurrentModificationException}. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.

  • 所有這些返回的迭代器類的“集合視圖方法”是 快速失敗:如果地圖結構修改創建迭代器後,任何時候以任何方式除非通過迭代器的刪除方法,迭代器就會拋出一個{@link ConcurrentModificationException}。因此,在面對併發修改時,迭代器會快速而乾淨地失敗,而不是在將來某個不確定的時間冒任意的、不確定的行爲的風險。

  • Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.

  • 注意,不能保證迭代器的快速故障行爲,因爲通常來說,在存在非同步併發修改的情況下,不可能做出任何嚴格的保證。故障快速迭代器在最大努力的基礎上拋出ConcurrentModificationException。因此,編寫一個依賴於這個異常來保證其正確性的程序是錯誤的:迭代器的快速失效行爲應該只用於檢測bug

2. 實現注意事項

  • This map usually acts as a binned (bucketed) hash table, but when bins get too large, they are transformed into bins of TreeNodes, each structured similarly to those in java.util.TreeMap. Most methods try to use normal bins, but relay to TreeNode methods when applicable (simply by checking instanceof a node). Bins of TreeNodes may be traversed and used like any others, but additionally support faster lookup when overpopulated. However, since the vast majority of bins in normal use are not overpopulated, checking for existence of tree bins may be delayed in the course of table methods.

  • 這個映射通常充當一個二進制(二進制)哈希表,但是當箱子變得太大時,它們會被轉換成treenode的箱子,每個箱子的結構與java.util.TreeMap中的箱子類似。大多數方法都嘗試使用普通的bin,但是在適用時中繼到TreeNode方法(只需檢查一個節點的instanceof)。樹節點的存儲箱可以像其他存儲箱一樣被遍歷和使用,但是在過度填充時支持更快的查找。但是,由於正常使用的大多數箱子並沒有被過度填充,所以在表方法的過程中可能會延遲檢查樹箱子的存在。

  • Tree bins (i.e., bins whose elements are all TreeNodes) are ordered primarily by hashCode, but in the case of ties, if two elements are of the same “class C implements Comparable”, type then their compareTo method is used for ordering. (We conservatively check generic types via reflection to validate this – see method comparableClassFor). The added complexity of tree bins is worthwhile in providing worst-case O(log n) operations when keys either have distinct hashes or are orderable, Thus, performance degrades gracefully under accidental or malicious usages in which hashCode() methods return values that are poorly distributed, as well as those in which many keys share a hashCode, so long as they are also Comparable. (If neither of these apply, we may waste about a factor of two in time and space compared to taking no precautions. But the only known cases stem from poor user programming practices that are already so slow that this makes little difference.)

  • 樹垃圾箱(即。元素都是treenode的箱子)主要通過hashCode排序,但是對於tie,如果兩個元素屬於相同的“C類實現了Comparable”,那麼輸入它們的compareTo方法來排序。(我們通過反射保守地檢查泛型類型來驗證這一點——參見comparableClassFor方法)。樹的額外複雜性垃圾箱是值得的在提供壞的O (log n)操作鍵有不同的散列或公開,定貨時因此,性能降低優雅地在意外或惡意使用hashCode()方法返回值的差分佈,以及許多密鑰共享一個hashCode,只要他們也類似。(如果這兩種方法都不適用,我們可能會在時間和空間上浪費大約兩倍於不採取預防措施的時間和空間。但已知的案例中,只有一些是由於糟糕的用戶編程實踐導致的,這些實踐已經非常緩慢了,這幾乎沒有什麼區別。

  • Because TreeNodes are about twice the size of regular nodes, we use them only when bins contain enough nodes to warrant use (see TREEIFY_THRESHOLD). And when they become too small (due to removal or resizing) they are converted back to plain bins. In usages with well-distributed user hashCodes, tree bins are rarely used. Ideally, under random hashCodes, the frequency of nodes in bins follows a Poisson distribution (http://en.wikipedia.org/wiki/Poisson_distribution) with a parameter of about 0.5 on average for the default resizing threshold of 0.75, although with a large variance because of resizing granularity. Ignoring variance, the expected occurrences of list size k are (exp(-0.5) * pow(0.5, k) /

  • 因爲樹節點的大小大約是普通節點的兩倍,所以我們只在箱子包含足夠的節點時才使用樹節點(參見TREEIFY_THRESHOLD)。當它們變得太小(由於刪除或調整大小)時,就會被轉換回普通的垃圾箱。在使用分佈良好的用戶hashcode時,很少使用樹箱。理想情況下,在隨機hashCodes下,bin中節點的頻率遵循泊松分佈(http://en.wikipedia.org/wiki/Poisson_distribution),默認調整閾值爲0.75,平均參數約爲0.5,儘管由於調整粒度而存在較大的差異。忽略方差,列表大小k的期望出現次數是(exp(-0.5) * pow(0.5, k) /

  • The root of a tree bin is normally its first node. However, sometimes (currently only upon Iterator.remove), the root might be elsewhere, but can be recovered following parent links (method TreeNode.root()).

  • 樹狀容器的根通常是它的第一個節點。但是,有時(目前僅在Iterator.remove上),根可能在其他地方,但是可以通過父鏈接(方法TreeNode.root())恢復。

  • All applicable internal methods accept a hash code as an argument (as normally supplied from a public method), allowing them to call each other without recomputing user hashCodes. Most internal methods also accept a “tab” argument, that is normally the current table, but may be a new or old one when resizing or converting.

  • 所有適用的內部方法都接受散列碼作爲參數(通常由公共方法提供),允許它們相互調用而無需重新計算用戶散列碼。大多數內部方法也接受“tab”參數,它通常是當前表,但在調整大小或轉換時可能是新表或舊錶。

  • When bin lists are treeified, split, or untreeified, we keep them in the same relative access/traversal order (i.e., field Node.next) to better preserve locality, and to slightly simplify handling of splits and traversals that invoke iterator.remove. When using comparators on insertion, to keep a total ordering (or as close as is required here) across rebalancings, we compare classes and identityHashCodes as tie-breakers.

  • 當bin列表被treeified、split或untreeified時,我們將它們保持在相同的相對訪問/遍歷順序(即爲了更好地保存局部,並稍微簡化對調用iterator.remove的分割和遍歷的處理。當在插入時使用比較器時,爲了在重新平衡時保持總排序(或儘可能接近這裏的要求),我們將類和identityHashCodes作爲關鍵字進行比較。

  • The use and transitions among plain vs tree modes is complicated by the existence of subclass LinkedHashMap. See below for hook methods defined to be invoked upon insertion, removal and access that allow LinkedHashMap internals to otherwise remain independent of these mechanics. (This also requires that a map instance be passed to some utility methods that may create new nodes.)

  • 普通vs樹模式之間的使用和轉換由於LinkedHashMap子類的存在而變得複雜。請參閱下面定義的鉤子方法,這些方法將在插入、刪除和訪問時調用,否則LinkedHashMap內部將獨立於這些機制。(這還要求將映射實例傳遞給一些可能創建新節點的實用程序方法。)

  • The concurrent-programming-like SSA-based coding style helps avoid aliasing errors amid all of the twisty pointer operations.

  • 類似於並行編程的基於ssa的編碼風格有助於避免所有扭曲指針操作中的別名錯誤。

3. hashMap的參數定義
此處對transient(不需要序列化的參數暫不解釋)

// 默認的初始容量-必須是2的冪。這裏是16
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
 // 可以設定最大的容量值,但是不能超多2^30值
static final int MAXIMUM_CAPACITY = 1 << 30;
 //在構造函數中沒有指定時使用的負載因子。
 // 負載因子:表示哈希表中元素填滿的程度。
static final float DEFAULT_LOAD_FACTOR = 0.75f;
//轉化爲的樹的臨界值 8
static final int TREEIFY_THRESHOLD = 8;
//拆分樹的臨界值 6
static final int UNTREEIFY_THRESHOLD = 6;
 //最小可存放bin量
static final int MIN_TREEIFY_CAPACITY = 64;

hashMap涉及的所有參數
4. HashMap的創建方式
有四種創建HashMap的方式
1.無參構造 new HashMap()
2.有參構造,給定容量 new HashMap(int initialCapacity )
3.有參構造,給定容量和負載因子
new HashMap(int initialCapacity ,float loadFactor)
4.直接將現有的map放入新創建的map中
new HashMap(Map<? extends K, ? extends V> m)

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
	public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @param  initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }
        /**
     * Constructs a new <tt>HashMap</tt> with the same mappings as the
     * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with
     * default load factor (0.75) and an initial capacity sufficient to
     * hold the mappings in the specified <tt>Map</tt>.
     *
     * @param   m the map whose mappings are to be placed in this map
     * @throws  NullPointerException if the specified map is null
     */
    public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章