【Java容器】HashMap源碼分析(一)

  • 本文爲JDK1.8的HashMap源碼分析

HashMap特點

  • 允許空鍵和空值
  • 不保證映射順序,輸出的順序和輸入時的不相同(如果要保證相同,可以選擇LinkedHashMap)
  • 儘可能的將元素平均分散在桶裏( “buckets”),實現爲get/put操作提供恆定時間的性能
  • 遍歷操作所需要的時間和桶的容量(table的大小)及其大小(key-value鍵值對的個數)成正比,因此桶的初始容量不能過高,負載因子不能過低
  • 初始容量(initial capacity):哈希表中存儲桶的數量,創建哈希表時的容量,必須是2的n次冪
  • 負載因子(load factor):表示一個哈希表的空間的使用程度,initailCapacity*loadFactor=HashMap的大小
  • 負載因子越大則散列表的裝填程度越高,也就是能容納更多的元素,元素多了,鏈表大了,所以此時索引效率就會降低,反之,負載因子越小則鏈表中的數據量就越稀疏,此時會對空間造成爛費,但是此時索引效率高
  • 線程不安全

HashMap定義

public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable {
  • Cloneable:標記接口,只有實現這個接口後,然後在類中重寫Object中的clone()方法,然後通過類調用clone方法才能克隆成功,如果不實現這個接口,則會拋出CloneNotSupportedException(克隆不支持)異常。
  • Serializable:標識接口,標識這該類可序列化及反序列化。

HashMap數據結構

  • 默認配置參數
	/**
     * The default initial capacity - MUST be a power of two.
     */
     //默認初始化容量:16 ,必須是2的n次冪
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
     //最大容量:2的30次方 
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.
     */
     // 負載因子 0.75
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     */
     // 鏈表轉成紅黑樹的閾值,在存儲數據時,當鏈表長度 > 8 時,則將鏈表轉換成紅黑樹
    static final int TREEIFY_THRESHOLD = 8;

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     */
     //當原有的紅黑樹內數量 < 6 時,則將 紅黑樹轉換成鏈表
    static final int UNTREEIFY_THRESHOLD = 6;

    /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     */
     // 當哈希表中的容量 > 該值時,才允許樹形化鏈表 (即 將鏈表 轉換成 紅黑樹)
     // 否則,若桶內元素太多時,則直接擴容,而不是樹形化
     // 爲了避免進行擴容、樹形化選擇的衝突,這個值不能小於 4 * TREEIFY_THRESHOLD
    static final int MIN_TREEIFY_CAPACITY = 64;
  • 存儲結構:
  • HashMap 內部包含了一個 Node 類型的數組 table,根據hash值確定數組下標,Node有個next字段,採用拉鍊法來解決衝突,相同的hash值在同一個鏈表中。
transient Node<K,V>[] table;

static class Node<K,V> implements Map.Entry<K,V> {
	final int hash;
    final K key;
    V value;
    Node<K,V> next;
    
    public final int hashCode() {
    	return Objects.hashCode(key) ^ Objects.hashCode(value);
    }

	public final boolean equals(Object o) {
        if(o == this)
            return true;
        if (o instanceof Map.Entry) {
            Map.Entry<?,?> e = (Map.Entry<?,?>)o;
            if (Objects.equals(key, e.getKey()) &&
                Objects.equals(value, e.getValue()))
                return true;
        }
        return false;
    }
}

在這裏插入圖片描述

  • 紅黑樹節點數據結構:HashMap在JDK1.8中引入了紅黑樹,當鏈表長度 > TREEIFY_THRESHOLD=8時,鏈表將轉換成紅黑樹
/**
 * Entryfor Tree bins. Extends LinkedHashMap.Entry (which in turn
 * extends Node) so can be used as extension of either regular or
 * linked node.
 */
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
    TreeNode<K,V> parent;  // red-black tree links
    TreeNode<K,V> left;
    TreeNode<K,V> right;
    TreeNode<K,V> prev;    // needed to unlink next upon deletion
    boolean red;
    TreeNode(int hash, K key, V val, Node<K,V> next) {
        super(hash, key, val, next);
    }
}

static class Entry<K,V> extends HashMap.Node<K,V> {
    Entry<K,V> before, after;
    Entry(int hash, K key, V value, Node<K,V> next) {
        super(hash, key, value, next);
    }
}
  • 看上面的繼承關係,發現TreeNode是Node的子類

小思考:
爲什麼TreeNode不直接繼承Node?

  • 相關成員變量
	/**
     * Holds cached entrySet(). Note that AbstractMap fields are used
     * for keySet() and values().
     */
    transient Set<Map.Entry<K,V>> entrySet;

    /**
     * The number of key-value mappings contained in this map.
     */
    transient int size;

    /**
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     */
    transient int modCount;

    /**
     * The next size value at which to resize (capacity * load factor).
     *
     * @serial
     */
    // (The javadoc description is true upon serialization.
    // Additionally, if the table array has not been allocated, this
    // field holds the initial array capacity, or zero signifying
    // DEFAULT_INITIAL_CAPACITY.)
    // threshold = capacity * load factor,當size大於這個值,就需要resize了
    int threshold;

    /**
     * The load factor for the hash table.
     *
     * @serial
     */
    final float loadFactor;

HashMap構造方法

  • 空構造,設置默認初始容量爲16,負載因子爲0.75,注意,這裏沒有創建table數組
    在構造函數中,我們可以看到,HashMap並沒有創建table數組,只是初始化了容量和負載因子
    也就是說,實際創建table數組是在後面put()操作時完成的。
/**
 * Constructs an empty <tt>HashMap</tt> with the default initial capacity
 * (16) and the default load factor (0.75).
 */
public HashMap() {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
  • 指定初始容量的構造方法
/**
 * Constructs an empty <tt>HashMap</tt> with the specified initial
 * capacity and the default load factor (0.75).
 *
 * @param  initialCapacity the initial capacity.
 * @throws IllegalArgumentException if the initial capacity is negative.
 */
public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
  • 指定初始容量及負載因子的構造方法
/**
 * Constructsan empty <tt>HashMap</tt> with the specified initial
 * capacity and load factor.
 *
 * @param  initialCapacity the initial capacity
 * @param  loadFactor      the load factor
 * @throws IllegalArgumentException if the initial capacity is negative
 *         or the load factor is nonpositive
 */
public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    this.loadFactor = loadFactor;
    this.threshold = tableSizeFor(initialCapacity);
}

tableSizeFor()是計算出不小於initialCapacity的最小的2的冪的結果,設計的很巧妙,來看看如何實現的

/**
 * Returns a power of two size for the given target capacity.
 */
static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
  • 如果cap是2的次冪,那麼先減1,最後再加1,return的還是cap本身
  • 如果cap是1,那麼n是0,無論n怎麼操作,最後都是0,再加1,返回的還是cap
  • 如果cap是0或者是小於0的數,那麼n是負數,無符號右移再或操作,n還是負數,最後return的是1
  • 如果cap大於1,即n大於0,tableSizeFor()計算如下:
    • n不爲0,至少有一位爲1,假設第i位爲n的二進制表示中爲1的最高一位
    • 無符號右移1位後,第i-1位爲1,再與n進行或操作,那麼第i位和第i-1位爲1,也就是說至少有2位爲1,最高位和第二高位值必定爲1
    • 無符號右移2位後,第i-2和第i-3位爲1,再與n進行或操作,那麼第i、i-1、i-2、i-3位爲1,也就是說至少有4位爲1,最高的4位值爲1
    • 以此類推,無符號右移16位後,最多也就32個1,同時可以注意到MAXIMUM_CAPACITY = 1 << 30,最多30個1
    • 所以這樣操作之後,從第i位開始後面的值全爲1
    • 最後再加1,得到的就是不小於initialCapacity的最小2的整數次冪
      在這裏插入圖片描述

小思考:
1、前面兩個構造方法就沒有對threshold賦值,爲什麼這個構造方法就要賦值呢
2、應該是 threshold = tableSizeFor(initialCapacity) * loadFactor;爲什麼這裏直接爲 threshold = tableSizeFor(initialCapacity);

  • 使用指定的Map構造一個HashMap
/**
 * Constructs a new <tt>HashMap</tt> with the same mappings as the
 * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with
 * default load factor (0.75) and an initial capacity sufficient to
 * hold the mappings in the specified <tt>Map</tt>.
 *
 * @param   m the map whose mappings are to be placed in this map
 * @throws  NullPointerException if the specified map is null
 */
public HashMap(Map<? extends K, ? extends V> m) {
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    putMapEntries(m, false);
}

添加元素

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

/**
 * Implements Map.put and related methods
 *
 * @param hash hash for key
 * @param key the key
 * @param value the value to put
 * @param onlyIfAbsent if true, don't change existing value
 * @param evict if false, the table is in creation mode.
 * @return previous value, or null if none
 */
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

首先看看table數組是如何創建的

if ((tab = table) == null || (n = tab.length) == 0)
	n = (tab = resize()).length;
  • 會先判斷table是否爲空,爲空時調用resize()方法,返回值賦值給tab數組,具體如何實現的,後面擴容部分再分析

resize()創建table或擴容

  • 計算新的容量 newCap
  • 計算新的擴容閾值 newThr
  • 創建新table newTab
  • 將舊table中的元素重新放入新table裏
    • 遍歷table數組
    • 遍歷鏈表或紅黑樹
    • 每個非空節點,先設置爲null,然後重新計算新table數組下標,插入到新table中
/**
 * Initializes or doubles table size.  If null, allocates in
 * accord with initial capacity target held in field threshold.
 * Otherwise, because we are using power-of-two expansion, the
 * elements from each bin must either stay at same index, or move
 * with a power of two offset in the new table.
 *
 * @return the table
 */
final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // preserve order
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

如何根據key計算出鍵值對要存放的數組下標

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
// 數組下標
i = (n - 1) & hash; // n爲table的length,是2的整數次冪
  • HashMap允許空鍵,如果鍵爲null,就放到table下標爲0的位置上
  • 對於非null的鍵,計算數組下標,可以轉換爲下面代碼
public int indexOf(Object key) {
	int h = key.hashCode();
	return (h ^ (h >>> 16)) & (length - 1);
}

爲什麼不直接 h & (length - 1)h >>> 16 是什麼,有什麼用呢
文檔上給出的解釋是 Computes key.hashCode() and spreads (XORs) higher bits of hash to lower.
將hashCode的高16位擴展到低16位,其目的也是爲了讓hash更散列
h >>> 16 無符號右移16位,得到h的高16位,然後和h進行異或操作,可以看作是h的高16位和h的低位進行異或操作,這樣高低位數據權重保留

1010 1010 0001 0100 1111 0101	h=11146485
0000 0000 0000 0000 1010 1010	>>>16
1010 1010 0001 0100 0101 1111	^	//高低位數據權重保留
1111 1111 1111 1111 1111 1111	(length -1)=16777215
1010 1010 0001 0100 1111 0101	& 結果=11146485//高低位數據的變化影響都有保留,儘可能地離散

接着後面就是一個if…else…,檢查要插入結點的位置是否爲空

if ((p = tab[i = (n - 1) & hash]) == null)
    tab[i] = newNode(hash, key, value, null);
else {
	// 發生碰撞,解決衝突
}
  • i = (n - 1) & hash 是根據hash來計算出數組下標
  • 如果插入節點位置爲空,直接創建Node節點,放入table中,創建節點newNode()
Node<K,V> newNode(int hash, K key, V value, Node<K,V> next) {
    return new Node<>(hash, key, value, next);
}
  • 當節點位置不爲空時,發生碰撞,解決衝突

當發生碰撞,如何解決衝突

  • 產生了衝突,那麼有兩種情況:key相同,key不同
  • 如何判斷key是否相同
    • 首先key的hash相同
    • 其次,key是同一個對象或者key調用equals()方法爲true
// p是已存在的節點,key是新節點的key
p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))
  • 如果p是TreeNode的實例,說明p下面是紅黑樹,需要在樹中找到一個合適的位置插入
  • p下面的結點數未超過8,則以單向鏈表的形式存在,逐個往下判斷:
    • ①如果下一個位爲空,插入鏈表尾部,並且判斷當插入後容量超過8則轉化成紅黑樹。
    • ②如果下一個位有相等的hash值,則覆蓋value(節點還是同一個節點,只修改了value值)。
// p = tab[i = (n - 1) & hash]
Node<K,V> e; K k;
if (p.hash == hash &&
    ((k = p.key) == key || (key != null && key.equals(k))))
    e = p; // e存放的是key相同的那個節點
else if (p instanceof TreeNode)
    e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
    for (int binCount = 0; ; ++binCount) {
        if ((e = p.next) == null) { 
            p.next = newNode(hash, key, value, null);//不存在key,新節點插到尾部
            if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                treeifyBin(tab, hash);// 當插入後容量超過8,鏈表轉化爲紅黑樹
            break;
        }
        if (e.hash == hash &&
            ((k = e.key) == key || (key != null && key.equals(k))))
            break;
        p = e;
    }
}
// key已經存在
if (e != null) { // existing mapping for key
    V oldValue = e.value;
    if (!onlyIfAbsent || oldValue == null)
        e.value = value;
    afterNodeAccess(e);
    return oldValue;
}

鏈表如何轉換爲紅黑樹

  • 如果元素數組爲空 或者 數組長度小於 樹結構化的最小限制(MIN_TREEIFY_CAPACITY = 64),此時可以通過擴容來使元素散列更均勻,不需要轉換紅黑樹
  • 當數組長度大於64(且鏈表長度大於8)時,要將鏈表轉換成紅黑樹結構
  • 樹形化
    • 首先要先將Node節點轉換爲TreeNode
    • 然後單鏈錶轉換爲雙鏈表
    • 然後將雙鏈錶轉換爲紅黑樹
/**
 * Replaces all linked nodes in bin at index for given hash unless
 * table is too small, in which case resizes instead.
 */
final void treeifyBin(Node<K,V>[] tab, int hash) {
    int n, index; Node<K,V> e;
    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
        resize();
    else if ((e = tab[index = (n - 1) & hash]) != null) {
        TreeNode<K,V> hd = null, tl = null;
        // 下面的循環是遍歷鏈表,將單鏈表Node變成雙鏈表TreeNode
        do {
            TreeNode<K,V> p = replacementTreeNode(e, null);
            if (tl == null)
                hd = p;
            else {
                p.prev = tl;
                tl.next = p;
            }
            tl = p;
        } while ((e = e.next) != null);
        // 將頭節點放到table裏
        if ((tab[index] = hd) != null)
            hd.treeify(tab); //頭節點調用treeify方法,從該節點開始將其轉換爲樹形結構
    }
}

TreeNode<K,V> replacementTreeNode(Node<K,V> p, Node<K,V> next) {
    return new TreeNode<>(p.hash, p.key, p.value, next);
}

最後判斷size大小是否超過擴容閾值

++modCount; // 這個是修改的標識
if (++size > threshold)
    resize();
  • 插入元素後,++size
  • 先插入元素,再進行擴容判斷(注意:JDK1.7是先擴容,再插入元素)

調用resize()的時機

  • 初始化後第一次put()操作插入元素時,此時只是創建table數組
  • 插入元素後,如果size超過擴容閾值(threshold = capacity * loadFactor),擴容
  • 當鏈表元素超過8個時,元素數組爲空 或者 數組長度小於 樹結構化的最小限制(MIN_TREEIFY_CAPACITY = 64),此時可以通過擴容來使元素散列更均勻,不需要轉換紅黑樹

putAll(map)

  • 創建一個新的Map結構,使用putAll()方法把原先的Map添加到新的Map中
  • 遍歷原先的Map,將每個元素放入新的Map裏,元素的值還是原先Map中元素對象的引用,並不是新創建key,value對象放入新Map裏
public void putAll(Map<? extends K, ? extends V> m) {
    putMapEntries(m, true);
}

final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
    int s = m.size();
    if (s > 0) {
        if (table == null) { // pre-size
            float ft = ((float)s / loadFactor) + 1.0F;
            int t = ((ft < (float)MAXIMUM_CAPACITY) ?
                     (int)ft : MAXIMUM_CAPACITY);
            if (t > threshold)
                threshold = tableSizeFor(t);
        }
        else if (s > threshold) // 這裏不是 s + size() > threshold,因爲JDK8中是在元素put進去之後纔會擴容的
            resize();
        for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
            K key = e.getKey();
            V value = e.getValue();
            putVal(hash(key), key, value, false, evict);
        }
    }
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章