【Java容器】HashMap源碼分析(一)

本文爲JDK1.8的HashMap源碼分析

HashMap特點

允許空鍵和空值
不保證映射順序，輸出的順序和輸入時的不相同（如果要保證相同，可以選擇LinkedHashMap）
儘可能的將元素平均分散在桶裏（ “buckets”），實現爲get/put操作提供恆定時間的性能
遍歷操作所需要的時間和桶的容量（table的大小）及其大小（key-value鍵值對的個數）成正比，因此桶的初始容量不能過高，負載因子不能過低
初始容量（initial capacity）：哈希表中存儲桶的數量，創建哈希表時的容量，必須是2的n次冪
負載因子（load factor）：表示一個哈希表的空間的使用程度，initailCapacity*loadFactor=HashMap的大小
負載因子越大則散列表的裝填程度越高，也就是能容納更多的元素，元素多了，鏈表大了，所以此時索引效率就會降低，反之，負載因子越小則鏈表中的數據量就越稀疏，此時會對空間造成爛費，但是此時索引效率高
線程不安全

HashMap定義

public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable {

Cloneable：標記接口，只有實現這個接口後，然後在類中重寫Object中的clone()方法，然後通過類調用clone方法才能克隆成功，如果不實現這個接口，則會拋出CloneNotSupportedException(克隆不支持)異常。
Serializable：標識接口，標識這該類可序列化及反序列化。

HashMap數據結構

默認配置參數

	/**
     * The default initial capacity - MUST be a power of two.
     */
     //默認初始化容量：16 ，必須是2的n次冪
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
     //最大容量：2的30次方 
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.
     */
     // 負載因子 0.75
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     */
     // 鏈表轉成紅黑樹的閾值，在存儲數據時，當鏈表長度 > 8 時，則將鏈表轉換成紅黑樹
    static final int TREEIFY_THRESHOLD = 8;

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     */
     //當原有的紅黑樹內數量 < 6 時，則將 紅黑樹轉換成鏈表
    static final int UNTREEIFY_THRESHOLD = 6;

    /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     */
     // 當哈希表中的容量 > 該值時，才允許樹形化鏈表 （即 將鏈表 轉換成 紅黑樹）
     // 否則，若桶內元素太多時，則直接擴容，而不是樹形化
     // 爲了避免進行擴容、樹形化選擇的衝突，這個值不能小於 4 * TREEIFY_THRESHOLD
    static final int MIN_TREEIFY_CAPACITY = 64;

存儲結構：
HashMap 內部包含了一個 Node 類型的數組 table，根據hash值確定數組下標，Node有個next字段，採用拉鍊法來解決衝突，相同的hash值在同一個鏈表中。

transient Node<K,V>[] table;

static class Node<K,V> implements Map.Entry<K,V> {
	final int hash;
    final K key;
    V value;
    Node<K,V> next;
    
    public final int hashCode() {
    	return Objects.hashCode(key) ^ Objects.hashCode(value);
    }

	public final boolean equals(Object o) {
        if(o == this)
            return true;
        if (o instanceof Map.Entry) {
            Map.Entry<?,?> e = (Map.Entry<?,?>)o;
            if (Objects.equals(key, e.getKey()) &&
                Objects.equals(value, e.getValue()))
                return true;
        }
        return false;
    }
}

紅黑樹節點數據結構：HashMap在JDK1.8中引入了紅黑樹，當鏈表長度 > TREEIFY_THRESHOLD=8時，鏈表將轉換成紅黑樹

/**
 * Entryfor Tree bins. Extends LinkedHashMap.Entry (which in turn
 * extends Node) so can be used as extension of either regular or
 * linked node.
 */
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
    TreeNode<K,V> parent;  // red-black tree links
    TreeNode<K,V> left;
    TreeNode<K,V> right;
    TreeNode<K,V> prev;    // needed to unlink next upon deletion
    boolean red;
    TreeNode(int hash, K key, V val, Node<K,V> next) {
        super(hash, key, val, next);
    }
}

static class Entry<K,V> extends HashMap.Node<K,V> {
    Entry<K,V> before, after;
    Entry(int hash, K key, V value, Node<K,V> next) {
        super(hash, key, value, next);
    }
}

看上面的繼承關係，發現TreeNode是Node的子類

小思考：
爲什麼TreeNode不直接繼承Node？

相關成員變量

	/**
     * Holds cached entrySet(). Note that AbstractMap fields are used
     * for keySet() and values().
     */
    transient Set<Map.Entry<K,V>> entrySet;

    /**
     * The number of key-value mappings contained in this map.
     */
    transient int size;

    /**
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     */
    transient int modCount;

    /**
     * The next size value at which to resize (capacity * load factor).
     *
     * @serial
     */
    // (The javadoc description is true upon serialization.
    // Additionally, if the table array has not been allocated, this
    // field holds the initial array capacity, or zero signifying
    // DEFAULT_INITIAL_CAPACITY.)
    // threshold = capacity * load factor,當size大於這個值，就需要resize了
    int threshold;

    /**
     * The load factor for the hash table.
     *
     * @serial
     */
    final float loadFactor;

HashMap構造方法

空構造，設置默認初始容量爲16，負載因子爲0.75，注意，這裏沒有創建table數組
在構造函數中，我們可以看到，HashMap並沒有創建table數組，只是初始化了容量和負載因子
也就是說，實際創建table數組是在後面put()操作時完成的。

/**
 * Constructs an empty <tt>HashMap</tt> with the default initial capacity
 * (16) and the default load factor (0.75).
 */
public HashMap() {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

指定初始容量的構造方法

/**
 * Constructs an empty <tt>HashMap</tt> with the specified initial
 * capacity and the default load factor (0.75).
 *
 * @param  initialCapacity the initial capacity.
 * @throws IllegalArgumentException if the initial capacity is negative.
 */
public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}

指定初始容量及負載因子的構造方法

/**
 * Constructsan empty <tt>HashMap</tt> with the specified initial
 * capacity and load factor.
 *
 * @param  initialCapacity the initial capacity
 * @param  loadFactor      the load factor
 * @throws IllegalArgumentException if the initial capacity is negative
 *         or the load factor is nonpositive
 */
public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    this.loadFactor = loadFactor;
    this.threshold = tableSizeFor(initialCapacity);
}

tableSizeFor()是計算出不小於initialCapacity的最小的2的冪的結果，設計的很巧妙，來看看如何實現的

/**
 * Returns a power of two size for the given target capacity.
 */
static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

如果cap是2的次冪，那麼先減1，最後再加1，return的還是cap本身
如果cap是1，那麼n是0，無論n怎麼操作，最後都是0，再加1，返回的還是cap
如果cap是0或者是小於0的數，那麼n是負數，無符號右移再或操作，n還是負數，最後return的是1
如果cap大於1，即n大於0，tableSizeFor()計算如下：
- n不爲0，至少有一位爲1，假設第i位爲n的二進制表示中爲1的最高一位
- 無符號右移1位後，第i-1位爲1，再與n進行或操作，那麼第i位和第i-1位爲1，也就是說至少有2位爲1，最高位和第二高位值必定爲1
- 無符號右移2位後，第i-2和第i-3位爲1，再與n進行或操作，那麼第i、i-1、i-2、i-3位爲1，也就是說至少有4位爲1，最高的4位值爲1
- 以此類推，無符號右移16位後，最多也就32個1，同時可以注意到MAXIMUM_CAPACITY = 1 << 30，最多30個1
- 所以這樣操作之後，從第i位開始後面的值全爲1
- 最後再加1，得到的就是不小於initialCapacity的最小2的整數次冪

小思考：
1、前面兩個構造方法就沒有對threshold賦值，爲什麼這個構造方法就要賦值呢
2、應該是 threshold = tableSizeFor(initialCapacity) * loadFactor;爲什麼這裏直接爲 threshold = tableSizeFor(initialCapacity);

使用指定的Map構造一個HashMap

/**
 * Constructs a new <tt>HashMap</tt> with the same mappings as the
 * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with
 * default load factor (0.75) and an initial capacity sufficient to
 * hold the mappings in the specified <tt>Map</tt>.
 *
 * @param   m the map whose mappings are to be placed in this map
 * @throws  NullPointerException if the specified map is null
 */
public HashMap(Map<? extends K, ? extends V> m) {
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    putMapEntries(m, false);
}

添加元素

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

/**
 * Implements Map.put and related methods
 *
 * @param hash hash for key
 * @param key the key
 * @param value the value to put
 * @param onlyIfAbsent if true, don't change existing value
 * @param evict if false, the table is in creation mode.
 * @return previous value, or null if none
 */
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

首先看看table數組是如何創建的

if ((tab = table) == null || (n = tab.length) == 0)
	n = (tab = resize()).length;

會先判斷table是否爲空，爲空時調用resize()方法，返回值賦值給tab數組，具體如何實現的，後面擴容部分再分析

resize()創建table或擴容

計算新的容量 newCap
計算新的擴容閾值 newThr
創建新table newTab
將舊table中的元素重新放入新table裏
- 遍歷table數組
- 遍歷鏈表或紅黑樹
- 每個非空節點，先設置爲null，然後重新計算新table數組下標，插入到新table中

/**
 * Initializes or doubles table size.  If null, allocates in
 * accord with initial capacity target held in field threshold.
 * Otherwise, because we are using power-of-two expansion, the
 * elements from each bin must either stay at same index, or move
 * with a power of two offset in the new table.
 *
 * @return the table
 */
final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // preserve order
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

如何根據key計算出鍵值對要存放的數組下標

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
// 數組下標
i = (n - 1) & hash; // n爲table的length,是2的整數次冪

HashMap允許空鍵，如果鍵爲null，就放到table下標爲0的位置上
對於非null的鍵，計算數組下標，可以轉換爲下面代碼

public int indexOf(Object key) {
	int h = key.hashCode();
	return (h ^ (h >>> 16)) & (length - 1);
}

爲什麼不直接 h & (length - 1)，h >>> 16 是什麼，有什麼用呢
文檔上給出的解釋是 Computes key.hashCode() and spreads (XORs) higher bits of hash to lower.
將hashCode的高16位擴展到低16位，其目的也是爲了讓hash更散列
h >>> 16 無符號右移16位，得到h的高16位，然後和h進行異或操作，可以看作是h的高16位和h的低位進行異或操作，這樣高低位數據權重保留

1010 1010 0001 0100 1111 0101	h=11146485
0000 0000 0000 0000 1010 1010	>>>16
1010 1010 0001 0100 0101 1111	^	//高低位數據權重保留
1111 1111 1111 1111 1111 1111	(length -1)=16777215
1010 1010 0001 0100 1111 0101	& 結果=11146485//高低位數據的變化影響都有保留，儘可能地離散

接着後面就是一個if…else…，檢查要插入結點的位置是否爲空

if ((p = tab[i = (n - 1) & hash]) == null)
    tab[i] = newNode(hash, key, value, null);
else {
	// 發生碰撞，解決衝突
}

i = (n - 1) & hash 是根據hash來計算出數組下標
如果插入節點位置爲空，直接創建Node節點，放入table中，創建節點newNode()

Node<K,V> newNode(int hash, K key, V value, Node<K,V> next) {
    return new Node<>(hash, key, value, next);
}

當節點位置不爲空時，發生碰撞，解決衝突

當發生碰撞，如何解決衝突

產生了衝突，那麼有兩種情況：key相同，key不同
如何判斷key是否相同
- 首先key的hash相同
- 其次，key是同一個對象或者key調用equals()方法爲true

// p是已存在的節點，key是新節點的key
p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))

如果p是TreeNode的實例，說明p下面是紅黑樹，需要在樹中找到一個合適的位置插入
p下面的結點數未超過8，則以單向鏈表的形式存在，逐個往下判斷：
- ①如果下一個位爲空，插入鏈表尾部，並且判斷當插入後容量超過8則轉化成紅黑樹。
- ②如果下一個位有相等的hash值，則覆蓋value（節點還是同一個節點，只修改了value值）。

// p = tab[i = (n - 1) & hash]
Node<K,V> e; K k;
if (p.hash == hash &&
    ((k = p.key) == key || (key != null && key.equals(k))))
    e = p; // e存放的是key相同的那個節點
else if (p instanceof TreeNode)
    e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
    for (int binCount = 0; ; ++binCount) {
        if ((e = p.next) == null) { 
            p.next = newNode(hash, key, value, null);//不存在key，新節點插到尾部
            if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                treeifyBin(tab, hash);// 當插入後容量超過8，鏈表轉化爲紅黑樹
            break;
        }
        if (e.hash == hash &&
            ((k = e.key) == key || (key != null && key.equals(k))))
            break;
        p = e;
    }
}
// key已經存在
if (e != null) { // existing mapping for key
    V oldValue = e.value;
    if (!onlyIfAbsent || oldValue == null)
        e.value = value;
    afterNodeAccess(e);
    return oldValue;
}

鏈表如何轉換爲紅黑樹

如果元素數組爲空或者數組長度小於樹結構化的最小限制(MIN_TREEIFY_CAPACITY = 64)，此時可以通過擴容來使元素散列更均勻，不需要轉換紅黑樹
當數組長度大於64（且鏈表長度大於8）時，要將鏈表轉換成紅黑樹結構
樹形化
- 首先要先將Node節點轉換爲TreeNode
- 然後單鏈錶轉換爲雙鏈表
- 然後將雙鏈錶轉換爲紅黑樹

/**
 * Replaces all linked nodes in bin at index for given hash unless
 * table is too small, in which case resizes instead.
 */
final void treeifyBin(Node<K,V>[] tab, int hash) {
    int n, index; Node<K,V> e;
    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
        resize();
    else if ((e = tab[index = (n - 1) & hash]) != null) {
        TreeNode<K,V> hd = null, tl = null;
        // 下面的循環是遍歷鏈表，將單鏈表Node變成雙鏈表TreeNode
        do {
            TreeNode<K,V> p = replacementTreeNode(e, null);
            if (tl == null)
                hd = p;
            else {
                p.prev = tl;
                tl.next = p;
            }
            tl = p;
        } while ((e = e.next) != null);
        // 將頭節點放到table裏
        if ((tab[index] = hd) != null)
            hd.treeify(tab); //頭節點調用treeify方法，從該節點開始將其轉換爲樹形結構
    }
}

TreeNode<K,V> replacementTreeNode(Node<K,V> p, Node<K,V> next) {
    return new TreeNode<>(p.hash, p.key, p.value, next);
}

最後判斷size大小是否超過擴容閾值

++modCount; // 這個是修改的標識
if (++size > threshold)
    resize();

插入元素後，++size
先插入元素，再進行擴容判斷（注意：JDK1.7是先擴容，再插入元素）

調用resize()的時機

初始化後第一次put()操作插入元素時，此時只是創建table數組
插入元素後，如果size超過擴容閾值（threshold = capacity * loadFactor），擴容
當鏈表元素超過8個時，元素數組爲空或者數組長度小於樹結構化的最小限制(MIN_TREEIFY_CAPACITY = 64)，此時可以通過擴容來使元素散列更均勻，不需要轉換紅黑樹

putAll(map)

創建一個新的Map結構，使用putAll()方法把原先的Map添加到新的Map中
遍歷原先的Map，將每個元素放入新的Map裏，元素的值還是原先Map中元素對象的引用，並不是新創建key，value對象放入新Map裏

public void putAll(Map<? extends K, ? extends V> m) {
    putMapEntries(m, true);
}

final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
    int s = m.size();
    if (s > 0) {
        if (table == null) { // pre-size
            float ft = ((float)s / loadFactor) + 1.0F;
            int t = ((ft < (float)MAXIMUM_CAPACITY) ?
                     (int)ft : MAXIMUM_CAPACITY);
            if (t > threshold)
                threshold = tableSizeFor(t);
        }
        else if (s > threshold) // 這裏不是 s + size() > threshold,因爲JDK8中是在元素put進去之後纔會擴容的
            resize();
        for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
            K key = e.getKey();
            V value = e.getValue();
            putVal(hash(key), key, value, false, evict);
        }
    }
}

【Java容器】HashMap源碼分析(一)

HashMap特點

HashMap定義

HashMap數據結構

HashMap構造方法

添加元素

首先看看table數組是如何創建的

resize()創建table或擴容

如何根據key計算出鍵值對要存放的數組下標

接着後面就是一個if…else…，檢查要插入結點的位置是否爲空

當發生碰撞，如何解決衝突

鏈表如何轉換爲紅黑樹

最後判斷size大小是否超過擴容閾值

調用resize()的時機

putAll(map)

ziw2pdf

apisix~helm方式的部署到k8s

firmeye - IoT固件漏洞挖掘工具

【劍指 Offer 題解】55.2 平衡二叉樹

【劍指 Offer 題解】55.1 二叉樹的深度

【代碼小技巧】枚舉類

【劍指 Offer 題解】47. 禮物的最大價值

【劍指 Offer 題解】45. 把數組排成最小的數

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結