HashMap源代碼.by 1.8

HashMap

HashMap是一個k-v的查找表

class HashMap<>{
    //表，在第一次使用時初始化，並根據需要調整大小。分配時，長度總是2的冪（在某些操作中，我們還允許長度爲零，以允許當前不需要的引導機制）。 
    transient Node<K,V>[] table;
    //entry的緩存
    transient Set<Map.Entry<K,V>> entrySet;
    //大小
    transient int size;
    //結構改變次數,用於快速失敗
    transient int modCount;
    //負載因子
    int threshold;
}

HashMap使用key的hash確定存放在數組中的位置,使用拉鍊法處理hash衝突,拉鍊使用鏈表(len<=8)或紅黑樹(>8)

putVal函數

如下是put函數的實現

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);//鏈表轉紅黑樹
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

可以看到無論如何都會進行第一個操作,即是檢查table有沒有初始化(tab==null || tab.length==0),如果沒有初始化則將進行resize()初始化

後續的代碼看起來比較長,但其實可以分爲兩種情況進行討論

key存在映射值
key不存在映射值

若存在映射值(通過equals),則說明此次僅有可能覆蓋舊值,否則說明此次必然會造成一次結構的改變

因爲存儲結構是數組,因此需要使用hash計算對應數組的位置

n=tab.length
index=(n - 1) & hash

這就是爲什麼需要保證tab.length是2的冪次.(n-1)&hash==hash%n

在netty中的LoopChooser也有一個實現是使用這種方式的.

如果index上是null,則說明沒有equals(k).

而index是可能存在衝突的,即index上有值,此時就需要查找RBTree和Linked中的node比是否存在equals(k)

查找後,根據e的值判斷是否存在old.node

爲什麼說是有可能覆蓋舊值,這裏使用e作爲是否存在舊值existing mapping for key

if (e != null) { // existing mapping for key
    V oldValue = e.value;
    if (!onlyIfAbsent || oldValue == null)
        e.value = value;
    afterNodeAccess(e);
    return oldValue;
}

這裏根據onlyIfAbsent和oldValue決定是否將新值替換舊值,然後返回old.value.

而對於沒有找到舊值存在的情況,則說明發生了結構修改

 ++modCount;
if (++size > threshold)
    resize();
afterNodeInsertion(evict);//對於HashMap是空實現

除此之外,存在一個特殊的結構性修改

即拉鍊新增了一個node,這種情況可能導致鏈表模型的改變

for (int binCount = 0; ; ++binCount) {
    if ((e = p.next) == null) {
        p.next = newNode(hash, key, value, null);
        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
            treeifyBin(tab, hash);//關鍵代碼
        break;
    }
    if (e.hash == hash &&
        ((k = e.key) == key || (key != null && key.equals(k))))
        break;
    p = e;
}

TREEIFY_THRESHOLD是final的默認值是8,並且沒有提供修改的方式.因此這裏的treeifyBin臨界點爲8

然而,即使調用了這個函數,也不一定進行樹化,還有另外一個條件tab.length>=MIN_TREEIFY_CAPACITY,這個MIN_TREEIFY_CAPACITY默認值是64,仍然是使用的final值.

也就是說**linked.size>=8 && tab.length>=64**,纔會進行樹化的操作,這和網上大部分的說明只要linked.size>=8的條件完全不一樣.

final void treeifyBin(Node<K,V>[] tab, int hash) {
    int n, index; Node<K,V> e;
    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
        resize();
    else if ((e = tab[index = (n - 1) & hash]) != null) {
        TreeNode<K,V> hd = null, tl = null;
        do {
            TreeNode<K,V> p = replacementTreeNode(e, null);
            if (tl == null)
                hd = p;
            else {
                p.prev = tl;
                tl.next = p;
            }
            tl = p;
        } while ((e = e.next) != null);
        if ((tab[index] = hd) != null)
            hd.treeify(tab);
    }
}

回到putVal函數,最後則是計算增加後的size是否到達了resize的閾值.如果達到的話,則進行resize

getNode函數

下面是get函數

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}
final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

相對於put函數,get顯得更加簡單了不少

計算index
比對index的第一個hash值和equals,如果相同則直接返回
在RBTree或Linked中查找

這裏要說明一下爲什麼Node中會存儲一個hash,其實這裏有兩個作用

避免多次計算同一個hash
對於可變的key,如果key改變的內容不會導致equals失敗,那麼HashMap還是能夠正常工作的

另外還有一個使用上的注意點

不要使用如下的代碼來進行檢查是否存在k-v映射

map.get(k)==null;

使用containsKey函數

public boolean containsKey(Object key) {
    return getNode(hash(key), key) != null;
}

因爲HashMap是可以存儲null值的.但即使是null值,也是對應了一個Node對象的

remove函數

下面是remove方法

public V remove(Object key) {
    Node<K,V> e;
    return (e = removeNode(hash(key), key, null, false, true)) == null ?
        null : e.value;
}
    final Node<K,V> removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node<K,V>[] tab; Node<K,V> p; int n, index;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) != null) {
            Node<K,V> node = null, e; K k; V v;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                node = p;
            else if ((e = p.next) != null) {
                if (p instanceof TreeNode)
                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
                else {
                    do {
                        if (e.hash == hash &&
                            ((k = e.key) == key ||
                             (key != null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while ((e = e.next) != null);
                }
            }
            if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
                if (node instanceof TreeNode)
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
                else if (node == p)
                    tab[index] = node.next;
                else
                    p.next = node.next;
                ++modCount;
                --size;
                afterNodeRemoval(node);
                return node;
            }
        }
        return null;
    }

套路和put函數差不多

key存在映射值
key不存在映射值

不存在映射值沒什麼好說的,直接返回null.

如果存在映射值,根據key查找映射的Node

查找的過程分爲3種

Node直接存儲在tab上
Node存儲在RBTree
Node存儲在Linked

找到了符合映射的Node後,按照matchValue進行操作

matchValue==false
matchValue==true

如果不需要匹配Value,則進行刪除,自然刪除的過程也分爲3種,這裏有個小技巧

if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
    if (node instanceof TreeNode)
        ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
    else if (node == p)
        tab[index] = node.next;
    else
        p.next = node.next;
    ++modCount;
    --size;
    afterNodeRemoval(node);
    return node;
}

刪除鏈表

對於鏈表的非首非null節點,找到前置節點,pre.next=pre.next.next即可

remove如果操作到了節點自然也會產生結構性的變化

afterNodeRemoval仍然是空實現

這裏有個非常神奇的操作,remove使得數據量下降時,HashMap是沒有進行resize()操作的.

搜索一下resize()的引用,說明只有"添加"的時候纔會進行resize(treeifyBin函數就是試圖樹化的操作)

![1558452161710](C:\work\document\jdk源碼\resize的引用

但是有可能會進行鏈化操作,即untreeify函數.

下面是resize

    final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

前面一段是在計算newCap的大小,同樣分爲幾種情況討論

map尚未初始化
當前map的容量已經達到了上限
map容量尚未到達上限

在map已經初始化,並且尚未達到上限的情況下,newCap總是oldCap的2倍,這也是爲什麼最大容量的值爲1<<30的原因:

int的最大值爲1<<31-1,如果超過這個值則會變成負數.導致各種判斷都存在問題,因此

if (oldCap >= MAXIMUM_CAPACITY) {
    threshold = Integer.MAX_VALUE;
    return oldTab;
}

不進行擴容,並將threshold設置爲Integer.MAX_VALUE.

如果目前符合限制,則會嘗試計算下次resize的閾值.即newThr=oldThr<<1

若map尚未初始化,並且指定了threshold其實表明了期望初始化後的容量值

if (oldThr > 0) // initial capacity was placed in threshold
	newCap = oldThr;

否則初始化將會使用默認值DEFAULT_INITIAL_CAPACITY16,而resize閾值會被設置爲(int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); DEFAULT_LOAD_FACTOR爲0.75

最後會進行一次託底操作

if (newThr == 0) {
    float ft = (float)newCap * loadFactor;
    newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
              (int)ft : Integer.MAX_VALUE);
}

其實針對的情況爲map尚未初始化,並且指定了threshold

至此,已經得出newCap的值,隨後直接創建一個newCap的數組替換table

然後進行數據遷移的操作,同樣可以分爲幾種情況討論

當前index無值
當前index的值不存在鏈路
當前index爲RBTree
當前index爲linked

1.2情況無需討論,先看一下簡單的情況4

如果當前節點存在拉鍊,說明拉鍊上的節點(n-1)&hash值都是相同的

而由於n是2的冪次,因此,這裏又會分爲兩種情況

n的二進制位中的1的位置對於hash值是1
n的二進制位中的1的位置對於hash值是0

由於newCap=oldCap<<1;因此若爲情況2,(newCap-1)&hash==(old-1)&hash的值相同,該節點仍然處於當前位置.而對於情況1,(newCap-1)&hash=((old-1)&hash)|old.又因爲old是2的冪次,因此((old-1)&hash)|old==((old-1)&hash)+old

這種情況可能會生成2個鏈表,這裏使用的是尾插法

情況3

這裏要說明一下TreeNode是繼承於Node的,因此鏈表結構是仍然存在的.

final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
    TreeNode<K,V> b = this;
    // Relink into lo and hi lists, preserving order
    TreeNode<K,V> loHead = null, loTail = null;
    TreeNode<K,V> hiHead = null, hiTail = null;
    int lc = 0, hc = 0;
    for (TreeNode<K,V> e = b, next; e != null; e = next) {
        next = (TreeNode<K,V>)e.next;
        e.next = null;
        if ((e.hash & bit) == 0) {
            if ((e.prev = loTail) == null)
                loHead = e;
            else
                loTail.next = e;
            loTail = e;
            ++lc;
        }
        else {
            if ((e.prev = hiTail) == null)
                hiHead = e;
            else
                hiTail.next = e;
            hiTail = e;
            ++hc;
        }
    }

    if (loHead != null) {
        if (lc <= UNTREEIFY_THRESHOLD)
            tab[index] = loHead.untreeify(map);
        else {
            tab[index] = loHead;
            if (hiHead != null) // (else is already treeified)
                loHead.treeify(tab);
        }
    }
    if (hiHead != null) {
        if (hc <= UNTREEIFY_THRESHOLD)
            tab[index + bit] = hiHead.untreeify(map);
        else {
            tab[index + bit] = hiHead;
            if (loHead != null)
                hiHead.treeify(tab);
        }
    }
}

這裏用了一種比較巧妙的方式:

整理鏈表狀態
如果鏈表的長度<=UNTREEIFY_THRESHOLD,則調用untreeify將RBTree退化爲Linked
否則使用鏈表結構重新treeify

這裏還用了一個技巧,如果整理loHead的時候,發現沒有hiHead,說明樹的結構沒有改變,反之也相同.

UNTREEIFY_THRESHOLD這個值是final的6

untreeify即鏈化操作,僅會在split和removeTreeNode中進行操作,而split僅在此調用

關於TreeNode繼承了Node結構,還有一個經常使用的地方

containsValue函數

public boolean containsValue(Object value) {
    Node<K,V>[] tab; V v;
    if ((tab = table) != null && size > 0) {
        for (int i = 0; i < tab.length; ++i) {
            for (Node<K,V> e = tab[i]; e != null; e = e.next) {
                if ((v = e.value) == value ||
                    (value != null && value.equals(v)))
                    return true;
            }
        }
    }
    return false;
}

另外還有幾個優化的點

public Set<K> keySet() {
    Set<K> ks = keySet;
    if (ks == null) {
        ks = new KeySet();
        keySet = ks;//緩存
    }
    return ks;
}
public Collection<V> values() {
    Collection<V> vs = values;
    if (vs == null) {
        vs = new Values();
        values = vs;//緩存
    }
    return vs;
}
//這個函數在使用initialCapacity的構造器和putMapEntries中使用,目的是爲了計算出大於cap的最小2冪次數
static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;//計算掩碼
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

這裏之所以可以使用緩存進行優化,是因爲KeySet和Values實際上訪問的都是實際存儲數據的table

computeIfAbsent不存在時計算.//計算時不會使用oldValue

computeIfPresent存在時計算.//計算時會使用oldValue

需要注意的是如果計算的值爲null,是不會添加到map中的,

compute在計算新值後,如果爲null,會移除掉原有的Node,即ContainsKey(k)==false

computeIfAbsent和compute都有可能引發結構的變化

merge

如果指定的鍵尚未與值關聯或與null關聯，則將其與給定的非空值關聯。

最後,關於RBTree部分

因爲是一個有序的樹結構,因此需要比較

如果實現了Compareable接口,則可以比較,(即使是Compareable接口,如果比較後相同,仍然會繼續),這裏使用了comparableClassFor函數
否則使用getClass進行比較
getClass相同的情況下,使用identityHashCode的值進行比較

//如果它的形式爲“class C implements omparable <C>”，則返回x的Class，否則返回null。
static Class<?> comparableClassFor(Object x) {
    if (x instanceof Comparable) {
        Class<?> c; Type[] ts, as; Type t; ParameterizedType p;
        if ((c = x.getClass()) == String.class) // bypass checks
            return c;
        if ((ts = c.getGenericInterfaces()) != null) {
            for (int i = 0; i < ts.length; ++i) {
                if (((t = ts[i]) instanceof ParameterizedType) &&
                    ((p = (ParameterizedType)t).getRawType() ==
                     Comparable.class) &&
                    (as = p.getActualTypeArguments()) != null &&
                    as.length == 1 && as[0] == c) // type arg is c
                    return c;
            }
        }
    }
    return null;
}

static int compareComparables(Class<?> kc, Object k, Object x) {
    return (x == null || x.getClass() != kc ? 0 :
            ((Comparable)k).compareTo(x));
}

HashMap源代碼.by 1.8

HashMap

putVal函數

getNode函數

remove函數

裁員了！別錯過2024年大數據工程師必備的10項技能

如何熟悉一個陌生系統

更換容器內的源

【安裝部署】Apache SeaTunnel 和 Web快速安裝詳解

一個.NET開源的功能豐富、靈活易用的 Windows 窗口增強神器

揭祕智能寫手GPT的測試報告生成技巧

低代碼集成Java系列：高效構建自定義插件

RestTemplate和web的一些坑

netty服務端啓動過程分析

mybatis執行原理

使用JDK動態代理實現裝飾器

通用的單進程執行

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結