引子

最近在看小灰灰算法.裏面有一節講到散列表相關的比較有意思. 本文記錄散列表相關, 以及JDK中的hashmap concurrenthashmap 是如何優化的.更多拾遺系列文章

書中所說

解決散列表衝突時候的2種辦法:

開放尋址法
如下圖, 我們想放入Entry6 (hash後需要放的位置是3)

但是位置3上面已經有Entry5, 那麼我們會**向數組後面接着找下一個有空的位置**. 這就是開放尋址法.

比如上個例子中我們會把entry6 放到位置4上:
TIPS:
ThreadLocal使用的是開放尋址法

(內心戲, 原來這些東西離我們真的不遠, 查看ThreadLocal實現):

        private void set(ThreadLocal<?> key, Object value) {

            // We don't use a fast path as with get() because it is at
            // least as common to use set() to create new entries as
            // it is to replace existing ones, in which case, a fast
            // path would fail more often than not.

            Entry[] tab = table;
            int len = tab.length;
            int i = key.threadLocalHashCode & (len-1);

            for (Entry e = tab[i];   // line 1
                 e != null;
                 e = tab[i = nextIndex(i, len)]) {  // line2
                ThreadLocal<?> k = e.get();

                if (k == key) {  // line 3
                    e.value = value;
                    return;
                }

                if (k == null) {  // line4
                    replaceStaleEntry(key, value, i);
                    return;
                }
            }

            tab[i] = new Entry(key, value);
            int sz = ++size;
            if (!cleanSomeSlots(i, sz) && sz >= threshold)
                rehash();

// 分析:
/**
* line1: 初始化爲第一個要找的位置
* line2: for循環的下一個位置
* line3: 如果是當前key則設置值.
* line4: 如果當前的位置沒有數據則放到當前位置.
* 問題的關鍵就是nextIndex方法:
***/
        private static int nextIndex(int i, int len) {
            return ((i + 1 < len) ? i + 1 : 0);
        }
這就是當前位置的下一個位置(超出則回到0號位置.)

鏈表法
鏈表法就是當發生衝突時, 我們會把發生衝突的元素鏈接到前一個元素的後面:
(像下面這樣在位置2便形成了一個鏈表)

JDK的實現分析

這裏面提到了一些. 有興趣的可以看看HashMap的死鎖問題解決. 這裏主要介紹下jdk1.7/jdk8中的hashmap和concurrenthashmap的實現的一些小細節和改動

JDK 1.7的hashmap

完整的鏈表實現. 但是如果衝突嚴重(某個位置的鏈表的長度很長很長)的時候,會導致查詢效率下降(O(N)的複雜度了).

jdk7的put方法:
    public V put(K key, V value) {
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value); // 從這裏可以看出是支持NULL key的
        int hash = hash(key);
        int i = indexFor(hash, table.length);  // 計算當前要插入key所在的位置
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {  // 遍歷當前位置i上的所有元素
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value; // 如果有key相同的,  就更新他的值
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);  // 如果沒有就添加一個新的元素在指定位置

// addEntry實現:
    void addEntry(int hash, K key, V value, int bucketIndex) {
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length); // 如果需要resize的話
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }
      // 直接加一個entry放到index就可以了
        createEntry(hash, key, value, bucketIndex);
    }

    void createEntry(int hash, K key, V value, int bucketIndex) {
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }

JDK 8的hashmap

爲了解決這個問題, 在JDK8中當鏈表長度達到一定長度後(默認是8), 會將鏈表轉化爲一顆紅黑樹. 這是一個很大的不同.

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null); // 初始化 如果該位置還沒東西的時候
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p; // 找到了一個相等的節點
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
                // 如果該節點已經變成一棵樹了的話 調用樹的putTreeVal方法
            else {
            // 否則, 這時候數組裏面的元素還是鏈表形式
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);  // 如果插入後的元素 超過了閾值就切換爲一棵樹
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

// 將某個index的鏈表的所有元素 切換爲紅黑樹
    /**
     * Replaces all linked nodes in bin at index for given hash unless
     * table is too small, in which case resizes instead.
     */
    final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            TreeNode<K,V> hd = null, tl = null;
            do {
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            if ((tab[index] = hd) != null)
                hd.treeify(tab);
        }
    }

// TreeNode的定義:
    static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;  // 很明顯這就是顆紅黑樹啦.

JDK 1.7的ConcurrentHashMap

JDK1.7中的CHM是以Segment爲同步單位的. 一個Segement可以保護多個Key(實際上可以理解爲是一個小的hashmap). Segement本身繼承自ReentrantLock

    final Segment<K,V>[] segments;  // chm中的segement數組定義
    static final class Segment<K,V> extends ReentrantLock implements Serializable

put時:

    public V put(K key, V value) {
        Segment<K,V> s;
        if (value == null)
            throw new NullPointerException();
        int hash = hash(key);
        int j = (hash >>> segmentShift) & segmentMask;
        if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck
             (segments, (j << SSHIFT) + SBASE)) == null) //  in ensureSegment
            s = ensureSegment(j);
            // 第一步找到segement
        return s.put(key, hash, value, false); // 第二步將元素加入到這個segement

// segement.put 實現
HashEntry<K,V> node = tryLock() ? null :
                scanAndLockForPut(key, hash, value);  // 先獲取鎖  也就是調用自身的lock方法
            V oldValue;
            try {
            // 跟hashmap類似的put操作
                HashEntry<K,V>[] tab = table;
                int index = (tab.length - 1) & hash;
                HashEntry<K,V> first = entryAt(tab, index);
                for (HashEntry<K,V> e = first;;) {
                    if (e != null) {
                        K k;
                        if ((k = e.key) == key ||
                            (e.hash == hash && key.equals(k))) {
                            oldValue = e.value;
                            if (!onlyIfAbsent) {
                                e.value = value;
                                ++modCount;
                            }
                            break;
                        }
                        e = e.next;
                    }
                    else {
                        if (node != null)
                            node.setNext(first);
                        else
                            node = new HashEntry<K,V>(hash, key, value, first);
                        int c = count + 1;
                        if (c > threshold && tab.length < MAXIMUM_CAPACITY)
                            rehash(node);
                        else
                            setEntryAt(tab, index, node);
                        ++modCount;
                        count = c;
                        oldValue = null;
                        break;
                    }
                }
            } finally {
            // 最後解鎖
                unlock();
            }

get時:

 Segment<K,V> s; // manually integrate access methods to reduce overhead
        HashEntry<K,V>[] tab;
        int h = hash(key);
        // 這裏調用了cas相關的getvolatile相關方法來保證可見性
        long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
        if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
            (tab = s.table) != null) {
            for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
                     (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
                 e != null; e = e.next) {
                K k;
                if ((k = e.key) == key || (e.hash == h && key.equals(k)))
                    return e.value;
            }
        }
        return null;

JDK8的ConcurrentHashMap

在jdk8中, 做了一項很重要的改進就是, 縮小了鎖的同步粒度從而提高了併發能力. 結構也發生了變化.
新的chm中的主要結構不是segements, 而是跟hashmap裏面一樣的就是node數組:

transient volatile Node<K,V>[] table;

然後有很多特殊類型的node:
比如ForwardingNode是用來resize的時候跟蹤被resize的節點的. ReservationNode是用來檢測compute或者computeIfAbsent的時候的遞歸調用的. TreeBin用來維護table中的首節點(不維護真實的key,value). TreeNode維護真實數據的樹形結構的節點. (是的hashmap中的紅黑樹在chm中也有保留).
put時:

    final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode()); // 找到hashcode, 這裏有重hash可以防止用戶生成的hashcode衝突太多
        int binCount = 0;
        for (Node<K,V>[] tab = table;;) { // 跟hashmap類似的遍歷table中的每個首節點
            Node<K,V> f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();  // 延遲初始化的  先檢查一下
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f); // 如果是正在transfer就調用forwardingnode相關邏輯
            else {
                V oldVal = null;
                synchronized (f) { // 同步加鎖了.  此時f只是table中的一個節點    可以想象只有hash出來的node是同一個的時候纔會有衝突.   
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i); // 也會做紅黑樹轉換.
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;

get時:

    public V get(Object key) {
        Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
        int h = spread(key.hashCode()); // 計算hash   
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (e = tabAt(tab, (n - 1) & h)) != null) {
            if ((eh = e.hash) == h) {
                if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                    return e.val;
            }
            else if (eh < 0) // 正在transfer的節點
                return (p = e.find(h, key)) != null ? p.val : null;
            while ((e = e.next) != null) {
                if (e.hash == h &&
                    ((ek = e.key) == key || (ek != null && key.equals(ek))))
                    return e.val;
            }
        }
        return null;
    }

scugxl

發佈了78 篇原創文章 · 獲贊 30 · 訪問量 25萬+

私信關注

Java拾遺03- 各個時期的HashMap和ConcurrentHashMap

引子

書中所說

解決散列表衝突時候的2種辦法:

JDK的實現分析

JDK 1.7的hashmap

JDK 8的hashmap

JDK 1.7的ConcurrentHashMap

JDK8的ConcurrentHashMap

如何使用 JS 判斷用戶是否處於活躍狀態

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

[Springboot編程思想]ch2-springboot是如何啓動的?

Springboot shutdown 耗時太長的分析使用btrace

[Springboot編程思想]ch7-spring的組合註解

使用springboottest和h2來構建數據庫測試的採坑記錄

kafka的複製實現和調試

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結