ConcurrentHashMap原理分析

ConcurrentHashMap原理分析

很多網上的面試筆試題集錦都有關於HashTable和HashMap的區別，比如HashTable是線程安全的，key值不允許爲空；而HashMap不是線程安全的，key值允許爲空；兩者的父不同，一個是Directory，一個是Map; 由於HashMap不是線程同步的，如果需要使用一個線程同步的HashMap，則需要額外進行同步的邏輯代碼編寫；或者也可以使用CollectionUtils提供的synchronizedMap()方法，該方法會返回一個線程同步的Map，這種方法也會額外增加同步的代價。JDK1.5提供了ConcurrentHashMap提供了簡單、安全且代價較小的HashMap同步。

一、ConcurrentHashMap同步原理概述

不管是HashTable還是synchronizedMap的同步，都是使用了鎖原理。操作需要訪問對象，首先對其加鎖；操作結束後，釋放鎖。通過Hashtable分析文已經就知道，HashTable的synchronized加鎖是針對整張Hash表的，即每次操作都鎖住整張表；而ConcurrentHashMap允許多個修改操作併發進行，其關鍵在於使用了Lock Stripping，即鎖分離、分段鎖或段鎖技術。分段鎖使用了多個鎖來控制對hash表的不同部分進行的修改。ConcurrentHashMap內部使用段(Segment)來表示這些不同的部分，每個段其實就是一個小的hash table，它們有自己的鎖。只要多個修改操作發生在不同的段上，它們就可以併發進行。由於引起了併發概念，其效率相對全部加鎖就有了明顯改善。

二、ConcurrentHashMap結構

由圖中可以看出，我們可以將整張ConcurrentHashMap劃分成不同的段，每個段可以看做一個HashTable，每個HashTable使用不同的鎖，段更進一步細分就是entry即實體。即：

 /**
     * The segments, each of which is a specialized hash table
     */
    final Segment<K,V>[] segments;

ConcurrentHashMap的概念包含ConcurrentHashMap、Segment和HashEntry。HashEntry定義如下：

 static final class HashEntry<K,V> {
        final K key;
        final int hash;
        volatile V value;
        final HashEntry<K,V> next;

        HashEntry(K key, int hash, HashEntry<K,V> next, V value) {
            this.key = key;
            this.hash = hash;
            this.next = next;
            this.value = value;
        }

	@SuppressWarnings("unchecked")
	static final <K,V> HashEntry<K,V>[] newArray(int i) {
	    return new HashEntry[i];
	}
    }

讀操作不需要加鎖

可以看出，除了value以外，其他值均是final的（包括next），這就意味着添加entry只能在頭上，而不能在中間或尾端。爲了確保讀操作能夠看到最新的值，將value設置成volatile，這避免了加鎖，從而提高了讀的效率。

定位段的方法

爲了加快定位段以及段中hash槽的速度，每個段hash槽的的個數都是2^n，這使得通過位運算就可以定位段和段中hash槽的位置。當併發級別爲默認值16時，也就是段的個數，hash值的高4位決定分配在哪個段中，後四位決定段中的座標。

    /**
     * Mask value for indexing into segments. The upper bits of a
     * key's hash code are used to choose the segment.
     */
    final int segmentMask;

    /**
     * Shift value for indexing within segments.
     */
    final int segmentShift;

segmentFor(int n)方法

/**
     * Returns the segment that should be used for key with given hash
     * @param hash the hash code for the key
     * @return the segment
     */
    final Segment<K,V> segmentFor(int hash) {
        return segments[(hash >>> segmentShift) & segmentMask];
    }

段的定義：

        /*
         * Segments maintain a table of entry lists that are ALWAYS
         * kept in a consistent state, so can be read without locking.
         * Next fields of nodes are immutable (final).  All list
         * additions are performed at the front of each bin. This
         * makes it easy to check changes, and also fast to traverse.
         * When nodes would otherwise be changed, new nodes are
         * created to replace them. This works well for hash tables
         * since the bin lists tend to be short. (The average length
         * is less than two for the default load factor threshold.)
         *
         * Read operations can thus proceed without locking, but rely
         * on selected uses of volatiles to ensure that completed
         * write operations performed by other threads are
         * noticed. For most purposes, the "count" field, tracking the
         * number of elements, serves as that volatile variable
         * ensuring visibility.  This is convenient because this field
         * needs to be read in many read operations anyway:
         *
         *   - All (unsynchronized) read operations must first read the
         *     "count" field, and should not look at table entries if
         *     it is 0.
         *
         *   - All (synchronized) write operations should write to
         *     the "count" field after structurally changing any bin.
         *     The operations must not take any action that could even
         *     momentarily cause a concurrent read operation to see
         *     inconsistent data. This is made easier by the nature of
         *     the read operations in Map. For example, no operation
         *     can reveal that the table has grown but the threshold
         *     has not yet been updated, so there are no atomicity
         *     requirements for this with respect to reads.
         *
         * As a guide, all critical volatile reads and writes to the
         * count field are marked in code comments.
         */

        private static final long serialVersionUID = 2249069246763182397L;

        /**
         * The number of elements in this segment's region.
         */
        transient volatile int count;

        /**
         * Number of updates that alter the size of the table. This is
         * used during bulk-read methods to make sure they see a
         * consistent snapshot: If modCounts change during a traversal
         * of segments computing size or checking containsValue, then
         * we might have an inconsistent view of state so (usually)
         * must retry.
         */
        transient int modCount;

        /**
         * The table is rehashed when its size exceeds this threshold.
         * (The value of this field is always <tt>(int)(capacity *
         * loadFactor)</tt>.)
         */
        transient int threshold;

        /**
         * The per-segment table.
         */
        transient volatile HashEntry<K,V>[] table;

        /**
         * The load factor for the hash table.  Even though this value
         * is same for all segments, it is replicated to avoid needing
         * links to outer object.
         * @serial
         */
        final float loadFactor;

count用來統計該段數據的個數，它是volatile，它用來協調修改和讀取操作，以保證讀取操作能夠讀取到幾乎最新的修改。協調方式是這樣的，每次修改操作做了結構上的改變，如增加/刪除節點(修改節點的值不算結構上的改變)，都要寫count值，每次讀取操作開始都要讀取count的值。這利用了 Java 5中對volatile語義的增強，對同一個volatile變量的寫和讀存在happens-before關係。modCount統計段結構改變的次數，主要是爲了檢測對多個段進行遍歷過程中某個段是否發生改變，在講述跨段操作時會還會詳述。threashold用來表示需要進行rehash的界限值。table數組存儲段中節點，每個數組元素是個hash鏈，用HashEntry表示。table也是volatile，這使得能夠讀取到最新的 table值而不需要同步。

刪除操作的代碼

  /**
         * Remove; match on key only if value null, else match both.
         */
        V remove(Object key, int hash, Object value) {
            lock();//加鎖
            try {
                int c = count - 1;
                HashEntry<K,V>[] tab = table;//優化volatile
                int index = hash & (tab.length - 1);//找到第一個節點位置
                HashEntry<K,V> first = tab[index];//找到第一個節點
                HashEntry<K,V> e = first;
                while (e != null && (e.hash != hash || !key.equals(e.key)))
                    e = e.next;//找到要刪除的節點

                V oldValue = null;
                if (e != null) {
                    V v = e.value;
                    if (value == null || value.equals(v)) {//找到要刪除的值
                        oldValue = v;
                        // All entries following removed node can stay
                        // in list, but all preceding ones need to be
                        // cloned.將刪除 素之前的元素全部clone，然後將第一個指向刪除元素///的next，第2個指向第1個，第3個，指向第二個，將刪除元素的的前驅設置爲第一個元素
                        ++modCount;
                        HashEntry<K,V> newFirst = e.next;
                        for (HashEntry<K,V> p = first; p != e; p = p.next)
                            newFirst = new HashEntry<K,V>(p.key, p.hash,
                                                          newFirst, p.value);
                        tab[index] = newFirst;
                        count = c; // write-volatile
                    }
                }
                return oldValue;
            } finally {
                unlock();
            }
        }

添加操作的代碼

  V put(K key, int hash, V value, boolean onlyIfAbsent) {
            lock();
            try {
                int c = count;
                if (c++ > threshold) // ensure capacity，如超限，rehash
                    rehash();
                HashEntry<K,V>[] tab = table;
                int index = hash & (tab.length - 1);
                HashEntry<K,V> first = tab[index];
                HashEntry<K,V> e = first;
                while (e != null && (e.hash != hash || !key.equals(e.key)))
                    e = e.next;//遍歷 

                V oldValue;
                if (e != null) {//如找到相同key，value直接替換
                    oldValue = e.value;
                    if (!onlyIfAbsent)
                        e.value = value;
                }
                else {//如未找到，創建一個新元素，指向first
                    oldValue = null;
                    ++modCount;
                    tab[index] = new HashEntry<K,V>(key, hash, first, value);
                    count = c; // write-volatile
                }
                return oldValue;
            } finally {
                unlock();
            }
        }

讀操作

  V get(Object key, int hash) {
            if (count != 0) { // read-volatile
                HashEntry<K,V> e = getFirst(hash);//獲取頭節點
                while (e != null) {
                    if (e.hash == hash && key.equals(e.key)) {
                        V v = e.value;
                        if (v != null)
                            return v;
                        //如空表明有其他操作在改變元素值或table結構，需要加鎖讀
                        return readValueUnderLock(e); // recheck
                    }
                    e = e.next;
                }
            }
            return null;
        }

ConcurrentHashMap原理分析

容器中nginx無法使用同一個網絡下的容器域名

Python: SunMoonTimeCalculator

NETCore中實現一個輕量無負擔的極簡任務調度ScheduleTask

docker使用特定的網絡

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

nodejs學習07——API

避免DbContext同時在多個線程調用

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

zimbra二次開發的相關資料

缺陷跟蹤工具jira和團隊協作與項目管理工具conflunce

jbpm人工終止的開發

volatile變量

我的友情鏈接

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結