ConcurrentHashMap原理分析
很多網上的面試筆試題集錦都有關於HashTable和HashMap的區別,比如HashTable是線程安全的,key值不允許爲空;而HashMap不是線程安全的,key值允許爲空;兩者的父不同,一個是Directory,一個是Map; 由於HashMap不是線程同步的,如果需要使用一個線程同步的HashMap,則需要額外進行同步的邏輯代碼編寫;或者也可以使用CollectionUtils提供的synchronizedMap()方法,該方法會返回一個線程同步的Map,這種方法也會額外增加同步的代價。JDK1.5提供了ConcurrentHashMap提供了簡單、安全且代價較小的HashMap同步。
一、ConcurrentHashMap同步原理概述
不管是HashTable還是synchronizedMap的同步,都是使用了鎖原理。操作需要訪問對象,首先對其加鎖;操作結束後,釋放鎖。通過Hashtable分析文已經就知道,HashTable的synchronized加鎖是針對整張Hash表的,即每次操作都鎖住整張表;而ConcurrentHashMap允許多個修改操作併發進行,其關鍵在於使用了Lock Stripping,即鎖分離、分段鎖或段鎖技術。分段鎖使用了多個鎖來控制對hash表的不同部分進行的修改。ConcurrentHashMap內部使用段(Segment)來表示這些不同的部分,每個段其實就是一個小的hash table,它們有自己的鎖。只要多個修改操作發生在不同的段上,它們就可以併發進行。由於引起了併發概念,其效率相對全部加鎖就有了明顯改善。
二、ConcurrentHashMap結構
由圖中可以看出,我們可以將整張ConcurrentHashMap劃分成不同的段,每個段可以看做一個HashTable,每個HashTable使用不同的鎖,段更進一步細分就是entry即實體。即:
/** * The segments, each of which is a specialized hash table */ final Segment<K,V>[] segments;
ConcurrentHashMap的概念包含ConcurrentHashMap、Segment和HashEntry。HashEntry定義如下:
static final class HashEntry<K,V> { final K key; final int hash; volatile V value; final HashEntry<K,V> next; HashEntry(K key, int hash, HashEntry<K,V> next, V value) { this.key = key; this.hash = hash; this.next = next; this.value = value; } @SuppressWarnings("unchecked") static final <K,V> HashEntry<K,V>[] newArray(int i) { return new HashEntry[i]; } }
讀操作不需要加鎖
可以看出,除了value以外,其他值均是final的(包括next),這就意味着添加entry只能在頭上,而不能在中間或尾端。爲了確保讀操作能夠看到最新的值,將value設置成volatile,這避免了加鎖,從而提高了讀的效率。
定位段的方法
爲了加快定位段以及段中hash槽的速度,每個段hash槽的的個數都是2^n,這使得通過位運算就可以定位段和段中hash槽的位置。當併發級別爲默認值16時,也就是段的個數,hash值的高4位決定分配在哪個段中,後四位決定段中的座標。
/** * Mask value for indexing into segments. The upper bits of a * key's hash code are used to choose the segment. */ final int segmentMask; /** * Shift value for indexing within segments. */ final int segmentShift;
segmentFor(int n)方法
/** * Returns the segment that should be used for key with given hash * @param hash the hash code for the key * @return the segment */ final Segment<K,V> segmentFor(int hash) { return segments[(hash >>> segmentShift) & segmentMask]; }
段的定義:
/* * Segments maintain a table of entry lists that are ALWAYS * kept in a consistent state, so can be read without locking. * Next fields of nodes are immutable (final). All list * additions are performed at the front of each bin. This * makes it easy to check changes, and also fast to traverse. * When nodes would otherwise be changed, new nodes are * created to replace them. This works well for hash tables * since the bin lists tend to be short. (The average length * is less than two for the default load factor threshold.) * * Read operations can thus proceed without locking, but rely * on selected uses of volatiles to ensure that completed * write operations performed by other threads are * noticed. For most purposes, the "count" field, tracking the * number of elements, serves as that volatile variable * ensuring visibility. This is convenient because this field * needs to be read in many read operations anyway: * * - All (unsynchronized) read operations must first read the * "count" field, and should not look at table entries if * it is 0. * * - All (synchronized) write operations should write to * the "count" field after structurally changing any bin. * The operations must not take any action that could even * momentarily cause a concurrent read operation to see * inconsistent data. This is made easier by the nature of * the read operations in Map. For example, no operation * can reveal that the table has grown but the threshold * has not yet been updated, so there are no atomicity * requirements for this with respect to reads. * * As a guide, all critical volatile reads and writes to the * count field are marked in code comments. */ private static final long serialVersionUID = 2249069246763182397L; /** * The number of elements in this segment's region. */ transient volatile int count; /** * Number of updates that alter the size of the table. This is * used during bulk-read methods to make sure they see a * consistent snapshot: If modCounts change during a traversal * of segments computing size or checking containsValue, then * we might have an inconsistent view of state so (usually) * must retry. */ transient int modCount; /** * The table is rehashed when its size exceeds this threshold. * (The value of this field is always <tt>(int)(capacity * * loadFactor)</tt>.) */ transient int threshold; /** * The per-segment table. */ transient volatile HashEntry<K,V>[] table; /** * The load factor for the hash table. Even though this value * is same for all segments, it is replicated to avoid needing * links to outer object. * @serial */ final float loadFactor;
count用來統計該段數據的個數,它是volatile,它用來協調修改和讀取操作,以保證讀取操作能夠讀取到幾乎最新的修改。協調方式是這樣的,每次修改操作做了結構上的改變,如增加/刪除節點(修改節點的值不算結構上的改變),都要寫count值,每次讀取操作開始都要讀取count的值。這利用了 Java 5中對volatile語義的增強,對同一個volatile變量的寫和讀存在happens-before關係。modCount統計段結構改變的次數,主要是爲了檢測對多個段進行遍歷過程中某個段是否發生改變,在講述跨段操作時會還會詳述。threashold用來表示需要進行rehash的界限值。table數組存儲段中節點,每個數組元素是個hash鏈,用HashEntry表示。table也是volatile,這使得能夠讀取到最新的 table值而不需要同步。
刪除操作的代碼
/** * Remove; match on key only if value null, else match both. */ V remove(Object key, int hash, Object value) { lock();//加鎖 try { int c = count - 1; HashEntry<K,V>[] tab = table;//優化volatile int index = hash & (tab.length - 1);//找到第一個節點位置 HashEntry<K,V> first = tab[index];//找到第一個節點 HashEntry<K,V> e = first; while (e != null && (e.hash != hash || !key.equals(e.key))) e = e.next;//找到要刪除的節點 V oldValue = null; if (e != null) { V v = e.value; if (value == null || value.equals(v)) {//找到要刪除的值 oldValue = v; // All entries following removed node can stay // in list, but all preceding ones need to be // cloned.將刪除 素之前的元素全部clone,然後將第一個指向刪除元素///的next,第2個指向第1個,第3個,指向第二個,將刪除元素的的前驅設置爲第一個元素 ++modCount; HashEntry<K,V> newFirst = e.next; for (HashEntry<K,V> p = first; p != e; p = p.next) newFirst = new HashEntry<K,V>(p.key, p.hash, newFirst, p.value); tab[index] = newFirst; count = c; // write-volatile } } return oldValue; } finally { unlock(); } }
添加操作的代碼
V put(K key, int hash, V value, boolean onlyIfAbsent) { lock(); try { int c = count; if (c++ > threshold) // ensure capacity,如超限,rehash rehash(); HashEntry<K,V>[] tab = table; int index = hash & (tab.length - 1); HashEntry<K,V> first = tab[index]; HashEntry<K,V> e = first; while (e != null && (e.hash != hash || !key.equals(e.key))) e = e.next;//遍歷 V oldValue; if (e != null) {//如找到相同key,value直接替換 oldValue = e.value; if (!onlyIfAbsent) e.value = value; } else {//如未找到,創建一個新元素,指向first oldValue = null; ++modCount; tab[index] = new HashEntry<K,V>(key, hash, first, value); count = c; // write-volatile } return oldValue; } finally { unlock(); } }
讀操作
V get(Object key, int hash) { if (count != 0) { // read-volatile HashEntry<K,V> e = getFirst(hash);//獲取頭節點 while (e != null) { if (e.hash == hash && key.equals(e.key)) { V v = e.value; if (v != null) return v; //如空表明有其他操作在改變元素值或table結構,需要加鎖讀 return readValueUnderLock(e); // recheck } e = e.next; } } return null; }