ConcurrentHashMap JDK1.8中結構原理及源碼分析

注：本文根據網絡和部分書籍整理基於JDK1.7書寫，如有雷同敬請諒解歡迎指正文中的錯誤之處。

數據結構

ConcurrentHashMap 1.8 拋棄了Segment分段鎖機制，採用Node + CAS + Synchronized來保證併發安全進行實現，採用table數組＋鏈表＋紅黑樹的存儲結構。以table數組元素作爲鎖，利用CAS+Synchronized來保證併發更新的安全，從而實現了對每個數組元素（Node）進行加鎖，進一步減少併發衝突的概率。結構如下：

注：1、對於個數超過8(默認值)的列表，jdk1.8中採用了紅黑樹的結構，那麼查詢的時間複雜度可以降低到O(logN)，可以改進性能。
2、新增字段 transient volatile CounterCell[] counterCells可方便的計算hashmap中所有元素的個數，性能大大優於jdk1.7中的size()方法。
3、Unsafe.getObjectVolatile可以直接獲取指定內存的數據，保證了每次拿到數據都是最新的。
4、在JDK1.8中，僅在構造器中確保初始容量>=concurrentLevel，爲兼容舊版本而保留；

重要屬性

baseCount：元素的個數，當插入新數據或則刪除數據時，會通過addCount()方法更新baseCount

sizeCtl：hash表初始化或擴容時的一個控制位標識量。
負數代表正在進行初始化或擴容操作。
-1代表正在初始化
-N 表示有N-1個線程正在進行擴容操作
正數或0代表hash表還沒有被初始化，這個數值表示初始化或下一次進行擴容的大小，這一點類似於擴容閾值的概念。它的值始終是當前ConcurrentHashMap容量的0.75倍，這與loadfactor是對應的。

重要類

Node ：最核心的內部類，它包裝了key-value鍵值對，所有插入ConcurrentHashMap的數據都包裝在這裏面。它與HashMap中的定義很相似，但是但是有一些差別它對value和next屬性設置了volatile同步鎖(與JDK7的Segment相同)，它不允許調用setValue方法直接改變Node的value域，它增加了find方法輔助map.get()方法。

TreeNode：樹節點類，另外一個核心的數據結構。當鏈表長度過長的時候，會轉換爲TreeNode。但是與HashMap不相同的是，它並不是直接轉換爲紅黑樹，而是把這些結點包裝成TreeNode放在TreeBin對象中，由TreeBin完成對紅黑樹的包裝。而且TreeNode在ConcurrentHashMap集成自Node類，而並非HashMap中的集成自LinkedHashMap.Entry<K,V>類，也就是說TreeNode帶有next指針，這樣做的目的是方便基於TreeBin的訪問。

TreeBin：這個類並不負責包裝用戶的key、value信息，而是包裝的很多TreeNode節點。它代替了TreeNode的根節點，也就是說在實際的ConcurrentHashMap“數組”中，存放的是TreeBin對象，而不是TreeNode對象，這是與HashMap的區別。另外這個類還帶有了讀寫鎖。在構造TreeBin節點時，僅僅指定了它的hash值爲TREEBIN常量，這也就是個標識。

~~~~ForwardingNode：~~~~一個用於連接兩個table的節點類。它包含一個nextTable指針，用於指向下一張表。而且這個節點的key value next指針全部爲null，它的hash值爲-1. 這裏面定義的find的方法是從nextTable裏進行查詢節點，而不是以自身爲頭節點進行查找。其中存儲nextTable的引用。只有table發生擴容的時候，ForwardingNode纔會發揮作用，作爲一個佔位符放在table中表示當前節點爲null或則已經被移動。

核心方法

tabAt : 獲得在i位置上的Node節點

casTabAt：利用CAS算法設置i位置上的Node節點。之所以能實現併發是因爲他指定了原來這個節點的值是多少：在CAS算法中，會比較內存中的值與你指定的這個值是否相等，如果相等才接受你的修改，否則拒絕你的修改，因此當前線程中的值並不是最新的值，這種修改可能會覆蓋掉其他線程的修改結果有點類似於SVN

setTabAt：利用volatile方法設置節點位置的值

/* ---------------- Table element access -------------- */

/*
 * Volatile access methods are used for table elements as well as
 * elements of in-progress next table while resizing.  All uses of
 * the tab arguments must be null checked by callers.  All callers
 * also paranoically precheck that tab's length is not zero (or an
 * equivalent check), thus ensuring that any index argument taking
 * the form of a hash value anded with (length - 1) is a valid
 * index.  Note that, to be correct wrt arbitrary concurrency
 * errors by users, these checks must operate on local variables,
 * which accounts for some odd-looking inline assignments below.
 * Note that calls to setTabAt always occur within locked regions,
 * and so in principle require only release ordering, not
 * full volatile semantics, but are currently coded as volatile
 * writes to be conservative.
 */

@SuppressWarnings("unchecked")
static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
	return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
}

static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i, Node<K,V> c, Node<K,V> v) {
	return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
}

static final <K,V> void setTabAt(Node<K,V>[] tab, int i, Node<K,V> v) {
	U.putObjectVolatile(tab, ((long)i << ASHIFT) + ABASE, v);
}

put操作

利用spread方法對key的hashcode進行一次hash計算，來確定這個值在table中的位置

如果相應位置的Node還未初始化，則通過CAS插入相應的數據；
如果相應位置的Node不爲空，且當前該節點不處於移動狀態，則對該節點加synchronized鎖，
如果該節點的hash>0，則得到的結點就是hash值相同的節點組成的鏈表的頭節點，則遍歷鏈表更新節點或向後遍歷，直到鏈表尾插入新節點。
如果該節點是TreeBin類型的節點，說明是紅黑樹結構，則調用紅黑樹的插值方法putTreeVal插入新節點；
如果binCount不爲0，說明put操作對數據產生了影響，如果當前鏈表的個數達到8個，則通過treeifyBin方法轉化爲紅黑樹，如果oldVal不爲空，說明是一次更新操作，沒有對元素個數產生影響，則直接返回舊值；

如果插入的是一個新節點，則執行addCount()方法嘗試更新元素個數baseCount；

/**
 * Maps the specified key to the specified value in this table.
 * Neither the key nor the value can be null.
 *
 * <p>The value can be retrieved by calling the {@code get} method
 * with a key that is equal to the original key.
 *
 * @param key key with which the specified value is to be associated
 * @param value value to be associated with the specified key
 * @return the previous value associated with {@code key}, or
 *         {@code null} if there was no mapping for {@code key}
 * @throws NullPointerException if the specified key or value is null
 */
public V put(K key, V value) {
	return putVal(key, value, false);
}

/** Implementation for put and putIfAbsent */
final V putVal(K key, V value, boolean onlyIfAbsent) {
	if (key == null || value == null) throw new NullPointerException();
	int hash = spread(key.hashCode());
	int binCount = 0;
	for (Node<K,V>[] tab = table;;) {
		Node<K,V> f; int n, i, fh;
		if (tab == null || (n = tab.length) == 0)
			tab = initTable();
		else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
			if (casTabAt(tab, i, null, new Node<K,V>(hash, key, value, null)))
				break;                   // no lock when adding to empty bin
		}
		else if ((fh = f.hash) == MOVED)
			tab = helpTransfer(tab, f);
		else {
			V oldVal = null;
			synchronized (f) {
				if (tabAt(tab, i) == f) {
					if (fh >= 0) {
						binCount = 1;
						for (Node<K,V> e = f;; ++binCount) {
							K ek;
							if (e.hash == hash &&
								((ek = e.key) == key || (ek != null && key.equals(ek)))) {
								oldVal = e.val;
								if (!onlyIfAbsent)
									e.val = value;
								break;
							}
							Node<K,V> pred = e;
							if ((e = e.next) == null) {
								pred.next = new Node<K,V>(hash, key, value, null);
								break;
							}
						}
					}
					else if (f instanceof TreeBin) {
						Node<K,V> p;
						binCount = 2;
						if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) != null) {
							oldVal = p.val;
							if (!onlyIfAbsent)
								p.val = value;
						}
					}
				}
			}
			if (binCount != 0) {
				if (binCount >= TREEIFY_THRESHOLD)
					treeifyBin(tab, i);
				if (oldVal != null)
					return oldVal;
				break;
			}
		}
	}
	addCount(1L, binCount);
	return null;
}

get操作

1、計算hash 值，根據 hash 值找到數組對應位置: (n – 1) & h

2、根據該位置處節點性質進行相應查找

如果該位置爲 null，那麼直接返回 null 就可以了

如果該位置節點key與傳入的key相同且不爲null，直接返回value值。

如果該位置節點的 hash 值小於 0，說明正在擴容，或者是紅黑樹，find查找返回value值

如果是鏈表，進行遍歷返回value值。

/**
 * Returns the value to which the specified key is mapped,
 * or {@code null} if this map contains no mapping for the key.
 *
 * <p>More formally, if this map contains a mapping from a key
 * {@code k} to a value {@code v} such that {@code key.equals(k)},
 * then this method returns {@code v}; otherwise it returns
 * {@code null}.  (There can be at most one such mapping.)
 *
 * @throws NullPointerException if the specified key is null
 */
public V get(Object key) {
	Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
	int h = spread(key.hashCode());
	if ((tab = table) != null && (n = tab.length) > 0 &&
		(e = tabAt(tab, (n - 1) & h)) != null) {
		if ((eh = e.hash) == h) {
			if ((ek = e.key) == key || (ek != null && key.equals(ek)))
				return e.val;
		}
		else if (eh < 0)
			return (p = e.find(h, key)) != null ? p.val : null;
		while ((e = e.next) != null) {
			if (e.hash == h &&
				((ek = e.key) == key || (ek != null && key.equals(ek))))
				return e.val;
		}
	}
	return null;
}

size()操作

初始化時counterCells爲空，在併發量很高時，如果存在兩個線程同時執行CAS修改baseCount值，則失敗的線程會繼續執行方法體中的邏輯，使用CounterCell記錄元素個數的變化；
如果CounterCell數組counterCells爲空，調用fullAddCount()方法進行初始化，並插入對應的記錄數，通過CAS設置cellsBusy字段，只有設置成功的線程才能初始化CounterCell數組
如果通過CAS設置cellsBusy字段失敗的話，則繼續嘗試通過CAS修改baseCount字段，如果修改baseCount字段成功的話，就退出循環，否則繼續循環插入CounterCell對象；

1.8中元素個數保存baseCount中，部分元素的變化個數保存在CounterCell數組中，通過累加baseCount和CounterCell數組中的數量，即可得到元素的總個數；

/**
 * {@inheritDoc}
 */
public int size() {
	long n = sumCount();
	return ((n < 0L) ? 0 :
			(n > (long)Integer.MAX_VALUE) ? Integer.MAX_VALUE :
			(int)n);
}

final long sumCount() {
	CounterCell[] as = counterCells; CounterCell a;
	long sum = baseCount;
	if (as != null) {
		for (int i = 0; i < as.length; ++i) {
			if ((a = as[i]) != null)
				sum += a.value;
		}
	}
	return sum;
}

CAS算法

CAS算法包含三個參數CAS(V, E, N)，判斷預期值E和內存舊值是否相同(Compare)，如果相等用新值N覆蓋舊值V(Swap)，否則失敗不會執行任何操作；當多個線程嘗試使用CAS同時更新同一個變量時，只有其中一個線程能更新變量的值，其他線程失敗（失敗線程不會被阻塞，而是被告知“失敗”，可以繼續嘗試）；
CAS在硬件層面可以被編譯爲機器指令執行，因此性能高於基於鎖佔有方式實現線程安全；

ConcurrentHashMap JDK1.8中結構原理及源碼分析

數據結構

重要屬性

重要類

核心方法

put操作

get操作

size()操作

CAS算法

Python多線程編程深度探索：從入門到實戰

mongodb處理json數據很好

35K*14 薪，入職了！這公司只要不裁員，我能一直呆下去！

mysql 提高查詢速度的方法（性能優化）

JDK1.7 HashMap原理及源碼分析

JDK1.8 HashMap原理及源碼分析

ArrayList 遍歷方式及性能對比

HashMap 遍歷方式及其性能對比

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結