JDK1.8逐字逐句帶你理解ConcurrentHashMap(2)

引言：

在上一篇博文我們介紹了ConcurrentHashMap在jdk1.8中所必要的知識，作爲基礎入門。因爲jdk1.8的ConcurrentHashMap做了太多的變動，所以新知識學習是必要的。今天是ConcurrentHashMap的第二篇，第二篇主要是認識ConcurrentHashMap，我將會介紹一下它的關鍵成員變量和一些關鍵的類。大家可以結合前幾篇博文的HashMap的知識，很多還是很相似的。筆者目前整理的一些blog針對面試都是超高頻出現的。大家可以點擊鏈接：http://blog.csdn.net/u012403290

如何實現線程安全：
我們都知道ConcurrentHashMap核心是線程安全的，那麼它又是用什麼來實現線程安全的呢？在jdk1.8中主要是採用了CAS算法實現線程安全的。在上一篇博文中已經介紹了CAS的無鎖操作，這裏不再贅述。同時它通過CAS算法又實現了3種原子操作（線程安全的保障就是操作具有原子性），下面我賦值了源碼分別表示哪些成員變量採用了CAS算法，然後又是哪些方法實現了操作的原子性:

  // Unsafe mechanics  CAS保障了哪些成員變量操作是原子性的

    private static final sun.misc.Unsafe U;
    private static final long LOCKSTATE;
      static {
            try {
                U = sun.misc.Unsafe.getUnsafe();
                Class<?> k = TreeBin.class; //操作TreeBin,後面會介紹這個類
                LOCKSTATE = U.objectFieldOffset
                    (k.getDeclaredField("lockState"));
            } catch (Exception e) {
                throw new Error(e);
            }
        }
--------------------------------------------------------------------------------------
    private static final sun.misc.Unsafe U;
    private static final long SIZECTL;
    private static final long TRANSFERINDEX;
    private static final long BASECOUNT;
    private static final long CELLSBUSY;
    private static final long CELLVALUE;
    private static final long ABASE;
    private static final int ASHIFT;

    static {
        try {
        //以下變量會在下面介紹到
            U = sun.misc.Unsafe.getUnsafe();
            Class<?> k = ConcurrentHashMap.class;
            SIZECTL = U.objectFieldOffset
                (k.getDeclaredField("sizeCtl"));
            TRANSFERINDEX = U.objectFieldOffset
                (k.getDeclaredField("transferIndex"));
            BASECOUNT = U.objectFieldOffset
                (k.getDeclaredField("baseCount"));
            CELLSBUSY = U.objectFieldOffset
                (k.getDeclaredField("cellsBusy"));
            Class<?> ck = CounterCell.class;
            CELLVALUE = U.objectFieldOffset
                (ck.getDeclaredField("value"));
            Class<?> ak = Node[].class;
            ABASE = U.arrayBaseOffset(ak);
            int scale = U.arrayIndexScale(ak);
            if ((scale & (scale - 1)) != 0)
                throw new Error("data type scale not a power of two");
            ASHIFT = 31 - Integer.numberOfLeadingZeros(scale);
        } catch (Exception e) {
            throw new Error(e);
        }
    }




//3個原子性操作方法：

    /* ---------------- Table element access -------------- */

    /*
     * Volatile access methods are used for table elements as well as
     * elements of in-progress next table while resizing.  All uses of
     * the tab arguments must be null checked by callers.  All callers
     * also paranoically precheck that tab's length is not zero (or an
     * equivalent check), thus ensuring that any index argument taking
     * the form of a hash value anded with (length - 1) is a valid
     * index.  Note that, to be correct wrt arbitrary concurrency
     * errors by users, these checks must operate on local variables,
     * which accounts for some odd-looking inline assignments below.
     * Note that calls to setTabAt always occur within locked regions,
     * and so in principle require only release ordering, not
     * full volatile semantics, but are currently coded as volatile
     * writes to be conservative.
     */

    @SuppressWarnings("unchecked")
    static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
        return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
    }

    static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
                                        Node<K,V> c, Node<K,V> v) {
        return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
    }

    static final <K,V> void setTabAt(Node<K,V>[] tab, int i, Node<K,V> v) {
        U.putObjectVolatile(tab, ((long)i << ASHIFT) + ABASE, v);
    }

以上這些基本實現了線程安全，還有一點是jdk1.8優化的結果，在以前的ConcurrentHashMap中是鎖定了Segment，而在jdk1.8被移除，現在鎖定的是一個Node頭節點（注意，synchronized鎖定的是頭結點，這一點從下面的源碼中就可以看出來），減小了鎖的粒度，性能和衝突都會減少，以下是源碼中的體現：

//這段代碼其實是在擴容階段對頭節點的鎖定，其實還有很多地方不一一列舉。
               synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        Node<K,V> ln, hn;
                        if (fh >= 0) {
                            int runBit = fh & n;
                            Node<K,V> lastRun = f;
                            for (Node<K,V> p = f.next; p != null; p = p.next) {
                                int b = p.hash & n;
                                if (b != runBit) {
                                    runBit = b;
                                    lastRun = p;
                                }
                            }
                            if (runBit == 0) {
                                ln = lastRun;
                                hn = null;
                            }
                            else {
                                hn = lastRun;
                                ln = null;
                            }
                            for (Node<K,V> p = f; p != lastRun; p = p.next) {
                                int ph = p.hash; K pk = p.key; V pv = p.val;
                                if ((ph & n) == 0)
                                    ln = new Node<K,V>(ph, pk, pv, ln);
                                else
                                    hn = new Node<K,V>(ph, pk, pv, hn);
                            }
                            setTabAt(nextTab, i, ln);
                            setTabAt(nextTab, i + n, hn);
                            setTabAt(tab, i, fwd);
                            advance = true;
                        }
                        else if (f instanceof TreeBin) {
                        .....
                   }
                }

如何存儲數據：
知道了ConcurrentHashMap是如何實現線程安全的同時，最起碼我們還要知道ConcurrentHashMap又是怎麼實現數據存儲的。以下是存儲的圖：

有人看了之後會想，這個不是HashMap的存儲結構麼？在jdk1.8中取消了segment，所以結構其實和HashMap是極其相似的，在HashMap的基礎上實現了線程安全，同時在每一個“桶”中的節點會被鎖定。

重要的成員變量：

1、capacity：容量，表示目前map的存儲大小，在源碼中分爲默認和最大，默認是在沒有指定容量大小的時候會賦予這個值，最大表示當容量達到這個值時，不再支持擴容。

    /**
     * The largest possible table capacity.  This value must be
     * exactly 1<<30 to stay within Java array allocation and indexing
     * bounds for power of two table sizes, and is further required
     * because the top two bits of 32bit hash fields are used for
     * control purposes.
     */
    private static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The default initial table capacity.  Must be a power of 2
     * (i.e., at least 1) and at most MAXIMUM_CAPACITY.
     */
    private static final int DEFAULT_CAPACITY = 16;

2、laodfactor：加載因子，這個和HashMap是一樣的，默認值也是0.75f。有不清楚的可以去尋找上篇介紹HashMap的博文。

    /**
     * The load factor for this table. Overrides of this value in
     * constructors affect only the initial table capacity.  The
     * actual floating point value isn't normally used -- it is
     * simpler to use expressions such as {@code n - (n >>> 2)} for
     * the associated resizing threshold.
     */
    private static final float LOAD_FACTOR = 0.75f;

3、TREEIFY_THRESHOLD與UNTREEIFY_THRESHOLD：作爲了解，這個兩個主要是控制鏈表和紅黑樹轉化的，前者表示大於這個值，需要把鏈表轉換爲紅黑樹，後者表示如果紅黑樹的節點小於這個值需要重新轉化爲鏈表。關於爲什麼要把鏈表轉化爲紅黑樹，在HashMap的介紹中，我已經詳細解釋過了。

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2, and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     */
    static final int TREEIFY_THRESHOLD = 8;

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     */
    static final int UNTREEIFY_THRESHOLD = 6;

4、下面3個參數作爲了解，主要是在擴容和參與擴容（當線程進入put的時候，發現該map正在擴容，那麼它會協助擴容）的時候使用，在下一篇博文中會簡單介紹到。

    /**
     * The number of bits used for generation stamp in sizeCtl.
     * Must be at least 6 for 32bit arrays.
     */
    private static int RESIZE_STAMP_BITS = 16;

    /**
     * The maximum number of threads that can help resize.
     * Must fit in 32 - RESIZE_STAMP_BITS bits.
     */
    private static final int MAX_RESIZERS = (1 << (32 - RESIZE_STAMP_BITS)) - 1;

    /**
     * The bit shift for recording size stamp in sizeCtl.
     */
    private static final int RESIZE_STAMP_SHIFT = 32 - RESIZE_STAMP_BITS;

5、下面2個字段比較重要，是線程判斷map當前處於什麼階段。MOVED表示該節點是個forwarding Node，表示有線程處理過了。後者表示判斷到這個節點是一個樹節點。

    static final int MOVED     = -1; // hash for forwarding nodes
    static final int TREEBIN   = -2; // hash for roots of trees

6、sizeCtl，標誌控制符。這個參數非常重要，出現在ConcurrentHashMap的各個階段，不同的值也表示不同情況和不同功能：
①負數代表正在進行初始化或擴容操作
②-N 表示有N-1個線程正在進行擴容操作（前面已經說過了，當線程進行值添加的時候判斷到正在擴容，它就會協助擴容）
③正數或0代表hash表還沒有被初始化，這個數值表示初始化或下一次進行擴容的大小，類似於擴容閾值。它的值始終是當前ConcurrentHashMap容量的0.75倍，這與loadfactor是對應的。實際容量>=sizeCtl，則擴容。

注意：在某些情況下，這個值就相當於HashMap中的threshold閥值。用於控制擴容。

極其重要的幾個內部類：
如果要理解ConcurrentHashMap的底層，必須要了解它相關聯的一些內部類。

1、Node

    /**
     * Key-value entry.  This class is never exported out as a
     * user-mutable Map.Entry (i.e., one supporting setValue; see
     * MapEntry below), but can be used for read-only traversals used
     * in bulk tasks.  Subclasses of Node with a negative hash field
     * are special, and contain null keys and values (but are never
     * exported).  Otherwise, keys and vals are never null.
     */
    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        volatile V val;  //用volatile修飾
        volatile Node<K,V> next;//用volatile修飾

        Node(int hash, K key, V val, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.val = val;
            this.next = next;
        }

        public final K getKey()       { return key; }
        public final V getValue()     { return val; }
        public final int hashCode()   { return key.hashCode() ^ val.hashCode(); }
        public final String toString(){ return key + "=" + val; }
        public final V setValue(V value) {
            throw new UnsupportedOperationException();  //不可以直接setValue
        }

        public final boolean equals(Object o) {
            Object k, v, u; Map.Entry<?,?> e;
            return ((o instanceof Map.Entry) &&
                    (k = (e = (Map.Entry<?,?>)o).getKey()) != null &&
                    (v = e.getValue()) != null &&
                    (k == key || k.equals(key)) &&
                    (v == (u = val) || v.equals(u)));
        }

        /**
         * Virtualized support for map.get(); overridden in subclasses.
         */
        Node<K,V> find(int h, Object k) {
            Node<K,V> e = this;
            if (k != null) {
                do {
                    K ek;
                    if (e.hash == h &&
                        ((ek = e.key) == k || (ek != null && k.equals(ek))))
                        return e;
                } while ((e = e.next) != null);
            }
            return null;
        }
    }

從上面的Node內部類源碼可以看出，它的value 和 next是用volatile修飾的，關於volatile已經在前面一篇博文介紹過，使得value和next具有可見性和有序性，從而保證線程安全。同時大家仔細看過代碼就會發現setValue（）方法訪問是會拋出異常，是禁止用該方法直接設置value值的。同時它還錯了一個find的方法，該方法主要是用戶尋找某一個節點。

2、TreeNode和TreeBin

 /**
     * Nodes for use in TreeBins
     */
    static final class TreeNode<K,V> extends Node<K,V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;

        TreeNode(int hash, K key, V val, Node<K,V> next,
                 TreeNode<K,V> parent) {
            super(hash, key, val, next);
            this.parent = parent;
        }

        Node<K,V> find(int h, Object k) {
            return findTreeNode(h, k, null);
        }

        /**
         * Returns the TreeNode (or null if not found) for the given key
         * starting at given root.
         */
        final TreeNode<K,V> findTreeNode(int h, Object k, Class<?> kc) {
            if (k != null) {
                TreeNode<K,V> p = this;
                do  {
                    int ph, dir; K pk; TreeNode<K,V> q;
                    TreeNode<K,V> pl = p.left, pr = p.right;
                    if ((ph = p.hash) > h)
                        p = pl;
                    else if (ph < h)
                        p = pr;
                    else if ((pk = p.key) == k || (pk != null && k.equals(pk)))
                        return p;
                    else if (pl == null)
                        p = pr;
                    else if (pr == null)
                        p = pl;
                    else if ((kc != null ||
                              (kc = comparableClassFor(k)) != null) &&
                             (dir = compareComparables(kc, k, pk)) != 0)
                        p = (dir < 0) ? pl : pr;
                    else if ((q = pr.findTreeNode(h, k, kc)) != null)
                        return q;
                    else
                        p = pl;
                } while (p != null);
            }
            return null;
        }
    }


//TreeBin太長，筆者截取了它的構造方法：

 TreeBin(TreeNode<K,V> b) {
            super(TREEBIN, null, null, null);
            this.first = b;
            TreeNode<K,V> r = null;
            for (TreeNode<K,V> x = b, next; x != null; x = next) {
                next = (TreeNode<K,V>)x.next;
                x.left = x.right = null;
                if (r == null) {
                    x.parent = null;
                    x.red = false;
                    r = x;
                }
                else {
                    K k = x.key;
                    int h = x.hash;
                    Class<?> kc = null;
                    for (TreeNode<K,V> p = r;;) {
                        int dir, ph;
                        K pk = p.key;
                        if ((ph = p.hash) > h)
                            dir = -1;
                        else if (ph < h)
                            dir = 1;
                        else if ((kc == null &&
                                  (kc = comparableClassFor(k)) == null) ||
                                 (dir = compareComparables(kc, k, pk)) == 0)
                            dir = tieBreakOrder(k, pk);
                            TreeNode<K,V> xp = p;
                        if ((p = (dir <= 0) ? p.left : p.right) == null) {
                            x.parent = xp;
                            if (dir <= 0)
                                xp.left = x;
                            else
                                xp.right = x;
                            r = balanceInsertion(r, x);
                            break;
                        }
                    }
                }
            }
            this.root = r;
            assert checkInvariants(root);
        }

從上面的源碼可以看出，在ConcurrentHashMap中不是直接存儲TreeNode來實現的，而是用TreeBin來包裝TreeNode來實現的。也就是說在實際的ConcurrentHashMap桶中，存放的是TreeBin對象，而不是TreeNode對象。之所以TreeNode繼承自Node是爲了附帶next指針，而這個next指針可以在TreeBin中尋找下一個TreeNode，這裏也是與HashMap之間比較大的區別。

3、ForwordingNode

    /**
     * A node inserted at head of bins during transfer operations.
     */
    static final class ForwardingNode<K,V> extends Node<K,V> {
        final Node<K,V>[] nextTable;
        ForwardingNode(Node<K,V>[] tab) {
            super(MOVED, null, null, null);
            this.nextTable = tab;
        }

        Node<K,V> find(int h, Object k) {
            // loop to avoid arbitrarily deep recursion on forwarding nodes
            outer: for (Node<K,V>[] tab = nextTable;;) {
                Node<K,V> e; int n;
                if (k == null || tab == null || (n = tab.length) == 0 ||
                    (e = tabAt(tab, (n - 1) & h)) == null)
                    return null;
                for (;;) {
                    int eh; K ek;
                    if ((eh = e.hash) == h &&
                        ((ek = e.key) == k || (ek != null && k.equals(ek))))
                        return e;
                    if (eh < 0) {
                        if (e instanceof ForwardingNode) {
                            tab = ((ForwardingNode<K,V>)e).nextTable;
                            continue outer;
                        }
                        else
                            return e.find(h, k);
                    }
                    if ((e = e.next) == null)
                        return null;
                }
            }
        }
    }

這個靜態內部內就顯得獨具匠心，它的使用主要是在擴容階段，它是鏈接兩個table的節點類，有一個next屬性用於指向下一個table，注意要理解這個table，它並不是說有2個table，而是在擴容的時候當線程讀取到這個地方發現這個地方爲空，這會設置爲forwordingNode，或者線程處理完該節點也會設置該節點爲forwordingNode，別的線程發現這個forwordingNode會繼續向後執行遍歷，這樣一來就很好的解決了多線程安全的問題。這裏有小夥伴就會問，那一個線程開始處理這個節點還沒處理完，別的線程進來怎麼辦，而且這個節點還不是forwordingNode吶？說明你前面沒看詳細，在處理某個節點（桶裏面第一個節點）的時候會對該節點上鎖，上面文章中我已經說過了。

認識階段就寫到這裏，對這些東西有一定的瞭解，在下一篇，也就是尾篇中，我會逐字逐句來介紹transfer（）擴容，put（）添加和get（）查詢三個方法。

如果博文存在什麼問題，或者有什麼想法，可以聯繫我呀，下面是我的微信二維碼：

JDK1.8逐字逐句帶你理解ConcurrentHashMap(2)

引言：

Mysql你必須知道的查詢語句

java實現排序(4)-堆排序

深入淺出LinkedList與ArrayList

java實現排序(3)-希爾排序

java中不太常見的東西(2) - Lambda表達式

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結