HashMap常量設計目的

HashMap中有哪些常量？這些常量設計的目的是什麼？本篇帶你走近Doug Lea、Josh Bloch、Arthur van Hoff、 Neal Gafter對HashMap的設計。（以下都是基於jdk1.8）

常量設計

（1）HashMap默認初始化大小是1 << 4（即16）

    /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

關於這個變量，註釋說“MUST be a power of two”，即必須是2的冪次方。爲什麼一定要是2的冪次方呢？

HashMap底層數據結構是數組+鏈表（或數組+紅黑樹），當添加元素時，索引定位使用的是i =(n - 1) & hash ，當初始化大小n是2的冪次方時，它就等價於 n % hash 。定位下標一般用取餘法，而按位與（&）運算的效率要比取餘（%）運算的效率高，所以默認初始化大必須爲2的冪次方，就是爲了使用更高效的與運算。

默認初始化大小爲什麼是16而不是8或者32？如果太小，擴容比較頻繁；如果太大，又佔用內存空間。這算是jdk爲我們做的初始權衡吧。

（2）HashMap最大容量是1<<30，即2的30次方

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

我們知道int是佔4個字節，一個字節是8位，所以說是32位整型，那按理說可以左移31位，即2的31次冪。在這裏爲什麼不是2的31次方呢？實際上，二進制數的最左邊那一位是符號位，用來表示正負的。我們來看下面的例子：

   System.out.println(1 << 30);
   System.out.println(1 << 31);
   System.out.println(1 << 32);
   System.out.println(1 << 33);

輸出：

1073741824
-2147483648
1
2

所以，HashMap的最大容量就是2的30次方。

（3）HashMap默認加載因子是0.75

    /**
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

HashMap表徵hash表的填滿程度，讓我們看一下源碼對load factor的解釋：

 * <p>As a general rule, the default load factor (.75) offers a good
 * tradeoff between time and space costs.  Higher values decrease the
 * space overhead but increase the lookup cost (reflected in most of
 * the operations of the <tt>HashMap</tt> class, including
 * <tt>get</tt> and <tt>put</tt>).  The expected number of entries in
 * the map and its load factor should be taken into account when
 * setting its initial capacity, so as to minimize the number of
 * rehash operations.  If the initial capacity is greater than the
 * maximum number of entries divided by the load factor, no rehash
 * operations will ever occur.

通常來說，加載因子的默認值0.75在時間性能和空間消耗之間達到了平衡。較高的值雖然降低了空間消耗，但是卻增加了查找時間（反映在HashMap大多數的操作上，包括get和put）。當設置初始容量的時候，應該考慮將要放入map中的元素數量和加載因子，以減少rehash的次數。如果初始的容量比預計的entry數量除以加載因子的商還要大，那麼永遠不需要rehash操作。

（4）HashMap默認樹化（鏈表轉換成紅黑樹）閾值是8

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     */
    static final int TREEIFY_THRESHOLD = 8;

Java8及以後的版本中，HashMap底層數據結構引入了紅黑樹，當添加元素的時候，如果桶中鏈表元素超過8，會自動轉爲紅黑樹。那麼閾值爲什麼是8呢？來看HashMap源碼中的這段註釋：

	 * Ideally, under random hashCodes, the frequency of
     * nodes in bins follows a Poisson distribution
     * (http://en.wikipedia.org/wiki/Poisson_distribution) with a
     * parameter of about 0.5 on average for the default resizing
     * threshold of 0.75, although with a large variance because of
     * resizing granularity. Ignoring variance, the expected
     * occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
     * factorial(k)). The first values are:
     *
     * 0:    0.60653066
     * 1:    0.30326533
     * 2:    0.07581633
     * 3:    0.01263606
     * 4:    0.00157952
     * 5:    0.00015795
     * 6:    0.00001316
     * 7:    0.00000094
     * 8:    0.00000006
     * more: less than 1 in ten million

理想狀態中，在隨機哈希碼情況下，對於默認0.75的加載因子，桶中節點的分佈頻率服從參數約爲0.5的泊松分佈，即使粒度調整會產生較大方差。從數據中可以看到鏈表中元素個數爲8時的概率非常非常小了，所以鏈表轉換紅黑樹的閾值選擇了8。

（5）HashMap中一個樹的鏈表還原閾值是6

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     */
    static final int UNTREEIFY_THRESHOLD = 6;

鏈表樹化閥值是8，那麼樹還原爲鏈表爲什麼是6而不是7呢？這是爲了防止鏈表和樹之間頻繁的轉換。如果是7的話，假設一個HashMap不停的插入、刪除元素，鏈表個數一直在8左右徘徊，就會頻繁樹轉鏈表、鏈表轉樹，效率非常低下。

（5）HashMap的最小樹化容量是64

     /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     */
    static final int MIN_TREEIFY_CAPACITY = 64;

爲什麼是64呢？這是因爲容量低於64時，哈希碰撞的機率比較大，而這個時候出現長鏈表的可能性會稍微大一些，這種原因下產生的長鏈表，我們應該優先選擇擴容而避免不必要的樹化。

參考鏈接：

https://mp.weixin.qq.com/s/aU7aQmSaw7TuLL9ZF-dLgg

HashMap常量設計目的

【Java】冒泡排序實現

LinkedHashMap特點（jdk1.8）

HashTable特點（jdk1.8）

深入理解HashMap擴容（JDK1.8）---源碼分析

ArrayList源碼分析（jdk1.8）

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結