/**
* Hash table based implementation of the <tt>Map</tt> interface. This
* implementation provides all of the optional map operations, and permits
* <tt>null</tt> values and the <tt>null</tt> key. (The <tt>HashMap</tt>
* class is roughly equivalent to <tt>Hashtable</tt>, except that it is
* unsynchronized and permits nulls.) This class makes no guarantees as to
* the order of the map; in particular, it does not guarantee that the order
* will remain constant over time.
HashMap允許空值在源碼的註釋中已經說明了
HashMap與HashTable唯一的不同就是HashMap不是線程安全的而且允許空
HashMap = 數組 + 單向鏈表
回顧一下數組和鏈表:
數組:ArrayList ---> Object [ ]
鏈表:LinkedList(雙向鏈表)
想看懂HashMap的源碼的存儲方式,首先得知道HashMap存儲的數據結構
內部類
/**
* Basic hash bin node, used for most entries. (See below for
* TreeNode subclass, and in LinkedHashMap for its Entry subclass.)
*/
static class Node<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
V value;
Node<K,V> next;
Node(int hash, K key, V value, Node<K,V> next) {
this.hash = hash;
this.key = key;
this.value = value;
this.next = next;
}
public final K getKey() { return key; }
public final V getValue() { return value; }
public final String toString() { return key + "=" + value; }
public final int hashCode() {
return Objects.hashCode(key) ^ Objects.hashCode(value);
}
public final V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}
public final boolean equals(Object o) {
if (o == this)
return true;
if (o instanceof Map.Entry) {
Map.Entry<?,?> e = (Map.Entry<?,?>)o;
if (Objects.equals(key, e.getKey()) &&
Objects.equals(value, e.getValue()))
return true;
}
return false;
}
}
/**
* The table, initialized on first use, and resized as
* necessary. When allocated, length is always a power of two.
* (We also tolerate length zero in some operations to allow
* bootstrapping mechanics that are currently not needed.)
*/
transient Node<K,V>[] table;
關鍵:Node數組 每個節點又可以縱向的掛節點
那麼問題來了,數組的大小是多少呢?最大能有多少呢?什麼時候需要擴容,怎麼實現擴容?每個節點又能再掛多少個節點?
這些問題大概都是面試種經常愛問的。
默認容量大小
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
默認容量大小爲16
最大容量
/**
* The maximum capacity, used if a higher value is implicitly specified
* by either of the constructors with arguments.
* MUST be a power of two <= 1<<30.
*/
static final int MAXIMUM_CAPACITY = 1 << 30;
最大容量爲2的30次方
負載因子
/**
* The load factor used when none specified in constructor.
*/
static final float DEFAULT_LOAD_FACTOR = 0.75f;
如16 * 0.75 = 12
當數組中的元素增長到12時,就會擴容
如何擴容??? ---------> double擴容 (通過resize方法) ------>爲什麼double擴容,後面再詳細說明
數組都有最大值了,那麼鏈表有最大值嗎
鏈表總不能一直往下掛把。。。
會有一個限制
/**
* The bin count threshold for using a tree rather than list for a
* bin. Bins are converted to trees when adding an element to a
* bin with at least this many nodes. The value must be greater
* than 2 and should be at least 8 to mesh with assumptions in
* tree removal about conversion back to plain bins upon
* shrinkage.
*/
static final int TREEIFY_THRESHOLD = 8;
每個節點最多掛8
如果超過8,就要改變鏈表結構
鏈表 ------ > 平衡二叉樹
平衡二叉樹 紅黑樹
Node ------------>TreeNode
/**
* The bin count threshold for untreeifying a (split) bin during a
* resize operation. Should be less than TREEIFY_THRESHOLD, and at
* most 6 to mesh with shrinkage detection under removal.
*/
static final int UNTREEIFY_THRESHOLD = 6;
如果小於6,平衡二叉樹再變回爲鏈表
TreeNode ------------>Node
那麼問題又tm來了
16個大小的數組
那麼每塞一個Node的位置怎麼確定? node內部類有個變量hash就是用來確定位置的
Object --- hashCode( ) ---> 得到32位整型數(二進制)
在這裏順便解讀下hashCode( ) 和 equals( )
在Object中
/**
* Returns a hash code value for the object. This method is
* supported for the benefit of hash tables such as those provided by
* {@link java.util.HashMap}.
* <p>
* The general contract of {@code hashCode} is:
* <ul>
* <li>Whenever it is invoked on the same object more than once during
* an execution of a Java application, the {@code hashCode} method
* must consistently return the same integer, provided no information
* used in {@code equals} comparisons on the object is modified.
* This integer need not remain consistent from one execution of an
* application to another execution of the same application.
* <li>If two objects are equal according to the {@code equals(Object)}
* method, then calling the {@code hashCode} method on each of
* the two objects must produce the same integer result.
* <li>It is <em>not</em> required that if two objects are unequal
* according to the {@link java.lang.Object#equals(java.lang.Object)}
* method, then calling the {@code hashCode} method on each of the
* two objects must produce distinct integer results. However, the
* programmer should be aware that producing distinct integer results
* for unequal objects may improve the performance of hash tables.
* </ul>
* <p>
* As much as is reasonably practical, the hashCode method defined by
* class {@code Object} does return distinct integers for distinct
* objects. (This is typically implemented by converting the internal
* address of the object into an integer, but this implementation
* technique is not required by the
* Java™ programming language.)
*
* @return a hash code value for this object.
* @see java.lang.Object#equals(java.lang.Object)
* @see java.lang.System#identityHashCode
*/
public native int hashCode();
equal返回true hashCode一樣
equal返回false hashCode不一定不一樣
根據key求hash的源碼
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
(h = key.hashCode()) ^ (h >>> 16)
高16位和低16位取異或 把32位hashCode充分利用 使得到的hash值儘可能的不同
有了hash值 那麼就可以通過hash值確定節點的位置
hash % 16
與運算效率高 hash & (16-1) 即hash & 15
所以在源碼中用&運算替代%運算
回到前面爲何會到容量閾值時要double擴容
初始是16
&運算時 hash & (16-1) -----> hash & 15 ------>15的2進制是01111
15的二進制後面幾位全是1才能保證與運算得到的值完全取決於hash值
double擴容也是這個原理
16*2=32 hash & ( 32-1 ) -------> hash & 31
double擴容
擴容後還需要爲其他節點分擔
每次擴容,需要重新計算hash,重新打亂,目的是更好的利用節點
假定節點在原,數組中擴容後,原始節點的位置:
可能在兩個位置,原位置或( 原下標+原容量)的位置