jdk集合源碼之HashMap

分析完數組和鏈表，再來分析下HashMap的源碼，基於K-V方式的存儲結構，簡單想象一下，典型的實現是基於數組+鏈表，正好我們前面分析完了數組和鏈表。廢話不多說，直接上源碼。

還是先從構造函數看起：

public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);

        // Find a power of 2 >= initialCapacity
        int capacity = 1;
        while (capacity < initialCapacity)
            capacity <<= 1;

        this.loadFactor = loadFactor;
        threshold = (int)(capacity * loadFactor);
        table = new Entry[capacity];
        init();
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @param  initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        threshold = (int)(DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR);
        table = new Entry[DEFAULT_INITIAL_CAPACITY];
        init();
    }
public HashMap(Map<? extends K, ? extends V> m) {
        this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1,
                      DEFAULT_INITIAL_CAPACITY), DEFAULT_LOAD_FACTOR);
        putAllForCreate(m);
    }

HashMap提供了四種構造函數，來看最簡單的一個構造函數，主要完成了成員變量的初始化。loadFactor裝載因子後面會講到，threshold等於裝載因子和容量的積。table就是數組的大小，這裏的數組是一個Entry數組，默認大小爲16，注意這裏的大小爲2的冪數，這裏面是有講究的後面會提到。init方法默認未提供實現。其他兩個構造函數提供的也是初始化的動作，值得看一下的是第一個構造函數，如果傳入的容量不是2的冪數，它會找到大於該容量的最小的2的冪數。實現方式是不斷的左移來判斷。最後一個構造函數後面講。

在看新增的方法前，先看下Entry的結構，Entry是HashMap的一個靜態內部類。將上圖的每個節點放大來看：

很簡單的結構，可以看出在每一個數組的位置，會根據hash值形成一個單項的鏈表。好了，看下最基本的put方法：

public V put(K key, V value) {
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key.hashCode());
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

值得說明的是Map中的key可以爲null，當key爲null時做單獨處理調用putForNullKey方法，因爲null值並沒有自己hash值：

private V putForNullKey(V value) {
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        modCount++;
        addEntry(0, null, value, 0);
        return null;
    }

可以看到null對象的key始終都是放到table[0]中，如果數組當前第一個元素爲空，則調用addEntry添加。

 void addEntry(int hash, K key, V value, int bucketIndex) {
	Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
        if (size++ >= threshold)
            resize(2 * table.length);
    }

初始化第一個元素，傳入的hash值爲0，這裏用到了threshold這個成員變量，可以看到如果當前map的size大於等於閾值則將前map的數組擴大一倍，threshold和loadfactor直接相關，主要是爲了防止過多的hash衝突，通過使用更多的空間來提高查找效率，假設women有初始大小爲8的map，則當存入第7個元素的時候，就會調用resize方法重置當前的map。

看一下resize的方法：

void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }

        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable);
        table = newTable;
        threshold = (int)(newCapacity * loadFactor);
    }

整個轉換的過程主要由transfer方法完成。

 void transfer(Entry[] newTable) {
        Entry[] src = table;
        int newCapacity = newTable.length;
        for (int j = 0; j < src.length; j++) {
            Entry<K,V> e = src[j];
            if (e != null) {
                src[j] = null;
                do {
                    Entry<K,V> next = e.next;
                    int i = indexFor(e.hash, newCapacity);
                    e.next = newTable[i];
                    newTable[i] = e;
                    e = next;
                } while (e != null);
            }
        }
    }

通過外層的for循環遍歷數組，內層的while循環遍歷每個節點上的鏈表，然後對鏈表上的每個節點調用indexFor方法進行重新hash獲取在新數組的位置i。然後將e的下一個節點指向i處原來的鏈表，將e插入的鏈表的頭部，繼續整個循環。

通過對putForNullKey方法的分析，認識了hashmap的擴充策略，下面接着看該方法的下半部分，先看下非常重要的hash規則，int hash = hash(key.hashCode());hash方法通過對key的再一次hash來獲取哈希值，爲的是避免key的哈希值分配不均，然後通過哈希值獲取在數組中的下標，主要通過indexFor方法實現：

static int indexFor(int h, int length) {
        return h & (length-1);
    }

這個方法實現的很巧妙，主要是因爲通過前面的保證，數組的長度始終是2的冪數，因此這裏的length-1用二進制表示全部是1，拿哈希值和全1的二進制做與操作，避免了和0與操作的相同性，其實這裏的與操作，就是對length取餘操作，保證從0-length的均勻性。put方法的後半部分和putForNullKey類似就不再分析了（如果找到hash值相同且key相同則覆蓋，否則新增到鏈表頭部）。

未插入元素u之前的結構圖

將u插入到位置7（假設hash後下標爲7）後的結構圖：

看完單個元素的插入，接着看批量的插入，上源碼：

 public void putAll(Map<? extends K, ? extends V> m) {
        int numKeysToBeAdded = m.size();
        if (numKeysToBeAdded == 0)
            return;

        /*
         * Expand the map if the map if the number of mappings to be added
         * is greater than or equal to threshold.  This is conservative; the
         * obvious condition is (m.size() + size) >= threshold, but this
         * condition could result in a map with twice the appropriate capacity,
         * if the keys to be added overlap with the keys already in this map.
         * By using the conservative calculation, we subject ourself
         * to at most one extra resize.
         */
        if (numKeysToBeAdded > threshold) {
            int targetCapacity = (int)(numKeysToBeAdded / loadFactor + 1);
            if (targetCapacity > MAXIMUM_CAPACITY)
                targetCapacity = MAXIMUM_CAPACITY;
            int newCapacity = table.length;
            while (newCapacity < targetCapacity)
                newCapacity <<= 1;
            if (newCapacity > table.length)
                resize(newCapacity);
        }

        for (Iterator<? extends Map.Entry<? extends K, ? extends V>> i = m.entrySet().iterator(); i.hasNext(); ) {
            Map.Entry<? extends K, ? extends V> e = i.next();
            put(e.getKey(), e.getValue());
        }
    }

整個方法沒有難理解的地方，除了if (numKeysToBeAdded > threshold)這裏的判斷，這裏不使用numKeysToBeAdded + size判斷，而是使用 numKeysToBeAdded，這是一種保守的做法，因爲考慮的被插入的集合可能和原來的集合有相同的key，避免造成空間的浪費。

看完新增，再看最開始提到的第四種構造函數，裏面的邏輯就容易理解多了，和我們前面分析的put操作差不太多。除了它裏面調用的是createEntry

void createEntry(int hash, K key, V value, int bucketIndex) {
	Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
        size++;
    }

由於構造函數保證了不會超過閾值，所以這裏只有新增，沒有判斷是否超過閾值。

看完新增照慣例來看刪除操作，主要看下removeEntryForKey，根據key去刪除節點，大體猜測一下實現，先確認是否爲null，是的話從0的位置查找，否則根據hash計算下標，然後遍歷對應的鏈表，比較key值，沒有找到則返回null。

final Entry<K,V> removeEntryForKey(Object key) {
        int hash = (key == null) ? 0 : hash(key.hashCode());
        int i = indexFor(hash, table.length);
        Entry<K,V> prev = table[i];
        Entry<K,V> e = prev;

        while (e != null) {
            Entry<K,V> next = e.next;
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k)))) {
                modCount++;
                size--;
                if (prev == e)
                    table[i] = next;
                else
                    prev.next = next;
                e.recordRemoval(this);
                return e;
            }
            prev = e;
            e = next;
        }

        return e;
    }

主要看下while循環中的動作，假設當前找到的鏈表結構如下，需要刪除的是b：

第一次循環if條件不滿足，則prev=e，e=next；

經過幾次循環假設現在滿足了if條件

這裏prev!=e則將prev的next節點直接指向next節點，從而將b節點刪除，那麼什麼時候prev==e呢？很明顯，當鏈的頭結點即是我們要刪除的節點時，直接將頭結點的next指針指向next元素。

新增和刪除看完後，其他的比如get，contain等方法就不難理解了，這裏也不一一進行講解了。

其實分析HashMap需要重點看的是它提供各種各樣的迭代器，HashMap的內部類比前面看的ArrayList和LinkedList都要多。

private final class ValueIterator extends HashIterator<V> {
        public V next() {
            return nextEntry().value;
        }
    }

    private final class KeyIterator extends HashIterator<K> {
        public K next() {
            return nextEntry().getKey();
        }
    }

    private final class EntryIterator extends HashIterator<Map.Entry<K,V>> {
        public Map.Entry<K,V> next() {
            return nextEntry();
        }
    }

value，value和entry的迭代器都是基於HashIterator實現，那麼就來看下HashIterator：

<pre name="code" class="java">private abstract class HashIterator<E> implements Iterator<E> {
        Entry<K,V> next;	// next entry to return
        int expectedModCount;	// For fast-fail
        int index;		// current slot
        Entry<K,V> current;	// current entry

        HashIterator() {
            expectedModCount = modCount;
            if (size > 0) { // advance to first entry
                Entry[] t = table;
                while (index < t.length && (next = t[index++]) == null)
                    ;
            }
        }

        public final boolean hasNext() {
            return next != null;
        }

        final Entry<K,V> nextEntry() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            Entry<K,V> e = next;
            if (e == null)
                throw new NoSuchElementException();

            if ((next = e.next) == null) {
                Entry[] t = table;
                while (index < t.length && (next = t[index++]) == null)
                    ;
            }
	    current = e;
            return e;
        }

        public void remove() {
            if (current == null)
                throw new IllegalStateException();
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            Object k = current.key;
            current = null;
            HashMap.this.removeEntryForKey(k);
            expectedModCount = modCount;
        }

    }

如果你觀察仔細會發現HashIterator少實現了接口Iterator中的一個next方法，該方法其實是留給了子類去實現，而它只提供了基礎的功能。看構造方法，從數組的0開始查找，直到該位置處的元素不爲null。所以顯而易見的hasNext方法只要判斷next是否爲null即可，重點看一下nextEntry這是HashIterator自己新增的方法，返回下一個entry對象。

仍然來看圖：

<img src="https://img-blog.csdn.net/20150102193655302?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvdGFuZ3lvbmd6aGU=/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt="" />

如果遍歷完當前節點的鏈表（即next == null），則繼續尋找下一個不爲null的節點，依次進行遍歷。爲了統一接口next方法由子類藉助nextEntry實現。上面迭代器都由對應的KeySet,Values，entrySet使用，對外呈現給開發人員。

jdk集合源碼之HashMap

各種排序算法python和java實現(二)

jvm的happens-before原則

關於類的初始化

一個關於awk命令和sort命令的小例子

從一道題目看類加載

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結