Java 基础 ——HashMap构造、PUT、GET

Java 基础 ——HashMap构造、PUT、GET

  • HashMap的数据结构包括了初始数组、链表、红黑树;
  • 插入数据的时候使用key%size来进行插入数据;
  • 当两个或者两个以上的key的key相同,且key值不同的时候(即%【取余】发生冲突,就会挂在数组初始化位置的链表后)
  • 当某个节点后出现过多的链表节点的时候,就会换成红黑树以提高效率;

HashMap结构

  • Key是通过Set组织的,即不允许重复
/**
 * Returns a {@link Set} view of the keys contained in this map.
 * The set is backed by the map, so changes to the map are
 * reflected in the set, and vice-versa.  If the map is modified
 * while an iteration over the set is in progress (except through
 * the iterator's own <tt>remove</tt> operation), the results of
 * the iteration are undefined.  The set supports element removal,
 * which removes the corresponding mapping from the map, via the
 * <tt>Iterator.remove</tt>, <tt>Set.remove</tt>,
 * <tt>removeAll</tt>, <tt>retainAll</tt>, and <tt>clear</tt>
 * operations.  It does not support the <tt>add</tt> or <tt>addAll</tt>
 * operations.
 *
 * @return a set view of the keys contained in this map
 */
public Set<K> keySet() {
    Set<K> ks = keySet;
    if (ks == null) {
        ks = new KeySet();
        keySet = ks;
    }
    return ks;
}
  • Value是通过Collection组织的,即体现允许重复数据(ArrayList、LinkedList)
/**
 * Returns a {@link Collection} view of the values contained in this map.
 * The collection is backed by the map, so changes to the map are
 * reflected in the collection, and vice-versa.  If the map is
 * modified while an iteration over the collection is in progress
 * (except through the iterator's own <tt>remove</tt> operation),
 * the results of the iteration are undefined.  The collection
 * supports element removal, which removes the corresponding
 * mapping from the map, via the <tt>Iterator.remove</tt>,
 * <tt>Collection.remove</tt>, <tt>removeAll</tt>,
 * <tt>retainAll</tt> and <tt>clear</tt> operations.  It does not
 * support the <tt>add</tt> or <tt>addAll</tt> operations.
 *
 * @return a view of the values contained in this map
 */
public Collection<V> values() {
    Collection<V> vs = values;
    if (vs == null) {
        vs = new Values();
        values = vs;
    }
    return vs;
}

HashMap(JDK8之前)

HashMap是数组+链表存储结构

  • 数组特点是查询速度快、增删较慢
  • 链表特点是查询速度慢、增删较快

HashMap结合两者特点组成

如图所示(HashMap存储结构 JDK8之前):

在这里插入图片描述

  • HashMap数组长度默认是16,16个数组(桶)中每个元素存储的就是链表的头结点
/**
 * 默认初始容量,必须是2的幂次方,即桶默认是16个
 * The default initial capacity - MUST be a power of two.
 */
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

HashMap(JDK8之后)

HashMap是数组+链表+红黑树结构

  • 数组特点是查询速度快、增删较慢
  • 链表特点是查询速度慢、增删较快
  • 通过常量TREEIFY_THRESHOLD来控制是否将链表转换成红黑树来存储

数据结构 —— 红黑树算法


以下对HashMap基于JDK8讲解


HashMap内部结构

通过数组Node<K,V>与链表Set<Map.Entry<K,V>>组成的复合结构

  • 基础结构
// 数组结构
transient Node<K,V>[] table;
// 链表结构
transient Set<Map.Entry<K,V>> entrySet;
  • Node<K,V>
/**
 * Node是通过hash值、键值对、以及指向下一个节点来组成的
 */
static class Node<K,V> implements Map.Entry<K,V> {
	// hash值
    final int hash;
    // 键
    final K key;
    // 值
    V value;
    // 指向下一个节点
    Node<K,V> next;

    Node(int hash, K key, V value, Node<K,V> next) {
        this.hash = hash;
        this.key = key;
        this.value = value;
        this.next = next;
    }

    public final K getKey()        { return key; }
    public final V getValue()      { return value; }
    public final String toString() { return key + "=" + value; }

    public final int hashCode() {
        return Objects.hashCode(key) ^ Objects.hashCode(value);
    }

    public final V setValue(V newValue) {
        V oldValue = value;
        value = newValue;
        return oldValue;
    }

    public final boolean equals(Object o) {
        if (o == this)
            return true;
        if (o instanceof Map.Entry) {
            Map.Entry<?,?> e = (Map.Entry<?,?>)o;
            if (Objects.equals(key, e.getKey()) &&
                Objects.equals(value, e.getValue()))
                return true;
        }
        return false;
    }
}

数组被分为一个个的backet(桶),通过hash值决定了键值对在这个数组的寻址,相同的键值对以链表形式存储!!!

  • 链表的大小超过TREEIFY_THRESHOLD阈值(默认是8),会改造成红黑树
/**
 * The bin count threshold for using a tree rather than list for a
 * bin.  Bins are converted to trees when adding an element to a
 * bin with at least this many nodes. The value must be greater
 * than 2 and should be at least 8 to mesh with assumptions in
 * tree removal about conversion back to plain bins upon
 * shrinkage.
 */
 // TREEIFY_THRESHOLD(树化)
static final int TREEIFY_THRESHOLD = 8;
  • 而某个桶,上面的链表由于删除了某些值,低于了阈值(UNTREEIFY_THRESHOLD),红黑树又被转变为链表,来确保性能
/**
 * The bin count threshold for untreeifying a (split) bin during a
 * resize operation. Should be less than TREEIFY_THRESHOLD, and at
 * most 6 to mesh with shrinkage detection under removal.
 */
static final int UNTREEIFY_THRESHOLD = 6;

HashMap构造函数

构造函数不是直接指定大小,而是给一些成员变量赋值,所以HashMap是在首次使用的时候才被初始化。

/**
 * Constructs an empty <tt>HashMap</tt> with the specified initial
 * capacity and load factor.
 *
 * @param  initialCapacity the initial capacity
 * @param  loadFactor      the load factor
 * @throws IllegalArgumentException if the initial capacity is negative
 *         or the load factor is nonpositive
 */
public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    this.loadFactor = loadFactor;
    this.threshold = tableSizeFor(initialCapacity);
}

/**
 * Constructs an empty <tt>HashMap</tt> with the specified initial
 * capacity and the default load factor (0.75).
 *
 * @param  initialCapacity the initial capacity.
 * @throws IllegalArgumentException if the initial capacity is negative.
 */
public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}

/**
 * Constructs an empty <tt>HashMap</tt> with the default initial capacity
 * (16) and the default load factor (0.75).
 */
public HashMap() {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

/**
 * Constructs a new <tt>HashMap</tt> with the same mappings as the
 * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with
 * default load factor (0.75) and an initial capacity sufficient to
 * hold the mappings in the specified <tt>Map</tt>.
 *
 * @param   m the map whose mappings are to be placed in this map
 * @throws  NullPointerException if the specified map is null
 */
public HashMap(Map<? extends K, ? extends V> m) {
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    putMapEntries(m, false);
}

总结HashMap结构

  • HashMap是数组+链表+红黑树的结构
  • 默认数组为16个桶
  • 构造函数不做大小指定,在首次使用(构造函数指定大小或者put)时对大小进行指定

具体使用数组、链表、红黑树如何转变的通过put函数说明。


HashMap添加元素(put)

  • 源码分析
/**
  * Associates the specified value with the specified key in this map.
  * If the map previously contained a mapping for the key, the old
  * value is replaced.
  *
  * @param key key with which the specified value is to be associated
  * @param value value to be associated with the specified key
  * @return the previous value associated with <tt>key</tt>, or
  *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
  *         (A <tt>null</tt> return can also indicate that the map
  *         previously associated <tt>null</tt> with <tt>key</tt>.)
  */
 public V put(K key, V value) {
     return putVal(hash(key), key, value, false, true);
 }

/**
 * Implements Map.put and related methods
 *
 * @param hash hash for key
 * @param key the key
 * @param value the value to put
 * @param onlyIfAbsent if true, don't change existing value
 * @param evict if false, the table is in creation mode.
 * @return previous value, or null if none
 */
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    // table 为空时,调用resize()方法来初始化table
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    // 进行hash运算,算出键值对,在table中的具体位置
    if ((p = tab[i = (n - 1) & hash]) == null)
    	// 当没有元素时,直接new该键值对的node放到数组当中
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        // 如果存在键值对,且传入进来的键值对是一致的,则直接替换数组中的元素
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        // 否则接着判断,判断当前数组位置存储的是否是已经树化之后的节点
        else if (p instanceof TreeNode)
      		// 如果是树化的,按照树的方式尝试存储键值对
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        // 如果不是树化的,按照链表的插入方式往链表中添加元素,同时判断链表元素的总数,一旦超过TREEIFY_THRESHOLD,则将链表进行树化
        else {
            for (int binCount = 0; ; ++binCount) {
            	// 如果不是树化的,按照链表的插入方式往链表中添加元素
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    // 同时判断链表元素的总数(binCount),一旦超过TREEIFY_THRESHOLD,则将链表进行树化(treeifyBin)
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            // 如果插入的键位存在于HashMap中,则对对应的键位进行更新操作
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    // 当HashMap中的size,当元素大于阈值(threshold)时,对hashMap进行扩容(resize方法)
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

resize即具备初始化,也具备扩容的功能

总结put逻辑

  • 1.若HashMap未被初始化,则进行初始化操作;
  • 2.对Key求Hash值,依据Hash值计算下标;
  • 3.若未发生碰撞,则直接放入桶中;
    "碰撞":就是计算得到相同的hash值
  • 4.若发生碰撞,则以链表的方式链接到后面;
  • 5.若链表长度超过阈值,且HashMap元素超过最低树化容量,则将链表转成红黑树;
    默认阈值:TREEIFY_THRESHOLD=8
    默认最低树化容量:MIN_TREEIFY_CAPACITY=64
    即:当前桶的容量超过8,并且整个HashMap的元素超过64就会将链表转换为红黑树。如果桶容量超过8,但是整个HashMap元素没有超过64,只会发生resize(扩容),而不会发生树化(红黑树树化)
  • 6.若节点已存在,则用新值替换旧值;
  • 7.若桶满了(默认容量16*扩容因子0.75),就需要resize(扩容2倍后重排);

HashMap获取元素(get)

主要使用键值对的hashcode,通过hash算法找到backet(桶)的位置,找到backet位置调用key.equlas(k)方法,去找到链表中正确的节点,最终找到要找的值并返回

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    // 通过hash算法找到backet(桶)的位置 (first = tab[(n - 1) & hash]) != null)
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        if (first.hash == hash && // always check first node 找到backet位置调用key.equlas(k)方法,去找到链表中正确的节点
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        if ((e = first.next) != null) {
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do {
            	// 最终找到要找的值并返回
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

总结

  • HashMap由数组+链表+红黑树组成
  • 数组也被称为桶(backed),默认为16个桶
  • 在PUT操作时,会先进行hash运算,计算数组下标是否出现“hash碰撞”
  • 如果没有出现hash碰撞,则直接插入桶中
  • 如果出现hash碰撞,则以链表方式链接到这个桶的后边(默认每个桶大小为8个)
  • 当桶中链表长度超过阈值(8)时,且hashMap桶超过(64)个时,则发生树化(红黑树树化)
  • 若节点已经存在,则用新值替换旧值
  • 若桶满了(默认容量16*扩容因子0.75),就需要resize(扩容2倍后重排)
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章