了解HashMap数据结构，超详细！

原創

程序员的时光

2020-10-24 17:13

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"写在前面"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"小伙伴儿们，大家好！今天来学习HashMap相关内容，作为面试必问的知识点，来深入了解一波！"}]}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"思维导图："}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d1/d1a4c006620f1e619dfa7e6a7cfe6ce5.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"学习框架图"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1，HashMap集合简介"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HashMap基于哈希表的Map接口实现，是以"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"key-value"},{"type":"text","text":"存储形式存在，即主要用来存放键值对。HashMap的实现不是同步的，这意味着它不是线程安全的。它的key、value都可以为null。此外，HashMap中的映射不是有序的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"JDK1.8之前的HashMap由数组+链表组成的，数组是HashMap的主体，链表则是主要为了节解决"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"哈希碰撞"},{"type":"text","text":"(两个对象调用的hashCode方法计算的哈希码值一致导致计算的数组索引值相同)而存在的（“拉链法”解决冲突）。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"JDK1.8之后在解决哈希冲突时有了较大的变化，当"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"链表长度大于阈值"},{"type":"text","text":"（或者红黑树的边界值，默认为8）并且当前"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"数组的长度大于64"},{"type":"text","text":"时，此时此索引位置上的所有数据改为使用红黑树存储。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"数组里面都是key-value的实例，在JDK1.8之前叫做Entry，在JDK1.8之后叫做Node。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/1c/1c7f5d4f4839b72908faba89c2ab56d9.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"key-value实例"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由于它的key、value都为null，所以在插入的时候会根据key的hash去计算一个index索引的值。计算索引的方法如下："}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"/**\n * 根据key求index的过程\n * 1,先用key求出hash值\n */\nstatic final int hash(Object key) {\n int h;\n return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);\n}\n//2,再用公式index = (n - 1) & hash（n是数组长度）\nint hash=hash(key);\nindex=(n-1)&hash;\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这里的Hash算法本质上就是三步：取key的hashCode值、高位运算、取模运算。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这样的话比如说put(\"A\",王炸)，插入了key为\"A\"的元素，这时候通过上述公式计算出插入的位置index，若index为3则结果如下（即hash(\"A\")=3）："}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/7c/7c21871ec7ee8c9f758992d568b6cd3f.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那么，HashMap中的链表又是干什么用的呢？"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大家都知道数组的长度是有限的，"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"在有限的长度里面使用哈希函数计算index的值时，很有可能插入的k值不同，但所产生的hash是相同的"},{"type":"text","text":"（也叫做哈希碰撞），这也就是哈希函数存在一定的概率性。就像上面的K值为A的元素，如果再次插入一个K值为a的元素，很有可能所产生的index值也为3，也就是即hash(\"a\")=3；那这就形成了链表，这种解决哈希碰撞的方法也叫做拉链法。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/06/06c43bf7aa4b497d858503e7c76c4ba7.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"当这个链表长度大于阈值8并且数组长度大于64则进行将链表变为红黑树。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"补充："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"将链表转换成红黑树前会判断，"},{"type":"text","marks":[{"type":"strong"}],"text":"如果阈值大于8，但是数组长度小64，此时并不会将链表变为红黑树。而是选择进行数组扩容。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这样做的目的是因为数组比较小，尽量避开红黑树结构，这种情况下变为红黑树结构，反而会降低效率，因为红黑树需要进行左旋，右旋，变色这些操作来保持平衡。同事数组长度小于64时，搜索时间相对快一些。所以综上所述为了提高性能和减少搜索时间，底层在阈值大于8并且数组长度大于64时，链表才转换为红黑树。具体可以参考treeifyBin方法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"当然虽然增了红黑树作为底层数据结构，结构变得复杂了，但是阈值大于8并且数组长度大于64时，链表转换为红黑树时，效率也变得更高效。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"特点："}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"存取无序的"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"键和值位置都可以是null，但是键位置只能是一个null"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"键位置是唯一的，底层的数据结构控制键的"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"jdk1.8前数据结构是：链表 + 数组 jdk1.8之后是：链表 + 数组 + 红黑树"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"阈值(边界值) > 8 并且数组长度大于64，才将链表转换为红黑树，变为红黑树的目的是为了高效的查询。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2，HsahMap底层数据结构"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.1，HashMap存储数据的过程"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/33/3372df90aec7531aa0993b6c5fb5ce3e.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每一个Node结点都包含"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"键值对的key，value"},{"type":"text","text":"还有计算出来的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"hash值"},{"type":"text","text":"，还保存着下一个"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}}],"text":" "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"Node 的引用 next"},{"type":"text","text":"（如果没有下一个 Node，next = null），来看看Node的源码："}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"static class Node implements Map.Entry {\n final int hash;\n final K key;\n V value;\n Node next;\n ...\n }"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HashMap存储数据需要用到put()方法，关于这些方法的详解，我们下节再说，这里简要说一下；"}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"public static void main(String[] args) {\n HashMap hmap=new HashMap<>();\n hmap.put(\"斑\",55);\n hmap.put(\"镜\",63);\n hmap.put(\"带土\",25);\n hmap.put(\"鼬\",9);\n hmap.put(\"佐助\",43);\n hmap.put(\"斑\",88);\n System.out.println(hmap);\n }"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"当创建HashMap集合对象的时候，在jdk1.8之前，构造方法中会创建很多"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"长度是16的Entry[] table"},{"type":"text","text":"用来存储键值对数据的。在jdk1.8之后不是在HashMap的构造方法底层创建数组了，是在"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"第一次调用put方法"},{"type":"text","text":"时创建的数组，"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"Node[] table"},{"type":"text","text":"用来存储键值对数据的。"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比方说我们向哈希表中存储\"斑\"-55的数据，根据K值(\"斑\")调用String类中重写之后的hashCode()方法计算出值（数量级很大），然后结合数组长度采用取余（(n-1)&hash）操作或者其他操作方法来计算出向Node数组中存储数据的空间的索引值。如果计算出来的索引空间没有数据，则直接将\"斑\"-55数据存储到数组中。跟上面的\"A-王炸\"数据差不多。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们回到上方的数组图，如果此时再插入\"A-蘑菇\"元素，那么首先根据Key值(\"A\")调用hashCode()方法结合数组长度计算出索引肯定也是3，此时比较后存储的\"A-蘑菇\"和已经存在的数据\"A-王炸\"的hash值是否相等，如果hash相等，此时发生hash碰撞。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那么底层会调用\"A\"所属"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#F5222D","name":"red"}},{"type":"strong"}],"text":"类String中的equals方法"},{"type":"text","text":"比较两个key内容是否相等，若相等，则后添加的数据直接覆盖已经存在的Value，也就是\"蘑菇\"直接覆盖\"王炸\"；若不相等，继续向下和其他数据的key进行比较，如果都不相等，则规划出一个节点存储数据。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/7d/7dce818b614372db22deffaffcba10c2.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"两个结点key值比较，是否覆盖"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.2，哈希碰撞相关的问题"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"哈希表底层采用何种算法计算hash值？还有哪些算法可以计算出hash值？"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"底层是采用key的hashCode方法的值结合数组长度进行"},{"type":"text","marks":[{"type":"strong"}],"text":"无符号右移（>>>）"},{"type":"text","text":"、按位异或（^）、按位与（&）计算出索引的"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"还可以采用：平方取中法，取余数，伪随机数法。这三种效率都比较低。而无符号右移16位异或运算效率是最高的。"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"当两个对象的hashCode相等时会怎么样？"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"会产生哈希碰撞，若"},{"type":"text","marks":[{"type":"strong"}],"text":"key值内容相同则替换旧的value"},{"type":"text","text":".否则连接到链表后面，"},{"type":"text","marks":[{"type":"strong"}],"text":"链表长度超过阈值8"},{"type":"text","text":"就转换为红黑树存储。"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"何时发生哈希碰撞和什么是哈希碰撞,如何解决哈希碰撞？"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"只要"},{"type":"text","marks":[{"type":"strong"}],"text":"两个元素的key计算的哈希值相同"},{"type":"text","text":"就会发生哈希碰撞。jdk8前使用链表解决哈希碰撞。jdk8之后使用链表+红黑树解决哈希碰撞。"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果两个键的hashcode相同，如何存储键值对？"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"hashcode相同，通过equals比较内容是否相同。"},{"type":"text","marks":[{"type":"strong"}],"text":"相同：则新的value覆盖之前的value 不相同：则将新的键值对添加到哈希表中"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.3，红黑树结构"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"当位于一个链表中的元素较多，即hash值相等但是内容不相等的元素较多时，通过key值依次查找的效率较低。而jdk1.8中，哈希表存储采用数组+链表+红黑树实现，当链表长度(阀值)超过 8 时且当前数组的长度 > 64时，将链表转换为红黑树，这样大大减少了查找时间。jdk8在哈希表中"},{"type":"text","marks":[{"type":"strong"}],"text":"引入红黑树的原因只是为了查找效率更高。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/2d/2d6bf6020091655bf7dade250e72cbc8.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"红黑树结构"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"JDK 1.8 以前 HashMap 的实现是数组+链表，即使哈希函数取得再好，也很难达到元素百分百均匀分布。当 HashMap 中有大量的元素都存放到同一个桶中时，这个桶下有一条长长的链表，这个时候 HashMap 就相当于一个单链表，假如单链表有 n 个元素，遍历的时间复杂度就是 O(n)，完全失去了它的优势。针对这种情况，JDK 1.8 中引入了红黑树（"},{"type":"text","marks":[{"type":"strong"}],"text":"查找时间复杂度为 O(logn)"},{"type":"text","text":"）来优化这个问题。当链表长度很小的时候，即使遍历，速度也非常快，但是当链表长度不断变长，肯定会对查询性能有一定的影响，所以才需要转成树。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2.4，存储流程图"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HashMap存放数据是用的put方法，put 方法内部调用的是 putVal() 方法，所以对 put 方法的分析也是对 putVal 方法的分析，整个过程比较复杂，流程图如下："}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e4/e40955237c51d77cbddd5d04b53a66f6.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"来看看put()源码："}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"public V put(K key, V value) {\n //对key的hashCode()做hash，调用的是putVal方法\n return putVal(hash(key), key, value, false, true);\n }\n\n final V putVal(int hash, K key, V value, boolean onlyIfAbsent,\n boolean evict) {\n Node[] tab; Node p; int n, i;\n /*\n 1，tab为空则开始创建，\n 2，(tab = table) == null 表示将空的table赋值给tab,然后判断tab是否等于null，第一次肯定是null\n 3，(n = tab.length) == 0 表示没有为table分配内存\n 4，tab为空，执行代码 n = (tab = resize()).length; 进行扩容。并将初始化好的数组长度赋值给n.\n 5，执行完n = (tab = resize()).length，数组tab每个空间都是null\n */\n \n if ((tab = table) == null || (n = tab.length) == 0)\n //调用resize()方法进行扩容\n n = (tab = resize()).length;\n /*\n 1，i = (n - 1) & hash 表示计算数组的索引赋值给i，即确定元素存放在哪个桶中\n 2，p = tab[i = (n - 1) & hash]表示获取计算出的位置的数据赋值给节点p\n 3，(p = tab[i = (n - 1) & hash]) == null 判断节点位置是否等于null，\n 如果为null，则执行代码：tab[i] = newNode(hash, key, value, null);根据键值对创建新的节点放入该位置的桶中\n 小结：如果当前桶没有哈希碰撞冲突，则直接把键值对插入空间位置\n */ \n if ((p = tab[i = (n - 1) & hash]) == null)\n //节点位置为null，则直接进行插入操作\n tab[i] = newNode(hash, key, value, null);\n //节点位置不为null，表示这个位置已经有值了，于是需要进行比较hash值是否相等\n else {\n Node e; K k;\n /*\n 比较桶中第一个元素(数组中的结点)的hash值和key是否相等\n 1，p.hash == hash 中的p.hash表示原来存在数据的hash值 hash表示后添加数据的hash值比较两个hash值是否相等\n 2，(k = p.key) == key ：p.key获取原来数据的key赋值给k key表示后添加数据的key 比较两个key的地址值是否相等\n 3，key != null && key.equals(k)：能够执行到这里说明两个key的地址值不相等，那么先判断后添加的key是否等于null，如果不等于null再调用equals方法判断两个key的内容是否相等\n */\n if (p.hash == hash &&\n ((k = p.key) == key || (key != null && key.equals(k))))\n /*\n 说明：两个元素哈希值相等（哈希碰撞），并且key的值也相等\n 将旧的元素整体对象赋值给e，用e来记录\n */ \n e = p;\n // hash值不相等或者key不相等；判断p是否为红黑树结点\n else if (p instanceof TreeNode)\n // 是红黑树，调用树的插入方法\n e = ((TreeNode)p).putTreeVal(this, tab, hash, key, value);\n // 说明是链表节点，这时进行插入操作\n else {\n /*\n 1，如果是链表的话需要遍历到最后节点然后插入\n 2，采用循环遍历的方式，判断链表中是否有重复的key\n */\n for (int binCount = 0; ; ++binCount) {\n /*\n 1)e = p.next 获取p的下一个元素赋值给e\n 2)(e = p.next) == null 判断p.next是否等于null，等于null，说明p没有下一个元素，那么此时到达了链表的尾部，还没有找到重复的key,则说明HashMap没有包含该键\n 将该键值对插入链表中\n */\n if ((e = p.next) == null) {\n p.next = newNode(hash, key, value, null);\n //插入后发现链表长度大于8，转换成红黑树结构\n if (binCount >= TREEIFY_THRESHOLD - 1) \n //转换为红黑树\n treeifyBin(tab, hash);\n break;\n }\n //key值以及存在直接覆盖value\n if (e.hash == hash &&\n ((k = e.key) == key || (key != null && key.equals(k))))\n break;\n p = e;\n }\n }\n //若结点为null，则不进行插入操作\n if (e != null) { \n V oldValue = e.value;\n if (!onlyIfAbsent || oldValue == null)\n e.value = value;\n afterNodeAccess(e);\n return oldValue;\n }\n }\n //修改记录次数\n ++modCount;\n // 判断实际大小是否大于threshold阈值，如果超过则扩容\n if (++size > threshold)\n resize();\n // 插入后回调\n afterNodeInsertion(evict);\n return null;\n }"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"小结："}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"根据哈希表中元素个数确定是"},{"type":"text","marks":[{"type":"strong"}],"text":"扩容还是树形化"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"如果是树形化遍历桶中的元素，创建相同个数的树形节点，复制内容，建立起联系"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"然后让桶中的第一个元素指向新创建的树根节点，替换桶的链表内容为树形化内容"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3，HashMap的扩容机制"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们知道，数组的容量是有限的，多次插入数据的话，到达一定数量就会进行扩容；先来看两个问题"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"什么时候需要扩容？"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"当HashMap中的"},{"type":"text","marks":[{"type":"strong"}],"text":"元素个数超过数组长度loadFactor(负载因子)"},{"type":"text","text":"时，就会进行数组扩容，loadFactor的默认值是0.75,这是一个折中的取值。也就是说，默认情况下，数组大小为16，那么当HashMap中的元素个数超过16×0.75=12(这个值就是阈值)的时候，就把数组的大小扩展为2×16=32，即扩大一倍，然后"},{"type":"text","marks":[{"type":"strong"}],"text":"重新计算每个元素在数组中的位置"},{"type":"text","text":"，而这是一个非常耗性能的操作，所以如果我们已经预知HashMap中元素的个数，那么预知元素的个数能够有效的提高HashMap的性能。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"怎么进行扩容的？"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HashMap在进行扩容时使用 resize() 方法，计算 table 数组的新容量和 Node 在新数组中的新位置，将旧数组中的值复制到新数组中，从而实现自动扩容。因为每次扩容都是翻倍，与原来计算的 (n-1)&hash的结果相比，只是多了一个bit位，所以节点要么就在原来的位置，要么就被分配到\""},{"type":"text","marks":[{"type":"strong"}],"text":"原位置+旧容量"},{"type":"text","text":"\"这个位置。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此，我们在扩充HashMap的时候，不需要重新计算hash，只需要看看原来hash值新增的那个bit是1还是0就可以了，是0的话索引没变，是1的话索引变成“原索引+oldCap("},{"type":"text","marks":[{"type":"strong"}],"text":"原位置+旧容量"},{"type":"text","text":")”。这里不再详细赘述，可以看看下图为16扩充为32的resize示意图："}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/fc/fc1b19df3ec331e181c92fa047b08dff.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"hashmap扩容"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"4，HashMap数组长度为什么是2的次幂"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们先看看它的成员变量："}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"序列化版本号"}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"private static final long serialVersionUID = 362498820763181265L;"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"集合的初始化容量initCapacity"}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"//默认的初始容量是16 -- 1<<4相当于1*2的4次方---1*16\nstatic final int DEFAULT_INITIAL_CAPACITY = 1 << 4; "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"初始化容量默认是16，容量过大，遍历时会减慢速度，效率低；容量过小，那么扩容的次数变多，非常耗费性能。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"负载因子"}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"/**\n * The load factor used when none specified in constructor.\n */\n static final float DEFAULT_LOAD_FACTOR = 0.75f;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"初始默认值为0.75，若过大，会导致哈希冲突的可能性更大；若过小，扩容的次数也会提高。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"为什么必须是2的n次幂？"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"当向HashMap中添加一个元素的时候，需要根据key的hash值，去确定其在数组中的具体位置。HashMap为了提高存取效率，要尽量较少碰撞，就是要尽量把数据分配均匀，每个链表长度大致相同，这个实现就在把数据存到哪个链表中的算法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这个算法实际就是取模，hash%length，计算机中直接求余效率不如位移运算。所以源码中做了优化,使用 "},{"type":"codeinline","content":[{"type":"text","text":"hash&(length-1)"}]},{"type":"text","text":"，而实际上"},{"type":"codeinline","content":[{"type":"text","text":"hash%length"}]},{"type":"text","text":"等于"},{"type":"codeinline","content":[{"type":"text","text":"hash&(length-1)"}]},{"type":"text","text":"的前提是length是2的n次幂。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"如果输入值不是2的幂会怎么样？"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果数组长度不是2的n次幂，计算出的索引特别容易相同，及其容易发生hash碰撞，导致其余数组空间很大程度上并没有存储数据，链表或者红黑树过长，效率降低。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"小结："}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1，当根据key的hash确定其在数组的位置时，如果n为2的幂次方"},{"type":"text","marks":[{"type":"strong"}],"text":"，可以保证数据的均匀插入"},{"type":"text","text":"，如果n不是2的幂次方，可能数组的一些位置永远不会插入数据，"},{"type":"text","marks":[{"type":"strong"}],"text":"浪费数组的空间，加大hash冲突。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2，一般可能会想通过 % 求余来确定位置，这样也可以，只不过性能不如 & 运算。而且当n是2的幂次方时：hash & (length - 1) == hash % length"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3，因此，HashMap 容量为2次幂的原因，就是"},{"type":"text","marks":[{"type":"strong"}],"text":"为了数据的的均匀分布，减少hash冲突，"},{"type":"text","text":"毕竟hash冲突越大，代表数组中一个链的长度越大，这样的话会降低hashmap的性能"}]},{"type":"horizontalrule"},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"微信搜索公众号《"},{"type":"text","marks":[{"type":"size","attrs":{"size":12}},{"type":"color","attrs":{"color":"#F5222D","name":"red"}}],"text":"程序员的时光"},{"type":"text","text":"》"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#000000","name":"black"}}],"text":"好了，今天就先分享到这里了，下期继续给大家带来HashMap面试内容！"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#000000","name":"black"}}],"text":"更多干货、优质文章，欢迎关注我的原创技术公众号~"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相關文章

欧洲英国德国法国TikTok与YouTube海外网红达人的完美合作策略

【本篇由言同數字科技有限公司原創】在當今數字營銷時代，TikTok已成爲一種受歡迎的社交媒體平臺，尤其在年輕人中頗具影響力。而其中的直播帶貨更是吸引了衆多品牌的注意，成爲推廣產品和增加銷售的重要途徑。下面言同數字將針對海外TikTok網紅直

2024-05-03 22:36:01

ollama使用

ollama 僅支持。gguf的格式其他格式需要llama.cpp 轉換 curl https://ollama.ai/install.sh | sh ollama --version ollama pull llama2-chin

2024-05-01 00:42:55

「Qt Widget中文示例指南」如何实现一个快捷编辑器（一）

Qt 是目前最先進、最完整的跨平臺C++開發工具。它不僅完全實現了一次編寫，所有平臺無差別運行，更提供了幾乎所有開發過程中需要用到的工具。如今，Qt已被運用於超過70個行業、數千家企業，支持數百萬設備及應用。快捷編輯器示例展示瞭如何創建一

2024-04-30 23:36:29

解锁HDC 2024之旅：从购票到报名，全程攻略

本文分享自華爲雲社區《解鎖HDC 2024之旅：從購票到報名，全程攻略》，作者：華爲雲社區精選。 Hi，代碼界的小夥伴們，集結號已經吹響了！華爲開發者大會（HDC 2024）——這場匯聚了HarmonyOS NEXT鴻蒙星河版、盤古大模型5

2024-04-30 22:34:35

银行核心背后的落地工程体系丨Oracle - TiDB 数据迁移详解

本文作者：張顯華，孟凡輝，莊培培系列導讀：徐戟（白鱔）數據庫技術專家，Oracle ACE，PostgreSQL ACE Director 當前，國內大量的關鍵行業的核心繫統正在實現國產化替代，而與此同時，這些行業的數字化轉型也正在進入

2024-04-30 22:24:59

30 秒出服装设计稿，森马用函数计算+AIGC 整“新活”!

創新項目如何去賦能我們的業務，這件事情在森馬很重要。阿里雲函數計算幫我們屏蔽掉了想把AI落地到實際業務場景中 GPU 算力資源儲備、採購成本、技術門檻等很多難題，從而迅速做出決策，快人一步站在正確的起點，體驗新技術對整個服裝爆款設計、營銷

2024-04-30 21:12:14

消金公司2023财报解析：息差维持高位，信用成本攀升

來源 | 鐳射財經（leishecaijing） 2023年，是持牌消金行業承上啓下的關鍵一年，也是鍛造韌性、比拼內功最緊張的一年。一方面，住戶短期消費貸款餘額在2022年觸底後，伴隨經濟復甦、消費提振，於2023年重新回到上行軌道。短

2024-04-30 13:11:32

Linux下制作Nginx绿色免安装包

前言 linux下安裝nginx比較繁瑣，遇到內網部署環境更是麻煩，所以研究了下nginx綠色免安裝版的部署包製作，開箱即用，特此記錄分享，一下操作在centos8環境下安裝，如果需要其他內核系統的安裝（Debian/Ubuntu等），請在

2024-04-29 21:38:23

数字化转型新篇章：企业通往智能化的新范式

早在十多年前，一些具有前瞻視野的企業以實現“數字化”爲目標啓動轉型實踐。但時至今日，可以說尚無幾家企業能夠在真正意義上實現“數字化”。在實現“數字化”的征途上，人們發現，努力愈進，彷彿終點愈遠。究其原因，還在於轉型一直落後於技術邊界的拓展

2024-04-29 21:22:20

MindSpore强化学习：使用PPO配合环境HalfCheetah-v2进行训练

本文分享自華爲雲社區《MindSpore強化學習：使用PPO配合環境HalfCheetah-v2進行訓練》，作者： irrational。半獵豹（Half Cheetah）是一個基於MuJoCo的強化學習環境，由P. Wawrzyński

2024-04-29 10:33:13

巧用 TiCDC Syncpiont 构建银行实时交易和准实时计算一体化架构

本文闡述了某商業銀行如何利用 TiCDC Syncpoint 功能，在 TiDB 平臺上構建一個既能處理實時交易又能進行準實時計算的一體化架構，用以優化其零售資格業務系統的實踐。通過遷移到 TiDB 並巧妙應用 Syncpoint，該銀行成

2024-04-30 22:24:58

Apache DolphinScheduler支持Flink吗？

隨着大數據技術的快速發展，很多企業開始將Flink引入到生產環境中，以滿足日益複雜的數據處理需求。而作爲一款企業級的數據調度平臺，Apache DolphinScheduler也跟上了時代步伐，推出了對Flink任務類型的支持。 Flink

2024-04-30 11:49:27

从原始边列表到邻接矩阵Python实现图数据处理的完整指南

本文分享自華爲雲社區《從原始邊列表到鄰接矩陣Python實現圖數據處理的完整指南》，作者：檸檬味擁抱。在圖論和網絡分析中，圖是一種非常重要的數據結構，它由節點（或頂點）和連接這些節點的邊組成。在Python中，我們可以使用鄰接矩陣來表示

2024-04-30 10:34:05

如何通过前后端交互的方式制作Excel报表

前言 Excel擁有在辦公領域最廣泛的受衆羣體，以其強大的數據處理和可視化功能，成了無可替代的工具。它不僅可以呈現數據清晰明瞭，還能進行數據分析、圖表製作和數據透視等操作，爲用戶提供了全面的數據展示和分析能力。今天小編就爲大家介紹一下，如

2024-04-30 10:24:12

Python爬虫技术与数据可视化：Numpy、pandas、Matplotlib的黄金组合

前言在當今信息爆炸的時代，數據已成爲企業決策和發展的關鍵。而互聯網作爲信息的主要來源，網頁中蘊含着大量的數據等待被挖掘。Python爬蟲技術和數據可視化工具的結合，爲我們提供了一個強大的工具箱，可以幫助我們從網絡中抓取數據，並將其可視

2024-04-29 23:26:28

24小時熱門文章

最新文章

最新評論文章