Hash算法以及java hashmap的源碼分析

原創

2020-02-22 16:00

hash算法也叫做散列函數，通過一個函數將任何信息轉換成信息量的摘要。一個設計的比較好的hash算法，其衝突是比較少，衝突的含義就是不同的輸入經過hash後得到了相同的摘要信息。

這裏我分析了一下java源代碼中HashMap的實現。

static int hash(int h) {
        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

這裏可以針對任何類型的對象進行hash計算，之所以傳入的是int類型，那是因爲java object類中hashcode()方法。該方法的定義也比較有意思。

public native int hashCode(); 說明該方法是c++實現的。因此肯定要加載對應的dll文件。但是這裏並沒有發現加載什麼dll文件。看看java代碼裏面的加載吧。

private static native void registerNatives();
    static {
        registerNatives();
    }

我看到這裏後，徹底暈倒了。我僅能得到的結論是：hashmap的實現中，java是將key值進行hashcode計算得到int類型，然後再hash計算得到摘要信息。看看hashmap的put方法吧。

 public V put(K key, V value) {
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key.hashCode());
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

這裏首先調用了hash(key.hashCode())對key值進行hash，得出的值到table中查詢，table是存放所有的值的數組，該數組的包含的對象是Entry對象，這個對象就像鏈表裏面的節點一樣。具體可以看看代碼了。

然後對其hash值進行查找，並看table中該索引下的值是否爲空，如果不爲空的話，則在當前位置的元素中，建立一個鏈接，新建一個Entry對象，這樣，在table數組當前的索引下，就是一個鏈表了。這是處理衝突的一種方法。如果table中該索引下的值是空的，說明使用正常，不存在衝突。

需要注意下哦，這裏hash出來的值比較大，但是如何把這麼大的值直接映射到這麼小的數組空間上來呢，在indexFor()方法裏面有一些眉目可以看到

 static int indexFor(int h, int length) {
        return h & (length-1);
    }

經過與計算後，再大的值都會在數組空間範圍內了。

關於java裏面的hash就記錄到這裏了。

以下有一篇翻譯的文章對理解hash比較有幫助。而且記錄有各種hash算法。摘錄以下代碼記錄一下。

http://blog.csdn.net/eaglex/article/details/6310727

package com.bplead.hash;



public class Hash {
	
	public static long RSHash(String str){
		int b = 378551;
		int a = 63689;
		long hash = 0;
		for(int i=0;i<str.length();i++){
			hash = hash * a + str.charAt(i);
			a = a +b;
		}
		return hash;
	}
	
	public static long JSHash(String str){  
	      long hash = 1315423911;  
	      for(int i = 0; i < str.length(); i++)  
	      {  
	         hash ^= ((hash << 5) + str.charAt(i) + (hash >> 2));  
	      }  
	      return hash;  
	}  
	
	public long SDBMHash(String str){  
	      long hash = 0;  
	      for(int i = 0; i < str.length(); i++)  
	      {  
	         hash = str.charAt(i) + (hash << 6) + (hash << 16) - hash;  
	      }  
	      return hash;  
	}  
	
	public long DJBHash(String str){  
	      long hash = 5381;  
	      for(int i = 0; i < str.length(); i++)  
	      {  
	         hash = ((hash << 5) + hash) + str.charAt(i);  
	      }  
	      return hash;  
	}  
	
	public long DEKHash(String str){  
	      long hash = str.length();  
	      for(int i = 0; i < str.length(); i++)  
	      {  
	         hash = ((hash << 5) ^ (hash >> 27)) ^ str.charAt(i);  
	      }  
	      return hash;  
	   }  
	
	public static int hash(int h) {
	        // This function ensures that hashCodes that differ only by
	        // constant multiples at each bit position have a bounded
	        // number of collisions (approximately 8 at default load factor).
	        h ^= (h >>> 20) ^ (h >>> 12);
	        return h ^ (h >>> 7) ^ (h >>> 4);
	    }
	
	public static void main(String[] args) {
		String temp = "HelloWorld12";
		int index = hash(temp.hashCode());
		System.out.println(index);
		System.out.println(index&15);
	}
	
}

akavyi

發佈了59 篇原創文章 · 獲贊 10 · 訪問量 6萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Hash算法以及java hashmap的源碼分析

Kafka存儲機制

aws語音呼叫調用，告警電話

【轉】[C#] WebAPI 防止併發調用二（冥等性）

HTTP URL 詳解

session Replication & session sticky

c++學習之this

Hash算法以及java hashmap的源碼分析

排序算法--雞尾酒排序

c++學習之命名空間和異常

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結