equals方法的姐妹篇——如何實現高效的hashCode方法

1、何時實現hashCode方法

上一篇文章介紹瞭如何實現equals方法如何實現equals方法，hashCode跟equals一樣，都是基類Object中的一個方法。而什麼時候該重寫hashCode方法呢？其實這個問題的答案我們也許都知道，就是我們的類需要使用到集合框架時，絕大多數情況都要實現equals和hashCode方法，而不能只實現這兩個方法其中一個。爲什麼呢？我們可以看一個例子：

public class Goods {

	public int id;
	public String goodsName;
	
	@Override
	public boolean equals(Object obj) {
		if(this == obj) 
			return true;	
		if(!(obj instanceof Goods)) 
			return false;		
		Goods target = (Goods)obj;
		if(this.id != target.id)
			return false;
		if(target.goodsName == null || !target.goodsName.equals(this.goodsName))
			return false;
		
		return true;
	}
    
    public static void main(String[] args) {
		Set<Goods> goodsSet = new HashSet<Goods>();
		Goods goods = null;
		for(int i=0;i<2;i++) {
			goods = new Goods();
			goods.id=1;
			goods.goodsName="可口可樂";
			if(!goodsSet.contains(goods)) {
				goodsSet.add(goods);
			}
		}
		System.out.println(goodsSet);
	}
}

上面的Goods類只實現了equals方法，而沒有實現hashCode方法，我們構造兩個Goods對象，假設我們期望在業務邏輯中id和goodsName相同的被認爲是同一個商品，而不是兩個，但在以上的代碼中連續添加“相同的”商品到一個集合裏，會發現兩個goods都被加進集合裏！這在業務範疇來說是不可接受的。

2、Object裏對equals方法實現的幾個規範

基類Object在equals方法上註釋了幾個規範：

 * <ul>
 * <li>Whenever it is invoked on the same object more than once during
 *     an execution of a Java application, the {@code hashCode} method
 *     must consistently return the same integer, provided no information
 *     used in {@code equals} comparisons on the object is modified.
 *     This integer need not remain consistent from one execution of an
 *     application to another execution of the same application.
 * <li>If two objects are equal according to the {@code equals(Object)}
 *     method, then calling the {@code hashCode} method on each of
 *     the two objects must produce the same integer result.
 * <li>It is <em>not</em> required that if two objects are unequal
 *     according to the {@link java.lang.Object#equals(java.lang.Object)}
 *     method, then calling the {@code hashCode} method on each of the
 *     two objects must produce distinct integer results.  However, the
 *     programmer should be aware that producing distinct integer results
 *     for unequal objects may improve the performance of hash tables.
 * </ul>

簡單的翻譯並概括一下這段話的意思：

如果對象中equals方法所用到的信息沒有被修改，那麼對這個對象的多次調用hashCode方法都應該返回相同的值。在一個應用程序跟另一個應用程序執行過程中，hashCode方法可以返回不一致;
如果兩個對象經過equals方法對比是相等的，那麼調用hashCode要返回相同的值；
如果兩個對象經過equals判定爲不相等，那麼調用hashCode方法不一定要求返回不同的值。但應該考慮儘可能讓兩個不相等的對象產生不同的hashCode，這有利於散列表的性能提高。

3、如何重寫高效的hashCode

基於上面的幾個理論說明，我們來考慮下如何正確並實現高效能的hashCode方法。我們都知道hashCode是一個整型值，看一下以下的hashCode重寫範例。

	@Override
	public int hashCode() {
		return 11;//不合理的散列碼，會導致構造出的散列表退化成鏈表
	}

這種hashCode的實現方法，是符合Object定義的那三條規範的。但是問題也隨之而來，因爲所有的對象都是返回相同的hashCode，那麼構造的散列表，就會退化成鏈表，我們都知道鏈表的搜索性能是非常差的，所以這種不負責任的hashCode實現方式是不合理的，應該要考慮如何儘可能的讓不同的對象產生的hashCode也不同。

那麼如何快速的寫出合理且高效的hashCode方法呢？很簡單，按照下面這個步驟來：

先聲明一個int變量並且命名爲result，將它初始化爲對象中第一個關鍵域(字段)的散列碼。接着計算第二個關鍵域(字段)的散列碼c，並按照公式result=result*31+c累加到result上，並以此類推到所有關鍵域；
如果關鍵域f是基本類型，那麼它的散列碼就是Type.hashCode(f)，其中Type是Integer、Double這些裝箱類型；
如果關鍵域是一個對象引用，那麼它的散列碼就是通過調用這個引用的hashCode方法得到的值，如果這個域是null，則返回0，或者返回某個常數；
如果關鍵域是一個數組，則需要遍歷數組中所有的元素，並求出每個元素的散列碼，並按照公式result=result*31+c把這些散列碼組合到result中；
返回result。

這裏直接上代碼：

public class Goods {

	public int id;	
	public String goodsName;	
	List<Integer> refGoods = new ArrayList<>();//關聯的商品id列表
	
	@Override
	public boolean equals(Object obj) {
		if(this == obj) 
			return true;
		if(!(obj instanceof Goods)) 
			return false;		

		Goods target = (Goods)obj;
		if(this.id != target.id)
			return false;
		if(target.goodsName == null || !target.goodsName.equals(this.goodsName))
			return false;
		
		for(Integer i : this.refGoods) {
			if(!target.refGoods.contains(i))
				return false;
		}
		
		return true;
	}
	
	@Override
	public int hashCode() {
		int result = Integer.hashCode(this.id);
		
		result = 31 * result + (this.goodsName == null?0:this.goodsName.hashCode());
		for(Integer i : this.refGoods) {
			result = 31 * result + i.hashCode();
		}
		
		return result;
	}

	
}

此處計算散列碼爲什麼要用31這個數字呢？我覺得這個是個約定俗成的方案，另外也是有一定的道理的。有兩方面因素：

**因爲31是個“不大不小”的奇素數。**如果用偶數來作爲乘數，則有可能出現乘法溢出的後果，因爲乘2運算相當於移位，有可能會溢出造成數據丟失。如果這個奇素數選擇得比較小，例如3，則導致計算出的散列碼過於小，造成的哈希衝突比較多。如果這個奇素數選擇得比較大，則有可能算出的結果超過了整型的最大值。因此一個“不大不小”的奇素數31是個不錯的選擇；
利用jvm的優化特性。因爲n*31會被編譯器優化成(n<<5)-n，變成移位和減法來代替的話，性能得到極大的提升。

4、提高不可變類的hashCode性能

對於不可變類，並且計算散列碼的開銷也很大，就應該把散列碼緩存在對象內部，避免每次都重新計算散列碼。

private int hashCode;
public int hashCode(){
    int result = hashCode;
    if(result == 0){
        result = this.addresss.hashCode();
        result = result*32 + Objects.hash(this.goodsList);
    }
    return result;
}

5、總結

總而言之，每當我們重寫equals方法時，必須相應的也把hashCode方法也實現了，不然程序將無法正確運行。另外，hashCode方法也必須遵循Object類定義的那三個通用約定，實現高效的hashCode方法，這纔會讓程序正確並高效的跑起來！
equals方法與hashCode方法就像是不可分割的兩姐妹，此處給出hashCode實現方法的姐妹篇之《如何實現equals方法》。

equals方法的姐妹篇——如何實現高效的hashCode方法

1、何時實現hashCode方法

2、Object裏對equals方法實現的幾個規範

3、如何重寫高效的hashCode

4、提高不可變類的hashCode性能

5、總結

公司新來一個幹練小夥，把 MyBatis 替換成 MyBatis-Plus，上線後哭暈在廁所。。。

Testin雲測上線華爲Pura 70系列真機測試服務！

10分鐘本地運行llama3及初體驗

golang 表格

手寫協議報文 c語言手法

甲骨文(Oracle)宣佈將以74億美元收購Sun公司

碼農的書法窩

聽說你還不會實現equals方法？收藏這篇文章就夠了！

Spring Data Redis事務的正確使用姿勢

@Transactional註解實現事務管理的原理

釋放sqlite文件佔用的多餘空間

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結