深入理解java虛擬機（十四）正確利用 JVM 的方法內聯

在IntelliJ IDEA裏面Ctrl+Alt+M用來拆分方法。選中一段代碼，敲下這個組合，非常簡單。Eclipse也用類似的快捷鍵，使用 Alt+Shift+M。我討厭長的方法，提起這個下面這個方法我就覺得太長了：

public void processOnEndOfDay(Contract c) {
		if (DateUtils.addDays(c.getCreated(), 7).before(new Date())) {
			priorityHandling(c, OUTDATED_FEE);
			notifyOutdated(c);
			log.info("Outdated: {}", c);
		} else {
			if (sendNotifications) {
				notifyPending(c);
			}
			log.debug("Pending {}", c);
		}
	}

首先，它有個條件判斷可讀性很差。先不管它怎麼實現的，它做什麼的才最關鍵。我們先把它拆分出來：

public void processOnEndOfDay(Contract c) {
		if (isOutDate(c)) {
			priorityHandling(c, OUTDATED_FEE);
			notifyOutdated(c);
			log.info("Outdated: {}", c);
		} else {
			if (sendNotifications) {
				notifyPending(c);
			}
			log.debug("Pending {}", c);
		}
	}

	private boolean isOutDate(Contract c) {
		return DateUtils.addDays(c.getCreated(), 7).before(new Date());
	}

很明顯，這個方法不應該放到這裏：

public void processOnEndOfDay(Contract c) {
		if (c.isOutDate()) {
			priorityHandling(c, OUTDATED_FEE);
			notifyOutdated(c);
			log.info("Outdated: {}", c);
		} else {
			if (sendNotifications) {
				notifyPending(c);
			}
			log.debug("Pending {}", c);
		}
	}

注意到什麼不同嗎？我的IDE把isOutdated方法改成Contract的實例方法了，這纔像樣嘛。不過我還是不爽。這個方法做的事太雜了。一個分支在處理業務相關的邏輯priorityHandling，以及發送系統通知和記錄日誌。另一個分支在則根據判斷條件做系統通知，同時記錄日誌。我們先把處理過期合同拆分成一個獨立的方法.

public void processOnEndOfDay(Contract c) {
		if (c.isOutDate()) {
			handleOutdated(c);
		} else {
			if (sendNotifications) {
				notifyPending(c);
			}
			log.debug("Pending {}", c);
		}
	}

	private void handleOutdated(Contract c) {
		priorityHandling(c, OUTDATED_FEE);
		notifyOutdated(c);
		log.info("Outdated: {}", c);
	}

有人會覺得這樣已經夠好了，不過我覺得兩個分支並不對稱令人扎眼。handleOutdated方法層級更高些，而else分支更偏細節。軟件應該清晰易讀，因此不要把不同層級間的代碼混到一起。這樣我會更滿意：

public void processOnEndOfDay(Contract c) {
		if (c.isOutDate()) {
			handleOutdated(c);
		} else {
			stillPending(c);
		}
	}

	private void stillPending(Contract c) {
		if (sendNotifications) {
			notifyPending(c);
		}
		log.debug("Pending {}", c);
	}

	private void handleOutdated(Contract c) {
		priorityHandling(c, OUTDATED_FEE);
		notifyOutdated(c);
		log.info("Outdated: {}", c);
	}

這個例子看起來有點裝，不過其實我想證明的是另一個事情。雖然現在不太常見了，不過還是有些開發人員不敢拆分方法，擔心這樣的話影響運行效率。他們不知道JVM其實是個非常棒的軟件（它其實甩Java語言好幾條街），它內建有許多非常令人驚訝的運行時優化。首先短方法更利於JVM推斷。流程更明顯，作用域更短，副作用也更明顯。如果是長方法JVM可能直接就跪了。第二個原因則更重要：

方法內聯

如果JVM監測到一些小方法被頻繁的執行，它會把方法的調用替換成方法體本身。比如說下面這個：

private int add4(int x1, int x2, int x3, int x4) {
		return add2(x1, x2) + add2(x3, x4);
	}

	private int add2(int x1, int x2) {
		return x1 + x2;
	}

可以肯定的是運行一段時間後JVM會把add2方法去掉，並把你的代碼翻譯成：

private int add4(int x1, int x2, int x3, int x4) {
		return x1 + x2 + x3 + x4;
	}

注意這說的是JVM，而不是編譯器。javac在生成字節碼的時候是比較保守的，這些工作都扔給JVM來做。事實證明這樣的設計決策是非常明智的:

JVM更清楚運行的目標環境，CPU，內存，體系結構，它可以更積極的進行優化。 JVM可以發現你代碼運行時的特徵，比如，哪個方法被頻繁的執行，哪個虛方法只有一個實現，等等。舊編譯器編譯的.class在新版本的JVM上可以獲取更快的運行速度。更新JVM和重新編譯源代碼，你肯定更傾向於後者。

我們對這些假設做下測試。我寫了一個小程序，它有着分治原則的最糟實現的稱號。add128方法需要128個參數並且調用了兩次add64方法——前後兩半各一次。add64也類似，不過它是調用了兩次add32。你猜的沒錯，最後會由add2方法來結束這一切，它是幹苦力活的。有些數字我給省略了，免得亮瞎了你的眼睛：

public class ConcreteAdder {
 
  public int add128(int x1, int x2, int x3, int x4, ... more ..., int x127, int x128) {
    return add64(x1, x2, x3, x4, ... more ..., x63, x64) +
        add64(x65, x66, x67, x68, ... more ..., x127, x128);
  }
 
  private int add64(int x1, int x2, int x3, int x4, ... more ..., int x63, int x64) {
    return add32(x1, x2, x3, x4, ... more ..., x31, x32) +
        add32(x33, x34, x35, x36, ... more ..., x63, x64);
  }
 
  private int add32(int x1, int x2, int x3, int x4, ... more ..., int x31, int x32) {
    return add16(x1, x2, x3, x4, ... more ..., x15, x16) +
        add16(x17, x18, x19, x20, ... more ..., x31, x32);
  }
 
  private int add16(int x1, int x2, int x3, int x4, ... more ..., int x15, int x16) {
    return add8(x1, x2, x3, x4, x5, x6, x7, x8) + add8(x9, x10, x11, x12, x13, x14, x15, x16);
  }
 
  private int add8(int x1, int x2, int x3, int x4, int x5, int x6, int x7, int x8) {
    return add4(x1, x2, x3, x4) + add4(x5, x6, x7, x8);
  }
 
  private int add4(int x1, int x2, int x3, int x4) {
    return add2(x1, x2) + add2(x3, x4);
  }
 
  private int add2(int x1, int x2) {
    return x1 + x2;
  }

}

不難發現，調用add128方法最後一共產生了127個方法調用。太多了。作爲參考，下面這有個簡單直接的實現版本：

public class InlineAdder {
 
    public int add128n(int x1, int x2, int x3, int x4, ... more ..., int x127, int x128) {
        return x1 + x2 + x3 + x4 + ... more ... + x127 + x128;
    } 
}

最後再來一個使用了抽象類和繼承的實現版本。127個虛方法調用開銷是非常大的。這些方法需要動態分發，因此要求更高，所以無法進行內聯。

public abstract class Adder {
 
  public abstract int add128(int x1, int x2, int x3, int x4, ... more ..., int x127, int x128);
 
  public abstract int add64(int x1, int x2, int x3, int x4, ... more ..., int x63, int x64);
 
  public abstract int add32(int x1, int x2, int x3, int x4, ... more ..., int x31, int x32);
 
  public abstract int add16(int x1, int x2, int x3, int x4, ... more ..., int x15, int x16);
 
  public abstract int add8(int x1, int x2, int x3, int x4, int x5, int x6, int x7, int x8);
 
  public abstract int add4(int x1, int x2, int x3, int x4);
 
  public abstract int add2(int x1, int x2);
}

還有一個實現：

public class VirtualAdder extends Adder {
 
  @Override
  public int add128(int x1, int x2, int x3, int x4, ... more ..., int x128) {
    return add64(x1, x2, x3, x4, ... more ..., x63, x64) +
        add64(x65, x66, x67, x68, ... more ..., x127, x128);
  }
 
  @Override
  public int add64(int x1, int x2, int x3, int x4, ... more ..., int x63, int x64) {
    return add32(x1, x2, x3, x4, ... more ..., x31, x32) +
        add32(x33, x34, x35, x36, ... more ..., x63, x64);
  }
 
  @Override
  public int add32(int x1, int x2, int x3, int x4, ... more ..., int x32) {
    return add16(x1, x2, x3, x4, ... more ..., x15, x16) +
        add16(x17, x18, x19, x20, ... more ..., x31, x32);
  }
 
  @Override
  public int add16(int x1, int x2, int x3, int x4, ... more ..., int x16) {
    return add8(x1, x2, x3, x4, x5, x6, x7, x8) + add8(x9, x10, x11, x12, x13, x14, x15, x16);
  }
 
  @Override
  public int add8(int x1, int x2, int x3, int x4, int x5, int x6, int x7, int x8) {
    return add4(x1, x2, x3, x4) + add4(x5, x6, x7, x8);
  }
 
  @Override
  public int add4(int x1, int x2, int x3, int x4) {
    return add2(x1, x2) + add2(x3, x4);
  }
 
  @Override
  public int add2(int x1, int x2) {
    return x1 + x2;
  }
}

受到我的另一篇關於@Cacheable 負載的文章的一些熱心讀者的鼓舞，我寫了個簡單的基準測試來比較這兩個過度分拆的ConcreteAdder和VirtualAdder的負載。結果出人意外，還有點讓人摸不着頭腦。我在兩臺機器上做了測試（紅色和藍色的），同樣的程序不同的是第二臺機器CPU核數更多而且是64位的：

具體的環境信息：

看起來慢的機器上JVM更傾向於進行方法內聯。不僅是簡單的私有方法調用的版本，虛方法的版本也一樣。爲什麼會這樣？因爲JVM發現Adder只有一個子類，也就是說每個抽象方法都只有一個版本。如果你在運行時加載了另一個子類（或者更多），你會看到性能會直線下降，因爲無能再進行內聯了。先不管這個了，從測試中來看，

這些方法的調用並不是開銷很低，是根本就沒有開銷！

方法調用（還有爲了可讀性而加的文檔）只存在於你的源代碼和編譯後的字節碼裏，運行時它們完全被清除掉了（內聯了）。

我對第二個結果也不太理解。看起來性能高的機器B運行單個方法調用的時候要快點，另兩個就要慢些。也許它傾向於延遲進行內聯？結果是有些不同，不過差距也不是那麼的大。就像優化棧跟蹤信息生成那樣——如果你爲了優化代碼性能，手動進行內聯，把方法越搞越龐大，越弄越複雜，那你就真的錯了。

ps：64bit 機器之所以運行慢有可能是因爲 JVM 內聯的要求的方法長度較長。

文章原文來源於：

http://www.javacodegeeks.com/2013/02/how-aggressive-is-method-inlining-in-jvm.html
http://it.deepinmind.com/java/2014/03/01/JVM的方法內聯.html

張小琦

發佈了86 篇原創文章 · 獲贊 65 · 訪問量 43萬+

私信關注

深入理解java虛擬機（十四）正確利用 JVM 的方法內聯

10分鐘搞定Mysql主從部署配置

如何使用 JS 判斷用戶是否處於活躍狀態

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

[轉帖]

python列出centos7內存使用前50的進程信息

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

一鍵自動化博客發佈工具,用過的人都說好(掘金篇)

lightdb數據庫超時相關控制參數

lightdb秒級增加列和刪除列（not null帶默認值）

Java ThreadPoolShutdown

SimpleDateFormat 的 format 方法使用詳解

Java 併發編程（二）對象的可見性

Java 併發編程（二）對象的發佈逸出和線程封閉

Java 浮點數 float和double類型的表示範圍和精度

linux下rtnetlink answers file exists的解決方案

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結