erlang catch的內部實現(初稿)

最近項目組有同事做了erlang內部數據（Eterm）的分享。Eterm 是Erlang Term的簡寫，用來表示erlang中任意類型的數據，也就是說，erlang可以用到的任意數據，都能 Eterm表示。比如常見的atom、數字、列表、元組，甚至pid，port，fun，ets表等等都用Eterm可以表示。

Eterm

Eterm 在VM中主要分爲三大類：列表，boxed對象，立即數。（這麼分法主要是複雜的數據無法單靠1個機器字表示，在32位機器上，一個字長4字節，在64位機器上是8字節。）

其中，boxed對象表示了複雜數據類型，如元組，大整數，二進制等。而立即數表示的就是一些簡單的小數據，如小整數，原子，pid 等。

對這塊內容有興趣的同學，可以先看看siyao的Erlang 數據類型的內部表示和實現。寫得很好，我就不畫蛇添足了。

但這裏爲什麼會有catch？相信不少人都會有這樣的疑問。所以，本文就圍繞着 catch 做說明。

catch表達式

形式：catch Expr

如果 Expr執行過程沒有異常發生，就返回Expr的執行結果。但如果異常發生了，就會被捕獲。

1> catch 1+2.
3
2> catch 1+a.
{'EXIT',{badarith,[...]}}
3> catch throw(hello).
hello

catch立即數怎麼產生的？

這個數據類型只在 Erlang VM內部使用，Erlang 程序不會直接操作這個數據類型。當代碼中含有 catch 或者 try-catch語句時就會產生catch，而且，這個數據將會被放置到Erlang進程的棧上。

源碼分析

下面以一個簡單做說明。

-module(test).

-compile(export_all).

t() ->

catch erlang:now().

保存爲test.erl，通過erlc -S test.erl 得到彙編代碼 test.S，內容如下：

{function, t, 0, 2}.

{label,1}.

{line,[{location,"test.erl",4}]}.

{func_info,{atom,test},{atom,t},0}.

{label,2}.

{allocate,1,0}.

{'catch',{y,0},{f,3}}.

{line,[{location,"test.erl",5}]}.

{call_ext,0,{extfunc,erlang,now,0}}.

{label,3}.

{catch_end,{y,0}}.

{deallocate,1}.

return.

這裏可以看到 catch 和 catch_end 是成對出現的。下面再編譯成opcode吧，容易到VM代碼中分析。

1> c(test).

{ok,test}

2> erts_debug:df(test).

找到生成 test.dis，內容如下：

04B55938: i_func_info_IaaI 0 test t 0

04B5594C: allocate_tt 1 0

04B55954: catch_yf y(0) f(0000871B)

04B55960: call_bif_e erlang:now/0

04B55968: catch_end_y y(0)

04B55970: deallocate_return_Q 1

以上， allocate_tt 和 deallocate_return_Q 在 beam_hot.h實現，其他在 beam_emu.c 實現，都可以找相關代碼。

先看下這幾個指令：

// beam_emu.c
 OpCase(catch_yf):
     c_p->catches++;       // catches數量加1
     yb(Arg(0)) = Arg(1);  // 把catch指針地址存入進程棧，即f(0000871B)
     Next(2);              // 執行下一條指令

// beam_emu.c
 OpCase(catch_end_y): {
     c_p->catches--;  // 進程 catches數減1
     make_blank(yb(Arg(0)));  // 將catch立即數的值置NIL，數據將會丟掉
     if (is_non_value(r(0))) { // 如果異常出現
	 if (x(1) == am_throw) {  // 如果是 throw(Term)，返回 Term
	     r(0) = x(2);
	 } else {
	     if (x(1) == am_error) {  // 如果是 error(Term), 再帶上當前堆棧的信息
	         SWAPOUT;
		 x(2) = add_stacktrace(c_p, x(2), x(3));
		 SWAPIN;
	     }
	     /* only x(2) is included in the rootset here */
	     if (E - HTOP < 3 || c_p->mbuf) {	/* Force GC in case add_stacktrace()
						 * created heap fragments */
		 // 檢查進程堆空間不足，執行gc避免出現堆外數據
		 SWAPOUT;
		 PROCESS_MAIN_CHK_LOCKS(c_p);
		 FCALLS -= erts_garbage_collect(c_p, 3, reg+2, 1);
		 ERTS_VERIFY_UNUSED_TEMP_ALLOC(c_p);
		 PROCESS_MAIN_CHK_LOCKS(c_p);
		 SWAPIN;
	     }
	     r(0) = TUPLE2(HTOP, am_EXIT, x(2));
	     HTOP += 3;
	 }
     }
     CHECK_TERM(r(0));
     Next(1); // 執行下一條指令
 }

//beam_hot.h
  OpCase(deallocate_return_Q):
    { 
    DeallocateReturn(Arg(0));//釋放分配的棧空間，返回上一個CP指令地址（注：CP是返回地址指針）
    }

DeallocateReturn實際是個宏，代碼如下：

#define DeallocateReturn(Deallocate)       \
  do {                                     \
    int words_to_pop = (Deallocate);       \
    SET_I((BeamInstr *) cp_val(*E));       \ // 解析當前棧的指令地址，即獲取上一個CP指令地址
    E = ADD_BYTE_OFFSET(E, words_to_pop);  \
    CHECK_TERM(r(0));                      \
    Goto(*I);                              \//執行的指令
  } while (0)

到這裏，應該有同學開始疑惑了。這裏說的catch ，真是前面提到的 catch立即數嗎？

談到 catch 立即數，很多同學可以找到以下這兩個宏：

// erl_term.h 
#define make_catch(x)	(((x) << _TAG_IMMED2_SIZE) | _TAG_IMMED2_CATCH)  // 轉成catch立即樹
#define is_catch(x)	(((x) & _TAG_IMMED2_MASK) == _TAG_IMMED2_CATCH)  // 是否catch立即數

這兩個是 catch立即數的生成和判定，後面的代碼會提到這兩個宏的使用。

現在，我們來看下VM解析和加載 catch 代碼的過程：

// beam_load.c 加載beam過程，有刪節
static void final_touch(LoaderState* stp)
{
    int i;
    int on_load = stp->on_load;
    unsigned catches;
    Uint index;
    BeamInstr* code = stp->code;
    Module* modp;

    /*
     * 申請catch索引，填補catch_yf指令
     * 前面的f(0000871B)就在這裏產生的，指向了beam_catches結構數據
     * 因爲一個catch立即數放不了整個beam_catches數據，就只放了指針
     */

    index = stp->catches;
    catches = BEAM_CATCHES_NIL;
    while (index != 0) {  //遍歷所有的catch_yf指令
	BeamInstr next = code[index];
	code[index] = BeamOpCode(op_catch_yf); // 指向catch_yf指令的opcode地址

	// 獲取 catch_end 指令地址，構造beam_catches結構數據
	catches = beam_catches_cons((BeamInstr *)code[index+2], catches);  
	code[index+2] = make_catch(catches); // 將beam_catches索引位置轉成 catch立即數
	index = next;
    }
    modp = erts_put_module(stp->module);
    modp->curr.catches = catches;

    /*
     * ....
     */ 
}

再來看下什麼時候會執行到這裏的代碼。

細心的同學就會發現，VM中很多異常都會這樣調用：

// 執行匿名函數
  OpCase(i_apply_fun): { 
     BeamInstr *next;

     SWAPOUT;
     next = apply_fun(c_p, r(0), x(1), reg);
     SWAPIN;
     if (next != NULL) {
	 r(0) = reg[0];
	 SET_CP(c_p, I+1);
	 SET_I(next);
	 Dispatchfun();
     }
     goto find_func_info;  // 遇到錯誤走這裏
 }

  // 數學運算錯誤，或檢查錯誤就會走這裏
 lb_Cl_error: {
     if (Arg(0) != 0) { // 如果帶了 label地址，就執行 jump指令
	 OpCase(jump_f): { // 這裏就是 jump實現代碼
	 jump_f:
	     SET_I((BeamInstr *) Arg(0));
	     Goto(*I);
	 }
     }
     ASSERT(c_p->freason != BADMATCH || is_value(c_p->fvalue));
     goto find_func_info;  // 遇到錯誤走這裏
 }
 
 // 等待消息超時
 OpCase(i_wait_error): {
	 c_p->freason = EXC_TIMEOUT_VALUE;
	 goto find_func_info; // 遇到錯誤走這裏
 }

好了，再看下 find_func_info 究竟是什麼神通？

/* Fall through here */
 find_func_info: { 
     reg[0] = r(0);
     SWAPOUT;
     I = handle_error(c_p, I, reg, NULL); // 獲取異常錯誤指令地址
     goto post_error_handling;
 }
 
 post_error_handling:
     if (I == 0) { // 等待下次調度 erl_exit()，拋出異常中斷
	 goto do_schedule;
     } else {
	 r(0) = reg[0];
	 ASSERT(!is_value(r(0)));
	 if (c_p->mbuf) { // 存在堆外消息數據，執行gc
	     erts_garbage_collect(c_p, 0, reg+1, 3);
	 }
	 SWAPIN;
	 Goto(*I); // 執行指令
     }
 }

然後，簡單看下 handle_error函數。

// erl_emu.c VM處理異常函數
static BeamInstr* handle_error(Process* c_p, BeamInstr* pc, Eterm* reg, BifFunction bf)
{
    Eterm* hp;
    Eterm Value = c_p->fvalue;
    Eterm Args = am_true;
    c_p->i = pc;    /* In case we call erl_exit(). */

    ASSERT(c_p->freason != TRAP); /* Should have been handled earlier. */

    /*
     * Check if we have an arglist for the top level call. If so, this
     * is encoded in Value, so we have to dig out the real Value as well
     * as the Arglist.
     */
    if (c_p->freason & EXF_ARGLIST) {
	  Eterm* tp;
	  ASSERT(is_tuple(Value));
	  tp = tuple_val(Value);
	  Value = tp[1];
	  Args = tp[2];
    }

    /*
     * Save the stack trace info if the EXF_SAVETRACE flag is set. The
     * main reason for doing this separately is to allow throws to later
     * become promoted to errors without losing the original stack
     * trace, even if they have passed through one or more catch and
     * rethrow. It also makes the creation of symbolic stack traces much
     * more modular.
     */
    if (c_p->freason & EXF_SAVETRACE) {
        save_stacktrace(c_p, pc, reg, bf, Args);
    }

    /*
     * Throws that are not caught are turned into 'nocatch' errors
     */
    if ((c_p->freason & EXF_THROWN) && (c_p->catches <= 0) ) {
        hp = HAlloc(c_p, 3);
        Value = TUPLE2(hp, am_nocatch, Value);
        c_p->freason = EXC_ERROR;
    }

    /* Get the fully expanded error term */
    Value = expand_error_value(c_p, c_p->freason, Value);

    /* Save final error term and stabilize the exception flags so no
       further expansion is done. */
    c_p->fvalue = Value;
    c_p->freason = PRIMARY_EXCEPTION(c_p->freason);

    /* Find a handler or die */
    if ((c_p->catches > 0 || IS_TRACED_FL(c_p, F_EXCEPTION_TRACE))
	&& !(c_p->freason & EXF_PANIC)) {
	BeamInstr *new_pc;
        /* The Beam handler code (catch_end or try_end) checks reg[0]
	   for THE_NON_VALUE to see if the previous code finished
	   abnormally. If so, reg[1], reg[2] and reg[3] should hold the
	   exception class, term and trace, respectively. (If the
	   handler is just a trap to native code, these registers will
	   be ignored.) */
	reg[0] = THE_NON_VALUE;
	reg[1] = exception_tag[GET_EXC_CLASS(c_p->freason)];
	reg[2] = Value;
	reg[3] = c_p->ftrace;
        if ((new_pc = next_catch(c_p, reg))) { // 從進程棧上找到最近的 catch
	    c_p->cp = 0;	/* To avoid keeping stale references. */
	    return new_pc;   // 返回 catch end 指令地址
	}
	if (c_p->catches > 0) erl_exit(1, "Catch not found");
    }
    ERTS_SMP_UNREQ_PROC_MAIN_LOCK(c_p);
    terminate_proc(c_p, Value);
    ERTS_SMP_REQ_PROC_MAIN_LOCK(c_p);
    return NULL;  // 返回0，就是執行 erl_exit()
}

問題討論

爲什麼catch要放在進程棧，然後利用立即數實現。

1、異常中斷處理

erlang本身就有速錯原則，發生錯誤就會拋出異常，並kill掉進程。如果需要捕獲異常，並獲取中斷處的結果，就要記錄中斷時要返回的地址。

2、catch多層嵌套

因爲catch允許多層嵌套結構，catch裏面的函數代碼還可以繼續再catch，就無法用一個簡單類型的變量表示。這需要一種數組或鏈表結構來表示catch層級的關係鏈。

問題延伸

進程堆與進程棧

至於catch實現爲什麼是進程棧，而不是進程堆，或者作爲VM調度線程的變量？
首先，erlang VM的基本調度單位是erlang進程。如果執行某段代碼，就要有運行erlang進程來執行。爲什麼我們可以在shell下肆無忌憚地運行代碼，實際上我們看到的是由shell實現進程執行後返回給我們的結果。

而erlang進程執行代碼過程中產生的大多數數據會放到進程的堆棧上（ets，binary，atom除外），而進程棧和進程堆是什麼樣的對應關係？

實際上，erlang進程的棧和堆在VM底層實現上都是在OS進程/線程的堆上，因爲OS提供的棧空間實在有限。

這裏，低位地址表示了堆底，高位地址表示了棧底。中間堆頂和棧頂的空白區域，表示了進程堆棧還未使用到的空間，使用內存時就向裏收縮，不夠時就執行gc

而erlang中，進程棧和進程堆的區別是棧只放入了簡單的數據，如果是複雜數據，就只放頭部（即前面最開始談到的列表，boxed對象），然後把實際的數據放到堆中。

這裏，儘管VM會把複雜數據存入erlang進程的堆上，但在棧上都保持了引用（或指針），但是，堆數據不會有引用（或指針）指向了棧。這麼做，是爲了減少GC的代價。這樣，GC時只要掃描結構較小的棧就可以，不用掃描整個堆棧。而進程字典寫操作，就保留引用指向進程堆（暫時不討論堆外數據的情況）

而這裏，catch實際表達的數據對象是一個 beam_catch_t 結構，最少最多也只能一個立即數表示，然後指向索引或指針位置。而且，catch與進程運行上下文代碼有關，允許多層嵌套，處理異常中斷，如果像寄存器一樣，作爲VM調度線程的變量，將會引入更加複雜的設計

try-catch尾遞歸

try-catch語法結構內無法構成尾遞歸

t() ->

try

do_something(),

t()

catch

_:_ -> ok

end.

erlang編譯時會生成 try 和 try_end 指令，而這裏 t() 實際就只是執行一次本地函數，不能構成尾遞歸。這點很好解釋，感興趣的同學可以打印彙編碼探尋這個問題。同樣的問題，catch也存在。

erlang:hibernate

調用這個函數會使當前進程進入wait狀態，同時減少其內存開銷。適用場合是進程短時間不會收到任何消息，如果頻繁收到消息就不適合了，否則頻繁內存清理操作，也是不少開銷。

但是，使用 erlang:hibernate 將會導致 catch或者try-catch 語句失效。

通過前面的內容可以知道，try catch 會往棧裏面都壓入了出錯時候的返回地址，而 erlang:hibernate則會清空棧數據，將會導致try-catch失效。

解決辦法如下，參考proc_lib:hibernate的實現：

hibernate(M, F, A) when is_atom(M), is_atom(F), is_list(A) ->

erlang:hibernate(?MODULE, wake_up, [M, F, A]).

wake_up(M, F, A) when is_atom(M), is_atom(F), is_list(A) ->

try

apply(M, F, A)

catch

_Class:Reason -> exit(Reason)

end.

最後語

很早之前就開始寫這篇文章，但現階段比較忙，不巧又遇上一點小風寒，所以文章寫得倉促，標題打了“初稿”字樣，希望能夠見諒。但也不該成爲寫錯內容的藉口，如果你有看到錯誤，請提出批評指正，我看到就會改正，謝謝支持。

2015/4/3 修改源碼分析中 deallocate_return_Q的說明

參考：http://blog.csdn.net/mycwq/article/details/44661219

erlang catch的內部實現(初稿)

Eterm

catch表達式

源碼分析

問題討論

1、異常中斷處理

2、catch多層嵌套

問題延伸

進程堆與進程棧

try-catch尾遞歸

erlang:hibernate

最後語

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

java由於越界導致的報錯

從erlang時間函數說到時間校正機制

Linux下編譯安裝Apache及模塊

C++ stderr/stdout 重定向到文件

erlang進程監控的實現原理

erlang catch的內部實現(初稿)

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結