lua_gc 源碼學習六

GC 中最繁雜的 mark 部分已經談完了。剩下的東西很簡單。今天一次可以寫完。

sweep 分兩個步驟，一個是清理字符串，另一個是清理其它對象。看代碼，lgc.c 573 行：

case GCSsweepstring: {
      lu_mem old = g->totalbytes;
      sweepwholelist(L, &g->strt.hash[g->sweepstrgc++]);
      if (g->sweepstrgc >= g->strt.size)  /* nothing more to sweep? */
        g->gcstate = GCSsweep;  /* end sweep-string phase */
      lua_assert(old >= g->totalbytes);
      g->estimate -= old - g->totalbytes;
      return GCSWEEPCOST;
    }
    case GCSsweep: {
      lu_mem old = g->totalbytes;
      g->sweepgc = sweeplist(L, g->sweepgc, GCSWEEPMAX);
      if (*g->sweepgc == NULL) {  /* nothing more to sweep? */
        checkSizes(L);
        g->gcstate = GCSfinalize;  /* end sweep phase */
      }
      lua_assert(old >= g->totalbytes);
      g->estimate -= old - g->totalbytes;
      return GCSWEEPMAX*GCSWEEPCOST;
    }

在 GCSsweepstring 中，每步調用 sweepwholelist 清理 strt 這個 hash 表中的一列。理想狀態下，所有的 string 都被 hash 散列開沒有衝突，這每一列上有一個 string 。我們可以讀讀 lstring.c 的 68 行：

if (tb->nuse > cast(lu_int32, tb->size) && tb->size <= MAX_INT/2)
    luaS_resize(L, tb->size*2);  /* too crowded */

當 hash 表中的 string 數量(nuse) 大於 hash 表的列數(size) 時，lua 將 hash 表的列數擴大一倍。就是按一列一個元素來估計的。

值得一提的是，分佈執行的 GC ，在這個階段，string 對象是有可能清理不乾淨的。當 GCSsweepstring 步驟中，step 間若發生以上 string table 的 hash 表擴容事件，那麼 string table 將被 rehash 。一些來不及清理的 string 很有可能被打亂放到已經通過 GCSsweepstring 的表列裏。一旦發生這種情況，部分 string 對象則沒有機會在當次 GC 流程中被重置爲白色。在某些極端情況下，即使你調用 fullgc 一次也不能徹底的清除垃圾。

關於 string 對象，還有個小地方需要了解。lua 是複用相同值的 TString 的，且同值 string 絕對不能有兩份。而 GC 的分步執行，可能會導致一些待清理的 TString 又復活。所以在它在創建新的 TString 對象時做了檢查，見 lstring.c 86 行：

if (ts->tsv.len == l && (memcmp(str, getstr(ts), l) == 0)) {
      /* string may be dead */
      if (isdead(G(L), o)) changewhite(o);
      return ts;
    }

相同的問題也存在於 upvalue 。同樣有類似檢查。

此處 GCSWEEPCOST 應該是一個經驗值。前面我們知道 singlestep 返回的這個步驟大致執行的時間。這樣可以讓 luaC_step 這個 api 每次執行的時間大致相同。mark 階段是按掃描的字節數確定這個值的。而每次釋放一個 string 的時間大致相等（和 string 的長度無關），GCSWEEPCOST 就是釋放一個對象的開銷了。

GCSsweep 清理的是整個 GCObject 鏈表。這個鏈表很長，所以也是分段完成的。記錄遍歷位置的指針是 sweepgc ，每次遍歷 GCSWEEPMAX 個。無論遍歷是否清理，開銷都是差不太多的。因爲對於存活的對象，需要把顏色位重置；需要清理的對象則需要釋放內存。每趟 GCSsweep 的開銷勢必比 GCSsweepstring 大。大致估算時間爲 GCSWEEPMAX*GCSWEEPCOST 。

真正的清理工作通過 lgc.c 408 行的 sweeplist 函數進行。

static GCObject **sweeplist (lua_State *L, GCObject **p, lu_mem count) {
  GCObject *curr;
  global_State *g = G(L);
  int deadmask = otherwhite(g);
  while ((curr = *p) != NULL && count-- > 0) {
    if (curr->gch.tt == LUA_TTHREAD)  /* sweep open upvalues of each thread */
      sweepwholelist(L, &gco2th(curr)->openupval);
    if ((curr->gch.marked ^ WHITEBITS) & deadmask) {  /* not dead? */
      lua_assert(!isdead(g, curr) || testbit(curr->gch.marked, FIXEDBIT));
      makewhite(g, curr);  /* make it white (for next cycle) */
      p = &curr->gch.next;
    }
    else {  /* must erase `curr' */
      lua_assert(isdead(g, curr) || deadmask == bitmask(SFIXEDBIT));
      *p = curr->gch.next;
      if (curr == g->rootgc)  /* is the first element of the list? */
        g->rootgc = curr->gch.next;  /* adjust first */
      freeobj(L, curr);
    }
  }
  return p;
}

後半段比較好理解，當對象存活的時候，調用 makewhite ；死掉則調用 freeobj 。sweeplist 的起點並不是從 rootgc 而是 sweepgc （它們的值可能相同），所以對於頭節點，需要做一點調整。

前面的 sweep open upvalues of each thread 需要做一點解釋。爲什麼 upvalues 需要單獨清理？這要從 upvalue 的儲存說起。

upvalues 並不是放在整個 GCObject 鏈表中的。而是存在於每個 thread 自己的 L 中（openupval 域）。爲何要這樣設計？因爲和 string table 類似，upvalues 需要唯一性，即引用相同變量的對象只有一個。所以運行時需要對當前 thread 的已有 upvalues 進行遍歷。Lua 爲了節省內存，並沒有爲 upvalues 多申請一個指針放置額外的鏈表。就借用 GCObject 本身的單向鏈表。所以每個 thread 擁有的 upvalues 就自成一鏈了。相關代碼可以參考 lfunc.c 53 行處的 luaF_findupval 函數。

但也不是所有 upvalues 都是這樣存放的。closed upvalue 就不再需要存在於 thread 的鏈表中。在 luaF_close 函數中將把它移到其它 GCObject 鏈中。見 lfunc.c 106 行：

unlinkupval(uv);
      setobj(L, &uv->u.value, uv->v);
      uv->v = &uv->u.value;  /* now current value lives here */
      luaC_linkupval(L, uv);  /* link upvalue into `gcroot' list */

另，某些 upvalue 天生就是 closed 的。它們可以直接通過 luaF_newupval 構造出來。

按道理來說，對 openvalues 的清理會增加單次 **sweeplist 的負荷，當記入 singlestep 的返回值。但這樣會導致 sweeplist 接口變複雜，實現的代價也會增加。鑑於 thread 通常不多，GC 開銷也只是一個估算值，也就沒有特殊處理了。

GC 的最後一個流程爲 GCSfinalize 。

它通過 GCTM 函數，每次調用一個需要回收的 userdata 的 gc 元方法。見 lgc.c 的 446 行：

static void GCTM (lua_State *L) {
  global_State *g = G(L);
  GCObject *o = g->tmudata->gch.next;  /* get first element */
  Udata *udata = rawgco2u(o);
  const TValue *tm;
  /* remove udata from `tmudata' */
  if (o == g->tmudata)  /* last element? */
    g->tmudata = NULL;
  else
    g->tmudata->gch.next = udata->uv.next;
  udata->uv.next = g->mainthread->next;  /* return it to `root' list */
  g->mainthread->next = o;
  makewhite(g, o);
  tm = fasttm(L, udata->uv.metatable, TM_GC);
  if (tm != NULL) {
    lu_byte oldah = L->allowhook;
    lu_mem oldt = g->GCthreshold;
    L->allowhook = 0;  /* stop debug hooks during GC tag method */
    g->GCthreshold = 2*g->totalbytes;  /* avoid GC steps */
    setobj2s(L, L->top, tm);
    setuvalue(L, L->top+1, udata);
    L->top += 2;
    luaD_call(L, L->top - 2, 0);
    L->allowhook = oldah;  /* restore hooks */
    g->GCthreshold = oldt;  /* restore threshold */
  }
}

代碼邏輯很清晰。需要留意的是，gc 元方法裏應避免再觸發 GC 。所以這裏採用修改 GCthreshold 爲比較較大值來回避。這其實不能完全避免 GC 的重入。甚至用戶錯誤的編寫代碼也可能主動觸發，但通常問題不大。因爲最壞情況也只是把 GCthreshold 設置爲一個不太正確的值，並不會引起邏輯上的錯誤。

lua_gc 源碼學習六

NiRenderListProcessor

NiScreenFillingRenderView

NiRenderClick

Ni3DRenderView

使用lua_next()遍歷表

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結