Andorid的Low Memory Killer是在標準的linux kernel的OOM基礎上修改而來的一種內存管理機制。當系統內存不足時,殺死不必要的進程釋放其內存。不必要的進程的選擇根據有2個:oom_adj和佔用的內存的大小。
oom_adj代表進程的優先級,數值越高,優先級越低,越容易被殺死;對應每個oom_adj都可以有一個空閒進程的閥值。Android Kernel每隔一段時間會檢測當前空閒內存是否低於某個閥值。假如是,則殺死oom_adj最大的不必要的進程,如果有多個,就根據oom_score_adj去殺死進程,直到內存恢復低於閥值的狀態。
LowMemoryKiller的值的設定,主要保存在2個文件之中,分別是:
/sys/module/lowmemorykiller/parameters/adj
/sys/module/lowmemorykiller/parameters/minfree。
oom_adj保存着當前系統殺進程的等級,minfree中數值的單位是內存中的頁面數量。當內存小於某個閾值時,就殺死大於對應adj的進程。
adj和minfree的閥值控制,通過updateOomLevels()設置:
private void updateOomLevels(int displayWidth, int displayHeight, boolean write)
// Scale buckets from avail memory: at 300MB we use the lowest values t
.............
if (write) {
ByteBuffer buf = ByteBuffer.allocate(4 * (2*mOomAdj.length + 1));
buf.putInt(LMK_TARGET);
for (int i=0; i<mOomAdj.length; i++) {
buf.putInt((mOomMinFree[i]*1024)/PAGE_SIZE);
buf.putInt(mOomAdj[i]);
}
//將內容傳給lmkd.c,最後寫到kernel驅動節點中
writeLmkd(buf);
SystemProperties.set("sys.sysctl.extra_free_kbytes", Integer.toString(reserve));
}
Android ActivityManagerService中涉及adj的核心方法
調整進程的adj的函數:
- updateOomAdjLocked:更新adj,當目標進程爲空,或者被殺則返回false;否則返回true;
- computeOomAdjLocked:計算adj,返回計算後RawAdj值;
- applyOomAdjLocked:應用adj,當需要殺掉目標進程則返回false;否則返回true。
其中,最爲常見的方法便是computeOomAdjLocked,這也是其他各個方法在需要更新adj時會調用的方法;
updateOomAdjLocked的實現過程中依次會computeOomAdjLocked和applyOomAdjLocked。其實設置的是進程的oom_score_adj;即設置文件/proc/<pid>/oom_score_adj。
private final boolean updateOomAdjLocked(ProcessRecord app, int cachedAdj,
ProcessRecord TOP_APP, boolean doingAll, long now) {
if (app.thread == null) {
return false;
}
computeOomAdjLocked(app, cachedAdj, TOP_APP, doingAll, now);
return applyOomAdjLocked(app, doingAll, now, SystemClock.elapsedRealtime());
}
private final boolean applyOomAdjLocked(ProcessRecord app, boolean doingAll, long now,
long nowElapsed) {
boolean success = true;
//將curRawAdj賦給setRawAdj
if (app.curRawAdj != app.setRawAdj) {
app.setRawAdj = app.curRawAdj;
}
if (app.curAdj != app.setAdj) {
//將app adj值 發送給lmkd守護進程
ProcessList.setOomAdj(app.pid, app.info.uid, app.curAdj);
app.setAdj = app.curAdj;
}
.....
}
在ProcessList中定義了各個OOM優先級的數值
/**
* Activity manager code dealing with processes.
*/
public final class ProcessList {
// OOM adjustments for processes in various states:
// Uninitialized value for any major or minor adj fields
static final int INVALID_ADJ = -10000;
// Adjustment used in certain places where we don't know it yet.
// (Generally this is something that is going to be cached, but we
// don't know the exact value in the cached range to assign yet.)
//一般指將要會緩存進程,無法獲取確定值
static final int UNKNOWN_ADJ = 1001;
// This is a process only hosting activities that are not visible,
// so it can be killed without any disruption.
static final int CACHED_APP_MAX_ADJ = 906;
static final int CACHED_APP_MIN_ADJ = 900;
// The B list of SERVICE_ADJ -- these are the old and decrepit
// services that aren't as shiny and interesting as the ones in the A list.
static final int SERVICE_B_ADJ = 800;
// This is the process of the previous application that the user was in.
// This process is kept above other things, because it is very common to
// switch back to the previous app. This is important both for recent
// task switch (toggling between the two top recent apps) as well as normal
// UI flow such as clicking on a URI in the e-mail app to view in the browser,
// and then pressing back to return to e-mail.
static final int PREVIOUS_APP_ADJ = 700;
// This is a process holding the home application -- we want to try
// avoiding killing it, even if it would normally be in the background,
// because the user interacts with it so much.
static final int HOME_APP_ADJ = 600;
// This is a process holding an application service -- killing it will not
// have much of an impact as far as the user is concerned.
static final int SERVICE_ADJ = 500;
// This is a process with a heavy-weight application. It is in the
// background, but we want to try to avoid killing it. Value set in
// system/rootdir/init.rc on startup.
//後臺的重量級進程,system/rootdir/init.rc文件中設置
static final int HEAVY_WEIGHT_APP_ADJ = 400;
// This is a process currently hosting a backup operation. Killing it
// is not entirely fatal but is generally a bad idea.
static final int BACKUP_APP_ADJ = 300;
// This is a process only hosting components that are perceptible to the
// user, and we really want to avoid killing them, but they are not
// immediately visible. An example is background music playback.
//可感知進程,比如後臺音樂播放
static final int PERCEPTIBLE_APP_ADJ = 200;
// This is a process only hosting activities that are visible to the
// user, so we'd prefer they don't disappear.
static final int VISIBLE_APP_ADJ = 100;
static final int VISIBLE_APP_LAYER_MAX = PERCEPTIBLE_APP_ADJ - VISIBLE_APP_ADJ - 1;
// This is the process running the current foreground app. We'd really
// rather not kill it!
static final int FOREGROUND_APP_ADJ = 0;
// This is a process that the system or a persistent process has bound to,
// and indicated it is important.
//關聯着系統或persistent進程
static final int PERSISTENT_SERVICE_ADJ = -700;
// This is a system persistent process, such as telephony. Definitely
// don't want to kill it, but doing so is not completely fatal.
static final int PERSISTENT_PROC_ADJ = -800;
// The system process runs at the default adjustment.
static final int SYSTEM_ADJ = -900;
// Special code for native processes that are not being managed by the system (so
// don't have an oom adj assigned by the system).
static final int NATIVE_ADJ = -1000;
// Memory pages are 4K.
static final int PAGE_SIZE = 4*1024;
在kernel OOM機制當中,關鍵的參數分別是oom_adj,oom_score_adj,oom_score。
每個進程都會有這樣的3個參數。 在Linux中,低內存情況下,系統通過計算這3個參數的值去殺死進程。
- oom_adj: 代表進程的優先級, 數值越大,優先級越低,越容易被殺. 取值範圍[-16, 15]
- oom_score_adj: 取值範圍[-1000, 1001] ,該值就是AMS設置過來的值。
- oom_score:lmk策略中沒有看到使用的地方,應該是oom纔會使用。 要想提高進程優先級,儘量避免自己被殺,那就得提高進程的oom_score_adj ;
針對android系統的lowmemkiller,主要都是由AMS服務來動態更新oomlevel來調整app進程級別。
AMS對oom_score_adj設置
frameworks\base\services\core\java\com\android\server\am\ProcessList.java
public static final void setOomAdj(int pid, int uid, int amt) {
long start = SystemClock.elapsedRealtime();
ByteBuffer buf = ByteBuffer.allocate(4 * 4);
buf.putInt(LMK_PROCPRIO);
buf.putInt(pid);
buf.putInt(uid);
buf.putInt(amt);
writeLmkd(buf);
……….
}
writeLmkd中通過 LocalSocket 機制,與lmkd 進行socket通信
sLmkdSocket = new LocalSocket(LocalSocket.SOCKET_SEQPACKET);
sLmkdSocket.connect(
new LocalSocketAddress("lmkd", LocalSocketAddress.Namespace.RESERVED));
sLmkdOutputStream = sLmkdSocket.getOutputStream();
lowmemorykiller Driver部分
lowmemorykiller driver 位於 drivers/staging/android/lowmemorykiller.c
LMK通過註冊shrinker來實現,shrinker是Linux kernel標準的回收page的機制,由內核線程kswapd負責監控。參見mm/vmscan.c中的kswapd。 或者某個app分配內存,發現可用內存不足時,則內核會阻塞請求分配內存的進程,進入slow path的內存申請邏輯進行回收(包括ZRAM的內存壓縮)。
參見mm/page_alloc.c中的__alloc_pages_slowpath。 LMK核心思想:
選擇oom_score_adj最大的進程中,並且RSS內存最大的進程作爲選中要殺的進程。 具體的實現: lowmemorykiller.c 中的lowmem_scan.
lowmemorykiller.c 中的lowmem_scan註冊到vmscan (kernel/mm/vmscan.c) shrinker鏈表裏。
static struct shrinker lowmem_shrinker = {
.scan_objects = lowmem_scan,
.count_objects = lowmem_count,
.seeks = DEFAULT_SEEKS * 16
};
static int __init lowmem_init(void)
{ …………
register_shrinker(&lowmem_shrinker);
}
然後,當Linux內存管理模塊線程kswapd被調度時,就會通過 kswapd_shrink_node- > ……>scan_objects 來觸發lowmem_scan內存掃描及執行 lowmem killer
lowmem_scan 根據當前系統free內存和每個進程的oom_score_adj來決定當前哪個進程將會被killed.
int other_free = global_page_state(NR_FREE_PAGES) - totalreserve_pages;
int other_file = global_node_page_state(NR_FILE_PAGES) -
global_node_page_state(NR_SHMEM) -
global_node_page_state(NR_UNEVICTABLE) -
total_swapcache_pages();
………….
for (i = 0; i < array_size; i++) {
minfree = lowmem_minfree[i];
//觸發條件:free size < minfree 並且 cache size < minifree
if (other_free < minfree && other_file < minfree) {
if (to_be_aggressive != 0 && i > 3) {
i -= to_be_aggressive;
if (i < 3)
i = 3;
}
min_score_adj = lowmem_adj[i];
break;
}
}
other_free 基本對應/proc/meminfo 中的 free size;other_file 基本對應/proc/meminfo 中的 cache size;
//選擇oom_score_adj最大的進程中,並且rss內存最大的進程.
selected_oom_score_adj = min_score_adj;
for_each_process(tsk) {
………..
if (selected) {
if (oom_score_adj < selected_oom_score_adj)
continue;
if (oom_score_adj == selected_oom_score_adj && tasksize <= selected_tasksize)
continue;
}
selected = p;
selected_tasksize = tasksize;
selected_oom_score_adj = oom_score_adj;
lowmem_print(2, "select '%s' (%d), adj %hd, size %d, to kill\n", p->comm, p->pid, oom_score_adj, tasksize);
}
…………….
send_sig(SIGKILL, selected, 0);
內存回收時機
內存中有三個水位min <low < high,當內存達到low水位時,kswapd開始回收內存,直到內存達到high水位時停止kswapd;
如果kswapd回收速度小於內存消耗速度,內存水位下降到min水位,則direct reclaim開始回收內存,並會阻塞應用程序。
內存換出到swap分區的過程:
kswapd()-->balance_pgdat()-->shrink_zone()-->shrink_inactive_list()
Watermark的設置
每個zone有單獨的水位,可以在/proc/sys/vm/min_free_kbytes中設置min水位,這個參數本身決定了系統中每個zone的watermark[min]的值大小。 然後內核根據min的大小並參考每個zone的內存大小分別算出每個zone的low水位和high水位值 通過命令查看zone的watermark:
XXXXX:/ # cat /proc/zoneinfo
Node 0, zone DMA
per-node stats
………….
pages free 285389
min 1392
low 2054
high 2716
node_scanned 0
…………..
相關代碼見/mm/page_alloc.c: __setup_per_zone_wmarks
最後:
要想提高進程優先級,儘量避免自己被殺,那就得提高進程的oom_score_adj ;在activity的創建與啓動,結束; service的創建與啓動,結束等場景下,調用AMS.applyOomAdjLocked ==>Process.setOomAdj ==>修改/proc/pid/oom_adj 。