Android framework：watchdog

原創

jianpan_zouni

2020-07-05 06:53

watchdog就是看門狗。以前實習公司的watchdog就是監視進程，如果進程掛了就重新啓動進程。

在Android中watchdog的原理也類似，通過向進程發送消息，判斷返回值延遲時間，若超時，通知zogte自殺，後面init會重啓zogte，所以重啓的是android，不影響kernel，速度較快。

盜個圖：

開始擼代碼：

1.啓動在systemserver：

final Watchdog watchdog = Watchdog.getInstance();

watchdog.init(context, mActivityManagerService);

Watchdog.getInstance().start();

2.getInstance是單例模式，就是調用watchdog的構造

250    private Watchdog() {
251        super("watchdog");
252        // Initialize handler checkers for each common thread we want to check.  Note
253        // that we are not currently checking the background thread, since it can
254        // potentially hold longer running operations with no guarantees about the timeliness
255        // of operations there.
256
257        // The shared foreground thread is the main checker.  It is where we
258        // will also dispatch monitor checks and do other work.
259        mMonitorChecker = new HandlerChecker(FgThread.getHandler(),
260                "foreground thread", DEFAULT_TIMEOUT);
261        mHandlerCheckers.add(mMonitorChecker);
262        // Add checker for main thread.  We only do a quick check since there
263        // can be UI running on the thread.
264        mHandlerCheckers.add(new HandlerChecker(new Handler(Looper.getMainLooper()),
265                "main thread", DEFAULT_TIMEOUT));
266        // Add checker for shared UI thread.
267        mHandlerCheckers.add(new HandlerChecker(UiThread.getHandler(),
268                "ui thread", DEFAULT_TIMEOUT));
269        // And also check IO thread.
270        mHandlerCheckers.add(new HandlerChecker(IoThread.getHandler(),
271                "i/o thread", DEFAULT_TIMEOUT));
272        // And the display thread.
273        mHandlerCheckers.add(new HandlerChecker(DisplayThread.getHandler(),
274                "display thread", DEFAULT_TIMEOUT));
275
276        // Initialize monitor for Binder threads.
277        addMonitor(new BinderThreadMonitor());
278
279        mOpenFdMonitor = OpenFdMonitor.create();
280
281        // See the notes on DEFAULT_TIMEOUT.
282        assert DB ||
283                DEFAULT_TIMEOUT > ZygoteConnectionConstants.WRAPPED_PID_TIMEOUT_MILLIS;
284    }

在Watchdog構造函數中將main thread，UIthread，Iothread，DisplayThread加入mHandlerCheckers列表中。最後初始化monitor放入mMonitorCheckers列表中，還有binder和fd的monitor

3.watchdog監控

Watchdog提供兩種監視方式，一種是通過monitor()回調監視服務關鍵區是否出現死鎖或阻塞，一種是通過發送消息監視服務主線程是否阻塞。比如服務ams（monitor），跑在systemserver（發送消息）上。

addMonitor()

addThread()

monitor監控服務是通過服務實現watchdog的monitor接口，主動實現的。

發生watchdog時，會打印watchdog重啓時有有兩種提示語：“Block in Handler in ......”和“Block in monitor”，它們分別對應不同的阻塞類型

4.watchdog工作

watchdog是個thread，start就是調用run，看run函數，比較長

首先是進入無限循環，調用

scheduleCheckLocked();進行監控

進入這個函數裏面：

1.如果monitor空，或者線程正在發消息，直接返回true，此時不可能有阻塞

2.mComplete爲false，代表正在進行監控

3.若都不滿足，則postAtFrontOfQueue(this)，進行檢查

調用postAtFrontOfQueue後，如果沒有阻塞，則很快有返回，代表thread沒有阻塞，有返回就會調用它的run函數，調用相應服務的monitor，而monitor就是加個鎖，看能不能獲取到，獲取到就沒有阻塞

 @Override
200        public void run() {
201            final int size = mMonitors.size();
202            for (int i = 0 ; i < size ; i++) {
203                synchronized (Watchdog.this) {
204                    mCurrentMonitor = mMonitors.get(i);
205                }
206                mCurrentMonitor.monitor();
207            }
208
209            synchronized (Watchdog.this) {
210                mCompleted = true;
211                mCurrentMonitor = null;
212            }
213        }

4.報異常邏輯

在每個監測過程中，調用evaluateCheckerCompletionLocked進行返回時間計算

complete就是沒有阻塞

waitting狀態就是時間在0~30，繼續等待

waited_half狀態實在30~59 時間過半，開始dump ams stacktrace

到60秒，就是有阻塞發生了

獲取阻塞的服務和線程，生成log和dropbox

最後開殺

Slog.w(TAG, "*** WATCHDOG KILLING SYSTEM PROCESS: " + subject);
563                WatchdogDiagnostics.diagnoseCheckers(blockedCheckers);
564                Slog.w(TAG, "*** GOODBYE!");
565                Process.killProcess(Process.myPid());
566                System.exit(10);

5.接收廣播重啓

在init()函數中，接下來會調用registerReceiver()來註冊系統重啓的BroadcastReceiver。在收到系統重啓廣播時會執行RebootRequestReceiver的onReceive()函數，繼而調用rebootSystem()重啓系統。它允許其它模塊（如CTS）通過發廣播來讓系統重啓。所以watchdog有一個重要的工作，就是接收廣播並重啓系統。

盜了張圖：

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Android framework：watchdog

如何在低代碼平臺中引用 JavaScript ？

探究職業發展的關鍵：能力模型解讀

高效率使用windows

如何使用 JavaScript 獲取當前頁面幀率 FPS

工程款拖欠，農民工怎麼了？就得一直忍着委屈求全嗎？

HarmonyOS 實現下拉刷新，上拉加載更多

語音信號處理中的“窗函數”

智能決策新時代：可視化大屏是否能夠超越傳統白板？

解密Prompt系列28. LLM Agent之金融領域摸索：FinMem & FinAgent

分享幾個.NET開源的AI和LLM相關項目框架

JAVA基礎：Unsafe類

android studio 配置代碼倉庫

凹凸屏

Android framework：watchdog

Github常用框架集合

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結