Kernel space lock contention配置及其使用

概述

本文涉及到的內容如下:

  1. kernel lock相關debug方式,比如lock耗時,拿不到lock,lock依賴等等
  2. trace的使用
1.開啓lock contenttion涉及到的config配置
config LOCKDEP  
        bool  
        depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT  
        select STACKTRACE  
        select FRAME_POINTER if !MIPS && !PPC && !ARM_UNWIND && !S390 && !MICROBLAZE && !ARC && !SCORE && !X86  
        select KALLSYMS  
        select KALLSYMS_ALL  
  
config LOCKDEP_SMALL  
        bool  
  
config LOCK_STAT  
        bool "Lock usage statistics"  
        depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT  
        select LOCKDEP  
        select DEBUG_SPINLOCK  
        select DEBUG_MUTEXES  
        select DEBUG_RT_MUTEXES if RT_MUTEXES  
        select DEBUG_LOCK_ALLOC  
        default n  
        help  
         This feature enables tracking lock contention points  
  
         For more details, see Documentation/locking/lockstat.txt  
  
         This also enables lock events required by "perf lock",  
         subcommand of perf.  
         If you want to use "perf lock", you also need to turn on  
         CONFIG_EVENT_TRACING.  
  
         CONFIG_LOCK_STAT defines "contended" and "acquired" lock events.  
         (CONFIG_LOCKDEP defines "acquire" and "release" events.)  

上面的config默認是關閉狀態。

2 menuconfig開啓config配置

kernel hacking —> Lock Debugging (spinlock, mutexs, etc…) —>進入之後勾選
Lock Debugging: detect incorrect freeing of live locks 和Lcok usage statistics,就會把依賴項自動勾選.
在這裏插入圖片描述

保存退出之後,可以查看kernel下面新產生的config diff文件:

# Lock Debugging (spinlocks, mutexes, etc...)  
 #  
-# CONFIG_DEBUG_RT_MUTEXES is not set  
-# CONFIG_DEBUG_SPINLOCK is not set  
-# CONFIG_DEBUG_MUTEXES is not set  
+CONFIG_DEBUG_RT_MUTEXES=y  
+CONFIG_DEBUG_SPINLOCK=y  
+CONFIG_DEBUG_MUTEXES=y  
 # CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set  
-# CONFIG_DEBUG_LOCK_ALLOC is not set  
+CONFIG_DEBUG_LOCK_ALLOC=y  
 # CONFIG_PROVE_LOCKING is not set  
-# CONFIG_LOCK_STAT is not set  
+CONFIG_LOCKDEP=y  
+CONFIG_LOCK_STAT=y  
+# CONFIG_DEBUG_LOCKDEP is not set  
 # CONFIG_DEBUG_ATOMIC_SLEEP is not set  
 # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set  
 # CONFIG_LOCK_TORTURE_TEST is not set  

這樣開啓lock trace event了.。編譯boot並刷機即可!

3 確定是否開啓trace lock content feature

上面步驟二中開啓之後,在adb shell裏面多瞭如下幾個接口:

  1. 增加了lock trace event, 有四個lock trace event,具體做什麼使用,下面在詳細講解
    在這裏插入圖片描述
  2. 增加了lock的統計信息, 節點信息實現源碼kernel/locking/lockdep_proc.c文件中:
    ● lock_stat: 統計各種類型lock的時延
    ● lockdep: 表示鎖的的深度,即一個lock裏面調用了多少個lock,並將這些lock的調用全部顯示出來
    ● lockdep_stats: 即鎖深度裏面包含了哪些lock以及頻次
    ● locks:鎖的狀態,這個實現在fs/locks.c文件裏面

在這裏插入圖片描述

4 如何使用lock trace event和proc node信息
4.1 lock trace event

從第三部分可以知道,lock trace event包含四個event,分別講解如下:
四個event定義在include/trace/event/lock.h裏面, 使用在kernel/locking/lockdep.c文件,

  1. lock_acquire: 獲取lock,比如mutex lock ,rcu read lock 或者spinlock等等lock
  2. lock_acquired: 表示已經獲取lock了
  3. lock_release: lock釋放的event
  4. lock_contended: lock被誰hold了

那麼如何抓取lock相關的trace event呢?
可以使用下面的腳本抓取(必須添加lock event ),即正常的抓取trace 命令:

echo 40000 > buffer_size_kb && echo irq sched_switch sched_wakeup sched_waking cpu_frequency cpu_idle lock > set_event && cat set_event && echo > trace  && echo 1 > tracing_on && sleep 10 && echo 0 > tracing_on && cat trace > /data/trace.txt   

腳本執行完畢之後, pull出data目錄下的trace.txt文件,vim打開即可看到下面類似的信息:

┊   ┊  sleep-2875  [004] d.s2   748.861607: sched_waking: comm=rcu_sched pid=9 prio=120 target_cpu=000  
┊   ┊  sleep-2875  [004] d.s3   748.861608: lock_acquire: 00000000d4eff177 &rq->lock  
┊   ┊  sleep-2875  [004] d.s3   748.861609: lock_contended: 00000000d4eff177 &rq->lock  
┊   ┊  sleep-2875  [004] d.s3   748.861662: lock_acquired: 00000000d4eff177 &rq->lock  
┊   ┊  sleep-2875  [004] d.s3   748.861663: lock_acquire: 000000000ff04d65 read tk_core.seq  
┊   ┊  sleep-2875  [004] d.s3   748.861664: lock_release: 000000000ff04d65 tk_core.seq  
┊   ┊  sleep-2875  [004] d.s3   748.861666: lock_acquire: 00000000cc726398 read rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s3   748.861667: lock_release: 00000000cc726398 rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s3   748.861668: lock_release: 00000000d4eff177 &rq->lock  
┊   ┊  sleep-2875  [004] d.s2   748.861669: lock_acquire: 00000000cc726398 read rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861670: lock_acquire: 00000000cc726398 read rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861671: lock_release: 00000000cc726398 rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861672: lock_acquire: 00000000cc726398 read rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861673: lock_release: 00000000cc726398 rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861674: lock_acquire: 00000000cc726398 read rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861675: lock_release: 00000000cc726398 rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861676: lock_acquire: 00000000cc726398 read rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861677: lock_release: 00000000cc726398 rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861678: lock_acquire: 00000000cc726398 read rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861679: lock_release: 00000000cc726398 rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s2   748.861683: lock_release: 00000000cc726398 rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s3   748.861684: lock_acquire: 00000000d4eff177 &rq->lock  
┊   ┊  sleep-2875  [004] d.s3   748.861685: lock_contended: 00000000d4eff177 &rq->lock  
┊   ┊  sleep-2875  [004] d.s3   748.861719: lock_acquired: 00000000d4eff177 &rq->lock  
┊   ┊  sleep-2875  [004] d.s4   748.861720: lock_acquire: 00000000cc2f375c &rq->lock  
┊   ┊  sleep-2875  [004] d.s4   748.861720: lock_acquired: 00000000cc2f375c &rq->lock  
┊   ┊  sleep-2875  [004] d.s4   748.861721: lock_acquire: 000000000ff04d65 read tk_core.seq  
┊   ┊  sleep-2875  [004] d.s4   748.861722: lock_release: 000000000ff04d65 tk_core.seq  
┊   ┊  sleep-2875  [004] d.s4   748.861724: lock_acquire: 00000000cc726398 read rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s4   748.861725: lock_release: 00000000cc726398 rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s4   748.861726: lock_acquire: 00000000cc726398 read rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s4   748.861727: lock_release: 00000000cc726398 rcu_read_lock  
┊   ┊  sleep-2875  [004] d.s4   748.861728: lock_release: 00000000d4eff177 &rq->lock  
┊   ┊  sleep-2875  [004] d.s3   748.861730: lock_release: 00000000cc2f375c &rq->lock  

我們可以看到如下的信息:

  1. rcu_read_lock 是lock_acquire 之後直接lock_release. 持鎖時間非常的短
  2. &rq->lock, 是一個spinlock類型. 通過lock_acquire→ lock_acquired→ lock_release的過程. 在lock_acquire過程中出現了lock_contended trace event, 表示此時的&rq->lock spinlock這個lock存在contention或者contended,表示此時有lock的競爭.
  3. 如果出現鎖競爭,就會統計當前進程獲取lock的等待時間等等統計信息,詳細在4.2節講解.
  4. 可以明顯的看到持鎖時間, 或者獲取鎖的時間.
4.2 lock proc node信息

有三個主要的節點信息,都在shell proc目錄下:

cat /proc/lock_stat:

/proc # cat lock_stat | head -n 40                                                                                                                                                                 
lock_stat version 0.4  
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
                              class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg  
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
  
                 &(&n->list_lock)->rlock:        441821         441946           1.81          80.23     1891806.60           4.28        2488861       10314115           0.00         137.08    64789787.05           6.28  
                 -----------------------  
                 &(&n->list_lock)->rlock         114751          [<0000000051f28fcf>] ___slab_alloc+0x194/0x5ec  
                 &(&n->list_lock)->rlock         105155          [<000000000913f330>] deactivate_slab+0x2ec/0x540  
                 &(&n->list_lock)->rlock         169288          [<00000000fb58074e>] free_debug_processing+0x34/0x26c  
                 &(&n->list_lock)->rlock          12316          [<00000000f00be1d8>] kfree+0x30c/0x55c  
                 -----------------------  
                 &(&n->list_lock)->rlock         111558          [<0000000051f28fcf>] ___slab_alloc+0x194/0x5ec  
                 &(&n->list_lock)->rlock          89847          [<000000000913f330>] deactivate_slab+0x2ec/0x540  
                 &(&n->list_lock)->rlock         192366          [<00000000fb58074e>] free_debug_processing+0x34/0x26c  
                 &(&n->list_lock)->rlock          12305          [<00000000f00be1d8>] kfree+0x30c/0x55c  
  
.............................................................................................................................................................................................................................  
  
                               &rq->lock:        351895         351959           1.88         564.00     9605797.10          27.29        1569868        5892437           0.00        1061.00   172708414.01          29.31  
                               ---------  
                               &rq->lock           5739          [<00000000807dcaa6>] task_rq_lock+0x74/0xc4  
                               &rq->lock           7268          [<00000000f033900c>] pick_next_task_fair+0x378/0x7e4  
                               &rq->lock          85000          [<0000000043e94103>] try_to_wake_up+0x1bc/0x554  
                               &rq->lock          46881          [<0000000078183a31>] try_to_wake_up+0x3b8/0x554  
                               ---------  
                               &rq->lock          24883          [<00000000f57857d0>] update_blocked_averages+0x50/0xc44  
                               &rq->lock           3577          [<00000000807dcaa6>] task_rq_lock+0x74/0xc4  
                               &rq->lock          14525          [<00000000f033900c>] pick_next_task_fair+0x378/0x7e4  
                               &rq->lock              5          [<0000000007411a3e>] walt_set_window_start+0xd4/0x134  
  
.............................................................................................................................................................................................................................  
  
                 &sg_policy->update_lock:        259980         259986           1.88         212.12     1681170.67           6.47        1729740        3292457           1.88         270.81    28368055.35           8.62  
                 -----------------------  
                 &sg_policy->update_lock         259986          [<00000000077def88>] sugov_update_shared+0x5c/0x1b0  
                 -----------------------  
                 &sg_policy->update_lock         259986          [<00000000077def88>] sugov_update_shared+0x5c/0x1b0  
  
.............................................................................................................................................................................................................................  

上面表示從開機到現在cat這個節點的時間內,lock的所有狀態信息,包括

  1. lock的名字
  2. contention次數
  3. lock等待的時間,最大時間,最小時間,平均時間以及等待的總時間
  4. lock持有時間,最大,最小,平均以及持有的總時間
  5. 獲取這個lock的次數
    等等…

cat /proc/lockdep:

目的是將class_lock上面所有相關聯的lock stack全部輸出
可能的樣式如下:

/proc # cat lockdep | head -n 100                                                                                                                                                              
all lock classes:  
0000000044beef8b ....: logbuf_lock  
  
00000000c80448bf ....: (console_sem).lock  
  
00000000700ad619 ....: console_lock  
  
00000000ef505732 ....: cgroup_mutex  
  
0000000042291e92 ....: console_owner_lock  
  
000000002e29cf8c ....: console_owner  
  
000000004e6f50fd ....: devtree_lock  
  
000000003c9defa6 ....: resource_lock  
  
00000000947b85f2 ....: pm_mutex  
  
0000000010004418 ....: primary_crng.lock  
  
0000000020d31ff4 ....: input_pool.lock  
  
00000000588a9d5b ....: "warn_unseeded_randomness".lock  
  
00000000e13be140 ....: lock  
  
0000000005822fb5 ....: cpu_hotplug_lock.rw_sem  
  
00000000b0cf55a3 ....: cpuhp_state_mutex  

cat /proc/lockdep_stats:

信息如下,主要是一些上下文裏面的信息:

/proc # cat lockdep_stats                                                                                                                                                                          
 lock-classes:                         1851 [max: 8191]  
 direct dependencies:                     0 [max: 32768]  
 indirect dependencies:                   0  
 all direct dependencies:                 0  
 in-hardirq chains:                       0  
 in-softirq chains:                       0  
 in-process chains:                       0  
 stack-trace entries:                 20581 [max: 524288]  
 combined max dependencies:               1  
 hardirq-safe locks:                      0  
 hardirq-unsafe locks:                    0  
 softirq-safe locks:                      0  
 softirq-unsafe locks:                    0  
 irq-safe locks:                          0  
 irq-unsafe locks:                        0  
 hardirq-read-safe locks:                 0  
 hardirq-read-unsafe locks:               0  
 softirq-read-safe locks:                 0  
 softirq-read-unsafe locks:               0  
 irq-read-safe locks:                     0  
 irq-read-unsafe locks:                   0  
 uncategorized locks:                  1851  
 unused locks:                            0  
 max locking depth:                      19  
 debug_locks:                             1  

lock event信息還是對於lock debug還是非常有幫助的。
實戰在另一篇博文:pr_emerg耗時,影響性能原理排查

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章