使用SMMU的PMU查看性能數據
-v0.1 Sherlock 2019.9.28 init
ARM的SMMU提供了性能相關的統計寄存器(Performance Monitor Counter Groups - PMCG),
目前相關驅動已經合入Linux內核主線。我們可以配合用戶態的perf工具使用。本文介紹具
體的使用方法。
-
首先要確定使用的系統裏有arm_smmuv3_pmu這個模塊,或者它已經被編譯進內核。
這個模塊的代碼在內核目錄kernel/drivers/perf/arm_smmuv3_pmu.c, 內核配置是:
CONFIG_ARM_SMMU_V3_PMU -
確定使用的單板上的UEFI裏有你要測試的模塊對應的SMMU PMCG節點,沒有這個節點的
的話即使加載上面的驅動也無法使用SMMU PMCG -
正常使用的話,dmesg | grep pmcg可以看見類似信息:
Ubuntu:/ # dmesg | grep pmcg
[ 1232.379951] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.8.auto: option mask 0x1
[ 1232.380040] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.8.auto: Registered PMU @ 0x0000000148020000 using 8 counters with Individual filter settings
[ 1232.380094] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.9.auto: option mask 0x1
[ 1232.380142] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.9.auto: Registered PMU @ 0x0000000201020000 using 8 counters with Individual filter settings
[ 1232.380190] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.10.auto: option mask 0x1
[ 1232.380241] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.10.auto: Registered PMU @ 0x0000000100020000 using 8 counters with Individual filter settings
[ 1232.380286] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.11.auto: option mask 0x1
[ 1232.380337] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.11.auto: Registered PMU @ 0x0000000140020000 using 8 counters with Individual filter settings
[ 1232.380397] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.12.auto: option mask 0x1
[ 1232.380445] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.12.auto: Registered PMU @ 0x0000200148020000 using 8 counters with Individual filter settings
[ 1232.380491] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.13.auto: option mask 0x1
[ 1232.380542] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.13.auto: Registered PMU @ 0x0000200201020000 using 8 counters with Individual filter settings
[ 1232.380601] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.14.auto: option mask 0x1
[ 1232.380653] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.14.auto: Registered PMU @ 0x0000200100020000 using 8 counters with Individual filter settings
[ 1232.380698] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.15.auto: option mask 0x1
[ 1232.380770] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.15.auto: Registered PMU @ 0x0000200140020000 using 8 counters with Individual filter settings
- 使用perf list | grep pmcg可以查看系統支持的pmcg相關的時間類型:
Ubuntu:/ # perf list | grep pmcg
[...]
smmuv3_pmcg_140020/config_cache_miss/ [Kernel PMU event]
smmuv3_pmcg_140020/config_struct_access/ [Kernel PMU event]
smmuv3_pmcg_140020/cycles/ [Kernel PMU event]
smmuv3_pmcg_140020/pcie_ats_trans_passed/ [Kernel PMU event]
smmuv3_pmcg_140020/pcie_ats_trans_rq/ [Kernel PMU event]
smmuv3_pmcg_140020/tlb_miss/ [Kernel PMU event]
smmuv3_pmcg_140020/trans_table_walk_access/ [Kernel PMU event]
smmuv3_pmcg_140020/transaction/ [Kernel PMU event]
[...]
-
使用pmcg之前需要先明確需要測試的設備是在哪個pmcg之下,pmcg的命名方式是:
smmuv3_pmcg_<phys_addr_page>, 這裏的phys_addr_page是對應SMMU PMCG基地址去掉
低12bit。這裏的設計有點不好,使用者很難找到對應的關係 ? -
當想觀測一個程序對應的SMMU上統計信息時我們可以, 比如這樣:
perf stat -e smmuv3_pmcg_<phys_addr_page>/tlb_miss/ <your_program>
得到程序執行過程的smmu tlb miss數目。把這裏的tlb_miss換成上面perf list | grep pmcg
所示的其他事件,就可以得到其他事件的統計。
- 實際系統上可能一個smmu下接着多個外設,只想看一個外設在smmu上統計數據可以,比如
這樣:
perf stat -e smmuv3_pmcg_<phys_addr_page>/tlb_miss/,filter_enable=1,filter_span=0,filter_stream_id=0x75 <your_program>
上面的0x75是設備對應的stream_id,PCIe設備的話,一般就是這個設備的BDF號。
(fix me: device function number怎麼表示?)