Solaris中如何檢測內核代碼的內存泄漏

Solaris中如何檢測內核代碼的內存泄漏

本文將以一個驅動程序(tleak.c tleak.conf)爲例說明如何利用mdb的::findleaks命令檢測內核代碼是否存在內存泄漏。

請注意,上一篇文章給的示例應用程序其內存泄漏發生在堆(heap)上,當程序退出的時候,堆隨之被釋放掉,所以並不會對系統造成影響。而本文提供的示例驅動tleak將在內核產生內存泄漏,所以請謹慎使用,不熟悉內核的朋友請不要在自己的機器上運行該驅動及以下步驟。(USE AT YOUR OWN RISK)

tleak是一個僞字符設備,每打開一次,會進行一次內存分配,則當第二次打開該設備的時候就會產生內存泄漏,主要函數tleak_open()定義如下:

static int
tleak_open(dev_t /*devp, int flag, int otyp, cred_t /*credp)
{
if (otyp != OTYP_CHR)
return (EINVAL);
tleak_addr = kmem_zalloc(100, KM_SLEEP);
return (0);
}

首先設置系統變量kmem_flags以使能核心內存分配(kernel memory allocator)的調試功能,這些功能在缺省情況下是被禁止的。爲此在/etc/system中加入行:
set kmem_flags=0xf
重啓機器,用mdb確認kmem_flag的值
$ mdb -k
Loading modules: [ unix krtld genunix specfs dtrace cpu.AuthenticAMD.15 uppc pcplusmp ufs ip sctp usba uhci s1394 nca fcp fctl lofs zfs random audiosup md cpc crypto fcip logindmux ptm sppp nfs ]
> kmem_flags/X
kmem_flags:
kmem_flags: f

其次編譯、安裝驅動程序tleak。
$ /usr/sfw/bin/gcc -D_KERNEL -c tleak.c
$ ld -dy -r -o tleak tleak.o

$ cp tleak /kernel/drv/
$ cp tleak.conf /kernel/drv/

$ add_drv tleak

add_drv將自動加載驅動程序,用modinfo檢查一下
$ modinfo | grep tleak
194 fa15bb04 484 205 1 tleak (Test kernel memory leak v0.1)

在/devices下生成了設備文件/devices/pseudo/tleak@0:tleak。多次運行cat打開設備以產生內存泄漏
$ cat /devices/pseudo/tleak@0:tleak

強制系統coredump,同時重啓機器
$ mdb -K
Loaded modules: [ audiosup crypto cpc uppc ptm ufs unix zfs krtld s1394 sppp ipcnca uhci lofs genunix ip logindmux usba specfs pcplusmp nfs md random sctp cpu.AuthenticAMD.15 ]
[0]> %CONTENT%lt;systemdump

注意,"mdb -K"須在控制檯上才能運行。另外,在控制檯或終端運行"reboot -d"也可以讓核心coredump。

等機器重新啓動後,用mdb調試上一步生成的核心core文件
$ cd /var/crash//
$ ls
bounds unix.0 vmcore.0
$ mdb -k 0
Loading modules: [ unix krtld genunix specfs dtrace cpu.AuthenticAMD.15 uppc pcplusmp ufs ip sctp usba uhci s1394 nca fcp fctl lofs zfs random audiosup md cpc crypto fcip logindmux ptm sppp nfs ]
> ::status
debugging crash dump vmcore.0 (32-bit) from mars
operating system: 5.11 snv_34 (i86pc)
panic message:
BAD TRAP: type=e (#pf Page fault) rp=d4e7cdb8 addr=0 occurred in module "" due to a NULL pointer dereference
dump content: kernel pages only
> ::findleaks
CACHE
LEAKED
BUFCTL
CALLER
dac2e6f0
2 d3f14980 AcpiOsAllocate+0x15
dac2e6f0 5 d3f20c40 AcpiOsAllocate+0x15
dac2e6f0 1 d3f14ae8 AcpiOsAllocate+0x15
dac2e6f0 1 d3f1e618 AcpiOsAllocate+0x15
dac2e6f0 7 d3f20cb8 AcpiOsAllocate+0x15
dac2e6f0 2
d3f20b50 AcpiOsAllocate+0x15
dac32030 1
d4ec7748 tleak_open+0x35
---------
---------
---------------
-------------------------
Total
19
buffers,
976 bytes

> d4ec7748%CONTENT%lt;bufctl_audit
ADDR
BUFADDR
TIMESTAMP
THREAD

CACHE
LASTLOG
CONTENTS
d4ec7748
d4db0300
a1397b121b
d64db340

dac32030
db0f0628
dbb62e98

kmem_cache_alloc_debug+0x256

kmem_cache_alloc+0x97

kmem_zalloc+0x4b

tleak_open+0x35

dev_open+0x27

spec_open+0x3cc

fop_open+0x6e

vn_openat+0x42a

copen+0x287

open64+0x20

至此,我們已能識別出tleak產生內存泄漏的位置就是tleak_open()中的kmem_zalloc()。進一步看一下,驅動程序都分配/釋放了哪些內存
> ::walk kmem_log | ::bufctl ! grep tleak
ADDR
BUFADDR
TIMESTAMP
THREAD
CALLER
----------
-----------
-------------
-----------
-------------------
db2bebf8
d4db0380
a49a0fccba
d64db340
tleak_open+0x35
db0f0628
d4db0300
a1397b121b
d64db340
tleak_open+0x35
db0bc394
d51b3380
9f58e81dab
d64db340
tleak_open+0x35

可以看出tleak_open()被調用了三次,也就意味着分配了三次內存。(或者說,cat被運行了三次)

另外mdb的::kmem_verify可以用來檢測內存異常(如越界訪問)。這時mdb提供了豐富的命令和宏,使用戶可以方便地得到壞內存被哪些線程訪問過。如:
> d4db0300::whatis
d4db0300 is d4db0300+0, bufctl d4ec7748 allocated from kmem_alloc_112


::bufctl -a用buffer地址過濾內存分配日誌。該例中此內存僅被tleak_open()訪問過
> ::walk kmem_log | ::bufctl -a d4db0300
ADDR BUFADDR TIMESTAMP THREAD CALLER
db0f0628 d4db0300 a1397b121b d64db340 tleak_open+0x35

::kgrep搜索對指定buffer的引用
> d4db0300::kgrep | ::whatis -a
db0f062c is dac43000+4ad62c (vmem_seg dac11168) from kmem_log vmem arena
db0f062c is dac43000+4ad62c (vmem_seg dac11258) from heap vmem arena
d4ec774c is d4ec7748+4, allocated from kmem_bufctl_audit_cache
d4ec774c is d4ec7000+74c (vmem_seg d4ea9ac8) from kmem_msb vmem arena
d4ec774c is d4ec7000+74c (vmem_seg d4ea9bb8) from kmem_metadata vmem arena
d4ec774c is d4ec4000+374c (vmem_seg d4ea6d20) from heap vmem arena
d504693c is d5046920+1c, allocated from kmem_magazine_7
d504693c is d5046000+93c (vmem_seg d4eb98e8) from kmem_msb vmem arena
d504693c is d5046000+93c (vmem_seg d4eb99d8) from kmem_metadata vmem arena
d504693c is d5044000+293c (vmem_seg d4eb66f8) from heap vmem arena
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章