Linux內存管理 (22)內存檢測技術(slub_debug/kmemleak/kasan)【轉】 Linux內存管理專題

轉自:https://www.cnblogs.com/arnoldlu/p/8568090.html

專題:Linux內存管理專題

關鍵詞:slub_debug、kmemleak、kasan、oob、Redzone、Padding。

 

Linux常見的內存訪問錯誤有:

  • 越界訪問(out of bounds)
  • 訪問已經釋放的內存(use after free)
  • 重複釋放
  • 內存泄露(memory leak)
  • 棧溢出(stack overflow)

不同的工具有不同的側重點,本章主要從slub_debug、kmemleak、kasan三個工具介紹。

kmemleak側重於內存泄露問題發現。

slub_debug和kasan有一定的重複,部分slub_debug問題需要藉助slabinfo去發現;kasan更快,所有問題獨立上報,缺點是需要高版本GCC支持(gcc 4.9.2 or gcc 5.0)。

1 測試環境準備

更新內核版本到Kernel v4.4,然後編譯:

git clone https://github.com/arnoldlu/linux.git -b running_kernel_4.4

export ARCH=arm64

export CROSS_COMPILE=aarch64-linux-gnu-

make defconfig

make bzImage -j4 ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-

2. slub_debug

關鍵詞:Red Zone、Padding、Object Layout。

Linux內核中,小塊內存大量使用slab/slub分配器,slub_debug提供了內存檢測小功能。

內存中比較容易出錯的地方有:

  • 訪問已經釋放的內存
  • 越界訪問
  • 重複釋放內存

關於slub_debug的兩篇文章:《圖解slub》《SLUB DEBUG原理

2.1 編譯支持slub_debug內核

首先需要打開General setup -> Enable SLUB debugging support,然後再選擇Kernel hacking -> Memory Debugging -> SLUB debugging on by default。

CONFIG_SLUB=y

CONFIG_SLUB_DEBUG=y

CONFIG_SLUB_DEBUG_ON=y

CONFIG_SLUB_STATS=y

2.3 測試環境:slabinfo、slub.ko

通過slub.ko模擬內存異常訪問,有些可以直接顯示,有些需要通過slabinfo -v來查看。

在tools/vm目錄下,執行如下命令,生成可執行文件slabinfo。放入_install目錄,打包到zImage中。

make slabinfo CFLAGS=-static ARCH=arm CROSS_COMPILE=arm-linux-gnueabi-

將編譯好的slabinfo放入sbin。

下面三個測試代碼:https://github.com/arnoldlu/linux/tree/running_kernel_4.4/test_code/slub_debug

test_code/slub_debug目錄下執行make.sh,將slub.ko/slub2.ko/slub3.ko放入data。

2.4 進行測試

啓動QEMU:

qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -smp 2 -m 2048 -kernel arch/arm64/boot/Image --append "rdinit=/linuxrc console=ttyAMA0 loglevel=8 slub_debug=UFPZ" -nographic

F:在free的時候會執行檢查。

Z:表示Red Zone的意思。

P:是Poison的意思。

U:會記錄slab的使用者信息,如果打開,會會顯示分配釋放對象的棧回溯。

在slub_debug打開SLAB_STORE_USER選項後,可以清晰地看到問題點的backtrace。

2.5 測試結果

 內存越界訪問包括Redzone overwritten和Object padding overwritten。

重複釋放對應Object already free。訪問已釋放內存爲Posion overwritten。

2.5.1 Redzone overwritten

 執行insmod data/slub.ko,使用slabinfo -v查看結果。

複製代碼
static void create_slub_error(void)
{
  buf = kmalloc(32, GFP_KERNEL);
  if(buf) {
    memset(buf, 0x55, 80);-----------------------------------雖然分配32字節,但是對應分配了64字節。所以設置爲80字節訪問觸發異常。從buf開始的80個字節仍然被初始化成功。
  }
}
複製代碼

雖然kmalloc申請了32字節的slab緩衝區,但是內核分配的是kmalloc-64。所以memset 36字節不會報錯,將36改成大於64即可。

一個slub Debug輸出包括四大部分:

=============================================================================

BUG kmalloc-64 (Tainted: G O ): Redzone overwritten-------------------------------------------------------------1. 問題描述:slab名稱-kmalloc-64,什麼錯誤-Redzone overwritten。
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: 0xeddb3640-0xeddb3643. First byte 0x55 instead of 0xcc------------------------------------------------1.1 問題起始和結束地址,這裏一共4字節。
INFO: Allocated in 0x55555555 age=1766 cpu=0 pid=771---------------------------------------------------------1.2 slab的分配棧回溯
0x55555555
0xbf002014
do_one_initcall+0x90/0x1d8
do_init_module+0x60/0x38c
load_module+0x1bac/0x1e94
SyS_init_module+0x14c/0x15c
ret_fast_syscall+0x0/0x3c
INFO: Freed in do_one_initcall+0x78/0x1d8 age=1766 cpu=0 pid=771-----------------------------------------1.3 slab的釋放棧回溯
do_one_initcall+0x78/0x1d8
do_init_module+0x60/0x38c
load_module+0x1bac/0x1e94
SyS_init_module+0x14c/0x15c
ret_fast_syscall+0x0/0x3c
INFO: Slab 0xefdb5660 objects=16 used=14 fp=0xeddb3700 flags=0x0081-----------------------------------1.4 slab的地址,以及其它信息。
INFO: Object 0xeddb3600 @offset=1536 fp=0x55555555-----------------------------------------------------------1.5 當前Object起始,及相關信息

Bytes b4 eddb35f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ------------2. 問題slab對象內容。2.1 打印問題slab對象內容之前一些字節。
Object eddb3600: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU---------2.2 slab對象內容,全部爲0x55。
Object eddb3610: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU
Object eddb3620: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU
Object eddb3630: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU
Redzone eddb3640: 55 55 55 55 UUUU----------------------------------------------------------------------------------2.3 Redzone內容,問題出在這裏。
Padding eddb36e8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ------------2.4 Padding內容,爲了對象對齊而補充。
Padding eddb36f8: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
CPU: 2 PID: 773 Comm: slabinfo Tainted: G B O 4.4.0+ #93--------------------------------------------------------3. 檢查問題點的棧打印,這裏是由於slabinfo找出來的。
Hardware name: ARM-Versatile Express
[<c0016588>] (unwind_backtrace) from [<c0013070>] (show_stack+0x10/0x14)
[<c0013070>] (show_stack) from [<c0244130>] (dump_stack+0x78/0x88)
[<c0244130>] (dump_stack) from [<c00e1874>] (check_bytes_and_report+0xd0/0x10c)
[<c00e1874>] (check_bytes_and_report) from [<c00e1a14>] (check_object+0x164/0x234)
[<c00e1a14>] (check_object) from [<c00e29bc>] (validate_slab_slab+0x198/0x1bc)
[<c00e29bc>] (validate_slab_slab) from [<c00e578c>] (validate_store+0xac/0x190)
[<c00e578c>] (validate_store) from [<c0146780>] (kernfs_fop_write+0xb8/0x1b4)
[<c0146780>] (kernfs_fop_write) from [<c00ebfc4>] (__vfs_write+0x1c/0xd8)
[<c00ebfc4>] (__vfs_write) from [<c00ec808>] (vfs_write+0x90/0x170)
[<c00ec808>] (vfs_write) from [<c00ed008>] (SyS_write+0x3c/0x90)
[<c00ed008>] (SyS_write) from [<c000f3c0>] (ret_fast_syscall+0x0/0x3c)
FIX kmalloc-64: Restoring 0xeddb3640-0xeddb3643=0xcc----------------------------------------------------------4. 問題點是如何被解決的,此處恢復4個字節爲0xcc。

2.5.2 Object padding overwritten

複製代碼
void create_slub_error(void)
{
  int i;

  buf = kmalloc(32, GFP_KERNEL);
  if(buf) {
    buf[-1] = 0x55;------------------------------------------------------------------------向左越界訪問
    kfree(buf);
  }
}
複製代碼

執行insmod data/slub4.ko,結果如下。

這裏的越界訪問和之前有點不一樣的是,這裏向左越界。覆蓋到了Padding區域。

al: slub error test init
=============================================================================
BUG kmalloc-128 (Tainted: G O ): Object padding overwritten------------------------------------------------------覆蓋到Padding區域
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: 0xffff80007767e9ff-0xffff80007767e9ff. First byte 0x55 instead of 0x5a
INFO: Allocated in call_usermodehelper_setup+0x44/0xb8 age=1 cpu=1 pid=789
alloc_debug_processing+0x17c/0x188
___slab_alloc.constprop.30+0x3f8/0x440
__slab_alloc.isra.27.constprop.29+0x24/0x38
kmem_cache_alloc+0x1ec/0x260
call_usermodehelper_setup+0x44/0xb8
/ # kobject_uevent_env+0x494/0x500
kobject_uevent+0x10/0x18
load_module+0x18cc/0x1d78
SyS_init_module+0x150/0x178
el0_svc_naked+0x24/0x28
INFO: Slab 0xffff7bffc2dd9f80 objects=16 used=9 fp=0xffff80007767ea00 flags=0x4081
INFO: Object 0xffff80007767e800 @offset=2048 fp=0xffff80007767ea00

Bytes b4 ffff80007767e7f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Object ffff80007767e800: 00 01 00 00 00 00 00 00 08 e8 67 77 00 80 ff ff ..........gw....
Object ffff80007767e810: 08 e8 67 77 00 80 ff ff f8 83 0c 00 00 80 ff ff ..gw............
Object ffff80007767e820: 00 00 00 00 00 00 00 00 00 6e aa 00 00 80 ff ff .........n......
Object ffff80007767e830: 00 23 67 78 00 80 ff ff 18 23 67 78 00 80 ff ff .#gx.....#gx....
Object ffff80007767e840: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff80007767e850: b8 8e 32 00 00 80 ff ff 00 23 67 78 00 80 ff ff ..2......#gx....
Object ffff80007767e860: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff80007767e870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Redzone ffff80007767e880: cc cc cc cc cc cc cc cc ........
Padding ffff80007767e9c0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff80007767e9d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff80007767e9e0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff80007767e9f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 55 ZZZZZZZZZZZZZZZU
CPU: 0 PID: 790 Comm: mdev Tainted: G B O 4.4.0+ #116
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff800000089738>] dump_backtrace+0x0/0x108
[<ffff800000089854>] show_stack+0x14/0x20
[<ffff8000003253c4>] dump_stack+0x94/0xd0
[<ffff800000196460>] print_trailer+0x128/0x1b8
[<ffff800000196848>] check_bytes_and_report+0xd8/0x118
[<ffff800000196928>] check_object+0xa0/0x240
[<ffff8000001987e0>] free_debug_processing+0x128/0x380
[<ffff80000019a1cc>] __slab_free+0x344/0x4a0
[<ffff80000019ab94>] kfree+0x1ec/0x220
[<ffff8000000c8278>] umh_complete+0x58/0x68
[<ffff8000000c83d8>] call_usermodehelper_exec_async+0x150/0x170
[<ffff800000085c50>] ret_from_fork+0x10/0x40
FIX kmalloc-128: Restoring 0xffff80007767e9ff-0xffff80007767e9ff=0x5a---------------------------------------------------------問題處理是將對應字節恢復爲0x5a。

2.5.3 Object already free

複製代碼
void create_slub_error(void)
{
  buf = kmalloc(32, GFP_KERNEL);
  if(buf) {
    memset(buf, 0x55, 32);
    kfree(buf);
    printk("al: Object already freed");
    kfree(buf);
  }
}
複製代碼

內核中free執行流程如下:

複製代碼
kfree
  ->slab_free
    ->__slab_free
      ->kmem_cache_debug
        ->free_debug_processing
->on_freelist
複製代碼

執行insmod data/slub2.ko,結果如下。

重複釋放,是對同一個對象連續釋放了多次。

al: slub error test init
al: Object already freed
=============================================================================
BUG kmalloc-128 (Tainted: G B O ): Object already free------------------------------------------------------------------在64位系統,32字節的kmalloc變成了kmalloc-128,問題類型是:Object already free,也即重複釋放。
-----------------------------------------------------------------------------

INFO: Allocated in create_slub_error+0x20/0x80 [slub2] age=0 cpu=1 pid=791------------------------------------內存分配點棧回溯
alloc_debug_processing+0x17c/0x188
___slab_alloc.constprop.30+0x3f8/0x440
__slab_alloc.isra.27.constprop.29+0x24/0x38
kmem_cache_alloc+0x1ec/0x260
create_slub_error+0x20/0x80 [slub2]
my_test_init+0x14/0x28 [slub2]
do_one_initcall+0x90/0x1a0
do_init_module+0x60/0x1cc
load_module+0x18dc/0x1d78
SyS_init_module+0x150/0x178
el0_svc_naked+0x24/0x28
INFO: Freed in create_slub_error+0x50/0x80 [slub2] age=0 cpu=1 pid=791------------------------------------------內存釋放點棧回溯
free_debug_processing+0x17c/0x380
__slab_free+0x344/0x4a0
kfree+0x1ec/0x220
create_slub_error+0x50/0x80 [slub2]
my_test_init+0x14/0x28 [slub2]
do_one_initcall+0x90/0x1a0
do_init_module+0x60/0x1cc
load_module+0x18dc/0x1d78
SyS_init_module+0x150/0x178
el0_svc_naked+0x24/0x28
INFO: Slab 0xffff7bffc2dda800 objects=16 used=7 fp=0xffff8000776a0800 flags=0x4081
INFO: Object 0xffff8000776a0800 @offset=2048 fp=0xffff8000776a0a00

Bytes b4 ffff8000776a07f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Object ffff8000776a0800: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk-----------------內存內容打印,供128字節。
Object ffff8000776a0810: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0820: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0830: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0840: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0850: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0860: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0870: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
Redzone ffff8000776a0880: bb bb bb bb bb bb bb bb ........
Padding ffff8000776a09c0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776a09d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776a09e0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776a09f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
CPU: 1 PID: 791 Comm: insmod Tainted: G B O 4.4.0+ #116--------------------------------------------------------------此處問題在insmod就發現了,所以檢查出問題的進程就是insmod。
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff800000089738>] dump_backtrace+0x0/0x108
[<ffff800000089854>] show_stack+0x14/0x20
[<ffff8000003253c4>] dump_stack+0x94/0xd0
[<ffff800000196460>] print_trailer+0x128/0x1b8
[<ffff800000198954>] free_debug_processing+0x29c/0x380
[<ffff80000019a1cc>] __slab_free+0x344/0x4a0
[<ffff80000019ab94>] kfree+0x1ec/0x220
[<ffff7ffffc008060>] create_slub_error+0x60/0x80 [slub2]
[<ffff7ffffc00a014>] my_test_init+0x14/0x28 [slub2]
[<ffff800000082930>] do_one_initcall+0x90/0x1a0
[<ffff80000014647c>] do_init_module+0x60/0x1cc
[<ffff800000120704>] load_module+0x18dc/0x1d78
[<ffff800000120cf0>] SyS_init_module+0x150/0x178
[<ffff800000085cb0>] el0_svc_naked+0x24/0x28
FIX kmalloc-128: Object at 0xffff8000776a0800 not freed------------------------------------------------------------------處理的結果是,此處slab 對象是沒有被釋放。

2.5.4 Poison overwritten

訪問已釋放內存的測試代碼如下:

複製代碼
static void create_slub_error(void)
{
  buf = kmalloc(32, GFP_KERNEL);-----------------------此時的buf內容都是0x6B
  if(buf) {
    kfree(buf);
    printk("al: Access after free");
    memset(buf, 0x55, 32);-----------------------------雖然被釋放,但是memset仍然生效了變成了0x55。
  }
}
複製代碼

執行insmod data/slub3.ko ,使用slabinfo -v查看結果。

=============================================================================

BUG kmalloc-128 (Tainted: G B O ): Poison overwritten----------------------------------------------slab名稱爲kmalloc-64,問題類型是:Poison overwritten,即訪問已釋放內存。
-----------------------------------------------------------------------------

 

INFO: 0xffff800077692800-0xffff80007769281f. First byte 0x55 instead of 0x6b
INFO: Allocated in create_slub_error+0x28/0xf0 [slub3] age=1089 cpu=1 pid=793----------分配點的棧回溯
alloc_debug_processing+0x17c/0x188
___slab_alloc.constprop.30+0x3f8/0x440
__slab_alloc.isra.27.constprop.29+0x24/0x38
kmem_cache_alloc+0x1ec/0x260
create_slub_error+0x28/0xf0 [slub3]
0xffff7ffffc00e014
do_one_initcall+0x90/0x1a0
do_init_module+0x60/0x1cc
load_module+0x18dc/0x1d78
SyS_init_module+0x150/0x178
el0_svc_naked+0x24/0x28
INFO: Freed in create_slub_error+0x80/0xf0 [slub3] age=1089 cpu=1 pid=793--------------釋放點的棧回溯
free_debug_processing+0x17c/0x380
__slab_free+0x344/0x4a0
kfree+0x1ec/0x220
create_slub_error+0x80/0xf0 [slub3]
0xffff7ffffc00e014
do_one_initcall+0x90/0x1a0
do_init_module+0x60/0x1cc
load_module+0x18dc/0x1d78
SyS_init_module+0x150/0x178
el0_svc_naked+0x24/0x28
INFO: Slab 0xffff7bffc2dda480 objects=16 used=16 fp=0x (null) flags=0x4080
INFO: Object 0xffff800077692800 @offset=2048 fp=0xffff800077692400

 

Bytes b4 ffff8000776927f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Object ffff800077692800: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU--------前32字節仍然被修改成功。
Object ffff800077692810: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU
Object ffff800077692820: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff800077692830: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff800077692840: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff800077692850: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff800077692860: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff800077692870: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
Redzone ffff800077692880: bb bb bb bb bb bb bb bb ........
Padding ffff8000776929c0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776929d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776929e0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776929f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
CPU: 0 PID: 795 Comm: slabinfo Tainted: G B O 4.4.0+ #116
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff800000089738>] dump_backtrace+0x0/0x108
[<ffff800000089854>] show_stack+0x14/0x20
[<ffff8000003253c4>] dump_stack+0x94/0xd0
[<ffff800000196460>] print_trailer+0x128/0x1b8
[<ffff800000196848>] check_bytes_and_report+0xd8/0x118
[<ffff800000196a54>] check_object+0x1cc/0x240
[<ffff800000197920>] alloc_debug_processing+0x108/0x188
[<ffff800000199670>] ___slab_alloc.constprop.30+0x3f8/0x440
[<ffff8000001996dc>] __slab_alloc.isra.27.constprop.29+0x24/0x38
[<ffff8000001998dc>] kmem_cache_alloc+0x1ec/0x260
[<ffff8000001d42fc>] seq_open+0x34/0x90
[<ffff80000022059c>] kernfs_fop_open+0x194/0x370
[<ffff8000001afb04>] do_dentry_open+0x214/0x318
[<ffff8000001b0dc8>] vfs_open+0x58/0x68
[<ffff8000001bf338>] path_openat+0x460/0xdf0
[<ffff8000001c0ff0>] do_filp_open+0x60/0xe0
[<ffff8000001b117c>] do_sys_open+0x12c/0x218
[<ffff8000001fd53c>] compat_SyS_open+0x1c/0x28
[<ffff800000085cb0>] el0_svc_naked+0x24/0x28
FIX kmalloc-128: Restoring 0xffff800077692800-0xffff80007769281f=0x6b

 

FIX kmalloc-128: Marking all objects used
SLUB: kmalloc-128 210 slabs counted but counter=211
slabinfo (795) used greatest stack depth: 12976 bytes left

 

3. kmemleak

kmemleak是內核提供的一種檢測內存泄露工具,啓動一個內核線程掃描內存,並打印發現新的未引用對象數量。

3.1 支持kmemleak內核選項

要使用kmemlieak,需要打開如下內核選項。

Kernel hacking->Memory Debugging->Kernel memory leak detector:

CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400
# CONFIG_DEBUG_KMEMLEAK_TEST is not set
CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y---------或者關閉此選項,則不需要在命令行添加kmemleak=on。

3.2 構造測試環境

同時還需要在內核啓動命令行中添加kmemleak=on。

qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -smp 2 -m 2048 -kernel arch/arm64/boot/Image --append "rdinit=/linuxrc console=ttyAMA0 loglevel=8 kmemleak=on" -nographic

 測試代碼如下:

複製代碼
static char *buf;

void create_kmemleak(void)
{
  buf = kmalloc(120, GFP_KERNEL);
  buf = vmalloc(4096);

}
複製代碼

3.3 進行測試

進行kmemleak測試之前,需要寫入scan觸發掃描操作。

然後通過讀kmemlean節點讀取相關信息。

  • 打開kmemlean掃描功能:echo scan > sys/kernel/debug/kmemleak
  • 加載問題module:insmod data/kmemleak.ko
  • 等待問題發現:kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
  • 查看kmemleak結果:cat /sys/kernel/debug/kmemleak

3.4 分析測試結果

每處泄露,都標出泄露地址和大小;相關進程信息;內存內容dump;棧回溯。

kmemleak會提示內存泄露可疑對象的具體棧調用信息、可疑對象的大小、使用哪個函數分配、二進制打印。

複製代碼
unreferenced object 0xede22dc0 (size 128):-------------------------------------第一處可疑泄露128字節
  comm "insmod", pid 765, jiffies 4294941257 (age 104.920s)--------------------相關進程信息
  hex dump (first 32 bytes):---------------------------------------------------二進制打印
    6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
    6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
  backtrace:-------------------------------------------------------------------棧回溯
    [<bf002014>] 0xbf002014
    [<c000973c>] do_one_initcall+0x90/0x1d8
    [<c00a71f4>] do_init_module+0x60/0x38c
    [<c0086898>] load_module+0x1bac/0x1e94
    [<c0086ccc>] SyS_init_module+0x14c/0x15c
    [<c000f3c0>] ret_fast_syscall+0x0/0x3c
    [<ffffffff>] 0xffffffff
unreferenced object 0xf12ba000 (size 4096):
  comm "insmod", pid 765, jiffies 4294941257 (age 104.920s)
  hex dump (first 32 bytes):
    d8 21 00 00 02 18 00 00 e4 21 00 00 02 18 00 00  .!.......!......
    46 22 00 00 02 18 00 00 52 22 00 00 02 18 00 00  F"......R"......
  backtrace:
    [<c00d77c8>] vmalloc+0x2c/0x34
    [<bf002014>] 0xbf002014
    [<c000973c>] do_one_initcall+0x90/0x1d8
    [<c00a71f4>] do_init_module+0x60/0x38c
    [<c0086898>] load_module+0x1bac/0x1e94
    [<c0086ccc>] SyS_init_module+0x14c/0x15c
    [<c000f3c0>] ret_fast_syscall+0x0/0x3c
    [<ffffffff>] 0xffffffff
複製代碼

4. kasan

相關文檔閱讀:《Kasan - Linux 內核的內存檢測工具》《KASAN實現原理》。

kasan暫不支持32位ARM,支持ARM64和X86。

kasan是一個動態檢查內存錯誤的工具,可以檢查內存越界訪問、使用已釋放內存、重複釋放以及棧溢出。

4.1 使能kasan

使用kasan,必須打開CONFIG_KASAN。

Kernel hacking->Memory debugging->KASan: runtime memory debugger

CONFIG_HAVE_ARCH_KASAN=y
CONFIG_KASAN=y
# CONFIG_KASAN_OUTLINE is not set
CONFIG_KASAN_INLINE=y
CONFIG_TEST_KASAN=m

4.2 代碼分析

kasan_report

  ->kasan_report_error

    ->print_error_description

    ->print_address_description

    ->print_shadow_for_address

4.3 測試用及分析

kasan提供了一個測試程序test_kacan.c,將其編譯成模塊,加載到內核。可以模擬很多內存錯誤場景。

kasan可以檢測到越界訪問、訪問已釋放內存、重複釋放等類型錯誤,其中重複釋放藉助於slub_debug。

insmod data/kasan.ko

越界訪問包括slab越界、棧越界、全局變量越界;訪問已釋放內存use-after-free;重複釋放可以被slub_debug識別。

4.3.1 slab-out-of-bounds

複製代碼
static noinline void __init kmalloc_oob_right(void)
{
    char *ptr;
    size_t size = 123;

    pr_info("out-of-bounds to right\n");
    ptr = kmalloc(size, GFP_KERNEL);
    if (!ptr) {
        pr_err("Allocation failed\n");
        return;
    }

    ptr[size] = 'x';
    kfree(ptr);
}
複製代碼

此種錯誤類型是對slab的越界訪問,包括左側、右側、擴大、縮小後越界訪問。除了數組賦值,還包括memset、指針訪問等等。

al: kasan error test init
kasan test: kmalloc_oob_right out-of-bounds to right
==================================================================
BUG: KASAN: slab-out-of-bounds in kmalloc_oob_right+0xa4/0xe0 [kasan] at addr ffff800066539c7b----------------錯誤類型是slab-out-of-bounds,在kmalloc_oob_right中產生。
Write of size 1 by task insmod/788
=============================================================================
BUG kmalloc-128 (Tainted: G O ): kasan: bad access detected-------------------------------------------------------------------slab非法非法訪問
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: Allocated in kmalloc_oob_right+0x54/0xe0 [kasan] age=0 cpu=1 pid=788--------------------------------------------問題點kmalloc_oob_right的棧回溯
alloc_debug_processing+0x17c/0x188
___slab_alloc.constprop.30+0x3f8/0x440
__slab_alloc.isra.27.constprop.29+0x24/0x38
kmem_cache_alloc+0x220/0x280
kmalloc_oob_right+0x54/0xe0 [kasan]
kmalloc_tests_init+0x18/0x70 [kasan]
do_one_initcall+0x11c/0x310
do_init_module+0x1cc/0x588
load_module+0x48cc/0x5dc0
SyS_init_module+0x1a8/0x1e0
el0_svc_naked+0x24/0x28
INFO: Freed in do_one_initcall+0x10c/0x310 age=0 cpu=1 pid=788
free_debug_processing+0x17c/0x368
__slab_free+0x344/0x4a0
kfree+0x21c/0x250
do_one_initcall+0x10c/0x310
do_init_module+0x1cc/0x588
load_module+0x48cc/0x5dc0
SyS_init_module+0x1a8/0x1e0
el0_svc_naked+0x24/0x28
INFO: Slab 0xffff7bffc2994e00 objects=16 used=2 fp=0xffff800066539e00 flags=0x4080
INFO: Object 0xffff800066539c00 @offset=7168 fp=0xffff800066538200

Bytes b4 ffff800066539bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................------------------------------內存dump
Object ffff800066539c00: 00 82 53 66 00 80 ff ff 74 65 73 74 73 5f 69 6e ..Sf....tests_in
Object ffff800066539c10: 69 74 20 5b 6b 61 73 61 6e 5d 00 00 00 00 00 00 it [kasan]......
Object ffff800066539c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539dc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539dd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539de0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0+ #108------------------------------------------------------------------打印此log消息的棧回溯
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff80000008e938>] dump_backtrace+0x0/0x270
[<ffff80000008ebbc>] show_stack+0x14/0x20
[<ffff800000735bb0>] dump_stack+0x100/0x188
[<ffff800000318f60>] print_trailer+0xf8/0x160
[<ffff80000031ea8c>] object_err+0x3c/0x50
[<ffff8000003209a0>] kasan_report_error+0x240/0x558
[<ffff800000320e90>] __asan_report_store1_noabort+0x48/0x50
[<ffff7ffffc008324>] kmalloc_oob_right+0xa4/0xe0 [kasan]
[<ffff7ffffc009070>] kmalloc_tests_init+0x18/0x70 [kasan]
[<ffff80000008309c>] do_one_initcall+0x11c/0x310
[<ffff8000002648c4>] do_init_module+0x1cc/0x588
[<ffff800000206724>] load_module+0x48cc/0x5dc0
[<ffff800000207dc0>] SyS_init_module+0x1a8/0x1e0
[<ffff800000086cb0>] el0_svc_naked+0x24/0x28
Memory state around the buggy address:
ffff800066539b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff800066539b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff800066539c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03
^
ffff800066539c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff800066539d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

4.3.2 user-after-free

user-after-free是釋放後使用的意思。

複製代碼
static noinline void __init kmalloc_uaf(void)
{
    char *ptr;
    size_t size = 10;

    pr_info("use-after-free\n");
    ptr = kmalloc(size, GFP_KERNEL);
    if (!ptr) {
        pr_err("Allocation failed\n");
        return;
    }

    kfree(ptr);
    *(ptr + 8) = 'x';
}
複製代碼

測試結果如下:

kasan test: kmalloc_uaf use-after-free
==================================================================
BUG: KASAN: use-after-free in kmalloc_uaf+0xac/0xe0 [kasan] at addr ffff800066539e08
Write of size 1 by task insmod/788
=============================================================================
BUG kmalloc-128 (Tainted: G B O ): kasan: bad access detected
-----------------------------------------------------------------------------

INFO: Allocated in kmalloc_uaf+0x54/0xe0 [kasan] age=0 cpu=1 pid=788
alloc_debug_processing+0x17c/0x188
___slab_alloc.constprop.30+0x3f8/0x440
__slab_alloc.isra.27.constprop.29+0x24/0x38
kmem_cache_alloc+0x220/0x280
kmalloc_uaf+0x54/0xe0 [kasan]
kmalloc_tests_init+0x48/0x70 [kasan]
do_one_initcall+0x11c/0x310
do_init_module+0x1cc/0x588
load_module+0x48cc/0x5dc0
SyS_init_module+0x1a8/0x1e0
el0_svc_naked+0x24/0x28
INFO: Freed in kmalloc_uaf+0x84/0xe0 [kasan] age=0 cpu=1 pid=788
free_debug_processing+0x17c/0x368
__slab_free+0x344/0x4a0
kfree+0x21c/0x250
kmalloc_uaf+0x84/0xe0 [kasan]
kmalloc_tests_init+0x48/0x70 [kasan]
do_one_initcall+0x11c/0x310
do_init_module+0x1cc/0x588
load_module+0x48cc/0x5dc0
SyS_init_module+0x1a8/0x1e0
el0_svc_naked+0x24/0x28
INFO: Slab 0xffff7bffc2994e00 objects=16 used=1 fp=0xffff800066539e00 flags=0x4080
INFO: Object 0xffff800066539e00 @offset=7680 fp=0xffff800066539800

Bytes b4 ffff800066539df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539e00: 00 98 53 66 00 80 ff ff 00 00 00 00 00 00 00 00 ..Sf............
Object ffff800066539e10: 00 9e 53 66 00 80 ff ff d0 51 12 00 00 80 ff ff ..Sf.....Q......
Object ffff800066539e20: 00 00 00 00 00 00 00 00 e0 14 6d 01 00 80 ff ff ..........m.....
Object ffff800066539e30: 00 69 a3 66 00 80 ff ff 18 69 a3 66 00 80 ff ff .i.f.....i.f....
Object ffff800066539e40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539e50: 30 da 73 00 00 80 ff ff 00 69 a3 66 00 80 ff ff 0.s......i.f....
Object ffff800066539e60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539e70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0+ #108
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff80000008e938>] dump_backtrace+0x0/0x270
[<ffff80000008ebbc>] show_stack+0x14/0x20
[<ffff800000735bb0>] dump_stack+0x100/0x188
[<ffff800000318f60>] print_trailer+0xf8/0x160
[<ffff80000031ea8c>] object_err+0x3c/0x50
[<ffff8000003209a0>] kasan_report_error+0x240/0x558
[<ffff800000320e90>] __asan_report_store1_noabort+0x48/0x50
[<ffff7ffffc00874c>] kmalloc_uaf+0xac/0xe0 [kasan]
[<ffff7ffffc0090a0>] kmalloc_tests_init+0x48/0x70 [kasan]
[<ffff80000008309c>] do_one_initcall+0x11c/0x310
[<ffff8000002648c4>] do_init_module+0x1cc/0x588
[<ffff800000206724>] load_module+0x48cc/0x5dc0
[<ffff800000207dc0>] SyS_init_module+0x1a8/0x1e0
[<ffff800000086cb0>] el0_svc_naked+0x24/0x28
Memory state around the buggy address:
ffff800066539d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff800066539d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff800066539e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff800066539e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff800066539f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

4.3.3 stack-out-of-bounds

棧越界訪問是函數中數組越界,在實際工程中經常出現,問題難以發現。

複製代碼
static noinline void __init kasan_stack_oob(void)
{
    char stack_array[10];
    volatile int i = 0;
    char *p = &stack_array[ARRAY_SIZE(stack_array) + i];

    pr_info("out-of-bounds on stack\n");
    *(volatile char *)p;
}
複製代碼

 

kasan test: kasan_stack_oob out-of-bounds on stack
==================================================================
BUG: KASAN: stack-out-of-bounds in kasan_stack_oob+0xa8/0xf0 [kasan] at addr ffff800066acb95a
Read of size 1 by task insmod/788
page:ffff7bffc29ab2c0 count:0 mapcount:0 mapping: (null) index:0x0
flags: 0x0()
page dumped because: kasan: bad access detected
CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0+ #108
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff80000008e938>] dump_backtrace+0x0/0x270
[<ffff80000008ebbc>] show_stack+0x14/0x20
[<ffff800000735bb0>] dump_stack+0x100/0x188
[<ffff800000320c90>] kasan_report_error+0x530/0x558
[<ffff800000320d00>] __asan_report_load1_noabort+0x48/0x50
[<ffff7ffffc0080a8>] kasan_stack_oob+0xa8/0xf0 [kasan]
[<ffff7ffffc0090b0>] kmalloc_tests_init+0x58/0x70 [kasan]
[<ffff80000008309c>] do_one_initcall+0x11c/0x310
[<ffff8000002648c4>] do_init_module+0x1cc/0x588
[<ffff800000206724>] load_module+0x48cc/0x5dc0
[<ffff800000207dc0>] SyS_init_module+0x1a8/0x1e0
[<ffff800000086cb0>] el0_svc_naked+0x24/0x28
Memory state around the buggy address:
ffff800066acb800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff800066acb880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
>ffff800066acb900: f1 f1 04 f4 f4 f4 f2 f2 f2 f2 00 02 f4 f4 f3 f3
^
ffff800066acb980: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
ffff800066acba00: f1 f1 00 00 00 00 00 00 00 00 f3 f3 f3 f3 00 00
==================================================================

4.3.4 global-out-of-bounds

複製代碼
static char global_array[10];

static noinline void __init kasan_global_oob(void)
{
    volatile int i = 3;
    char *p = &global_array[ARRAY_SIZE(global_array) + i];

    pr_info("out-of-bounds global variable\n");
    *(volatile char *)p;
}
複製代碼

測試結果如下:

kasan test: kasan_global_oob out-of-bounds global variable
==================================================================
BUG: KASAN: global-out-of-bounds in kasan_global_oob+0x9c/0xe8 [kasan] at addr ffff7ffffc001c8d
Read of size 1 by task insmod/788
Address belongs to variable global_array+0xd/0xffffffffffffe3f8 [kasan]
CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0+ #108
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff80000008e938>] dump_backtrace+0x0/0x270
[<ffff80000008ebbc>] show_stack+0x14/0x20
[<ffff800000735bb0>] dump_stack+0x100/0x188
[<ffff800000320c90>] kasan_report_error+0x530/0x558
[<ffff800000320d00>] __asan_report_load1_noabort+0x48/0x50
[<ffff7ffffc00818c>] kasan_global_oob+0x9c/0xe8 [kasan]
[<ffff7ffffc0090b4>] kmalloc_tests_init+0x5c/0x70 [kasan]
[<ffff80000008309c>] do_one_initcall+0x11c/0x310
[<ffff8000002648c4>] do_init_module+0x1cc/0x588
[<ffff800000206724>] load_module+0x48cc/0x5dc0
[<ffff800000207dc0>] SyS_init_module+0x1a8/0x1e0
[<ffff800000086cb0>] el0_svc_naked+0x24/0x28
Memory state around the buggy address:
ffff7ffffc001b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff7ffffc001c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff7ffffc001c80: 00 02 fa fa fa fa fa fa 00 00 00 00 00 00 00 00
^
ffff7ffffc001d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff7ffffc001d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================

5. 小結

kmemleak檢查內存泄露的獨門絕技,讓其有一定市場空間。但功能比較單一,專注於內存泄露問題。

對於非ARM64/x86平臺,只能使用slub_debug進行內存問題分析;kasan更高效,但也需要更高的內核和GCC版本支持。

 

聯繫方式:[email protected]
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章