關於linux下的oops_hantown-ChinaUnix博客

以下內容從舊的百度空間轉過來,權當是資料備份。

以下內容爲網上轉帖:

在編寫內核代碼的時候,經常會遇到oops,其中絕大部分是非法指針做造成的。下面是轉載別人的關於oops的一些英文翻譯。
Oops Messages
Oops 消息
Most bugs show themselves in NULL pointer dereferences or by the use of other incorrect pointer values. The usual outcome of such bugs is an oops message.
大多數bug通常是因爲廢棄了一個NULL指針或者使用了錯誤的指針值。這類bug導致的結果通常是一條oops消息。

Almost any address used by the processor is a virtual address and is mapped to physical addresses through a complex structure of page tables (the exceptions are physical addresses used with the memory management subsystem itself). When an invalid pointer is dereferenced, the paging mechanism fails to map the pointer to a physical address, and the processor signals a page fault to the operating system. If the address is not valid, the kernel is not able to "page in" the missing address; it (usually) generates an oops if this happens while the processor is in supervisor mode.
處理器使用的所有地址幾乎都是通過一個複雜的頁表結構對物理地址映射而得到的虛擬地址(除了內存管理子系統自己所使用的物理地址)。當一個非法的指針被廢棄時,內存分頁機制將不能爲指針映射一個物理地址,處理器就會向操作系統發出一個頁故障信號。如果地址不合法,那麼內核將不能在該地址“布頁”;;這時如果處理器處於超級用戶模式,內核就會生成一條oops消息。

An oops displays the processor status at the time of the fault, including the contents of the CPU registers and other seemingly incomprehensible information. The message is generated by printk statements in the fault handler (arch/*/kernel/traps.c) and is dispatched as described earlier in Section 4.2.1).
一條oops消息能夠顯示發生故障時處理器的狀態,以及CPU寄存器的內容和其他從表面難以理解的信息。該消息是由容錯處理中的printk語句產生的(arch/*kernel/traps.c)並按照4.2.1小節中描述的方式進行分派。

Let's look at one such message. Here's what results from dereferencing a NULL pointer on a PC running Version 2.6 of the kernel. The most relevant information here is the instruction pointer (EIP), the address of the faulty instruction.
下面我們來看一條這樣的消息。這是通過在一臺運行着2.6內核的PC機上廢棄一個NULL指針所引起的。其中最有關的信息是指令指針(EIP),就是故障指令的地址。

Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: d083a064 Oops: 0002 [#1] SMP CPU: 0 EIP: 0060:[] Not tainted EFLAGS: 00010246 (2.6.6) EIP is at faulty_write 0x4/0x10 [faulty] eax: 00000000 ebx: 00000000 ecx: 00000000 edx: 00000000 esi: cf8b2460 edi: cf8b2480 ebp: 00000005 esp: c31c5f74 ds: 007b es: 007b ss: 0068 Process bash (pid: 2086, threadinfo=c31c4000 task=cfa0a6c0) Stack: c0150558 cf8b2460 080e9408 00000005 cf8b2480 00000000 cf8b2460 cf8b2460 fffffff7 080e9408 c31c4000 c0150682 cf8b2460 080e9408 00000005 cf8b2480 00000000 00000001 00000005 c0103f8f 00000001 080e9408 00000005 00000005 Call Trace: [] vfs_write 0xb8/0x130 [] sys_write 0x42/0x70 [] syscall_call 0x7/0xb Code: 89 15 00 00 00 00 c3 90 8d 74 26 00 83 ec 0c b8 00 a6 83 d0

This message was generated by writing to a device owned by the faulty module, a module built deliberately to demonstrate failures. The implementation of the write method of faulty.c is trivial:
這條消息是由一個問題模塊向其設備執行寫操作時引起的,該模塊是特意爲示範故障而構建的。faulty.c中的write函數很普通:

ssize_t faulty_write (struct file *filp, const char _ _user *buf, size_t count, loff_t *pos)
{

/* make a simple fault by dereferencing a NULL pointer */

*(int *)0 = 0;

return 0;

}

As you can see, what we do here is dereference a NULL pointer. Since 0 is never a valid pointer value, a fault occurs, which the kernel turns into the oops message shown earlier. The calling process is then killed.
如你所見,我們在這裏做的就是廢棄一個NULL指針。因爲0從來都不是一個可用的指針值,所以會引發一個故障,內核會簡單地將其轉換爲oops消息並顯示。然後其調用進程會被殺死。

The faulty module has a different fault condition in its read implementation: faulty
該示例模塊在其read函數中則有着不同的故障條件:

ssize_t faulty_read(struct file *filp, char _ _user *buf, size_t count, loff_t *pos)
{

int ret;

char stack_buf[4];

/* Let's try a buffer overflow */

memset(stack_buf, 0xff, 20);

if (count > 4)

count = 4;

/* copy 4 bytes to the user */

ret = copy_to_user(buf, stack_buf, count);

if (!ret)

return count;

return ret;

}
This method copies a string into a local variable; unfortunately, the string is longer than the destination array. The resulting buffer overflow causes an oops when the function returns. Since the return instruction brings the instruction pointer to nowhere land, this kind of fault is much harder to trace, and you can get something such as the following:
該函數將一個字符串賦給一個局部變量;不幸的是,字符串的長度超出了目標數組的範圍。當函數返回時就會導致緩衝區溢出而引起一條oops消息。由於返回指令會帶來指向空地址的指針,因此這類故障更加難以跟蹤,你將會看到下面這樣的信息:

EIP: 0010:[<00000000>] Unable to handle kernel paging request at virtual address ffffffff printing eip: ffffffff Oops: 0000 [#5] SMP CPU: 0 EIP: 0060:[] Not tainted EFLAGS: 00010296 (2.6.6) EIP is at 0xffffffff eax: 0000000c ebx: ffffffff ecx: 00000000 edx: bfffda7c esi: cf434f00 edi: ffffffff ebp: 00002000 esp: c27fff78 ds: 007b es: 007b ss: 0068 Process head (pid: 2331, threadinfo=c27fe000 task=c3226150) Stack: ffffffff bfffda70 00002000 cf434f20 00000001 00000286 cf434f00 fffffff7 bfffda70 c27fe000 c0150612 cf434f00 bfffda70 00002000 cf434f20 00000000 00000003 00002000 c0103f8f 00000003 bfffda70 00002000 00002000 bfffda70 Call Trace: [] sys_read 0x42/0x70 [] syscall_call 0x7/0xb Code: Bad EIP value.

In this case, we see only part of the call stack (vfs_read and faulty_read are missing), and the kernel complains about a "bad EIP value." That complaint, and the offending address (ffffffff) listed at the beginning are both hints that the kernel stack has been corrupted.
這種情況下,你只能看到部分函數調用的堆棧情況(vfs_read和faulty_read丟失了),而且內核會爲了一個“不可用的EIP值”而抱怨。這種抱怨以及開始部分列出的討厭地址(ffffffff)都暗示了內核堆棧已經坍塌。

In general, when you are confronted with an oops, the first thing to do is to look at the location where the problem happened, which is usually listed separately from the call stack. In the first oops shown above, the relevant line is:
通常,當你面臨一個oops時,首要問題就是查看故障的發生位置,它通常會與函數調用的堆棧信息分開列出。對於上面第一個oops,與此相關的信息是:
EIP is at faulty_write 0x4/0x10 [faulty]

Here we see that we were in the function faulty_write , which is located in the faulty module (which is listed in square brackets). The hex numbers indicate that the instruction pointer was 4 bytes into the function, which appears to be 10 (hex) bytes long. Often that is enough to figure out what the problem is.

這裏可以看出我們正位於faulty模塊(方括號中的是模塊名稱)的faulty_write函數中。十六進制的數字指明瞭該函數中的指令指針長度爲4字節,而現在看起來則有10(十六進制)字節長。通常這足以查明問題的所在了。
If you need more information, the call stack shows you how you got to where things fell apart. The stack itself is printed in hex form; with a bit of work, you can often determine the values of local variables and function parameters from the stack listing. Experienced kernel developers can benefit from a certain amount of pattern recognition here; for example, if we look at the stack listing from the faulty_read oops:

如果你需要更多信息,函數調用的堆棧信息將會告訴你怎樣找到已崩潰的東西。堆棧信息會以十六進制列出;稍加分析,你就能從中辨別出局部變量以及函數參數。有經驗的內核開發者會從中獲得很大的幫助;例如,faulty_read函數的堆棧信息如下所示:

Stack: ffffffff bfffda70 00002000 cf434f20 00000001 00000286 cf434f00 fffffff7 bfffda70 c27fe000 c0150612 cf434f00 bfffda70 00002000 cf434f20 00000000 00000003 00002000 c0103f8f 00000003 bfffda70 00002000 00002000 bfffda70

The ffffffff at the top of the stack is part of our string that broke things. On the x86 architecture, by default, the user-space stack starts just below 0xc0000000; thus, the recurring value 0xbfffda70 is probably a user-space stack address; it is, in fact, the address of the buffer passed to the read system call, replicated each time it is passed down the kernel call chain. On the x86 (again, by default), kernel space starts at 0xc0000000, so values above that are almost certainly kernel-space addresses, and so on.

位於堆棧頂部的ffffffff是引發故障的字符串的一部分。在x86體系中,默認用戶空間中的堆棧地址是小於0xc00000000的;因此,其中0xbfffda70很有可能是一個用戶空間的堆棧地址;實際上它就是傳遞給read系統調用的緩衝區的地址,它在內核調用鏈中每次被下傳時都會被複制。在x86中(再次說明,缺省的),內核空間地址起始自0xc00000000,所以可以基本確定凡是大於該值的地址都是屬於內核空間的地址。

Finally, when looking at oops listings, always be on the lookout for the "slab poisoning" values discussed at the beginning of this chapter. Thus, for example, if you get a kernel oops where the offending address is 0xa5a5a5a5, you are almost certainly forgetting to initialize dynamic memory somewhere.

最後要注意的一點是,當你查看oops信息時,始終要留意本章開始時討論的“slab poisoning”的值。因此,如果一條內核oops中出現了討厭的地址值0xa5a5a5a5,那麼你肯定是在什麼地方忘記初始化動態分配的內存了。

Please note that you see a symbolic call stack (as shown above) only if your kernel is built with the CONFIG_KALLSYMS option turned on. Otherwise, you see a bare, hexadecimal listing, which is far less useful until you have decoded it in other ways.

請注意要想看到一條可讀的調用堆棧信息(如上所示),你必須要在構建內核時啓用CONFIG_KALLSYMS選項。否則你將會看到一個原始的十六進制列表,在你使用其他方法譯解它之前它幾乎沒什麼用處。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章