Linux kernel crash analysis example

Issue reported:

When USB is connected as Mass Storage mode, copy file from external sdcard to clipboard.

The disconnect USB and try to paste clipboard file into internal sdcard but it will fail.

Reconnect USB and target crash several minutes later.


Crash Context:

<4>[32284.249267] C0 [      swapper/0] CPU: 0    Tainted: G        W     (3.4.5 #1)
<4>[32284.256499] C0 [      swapper/0] PC is at DWC_WORKQ_SCHEDULE+0xd4/0x108
<4>[32284.263152] C0 [      swapper/0] LR is at DWC_WORKQ_SCHEDULE+0x94/0x108
<4>[32284.269866] C0 [      swapper/0] pc : [<c0332fc4>]    lr : [<c0332f84>]    psr: 800001d3
<4>[32284.269897] C0 [      swapper/0] sp : c0947dc0  ip : dc050000  fp : 00000000
<4>[32284.285064] C0 [      swapper/0] r10: 00000000  r9 : 00000000  r8 : db376e04
<4>[32284.292174] C0 [      swapper/0] r7 : db370000  r6 : c0318050  r5 : db2d6a40  r4 : d972e140
<4>[32284.300597] C0 [      swapper/0] r3 : 00000000  r2 : caa91840  r1 : d972e158  r0 : 0000007f
<4>[32284.309020] C0 [      swapper/0] Flags: Nzcv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
<4>[32284.318389] C0 [      swapper/0] Control: 10c53c7d  Table: 8291806a  DAC: 00000015
<4>[32284.326018] C0 [      swapper/0] 
<4>[32284.326049] C0 [      swapper/0] PC: 0xc0332f44:
<4>[32284.334014] C0 [      swapper/0] 2f44  e595000c ebfffc5c e3a00044 ebffff3d e1a04000 e88000c0 e5805008 e59f007c
<4>[32284.344451] C0 [      swapper/0] 2f64  ebffff3a e59f1078 e59f2078 e584000c e1a03000 e59f0070 e58d4000 ebfffd7b
<4>[32284.354858] C0 [      swapper/0] 2f84  e3a03c05 e5843018 e284301c e584301c e5843020 e2841018 e59f3050 e5843024
<4>[32284.365264] C0 [      swapper/0] 2fa4  e2853010 e5843010 e5952014 e5842014 e5952010 e1520003 05854010 15953014
<4>[32284.375671] C0 [      swapper/0] 2fc4  15834010 e5854014 e5950000 ebf4e615 e28dd018 e8bd40f0 e28dd004 e12fff1e
<4>[32284.386077] C0 [      swapper/0] 2fe4  c0a71d34 c063fef4 c076b31a c07a722f c03326a8 e92d4038 e1a04000 e59f501c
<4>[32284.396484] C0 [      swapper/0] 3004  ea000003 e5953004 e2444001 e59f0010 e12fff33 e3540000 1afffff9 e8bd8038
<4>[32284.406890] C0 [      swapper/0] 3024  c09944e0 066665b0 e92d4008 e59f3008 e5933008 e12fff33 e8bd8008 c09944e0
<4>[32284.417358] C0 [      swapper/0] 
<4>[32284.417358] C0 [      swapper/0] LR: 0xc0332f04:
<4>[32284.425353] C0 [      swapper/0] 2f04  e1a06001 e1a07002 e3a01080 e59d202c e59f00c8 e58d300c ebfc2336 e5950008
<4>[32284.435790] C0 [      swapper/0] 2f24  e28d1010 ebfffd28 e5953004 e59d1010 e2833001 e5950008 e5853004 eb0b2c0d
<4>[32284.446197] C0 [      swapper/0] 2f44  e595000c ebfffc5c e3a00044 ebffff3d e1a04000 e88000c0 e5805008 e59f007c
<4>[32284.456634] C0 [      swapper/0] 2f64  ebffff3a e59f1078 e59f2078 e584000c e1a03000 e59f0070 e58d4000 ebfffd7b
<4>[32284.467040] C0 [      swapper/0] 2f84  e3a03c05 e5843018 e284301c e584301c e5843020 e2841018 e59f3050 e5843024
<4>[32284.477477] C0 [      swapper/0] 2fa4  e2853010 e5843010 e5952014 e5842014 e5952010 e1520003 05854010 15953014
<4>[32284.487884] C0 [      swapper/0] 2fc4  15834010 e5854014 e5950000 ebf4e615 e28dd018 e8bd40f0 e28dd004 e12fff1e
<4>[32284.498321] C0 [      swapper/0] 2fe4  c0a71d34 c063fef4 c076b31a c07a722f c03326a8 e92d4038 e1a04000 e59f501c

System Triage Procedure:

1) Find call stack and locate the DWC_WORKQ_SCHEDULE() API

2) Get the assembly code via objdump for offending API

3) ARM assembly code is listed

EXPORT_SYMBOL(DWC_WORKQ_FREE);

void DWC_WORKQ_SCHEDULE(dwc_workq_t *wq, dwc_work_callback_t work_cb,
			void *data, char *format, ...)
{
    107c:	e52d3004 	push	{r3}		; (str r3, [sp, #-4]!)
    1080:	e92d40f0 	push	{r4, r5, r6, r7, lr}				// 0x107c + 0xD4 offset = > 0x1150
    1084:	e24dd018 	sub	sp, sp, #24
    1088:	e1a05000 	mov	r5, r0
	int64_t flags;
	work_container_t *container;
	static char name[128];

	va_list args;
	va_start(args, format);
    108c:	e28d3030 	add	r3, sp, #48	; 0x30

4) Check offset=0x1150 code

#ifdef DEBUG
	DWC_CIRCLEQ_INSERT_TAIL(&wq->entries, container, entry);
    1130:	e2853010 	add	r3, r5, #16
    1134:	e5843010 	str	r3, [r4, #16]
    1138:	e5952014 	ldr	r2, [r5, #20]
    113c:	e5842014 	str	r2, [r4, #20]
    1140:	e5952010 	ldr	r2, [r5, #16]
    1144:	e1520003 	cmp	r2, r3
    1148:	05854010 	streq	r4, [r5, #16]
    114c:	15953014 	ldrne	r3, [r5, #20]
    1150:	15834010 	strne	r4, [r3, #16]							// 0xc0332f44  
    1154:	e5854014 	str	r4, [r5, #20]
#endif

5) Check ARM instruction against crash context

<4>[32284.375671] C0 [      swapper/0] 2fc4  15834010 e5854014 e5950000 ebf4e615 e28dd018 e8bd40f0 e28dd004 e12fff1e

6) Now we can conclude that null pointer is caused in offset=0x1150.

7) Check against source c code DWC_CIRCLEQ_INSERT_TAIL()

#define DWC_CIRCLEQ_INSERT_TAIL(head, elm, field) do {            \
    (elm)->field.cqe_next = DWC_CIRCLEQ_END(head);            \
    (elm)->field.cqe_prev = (head)->cqh_last;            \
    if ((head)->cqh_first == DWC_CIRCLEQ_END(head))            \
        (head)->cqh_first = (elm);                \
    else                                \
        (head)->cqh_last->field.cqe_next = (elm);    \
    (head)->cqh_last = (elm);    \
} while (0)

8) Analyze Crash Context

r3 : 00000000

So we know R3=(head)->cqh_last and it is NULL pointer.

9) Add protection code for NULL pointer.


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章