內核崩潰

內核崩潰

1、問題描述

  當屏幕接在機器上時上電啓動,主動安裝、卸載驅動都是一切正常。但是當上電時沒有接上屏幕,然後主動去卸載驅動會導致內核崩潰。

2、log截選

[  135.779814] Unable to handle kernel paging request at virtual address ffff000000dd36a8
[  135.780842] Mem abort info:
[  135.781203]   Exception class = IABT (current EL), IL = 32 bits
[  135.781954]   SET = 0, FnV = 0
[  135.782348]   EA = 0, S1PTW = 0
[  135.782750] ====>AEE dump_stack start
[  135.782766] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G         C      4.14.98-07190-gbcdaf61-dirty #5
[  135.784373] Hardware name: Freescale i.MX8QXP MEK (DT)
[  135.785030] Call trace:
[  135.785361] [<ffff00000808b198>] dump_backtrace+0x0/0x414
[  135.786059] [<ffff00000808b5c0>] show_stack+0x14/0x1c
[  135.786710] [<ffff000008adc008>] dump_stack+0x90/0xb0
[  135.787363] [<ffff00000809d1f4>] __do_kernel_fault+0x98/0x114
[  135.788098] [<ffff00000809d638>] do_translation_fault+0x48/0xa0
[  135.788857] [<ffff0000080812d0>] do_mem_abort+0x4c/0xcc
[  135.789526] Exception stack(0xffff00000800bc50 to 0xffff00000800bd90)
[  135.790350] bc40:                                   ffff800879194000 ffff000000dd36a8
[  135.791349] bc60: ffff800879194000 ffff800878108c80 0000000000000080 ffff800879194978
[  135.792346] bc80: 0010000000000000 4010000100000000 ffff80087ff33da8 0000000000000004
[  135.793343] bca0: 4000000100000000 ffff00000800be78 0000000000000001 0000000000007d00
[  135.794340] bcc0: 0000000000000001 0000000000000000 ffff0000081299dc 0000000000000000
[  135.795338] bce0: 0000000000000000 ffff800879194978 ffff800879194978 ffff000000dd36a8
[  135.796335] bd00: 0000000000000101 ffff800879194000 ffff000009189f38 ffff800878108c80
[  135.797333] bd20: ffff000008f98908 ffff000009189f38 ffff800878108c80 ffff00000800bd90
[  135.798330] bd40: ffff000008126b00 ffff00000800bd90 ffff000000dd36a8 0000000080000145
[  135.799328] bd60: ffff000009189f38 ffff000008f97018 0000ffffffffffff ffff000008f97018
[  135.800323] bd80: ffff00000800bd90 ffff000000dd36a8
[  135.800949] [<ffff000008083054>] el1_da+0x24/0x84
[  135.801556] [<ffff000000dd36a8>] 0xffff000000dd36a8
[  135.802184] [<ffff000008126d74>] expire_timers+0xdc/0x164
[  135.802874] [<ffff000008126e94>] run_timer_softirq+0x98/0x180
[  135.803609] [<ffff000008081a30>] __do_softirq+0x130/0x364
[  135.804304] [<ffff0000080b485c>] irq_exit+0xbc/0xec
[  135.804934] [<ffff00000810e268>] __handle_domain_irq+0x64/0xac
[  135.805678] [<ffff000008081858>] gic_handle_irq+0xd4/0x17c
[  135.806377] Exception stack(0xffff00000938bdd0 to 0xffff00000938bf10)
[  135.807199] bdc0:                                   ffff000008f97018 0000800876f99000
[  135.808197] bde0: 0000800876f99000 ffff00000938bf20 0000800876f99000 0038815500000000
[  135.809194] be00: 0000000043738680 ffff0000272f3850 ffff8008781095a0 ffff00000938be90
[  135.810192] be20: 00000000000008c0 00000000b3e6a559 0000000000000078 00000000f4b01930
[  135.811187] be40: 00000000b3e6a391 0000000000000000 ffff0000081299dc 0000000000000000
[  135.812184] be60: 0000000000000000 ffff000008f97000 ffff000009189000 0000000000000001
[  135.813182] be80: ffff000008fa1850 ffff000009189fdc ffff000009189000 0000000000000000
[  135.814179] bea0: 0000000000000000 0000000000000000 0000000000000000 ffff00000938bf10
[  135.815176] bec0: ffff000008085894 ffff00000938bf10 ffff000008085898 0000000060000145
[  135.816171] bee0: ffff80087ff36700 0000001f9ca44f00 ffffffffffffffff 0000000000000001
[  135.817166] bf00: ffff00000938bf10 ffff000008085898
[  135.817792] [<ffff000008083230>] el1_irq+0xb0/0x124
[  135.818421] [<ffff000008085898>] arch_cpu_idle+0x2c/0x1c0
[  135.819113] [<ffff000008af8e14>] default_idle_call+0x18/0x2c
[  135.819841] [<ffff0000080fa14c>] do_idle+0x1ac/0x26c
[  135.820477] [<ffff0000080fa3c4>] cpu_startup_entry+0x20/0x24
[  135.821203] [<ffff000008091900>] secondary_start_kernel+0x104/0x110
[8 79194040 ffff8008 79194050 ffff8008 79194050 ffff8008
[  135.853754] 4060  79194060 ffff8008 79194060 ffff8008 79194070 ffff8008 79194070 ffff8008
[  135.854817]
[  135.854817] X2: 0xffff800879193f80:
[  135.855447] 3f80  2159b880 ffff7e00 00001000 00000000 2159b8c0 ffff7e00 00001000 00000000
[  135.856505] 3fa0  2159b900 ffff7e00 00001000 00000000 2159b940 ffff7e00 00001000 00000000
[  135.857564] 3fc0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  135.858623] 3fe0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  135.85[  135.888074] 3dc8  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  135.889132] 3de8  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  135.890189] 3e08  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  135.891257]
[  135.891257] X19: 0xffff8008791948f8:
[  135.891898] 48f8  00000000 00000000 00000000 00000000 ffffffe0 0000000f 79194910 ffff8008
[  135.892958] 4918  79194910 ffff8008 00dd4168 ffff0000 00000040 00000000 79194930 ffff8008
[  135.894018] 4938  79194930 ffff8008 00dd8f08 ffff0000 00000000 00000000 00000000 00000000
[  135.895077] 4950000000 79194930 ffff8008
[  135.903135] 4938  79194930 ffff8008 00dd8f08 ffff0000 00000000 00000000 00000000 00000000
[  135.904194] 4958  00000000 00000000 00dd36f0 ffff0000 79194000 ffff8008 00000001 00000000
[  135.905253] 4978  00000200 dead0000 00000000 00000000 ffff5f9c 00000000 00dd36a8 ffff0000
[  135.906310] 4998  79194000 ffff8008 1d000001 00000000 00000000 00000000 00000000 00000000
[  135.907369] 49b8  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  135.908427] 49d8  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  135.909492]
[  135.:
[  135.938213] Exception stack(0xffff00000800bc50 to 0xffff00000800bd90)
[  135.939034] bc40:                                   ffff800879194000 ffff000000dd36a8
[  135.940032] bc60: ffff800879194000 ffff800878108c80 0000000000000080 ffff800879194978
[  135.941031] bc80: 0010000000000000 4010000100000000 ffff80087ff33da8 0000000000000004
[  135.942028] bca0: 4000000100000000 ffff00000800be78 0000000000000001 0000000000007d00
[  135.943025] bcc0: 0000000000000001 0000000000000000 ffff0000081299dc 0000000000000000
[  135.944023] bce0: 0000000000000000 ffff800879194978 ffff80080x130/0x364
[  135.952372] [<ffff0000080b485c>] irq_exit+0xbc/0xec
[  135.952999] [<ffff00000810e268>] __handle_domain_irq+0x64/0xac
[  135.953745] [<ffff000008081858>] gic_handle_irq+0xd4/0x17c
[  135.954442] Exception stack(0xffff00000938bdd0 to 0xffff00000938bf10)
[  135.955264] bdc0:                                   ffff000008f97018 0000800876f99000
[  135.956261] bde0: 0000800876f99000 ffff00000938bf20 0000800876f99000 0038815500000000
[  135.957259] be00: 0000000043738680 ffff0000272f3850 ffff8008781095a0 ffff00000938be90
[  135.958257] be20: 00000000000008c0 00000000b3e6a55

3、原因分析

[  135.779814] Unable to handle kernel paging request at virtual address ffff000000dd36a8

  無法在虛擬地址處理內核分頁請求,大致原因有以下三點。

1、Unable to handle kernel paging request at virtual address 00000000 原因是由於使用空NULL指針。
2、Unable to handle kernel paging request at virtual address 20100110 原因是的內存越界導致該指針, 所在內存被破壞了。 接下來的困難是在什麼地方這個內存被修改?爲什麼被修改?
3、Unable to handle kernel paging request at virtual address c074838c 試圖篡改受限制內存。比如:聲明爲const的變量!

[  135.800949] [<ffff000008083054>] el1_da+0x24/0x84
[  135.801556] [<ffff000000dd36a8>] 0xffff000000dd36a8
[  135.802184] [<ffff000008126d74>] expire_timers+0xdc/0x164(終止定時器)
[  135.802874] [<ffff000008126e94>] run_timer_softirq+0x98/0x180
[  135.803609] [<ffff000008081a30>] __do_softirq+0x130/0x364
[  135.804304] [<ffff0000080b485c>] irq_exit+0xbc/0xec
[  135.804934] [<ffff00000810e268>] __handle_domain_irq+0x64/0xac
[  135.805678] [<ffff000008081858>] gic_handle_irq+0xd4/0x17c

  說明可能是哪個定時器忘記關閉了

4、代碼查找

int cyttsp6_release(struct cyttsp6_core_data *cd)
{
	struct device *dev = cd->dev;
	cyttsp6_proximity_release(dev);
	cyttsp6_btn_release(dev);
	cyttsp6_mt_release(dev);
	
#ifdef CONFIG_HAS_EARLYSUSPEND
	unregister_early_suspend(&cd->es);
#elif defined(CONFIG_FB)
	fb_unregister_client(&cd->fb_notifier);
#endif
	
#if NEED_SUSPEND_NOTIFIER
	unregister_pm_notifier(&cd->pm_notifier);
#endif

	/*
	 * Suspend the device before freeing the startup_work and stopping
	 * the watchdog since sleep function restarts watchdog on failure
	 */
	pm_runtime_suspend(dev);
	pm_runtime_disable(dev);
	cancel_work_sync(&cd->startup_work);
	cyttsp6_stop_wd_timer(cd);
	del_timer(&cd->cyttsp6_recovery_timer);//忘記刪除這個定時器了
	device_init_wakeup(dev, 0);
	remove_sysfs_interfaces(cd, dev);
	free_irq(cd->irq, cd);
	if (cd->cpdata->init)
		cd->cpdata->init(cd->cpdata, 0, dev);
	dev_set_drvdata(dev, NULL);
	cyttsp6_del_core(dev);
	cyttsp6_free_si_ptrs(cd);
	kfree(cd);

	return 0;
}

5、問題總結

  在釋放函數中缺少刪除定時器的操作,理論上是在任何時候卸載驅動都會導致內核崩潰,但是在接上屏幕後上電並不會導致內核崩潰。分析代碼得到的結果是,當安裝驅動時,屏幕初始化完成後會刪除該定時器,也就導致了接着屏幕時安裝、卸載驅動都不會有任何問題。但是當屏幕沒有接在上面時候就會導致無法初始化成功,所有定時器就無法被正常刪除,最終卸載驅動時也沒有卸載定時器資源導致內核崩潰。

6、過程總結

1、內核崩潰一般爲非法地址的訪問、資源沒有完全釋放。
2、查找過程重點在本驅動,不要糾結到內核源碼中去。
3、從釋放資源函數開始,重點留意驅動中的資源是否完全釋放。
4、認真查看log中的代碼足跡,可以從中獲取問題的關鍵。

  

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章