平臺:Allwinner H3
系統:OpenWrt linux
出錯特點: reboot 後概率出現,下次啓動又正常。
log :
[ 1.781438] Key type dns_resolver registered
[ 1.785824] Registering SWP/SWPB emulation handler
[ 1.795843] hctosys: unable to open rtc device (rtc0)
[ 1.801462] vcc3v0: disabling
[ 1.804431] vcc3v3: disabling
[ 1.807396] vcc5v0: disabling
[ 1.810378] ALSA device list:
[ 1.813343] No soundcards found.
[ 1.887068] VFS: Mounted root (squashfs filesystem) readonly on device 31:4.
[ 1.896322] Freeing unused kernel memory: 2048K
[ 1.941487] random: crng init done
[ 2.747800] NOHZ: local_softirq_pending 282
[ 2.752001] NOHZ: local_softirq_pending 282
[ 2.756185] NOHZ: local_softirq_pending 282
[ 2.760372] NOHZ: local_softirq_pending 282
[ 2.764551] NOHZ: local_softirq_pending 282
[ 2.768736] NOHZ: local_softirq_pending 282
[ 2.772919] NOHZ: local_softirq_pending 282
[ 2.777103] NOHZ: local_softirq_pending 282
[ 2.781286] NOHZ: local_softirq_pending 282
[ 2.863388] SQUASHFS error: xz decompression failed, data probably corrupt
[ 2.870281] SQUASHFS error: squashfs_read_data failed to read block 0x1fd9ce
[ 2.877320] SQUASHFS error: Unable to read fragment cache entry [1fd9ce]
[ 2.884024] SQUASHFS error: Unable to read page, block 1fd9ce, size 2bcc
[ 2.890740] SQUASHFS error: Unable to read fragment cache entry [1fd9ce]
[ 2.897431] SQUASHFS error: Unable to read page, block 1fd9ce, size 2bcc
[ 2.904139] SQUASHFS error: Unable to read fragment cache entry [1fd9ce]
[ 2.910841] SQUASHFS error: Unable to read page, block 1fd9ce, size 2bcc
[ 2.917536] SQUASHFS error: Unable to read fragment cache entry [1fd9ce]
[ 2.924239] SQUASHFS error: Unable to read page, block 1fd9ce, size 2bcc
[ 2.930949] SQUASHFS error: Unable to read fragment cache entry [1fd9ce]
[ 2.937640] SQUASHFS error: Unable to read page, block 1fd9ce, size 2bcc
/sbin/init: error while loading shared libraries: /lib/librt.so.1: cannot read file data: Input/output error
[ 2.977864] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[ 2.977864]
分析:
1,因爲是概率出現,說明squashfs rootfs 本身沒有問題。
2,嘗試 降 spi 時鐘頻率,加上拉電阻等辦法都沒有用
3,最後懷疑 NOHZ: local_softirq_pending 這地方有關,修改後問題不再復現。
解決辦法:
--- a/target/linux/sunxi/config-4.14
+++ b/target/linux/sunxi/config-4.14
@@ -289,6 +289,7 @@ CONFIG_HW_CONSOLE=y
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_TIMERIOMEM=y
CONFIG_HZ_FIXED=0
+CONFIG_HZ_PERIODIC=y
CONFIG_I2C=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_CHARDEV=y
@@ -377,9 +378,6 @@ CONFIG_NLS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NO_BOOTMEM=y
-CONFIG_NO_HZ=y
-CONFIG_NO_HZ_COMMON=y
-CONFIG_NO_HZ_IDLE=y
CONFIG_NR_CPUS=8
CONFIG_NVMEM=y
CONFIG_NVMEM_SUNXI_SID=y
原因猜測:
可能是配置了 CONFIG_NO_HZ_IDLE 時 cpu 的 tick 停擺了導致 spi 讀寫異常。