平台:Allwinner H3
系统:OpenWrt linux
出错特点: reboot 后概率出现,下次启动又正常。
log :
[ 1.781438] Key type dns_resolver registered
[ 1.785824] Registering SWP/SWPB emulation handler
[ 1.795843] hctosys: unable to open rtc device (rtc0)
[ 1.801462] vcc3v0: disabling
[ 1.804431] vcc3v3: disabling
[ 1.807396] vcc5v0: disabling
[ 1.810378] ALSA device list:
[ 1.813343] No soundcards found.
[ 1.887068] VFS: Mounted root (squashfs filesystem) readonly on device 31:4.
[ 1.896322] Freeing unused kernel memory: 2048K
[ 1.941487] random: crng init done
[ 2.747800] NOHZ: local_softirq_pending 282
[ 2.752001] NOHZ: local_softirq_pending 282
[ 2.756185] NOHZ: local_softirq_pending 282
[ 2.760372] NOHZ: local_softirq_pending 282
[ 2.764551] NOHZ: local_softirq_pending 282
[ 2.768736] NOHZ: local_softirq_pending 282
[ 2.772919] NOHZ: local_softirq_pending 282
[ 2.777103] NOHZ: local_softirq_pending 282
[ 2.781286] NOHZ: local_softirq_pending 282
[ 2.863388] SQUASHFS error: xz decompression failed, data probably corrupt
[ 2.870281] SQUASHFS error: squashfs_read_data failed to read block 0x1fd9ce
[ 2.877320] SQUASHFS error: Unable to read fragment cache entry [1fd9ce]
[ 2.884024] SQUASHFS error: Unable to read page, block 1fd9ce, size 2bcc
[ 2.890740] SQUASHFS error: Unable to read fragment cache entry [1fd9ce]
[ 2.897431] SQUASHFS error: Unable to read page, block 1fd9ce, size 2bcc
[ 2.904139] SQUASHFS error: Unable to read fragment cache entry [1fd9ce]
[ 2.910841] SQUASHFS error: Unable to read page, block 1fd9ce, size 2bcc
[ 2.917536] SQUASHFS error: Unable to read fragment cache entry [1fd9ce]
[ 2.924239] SQUASHFS error: Unable to read page, block 1fd9ce, size 2bcc
[ 2.930949] SQUASHFS error: Unable to read fragment cache entry [1fd9ce]
[ 2.937640] SQUASHFS error: Unable to read page, block 1fd9ce, size 2bcc
/sbin/init: error while loading shared libraries: /lib/librt.so.1: cannot read file data: Input/output error
[ 2.977864] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[ 2.977864]
分析:
1,因为是概率出现,说明squashfs rootfs 本身没有问题。
2,尝试 降 spi 时钟频率,加上拉电阻等办法都没有用
3,最后怀疑 NOHZ: local_softirq_pending 这地方有关,修改后问题不再复现。
解决办法:
--- a/target/linux/sunxi/config-4.14
+++ b/target/linux/sunxi/config-4.14
@@ -289,6 +289,7 @@ CONFIG_HW_CONSOLE=y
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_TIMERIOMEM=y
CONFIG_HZ_FIXED=0
+CONFIG_HZ_PERIODIC=y
CONFIG_I2C=y
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_CHARDEV=y
@@ -377,9 +378,6 @@ CONFIG_NLS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NO_BOOTMEM=y
-CONFIG_NO_HZ=y
-CONFIG_NO_HZ_COMMON=y
-CONFIG_NO_HZ_IDLE=y
CONFIG_NR_CPUS=8
CONFIG_NVMEM=y
CONFIG_NVMEM_SUNXI_SID=y
原因猜测:
可能是配置了 CONFIG_NO_HZ_IDLE 时 cpu 的 tick 停摆了导致 spi 读写异常。