# apt install linux-crashdump
ubuntu多了一个grub配置文件 /etc/default/grub.d/kdump-tools.cfg,这样就不需要在/etc/default/grub设置crashkernel大小了。
# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.18.0-17-generic root=UUID=6d42d019-fc60-4eae-9da1-bb494e587cfc ro intel_iommu=on biosdevname=0 pci=realloc crashkernel=256M
# cd /sys/kernel/
# cat kexec_*
1
268435456
0
产生vmcore测试:
panic ()
{
echo 1 > /proc/sys/kernel/sysrq;
echo c > /proc/sysrq-trigger
}
# cd /var/crash
# ll
total 44K
drwxr-sr-x 2 root whoopsie 4.0K Apr 23 19:09 201904231909
drwxr-sr-x 2 root whoopsie 4.0K Apr 23 21:45 201904232144
-rw-r--r-- 1 root whoopsie 309 Apr 23 21:47 kexec_cmd
-rw-r----- 1 root whoopsie 32K Apr 23 19:11 linux-image-4.18.0-17-generic-201904231909.crash
lrwxrwxrwx 1 root whoopsie 30 Apr 23 19:11 vmcore.0 -> 201904231909/dump.201904231909
lrwxrwxrwx 1 root whoopsie 30 Apr 23 21:47 vmcore.1 -> 201904232144/dump.201904232144
ln-crash ()
{
cd /var/crash;
local dir=$(ls -td $(date +%Y)*/ | head -1);
local n=$(ls vmcore* | wc -l);
ln -s ${dir}dump* vmcore.$n
}
经测试上面的步骤在笔记本上没有问题,但是在服务器上就不行,会产生out of memory, killed process的错误,然后系统就hang。后来发现是mellanox网卡被初始化了,占用了太多的内存。编辑文件/etc/modprobe.d/blacklist.conf,加上下面两行:
blacklist mlx5_ib
blacklist mlx5_core
然后重新产生initrd.img文件:
# cd /var/lib/kdump
# /bin/rm -rf *
# kdump-config unload
# kdump-config load
这样就没问题了。但其实还是不太明白,网卡为什么会被初始化。
如果重新编译新内核的话,很有可能kdump-tools服务起不来,这时可以用官方内核和initrd产生vmcore,方法是编辑脚本/usr/sbin/kdump-config:把KVER=`uname -r`,改为KVER=4.18.0-17-generic即可。
安装linux-image-dbgsym:
function install-dbgsym
{
echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse
deb http://ddebs.ubuntu.com $(lsb_release -cs)-updates main restricted universe multiverse
deb http://ddebs.ubuntu.com $(lsb_release -cs)-proposed main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/ddebs.list
sudo apt install ubuntu-dbgsym-keyring
sudo apt-get update
sudo apt -y install linux-image-$(uname -r)-dbgsym
}
# cd /var/lib/kdump
# unmkinitramfs initrd.img-4.18.0-17-generic 1
/etc/init.d/kdump-tools
an init script to automatically load a kdump kernel, or save a vmcore and reboot.
/etc/default/kdump-tools
the kdump-tools configuration file
/var/crash/kernel_link
a link to the current debug kernel
/var/crash/kexec_cmd
the last kexec_cmd executed by kdump-config