crashdump_viewer:start().
因为 crashdump 文本文件里面记录了大量系统相关的信息,这些信息对于分析系统的性能,状态,排除问题提供了不可替代的功能。所以很需要在系统正常运作的时候,得到 crashdump 文件。
[root@Betty upu]# ps aux|grep upu
root 2185 0.0 0.0 12908 796 ? S 13:03 0:00 /opt/mcu/upu/erts-6.0/bin/run_erl -daemon /tmp//opt/mcu/upu/ /opt/mcu/upu/log exec /opt/mcu/upu/bin/upu console ''
root 2186 12.3 1.1 507936 43688 pts/0 Ssl+ 13:03 0:00 /opt/mcu/upu/erts-6.0/bin/beam.smp -K true -- -root /opt/mcu/upu -progname upu -- -home /root -- -boot /opt/mcu/upu/releases/1/upu -mode embedded -config /opt/mcu/upu/etc/upu.config -mnesia dir '/opt/mcu/upu/data' -sname upu@Betty -setcookie upu -- console
root 2237 0.0 0.0 103252 848 pts/6 S+ 13:03 0:00 grep upu
root 2525 0.0 0.0 10956 396 ? S Feb01 0:12 /opt/mcu/upucore/erts-6.0/bin/epmd -daemon
[root@Betty upu]#
通过 remsh 登录,再通过 Ctrl+c,a 退出
[root@Betty upu]# ./bin/upu remote_console
Erlang/OTP 17 [erts-6.0] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V6.0 (abort with ^G)
(upu@Betty)1>
(upu@Betty)1>
(upu@Betty)1>
BREAK: (a)bort (c)ontinue (p)roc info (i)nfo (l)oaded
(v)ersion (k)ill (D)b-tables (d)istribution
a
[root@Betty upu]#
可以看到,上述操作对业务进程无影响(不产生影响的原因见后文)。
[root@Betty upu]#
[root@Betty upu]# ps aux|grep upu
root 2185 0.0 0.0 12908 796 ? S 13:03 0:00 /opt/mcu/upu/erts-6.0/bin/run_erl -daemon /tmp//opt/mcu/upu/ /opt/mcu/upu/log exec /opt/mcu/upu/bin/upu console ''
root 2186 3.1 1.1 507348 43600 pts/0 Ssl+ 13:03 0:00 /opt/mcu/upu/erts-6.0/bin/beam.smp -K true -- -root /opt/mcu/upu -progname upu -- -home /root -- -boot /opt/mcu/upu/releases/1/upu -mode embedded -config /opt/mcu/upu/etc/upu.config -mnesia dir '/opt/mcu/upu/data' -sname upu@Betty -setcookie upu -- console
root 2328 0.0 0.0 103252 848 pts/6 S+ 13:03 0:00 grep upu
root 2525 0.0 0.0 10956 404 ? S Feb01 0:12 /opt/mcu/upucore/erts-6.0/bin/epmd -daemon
[root@Betty upu]#
[root@Betty upu]#
再次通过 remsh 登录,并执行 Ctrl+c,A
[root@Betty upu]# ./bin/upu remote_console
Erlang/OTP 17 [erts-6.0] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false] Eshell V6.0 (abort with ^G) (upu@Betty)1> BREAK: (a)bort (c)ontinue (p)roc info (i)nfo (l)oaded (v)ersion (k)ill (D)b-tables (d)istribution A Crash dump was written to: erl_crash.dump Crash dump requested by user已放弃 (core dumped) [root@Betty upu]#
可以看到,上述操作对 upu 进程同样无影响(不产生影响的原因见后文),同时能够产生 erl_crash.dump 文件
[root@Betty upu]#
[root@Betty upu]# ps aux|grep upu
root 2185 0.0 0.0 12908 796 ? S 13:03 0:00 /opt/mcu/upu/erts-6.0/bin/run_erl -daemon /tmp//opt/mcu/upu/ /opt/mcu/upu/log exec /opt/mcu/upu/bin/upu console ''
root 2186 0.6 1.1 507604 43864 pts/0 Ssl+ 13:03 0:00 /opt/mcu/upu/erts-6.0/bin/beam.smp -K true -- -root /opt/mcu/upu -progname upu -- -home /root -- -boot /opt/mcu/upu/releases/1/upu -mode embedded -config /opt/mcu/upu/etc/upu.config -mnesia dir '/opt/mcu/upu/data' -sname upu@Betty -setcookie upu -- console
root 2463 0.0 0.0 103252 848 pts/6 S+ 13:05 0:00 grep upu
root 2525 0.0 0.0 10956 408 ? S Feb01 0:12 /opt/mcu/upucore/erts-6.0/bin/epmd -daemon
[root@Betty upu]#
[root@Betty upu]# ll
总用量 360
drwxr-xr-x 2 root root 4096 2月 26 14:27 bin
drwxr-xr-x 2 root root 4096 3月 1 16:53 data
-rw-r----- 1 root root 334226 3月 4 13:05 erl_crash.dump
drwxr-xr-x 3 root root 4096 2月 26 14:27 erts-6.0
drwxr-xr-x 2 root root 4096 3月 1 16:52 etc
drwxr-xr-x 12 root root 4096 2月 26 14:27 lib
drwxr-xr-x 2 root root 4096 3月 4 13:03 log
drwxr-xr-x 3 root root 4096 2月 26 14:27 releases
drwxr-xr-x 2 root root 4096 2月 26 14:28 system
[root@Betty upu]#
此时就可以基于该 erl_crash.dump 文件对 upu 进程的运行时状态进行分析了(此结论已被我自己证实存在问题)。
[root@Betty upu]#
[root@Betty upu]# ps aux|grep upu
root 2185 0.0 0.0 12908 796 ? S 13:03 0:00 /opt/mcu/upu/erts-6.0/bin/run_erl -daemon /tmp//opt/mcu/upu/ /opt/mcu/upu/log exec /opt/mcu/upu/bin/upu console ''
root 2186 0.1 1.5 507936 58048 pts/0 Ssl+ 13:03 0:01 /opt/mcu/upu/erts-6.0/bin/beam.smp -K true -- -root /opt/mcu/upu -progname upu -- -home /root -- -boot /opt/mcu/upu/releases/1/upu -mode embedded -config /opt/mcu/upu/etc/upu.config -mnesia dir '/opt/mcu/upu/data' -sname upu@Betty -setcookie upu -- console
root 2525 0.0 0.0 10956 412 ? S Feb01 0:12 /opt/mcu/upucore/erts-6.0/bin/epmd -daemon
root 2928 0.0 0.0 103252 844 pts/6 S+ 13:21 0:00 grep upu
[root@Betty upu]#
[root@Betty upu]# ll
总用量 36
drwxr-xr-x 2 root root 4096 2月 26 14:27 bin
drwxr-xr-x 2 root root 4096 3月 1 16:53 data
drwxr-xr-x 3 root root 4096 2月 26 14:27 erts-6.0
drwxr-xr-x 2 root root 4096 3月 1 16:52 etc
drwxr-xr-x 12 root root 4096 2月 26 14:27 lib
drwxr-xr-x 2 root root 4096 3月 4 13:03 log
drwxr-xr-x 3 root root 4096 2月 26 14:27 releases
drwxr-xr-x 2 root root 4096 2月 26 14:28 system
[root@Betty upu]#
通过 remsh 登录,并执行 erlang:halt("abort").
[root@Betty upu]#
[root@Betty upu]# ./bin/upu remote_console
Erlang/OTP 17 [erts-6.0] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V6.0 (abort with ^G)
(upu@Betty)1>
(upu@Betty)1> erlang:halt("abort").
*** ERROR: Shell process terminated! (^G to start new job) ***
BREAK: (a)bort (c)ontinue (p)roc info (i)nfo (l)oaded
(v)ersion (k)ill (D)b-tables (d)istribution
^C[root@Betty upu]#
[root@Betty upu]#
退出后发现生成了 erl_crash.dump ,此文件大小比通过 Ctrl+c,A 生成的大(大的原因见后文)。
[root@Betty upu]# ll
总用量 1404
drwxr-xr-x 2 root root 4096 2月 26 14:27 bin
drwxr-xr-x 2 root root 4096 3月 1 16:53 data
-rw-r----- 1 root root 1400355 3月 4 13:21 erl_crash.dump
drwxr-xr-x 3 root root 4096 2月 26 14:27 erts-6.0
drwxr-xr-x 2 root root 4096 3月 1 16:52 etc
drwxr-xr-x 12 root root 4096 2月 26 14:27 lib
drwxr-xr-x 2 root root 4096 3月 4 13:03 log
drwxr-xr-x 3 root root 4096 2月 26 14:27 releases
drwxr-xr-x 2 root root 4096 2月 26 14:28 system
[root@Betty upu]#
可以看到,此时业务进程已经终止。
[root@Betty upu]# ps aux|grep upu
root 2525 0.0 0.0 10956 416 ? S Feb01 0:12 /opt/mcu/upucore/erts-6.0/bin/epmd -daemon
root 3053 0.0 0.0 103252 848 pts/6 S+ 13:24 0:00 grep upu
[root@Betty upu]#
[root@Betty upu]# ./bin/upu start
[root@Betty upu]#
[root@Betty upu]# ps aux|grep upu
root 2525 0.0 0.0 10956 424 ? S Feb01 0:12 /opt/mcu/upucore/erts-6.0/bin/epmd -daemon
root 3365 0.0 0.0 12908 792 ? S 13:27 0:00 /opt/mcu/upu/erts-6.0/bin/run_erl -daemon /tmp//opt/mcu/upu/ /opt/mcu/upu/log exec /opt/mcu/upu/bin/upu console ''
root 3366 27.3 1.0 511852 38788 pts/0 Ssl+ 13:27 0:00 /opt/mcu/upu/erts-6.0/bin/beam.smp -K true -- -root /opt/mcu/upu -progname upu -- -home /root -- -boot /opt/mcu/upu/releases/1/upu -mode embedded -config /opt/mcu/upu/etc/upu.config -mnesia dir '/opt/mcu/upu/data' -sname upu@Betty -setcookie upu -- console
root 3415 0.0 0.0 103252 848 pts/6 S+ 13:27 0:00 grep upu
[root@Betty upu]#
执行 kill 命令发送信号 SIGUSR1 到业务进程
[root@Betty upu]#
[root@Betty upu]# kill -s SIGUSR1 3366
[root@Betty upu]#
[root@Betty upu]# ps aux|grep upu
root 2525 0.0 0.0 10956 424 ? S Feb01 0:12 /opt/mcu/upucore/erts-6.0/bin/epmd -daemon
root 3429 0.0 0.0 103252 848 pts/6 S+ 13:28 0:00 grep upu
[root@Betty upu]#
[root@Betty upu]# ll
总用量 1400
drwxr-xr-x 2 root root 4096 2月 26 14:27 bin
drwxr-xr-x 2 root root 4096 3月 1 16:53 data
-rw-r----- 1 root root 1395568 3月 4 13:28 erl_crash.dump
drwxr-xr-x 3 root root 4096 2月 26 14:27 erts-6.0
drwxr-xr-x 2 root root 4096 3月 1 16:52 etc
drwxr-xr-x 12 root root 4096 2月 26 14:27 lib
drwxr-xr-x 2 root root 4096 3月 4 13:03 log
drwxr-xr-x 3 root root 4096 2月 26 14:27 releases
drwxr-xr-x 2 root root 4096 2月 26 14:28 system
可以看到,这种方式也能够产生 erl_crash.dump 文件,但业务进程会终止运行。
[root@Betty upu]# ./bin/upu start
[root@Betty upu]#
[root@Betty upu]#
[root@Betty upu]# ps aux|grep upu
root 2525 0.0 0.0 10956 428 ? S Feb01 0:12 /opt/mcu/upucore/erts-6.0/bin/epmd -daemon
root 3645 0.0 0.0 12908 792 ? S 13:31 0:00 /opt/mcu/upu/erts-6.0/bin/run_erl -daemon /tmp//opt/mcu/upu/ /opt/mcu/upu/log exec /opt/mcu/upu/bin/upu console ''
root 3646 43.5 1.1 512108 42036 pts/0 Ssl+ 13:31 0:00 /opt/mcu/upu/erts-6.0/bin/beam.smp -K true -- -root /opt/mcu/upu -progname upu -- -home /root -- -boot /opt/mcu/upu/releases/1/upu -mode embedded -config /opt/mcu/upu/etc/upu.config -mnesia dir '/opt/mcu/upu/data' -sname upu@Betty -setcookie upu -- console
root 3693 0.0 0.0 103252 848 pts/6 S+ 13:31 0:00 grep upu
[root@Betty upu]#
[root@Betty upu]# ll
总用量 36
drwxr-xr-x 2 root root 4096 2月 26 14:27 bin
drwxr-xr-x 2 root root 4096 3月 1 16:53 data
drwxr-xr-x 3 root root 4096 2月 26 14:27 erts-6.0
drwxr-xr-x 2 root root 4096 3月 1 16:52 etc
drwxr-xr-x 12 root root 4096 2月 26 14:27 lib
drwxr-xr-x 2 root root 4096 3月 4 13:03 log
drwxr-xr-x 3 root root 4096 2月 26 14:27 releases
drwxr-xr-x 2 root root 4096 2月 26 14:28 system
[root@Betty upu]#
执行 kill 命令发送信号 SIGUSR2 到业务进程
[root@Betty upu]#
[root@Betty upu]# kill -s SIGUSR2 3646
[root@Betty upu]# ll
总用量 36
drwxr-xr-x 2 root root 4096 2月 26 14:27 bin
drwxr-xr-x 2 root root 4096 3月 1 16:53 data
drwxr-xr-x 3 root root 4096 2月 26 14:27 erts-6.0
drwxr-xr-x 2 root root 4096 3月 1 16:52 etc
drwxr-xr-x 12 root root 4096 2月 26 14:27 lib
drwxr-xr-x 2 root root 4096 3月 4 13:03 log
drwxr-xr-x 3 root root 4096 2月 26 14:27 releases
drwxr-xr-x 2 root root 4096 2月 26 14:28 system
[root@Betty upu]# ps aux|grep upu
root 2525 0.0 0.0 10956 428 ? S Feb01 0:12 /opt/mcu/upucore/erts-6.0/bin/epmd -daemon
root 3706 0.0 0.0 103252 844 pts/6 S+ 13:31 0:00 grep upu
[root@Betty upu]#
可以看到,这种方式不会产生 erl_crash.dump 文件,但 upu 进程会终止运行。
==============
重要的补充说明: