当前位置: 首页 > 工具软件 > Soft U2F > 使用案例 >

soft lockup

梁丘逸仙
2023-12-01

static void dump_softlock_debug(unsigned long data);

DEFINE_TIMER(softlock_timer, dump_softlock_debug, 0, 0);


init_timer(&softlock_timer);


static void dump_softlock_debug(unsigned long data)
{
    int i, reboot;
    u64 system[NR_CPUS], num_jifs;

    num_jifs = jiffies - beattime;//获得过去了的时间
    for_each_possible_cpu(i) {
        system[i] = kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM]    -     heartbeats[i];
    }    

    for_each_possible_cpu(i) {
        if ((num_jifs - cputime_to_jiffies(system[i]))  <    msecs_to_jiffies(10)) {//如果 逝去的时间减去系统占用的时间 小于10ms, 说明有问题。
            WARN(1, "cpu %d wedged\n", i);
            smp_call_function_single(i, smp_dumpstack, NULL, 1);
            reboot = 1; 
        } 
    }  


    if (reboot) {
        panic_timeout = 10;
        trigger_all_cpu_backtrace();
        panic("Soft lock on CPUs\n");
    }

}

在某个tasklet func( )里面

{

    beattime = jiffies;

    for_each_possible_cpu(i) {
            heartbeats[i] = kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM];

    }

    mod_timer(&softlock_timer, jiffies + SOFT_LOCK_TIME * HZ);

}

================================================

How to Deliberately Crash a System when Soft Lockup Occurs

Information

When the system experience soft-lockups, e.g.  BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] Pid: 0 one needs to generate a vmcore at the time of the soft-lockups which could be used for further investigation of the issue.

Details

Starting from Red Hat Enterprise Linux 5.3, it is now possible to have the vmcore dump generated automatically at the time of a soft-lockup.
To implement this, firstly one needs to set up and test kdump.
Then update the  sysctl.conf file by the below command to panic the system when soft-lockup occurs.
sysctl -w kernel.softlockup_panic=1
This should now result in the system deliberately crashing and generating a vmcore at the time of a soft-lockup.
Soft lockups are situations in which the kernel's scheduler subsystem has not been given a chance to perform its job for more than 10 seconds.
They can be caused by defects in the kernel, by hardware issues or by extremely high workloads. The kernel includes code (in kernel/softlockup.c) to detect these situations and take action on them.

Issue

Enduser may see  CPU soft lockup messages in the log files under heavy load. These are informational messages indicating that a CPU did not respond to a soft lockup timer within the timer window (currently 10 seconds on Red Hat Enterprise Linux). They do not indicate a problem with the system.

Solution

The current upstream setting for this soft lockup timer parameter is 60 seconds.
Altering the default value of  kernel.softlockup_thresh from 10 to 30 or above would get rid of this message.
# sysctl -w kernel.softlockup_thresh=30
OR
Add this line to  /etc/sysctl.conf (takes effect on next reboot):
      kernel.softlockup_thresh=30
OR
Change value dynamically; only affects the system's current value:
      echo 30 > /proc/sys/kernel/softlockup_thresh


 类似资料:

相关阅读

相关文章

相关问答