Disabling Deep Sleep C-states
CPU deep sleep C-states can cause unpredictable performance and system hangs in NexentaStor. Please use the attached document to understand the reasoning and requirement as to why it is advised to disable C-States.
The method for disabling deep sleep C-states varies widely, just as BIOS implementations vary widely.
The exact settings and descriptions may vary. In this section, the procedure for disabling deep sleep C-states is shown for some servers commonly used for NexentaStor appliances. In general, it is best to change the C-state settings during installation or pre-production testing. The changes require a NexentaStor service outage that can impact production schedules. If a system is currently running in production, consider disabling deep sleep C-states at the next scheduled maintenance.
Overview
NexentaStor hosts become unresponsive with C-state enabled.
Brief Explanation of C-states
C-states are simply CPU idle states. The exact implementation in hardware varies slightly, but in
general the states are:
Higher C-states represent turning off more of the processor functions. As more of the functions are turned off, turning them back on can take longer, leading to unpredictable performance. In this paper, deep sleep C-states are defined as C3 or higher.
Note that changes in CPU frequency, often described as P-states, do not appear to negatively impact NexentaStor appliance operation. This paper describes changes to C-state settings only.
How to determine if C-states are enabled
To obtain the current C-State registers from a NexentaStor appliance:
nmc@myhost:/$ option expert_mode=1 nmc@myhost:/$ !kstat -p cpu_info:::current_cstate You are about to enter the Unix ("raw") shell and execute low-level Unix command(s). Warning: using low-level Unix commands is not recommended! Execute? Yes cpu_info:0:cpu_info0:current_cstate 0 cpu_info:1:cpu_info1:current_cstate 1 cpu_info:2:cpu_info2:current_cstate 1 cpu_info:3:cpu_info3:current_cstate 0 ... |
If a CPU shows C-state other than 0, 1, or 2, then the results of this test are inconclusive, because a CPU's
C-state can change quickly. If at least one CPU is in C-state 3 or greater, then consider disabling deep
sleep C-states.
nmc@myhost:/$ option expert_mode=1 nmc@myohst:/$ !kstat -p cpu_info:::supported_max_cstates You are about to enter the Unix ("raw") shell and execute low-level Unix command(s). Warning: using low-level Unix commands is not recommended! Execute? Yes cpu_info:0:cpu_info0:supported_max_cstates 4 cpu_info:1:cpu_info1:supported_max_cstates 4 cpu_info:2:cpu_info2:supported_max_cstates 4 cpu_info:3:cpu_info3:supported_max_cstates 4 ... |
A second test is to observe the maximum number of possible C-states:
If the number of supported_max_cstates is greater than 2, then consider disabling deep sleep C-states.
Disabling Deep Sleep C-states
The method for disabling deep sleep C-states varies widely, just as BIOS implementations vary widely. The exact settings and descriptions may vary. In this section, the procedure for disabling deep sleep C-states is shown for some servers commonly used for NexentaStor appliances.
In general, it is best to change the C-state settings during installation or pre-production testing. The changes require a NexentaStor service outage that can impact production schedules. If a system is currently running in production, consider disabling deep sleep C-states at the next scheduled maintenance window.
HP DL-series servers
For many HP DL-series servers, the default BIOS configuration is "c-state enable," which includes C3 and higher C-states. To disable deep sleep C-states, modify the BIOS settings. Under the BIOS menu, select the “Power Management" option, "Advanced Power Management", and "Modify C-STATE." Disable use of deep sleep C-states. When disabled, only C-states C0 and C1 are enabled.
This means that HP systems must undergo at least one boot cycle to achieve the disabling of this PM feature and then it should be verified as detailed above from inside NexentaStor.
Intel SR2600 and SR2625 Servers Sold by Quanta
Quanta choose to interpret "C-states enabled" in their BIOS implementation as the range from C0 to C2 only. When disabled, C-states are restricted to C0 or C1. Thus "C-states enabled" on a Quanta system does not enable deep sleep C-states. Also, the BIOS appear to transition from C1 to C2 rarely, leaving the processors in C0 or C1 under many workloads.
Pogo Linux Storage Director
Some Storage Director systems have been shipped with C3 and C6 C-states enabled. To disable the C3 and C6 C-states on Storage Director Z2 systems:
Step 1) Enter the BIOS during POST with the [F2] key.
Step 2) Navigate to the "Advanced>Processor Configuration" Menu and set the 'Processor C3' and 'Processor C6' Settings to [Disabled].
Step 3) Save and exit the BIOS using the [F10] Key.
Dell Servers
Many Dell systems have been observed to aggressively enter deep sleep C-states.
Some Dell servers offer the BIOS option of enabling C1E or C3 separately. The C1E and deep sleep C3 C-state have been observed to cause unpredictable performance on Dell servers.
SuperMicro Servers
Many SuperMicro servers ship with deep sleep C-states enabled. It is advised to disable them from the BIOS as recommended above.
Next Steps
BIOS revisions can bring changes to the behavior of deep sleep C-states. Other OS vendors have noted similar problems with deep sleep C-state transitions. Nexenta recommends disabling deep sleep C-states. If improvements to the C-state implementations show predictable performance and stability, then this recommendation can be revisited.
Nexenta recognizes that consistency in operations can reduce total cost of ownership. Disabling deep sleep C-states can be a standard setting for system deployment for all OSes.
Other Hardware Affected
Broadcom 5709 and 5716 NIC
The Broadcom 5709 and 5716 NIC have been observed to be particularly susceptible to a system freeze issue. One suspected link is that the Broadcom driver included on the installation disk contains a known defect, RHBZ 511368, which leads to a lost interrupt vector and subsequent loss of network connectivity when the server is under load. This, in turn, can lead to the server and virtual machines becoming idle because of a lack of network requests, that causes the CPUs to enter deep sleep C-states, and can result in a hung system. Servers with these Broadcom NICs thereby become susceptible to freezes both under low and high system load. Disabling deep sleep C-states appears to remedy this condition.
References
External links for further information:
Dell : http://lists.us.dell.com/pipermail/linux-poweredge/2010-May/042280.html
Microsoft: http://support.microsoft.com/kb/2000977
IBM: http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5083648
IBM: http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5085841&brandind=5000008
VMware: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1028656
Oracle: http://forums.oracle.com/forums/thread.jspa?threadID=1924462&start=15&tstart=0
RedHat: https://access.redhat.com/kb/docs/DOC-26837 (available for Red Hat subscribers)