What are all these "Bug: soft lockup" messages about?
This document (7017652) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server 12
SUSE Linux Enterprise Server 11
Situation
May 25 07:23:59 XXXXXXX kernel: [13445315.881356] BUG: soft lockup - CPU#16 stuck for 23s! [yyyyyyy:81602]These are followed by various stack traces. This document tries to explain what the soft lockup messages mean.
Resolution
A 'soft lockup' watchdog timeout can happen if the kernel is busy, working on a huge amount of objects which need to be scanned, freed, or allocated, respectively.
The stack traces of those tasks can give a first idea about what the tasks were doing. However, to be able to examine the cause behind the messages, a kernel dump would be needed.
While these messages cannot be disabled entirely, in some situations, increasing the time before these soft lockups are fired can relax the situation.
server1:~ # echo 20 > /proc/sys/kernel/watchdog_thresh
or
server1:~ # echo "kernel.watchdog_thresh=20" > /etc/sysctl.d/99-watchdog_thresh.conf server1:~ # sysctl -p /etc/sysctl.d/99-watchdog_thresh.conf
For more information on how to configure and capture kernel dump please check: Configure crashkernel memory for kernel core dump analysis
Cause
The watchdog daemon will send an non-maskable interrupt (NMI) to all CPUs in the system who, in turn, print the stack traces of their currently running tasks.
Additional Information
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7017652
- Creation Date: 31-May-2016
- Modified Date:22-Dec-2023
-
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com