Kernel soft lockup with blk_mq_update in traces
This document (000020248) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server 15
Situation
kernel: [1726320.308008] NMI watchdog: BUG: soft lockup - CPU#104 stuck for 23s!
and followed by
kernel: [1726320.336383] ? blk_mq_update_queue_map+0x20/0x20
in the logs.
Resolution
These stats are the ones shown in
/proc/diskstats
In the Azure Enviroment the Hyper-V storvsc driver in Linux can set the
can_queue
parameter value too high, which can result in allocating too many "tags" when operating with blk-mq enabled.
There are several workarounds possible:
1. The most obvious would be to move to a Linux kernel version 5.0 or later. But this is the most problematic in a Production Enviroment as it depends on whether a Kernel 5.0 or later is available.
2. Disable blk-mq and use the older block subsystem in the Linux kernel.
This workaround could negate some I/O performance gains of the parallelism that the blk-mq subsystem provides.
To apply this workaround the
scsi_mod.use_blk_mq=y
is removed from the kernel boot line, and a reboot is required.
3. Disable disk I/O stats.
This can be done on-the-fly on a per-disk basis on a running system by
echo "0" /sys/block/<device>/queue/iostats
This has to be done for each disk device in the System and is not persistent over a reboot.
4. Reduce the "can_queue" value.
This value cannot be set directly. The desired effect can be achieved by adding the kernel boot line options
hv_storvsc.storvsc_ringbuffer_size=131072 hv_storvsc.storvsc_vcpus_per_sub_channel=1024
a reboot is required during which these values will then be passed to the hv_storvsc.
Option 4 only works on Azure. Option 3 is not persistent. Option 1 might not be possible.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020248
- Creation Date: 14-May-2021
- Modified Date:09-Jun-2021
-
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com