SUSE Support

Here When You Need Us

Servers with Broadcom LSI MegaRAID SAS storage controllers randomly crash on boot

This document (000021663) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 15 SP5
SUSE Linux Enterprise Server 15 SP6


Situation

On Dell PowerEdge R640 with Broadcom LSI MegaRAID storage controllers it has been observed that after updating the SLES 15 SP5 kernel from 5.14.21-150500.55.80 to 5.14.21-150500.55.83 some servers would intermittently crash during the later stages of boot. In this particular setup the kernel crashed when cloud-init was being executed.

The kernel log would contain these messages:

BUG: kernel NULL pointer dereference, address: 0000000000000130
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 18a1b6067 P4D 18a1b6067 PUD 1c0b16067 PMD 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 62 PID: 0 Comm: swapper/62 Kdump: loaded Tainted: G          IOE  X    5.14.21-150500.55.83-default #1 SLE15-SP5 33240b69c7203f8eab122d545288fe507f185c6b
Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.22.2 09/12/2024
RIP: 0010:complete_cmd_fusion+0x439/0x650 [megaraid_sas]
...
Call Trace:
 <IRQ>
 megasas_isr_fusion+0x84/0x90 [megaraid_sas d7ec7902299b887bdbcf4651784a30e2e26d5c22]
 __handle_irq_event_percpu+0x36/0x1a0
 handle_irq_event_percpu+0x30/0x70
 handle_irq_event+0x34/0x60
 handle_edge_irq+0x7e/0x1a0
 __common_interrupt+0x3b/0xb0
 common_interrupt+0x58/0xa0
 </IRQ>
 <TASK>

Analysing kernel crash-dumps from different servers would show the exact same call trace and null pointer.

Resolution

To avoid this issue, one of below two options can be used: 

  • Disable the smp_affinity_enable feature when loading the megaraid_sas kernel driver during boot. This can be done in two ways.
    1. Add 'smp_affinity_enable=0' to the kernel options in the bootloader configuration
    2. Create a .conf file in /etc/modprobe.d/ and generate new initrd. Example below
echo 'options megaraid_sas smp_affinity_enable=0' > /etc/modprobe.d/megaraid_sas.conf
dracut -f

 

  • Configure irqbalance to back off the megaraid_sas driver - the "cleanest" approach.

As per 'man irqbalance' the -m/--banmod option can be used to ensure that irqbalance will not affect the affinity of any IRQs of the given module. This option can be configured through IRQBALANCE_ARGS in the /etc/sysconfig/irqbalance file.
Simply add '--banmod=megaraid_sas' to the list of options in the IRQBALANCE_ARGS= line in /etc/sysconfig/irqbalance.

Cause

This issue could happen on other hardware and any recent SLES version and kernel. The observation at hand happened to be between to SLES 15 SP5 kernel versions on Dell R640 servers.

The megasas firmware/hardware has a design limitation which does not allow live migration of irq handlers and these servers have the irqbalance service enabled. The design decision was made by the hardware vendor (LSI Logic/Broadcom) and has always been present.

This design creates a small race window which may or may not be hit on systems with  megasas hardware and irqbalance enabled.
At some point - due to kernel or driver changes - the race window may get hit and the server will oops with a null pointer in complete_cmd_fusion.

irqbalance mostly works fine for network cards,where it does not matter on which queue an RX flow arrives as RX and TX flows are independent.

For storage HBAs submission and completion are tightly coupled (I/O can only continue if a matching completion for any give submission has been received), so rearranging the interrupt mapping for any of those cards may cause problems like what is described in this document.

This is especially true for the megaraid HBAs, which do require each completion to arrive on the queue specified by the submission command.

The driver itself has no fallback to look for completions on other queues, so if the interrupt affinity is changed completions will arrive on the wrong queue, causing command timeouts and an HBA reset.

HBAs known to be affected by this issue :

  • Broadcom LSI MegaRAID SAS-3 3108

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021663
  • Creation Date: 08-Jan-2025
  • Modified Date:23-Jan-2025
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.