SUSE Support

Here When You Need Us

File System Gets Corrupted On Disks Connected to H330 PERC On Dell EMC AMD Based Servers

This document (7023840) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 12 Service Pack 3 (SLES 12 SP3)
SUSE Linux Enterprise Server 12 Service Pack 4 (SLES 12 SP4)
SUSE Linux Enterprise Server 15
Dell EMC PowerEdge R6415
Dell EMC PowerEdge R7415
Dell EMC PowerEdge R7425

Situation

When high data I/O is run on the disks that are connected to H330 PERC on Dell EMC AMD based PowerEdge Servers, there can be file system corruption.

Resolution

  • Update BIOS to 1.8.7 or latest which will mark the PERC H330 address range as Unity mapping.
  • Kernel fix is in progress. Fix details will be updated here.

Note that either kernel update or BIOS update should be enough to fix the issue. Updating both is not mandatory.


Cause

PERC H330 controllers do not have their own memory and have to use system memory for RAID operations. So, BIOS will reserve a memory range and report it as an “exclusion range” and mark it reserved in the IVRS table so that amd_iommu driver will set the exclusion range registers in the IOMMU, which will disable IOMMU to that particular region of memory. There are two reasons that would lead to the data corruption:

  1) The exclusion range is not reserved in the IOVA Domain of PERC H330 and are used for DMA, resulting in data corruption.

  2) IVRS Table in BIOS provides the starting address and length of the exclusion range for H330. While AMD IOMMU driver is setting up exclusion range, the driver is adding the IVRS provided starting address and length to get the ending address that it uses to program the exclusion range limit register in the IOMMU, but to get the ending address it should add the length to the starting address and subtract one, which results in the exclusion range excluding one page extra past the end of the BIOS specified exclusion range.  If the kernel uses this extra page address as IOVA, then it leads to data corruption.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7023840
  • Creation Date: 25-Apr-2019
  • Modified Date:03-Mar-2020
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.