SUSE Support

Here When You Need Us

intel_uncore_has_discovery_tables issue with Intel Sapphire Rapids Middle Core Count CPUs

This document (000021138) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 15 Service Pack 5
SUSE Linux Enterprise Server 15 Service Pack 4

Situation

Following call trace is seen in dmesg.log on hardware containing Intel Sapphire Rapids (SPR) Middle Core Count (MCC) CPUs:
 
[    1.173376] WARNING: CPU: 1 PID: 1 at ../arch/x86/events/intel/uncore_discovery.c:184 intel_uncore_has_discovery_tables+0x494/0x5c0
[    1.173388] Modules linked in:
[    1.173392] Supported: Yes
[    1.173395] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.21-150400.22-default #1 SLE15-SP4 0b6a6578ade2de5c4a0b916095dff44f76ef1704
[    1.173403] Hardware name: Dell Inc. PowerEdge XR5610/075K3T, BIOS 1.0.0 02/20/2023
[    1.173405] RIP: 0010:intel_uncore_has_discovery_tables+0x494/0x5c0
[    1.173411] Code: 00 4c 8b 4b 38 48 83 c2 04 48 83 c7 04 47 8b 0c 01 44 89 4a fc 4c 8b 5b 40 47 8b 04 03 44 89 47 fc 45 39 d1 75 c4 48 89 0c 24 <0f> 0b 48 89 c7 e8 92 68 2d 00 48 8b 0c 24 48 89 cf e8 86 68 2d 00
[    1.173415] RSP: 0000:ff5c9a5500057d50 EFLAGS: 00010246
[    1.173419] RAX: ff3bc1f4c26cd180 RBX: ff3bc1f4c551f600 RCX: ff3bc1f4c26cd610
[    1.173422] RDX: ff3bc1f4c26cd18c RSI: 0000000000000002 RDI: ff3bc1f4c26cd61c
[    1.173425] RBP: 0000000000000044 R08: 0000000000018000 R09: 0000000000000003
[    1.173428] R10: 0000000000000003 R11: ff3bc1f4c26cd2f0 R12: ff5c9a550321a000
[    1.173430] R13: 0000000000000000 R14: ff3bc1f4c51fe000 R15: 0000000000000100
[    1.173432] FS:  0000000000000000(0000) GS:ff3bc2143f440000(0000) knlGS:0000000000000000
[    1.173436] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.173439] CR2: 0000000000000000 CR3: 0000001bbac10001 CR4: 0000000000771ee0
[    1.173442] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.173444] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[    1.173446] PKRU: 55555554
[    1.173448] Call Trace:
[    1.173451]  <TASK>
[    1.173457]  ? rdinit_setup+0x2b/0x2b
[    1.173466]  intel_uncore_init+0x88/0x4ee
[    1.173473]  ? perf_pmu_register+0x2bb/0x440
[    1.173481]  ? uncore_types_init+0x1fa/0x1fa
[    1.173486]  ? rdinit_setup+0x2b/0x2b
[    1.173491]  do_one_initcall+0x3e/0x200
[    1.173500]  kernel_init_freeable+0x236/0x298
[    1.173509]  ? rest_init+0xd0/0xd0
[    1.173515]  kernel_init+0x16/0x120
[    1.173519]  ret_from_fork+0x1f/0x30
[    1.173528]  </TASK>

This may affect some perf functionality on Intel Sapphire Rapids CPUs.

Resolution

This is mitigated by providing a hardcoded, pre-defined table. 

Fixed in these maintenance releases:
     SLES 15 SP5 - kernel 5.14.21-150500.55.7.1 (July-19-2023)
     SLES 15 SP4 - kernel 5.14.21-150400.24.55 (Mar-31-2023)

Cause

The discovery table of UPI (Ultra Path Interconnect) on Sapphire Rapids (SPR) Middle Core Count (MCC) CPUs is not working correctly due to broken discovery table. This triggers a kernel warning during boot in function intel_uncore_has_discovery_tables.

Additional Information

The fix for this issue appears in the kernel changelog for both SLES15 SP5 and SLES15 SP4 as a part of the following set of patches:
 
* Tue Mar 14 2023 tonyj@suse.de
- perf/x86/uncore: Don't WARN_ON_ONCE() for a broken discovery
  table (bsc#1206824, bsc#1206493, bsc#1206492).
- perf/x86/uncore: Add a quirk for UPI on SPR (bsc#1206824,
  bsc#1206493, bsc#1206492).
- perf/x86/uncore: Ignore broken units in discovery table
  (bsc#1206824, bsc#1206493, bsc#1206492).
- perf/x86/uncore: Fix potential NULL pointer in
  uncore_get_alias_name (bsc#1206824, bsc#1206493, bsc#1206492).
- perf/x86/uncore: Factor out uncore_device_to_die() (bsc#1206824,
  bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Make set_mapping() procedure void
  (bsc#1206824, bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Update sysfs-devices-mapping file
  (bsc#1206824, bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Enable UPI topology discovery for
  Sapphire Rapids (bsc#1206824, bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Enable UPI topology discovery for
  Icelake Server (bsc#1206824, bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Get UPI NodeID and GroupID (bsc#1206824,
  bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Enable UPI topology discovery for
  Skylake Server (bsc#1206824, bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Generalize get_topology() for SKX PMUs
  (bsc#1206824, bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on
  ICX-D (bsc#1206824, bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Clear attr_update properly (bsc#1206824,
  bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Introduce UPI topology type (bsc#1206824,
  bsc#1206493, bsc#1206492).
- perf/x86/intel/uncore: Generalize IIO topology support
  (bsc#1206824, bsc#1206493, bsc#1206492).
- commit 23fd14b

 

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021138
  • Creation Date: 17-Jul-2023
  • Modified Date:26-Jul-2023
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.