Faulty Intel chipsets cause problems with interrupt remapping
This document (7014344) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server 11 Service Pack 3
Situation
The reported symptoms range from network link state flapping and partial to full loss of communication on network cards.
Common is that the kernel will log messages like
"kernel: do_IRQ: x.xxx No irq handler for vector (irq -1)"
in the syslog.
Resolution
Intel has provided firmware updates/errata to the BIOS of the affected chipsets.
Examples :
Intel® 5520 and Intel® 5500 Chipset Specification Update
Erratas
47. Intel VT-d: Receiving two identical interrupt requests in back to back
cycles may corrupt attributes of remapped interrupt, or hang
subsequent interrupt-remap-cache invalidation command.
and
53. Intel VT-d: In-flight remap-able interrupts not drained on interrupt
invalidation command
Intel® X58 Express Chipset Specification Update
Erratas
62. Intel VT-d: UP Workstation ONLY. Receiving two identical interrupt
requests in back to back cycles may corrupt attributes of remapped
interrupt, or hang subsequent interrupt-remap-cache invalidation
command.
69. Intel® VT-d: In-flight remap-able interrupts not drained on interrupt
invalidation command
In some deployments however, updating firmware on field systems may have implications.
To help customers in this situation, a quirk in the upstream linux kernel has recently been introduced.
When this code detects that the system matches the hardware specifications and interrupt mapping is enabled, it disables interrupt mapping in the kernel. Also the kernel is tainted and the message below is logged in /var/log/messages and dmesg.
This system BIOS has enabled interrupt remapping
on a chipset that contains an erratum making that
feature unstable. To maintain system stability
interrupt remapping is being disabled. Please
contact your BIOS vendor for an update.
This change is included in the SLE 11 SP3 linux kernel version 3.0.101-0.21.1. The SLE 11 SP2 (LTSS) kernel version 3.0.101-0.7.19.1 includes a lightweight version of the same, where the kernel gets tainted, the warning is printed, but interrupt remapping is not actually disabled.
If a system is exhibiting the symptoms described above, it is recommended to first determine whether the system is equipped with one of the faulty chipsets.
This can be done with the following command :
# /sbin/lspci -nn | grep -qE '8086:(340[36].*rev 13|3405.*rev (12|13|22))' && echo "Interrupt remapping is broken"In case it outputs "Interrupt remapping is broken", continue below. Otherwise this document does not apply.
The quirk introduced in the linux kernel is merely a workaround to handle broken hardware.
Contact the hardware vendor and request a firmware update that addresses the problem in order to get the root cause fixed.
If a firmware update fix is not an option install the kernel version mentioned above (or later) on SLE 11 SP3 systems. For SLE 11 SP2 there will not be a kernel workaround.
On some systems it is also possible to disable interrupt remapping in the BIOS. Interrupt remapping however goes under different names. An example would be "Intel VT-d".
As a temporary workaround, the interrupt remapping can be disabled by adding
intremap=off
to the list of kernel commandline parameters in the boot loader configuration using the YaST bootloader module.
Interrupt remapping is mostly useful when using PCI pass-through in KVM-based virtualization scenarios. Customers not using KVM, or using KVM without PCI pass-through, can just disable interrupt remapping and be done with it. However, customers using PCI pass-through in KVM will additionally have to update their KVM configuration to no longer make use of PCI pass-through. Failing to update the KVM configuration would cause KVM guests to no longer come up.
Cause
Additional Information
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7014344
- Creation Date: 20-Dec-2013
- Modified Date:28-Sep-2022
-
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com