Security Vulnerability: "L1 Terminal Fault" (L1TF) aka CVE-2018-3615, CVE-2018-3620 & CVE-2018-3646.
This document (7023077) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server 12
SUSE Linux Enterprise Server 11
Situation
Researchers have found that during speculative execution, pagetable address lookups do not honor pagetable present and other reserved bits, so that speculative execution could read memory content of other processes or other VMs if this memory content is present in the shared L1 Datacache of the same core.
The issue is called "Level 1 Terminal Fault", or short "L1TF".
(Cedarview, Cloverview, Lincroft, Penwell, Pineview, Slivermont, Airmont, Merrifield)
- The Core Duo Yonah variants (2006 - 2008)
- The XEON PHI family
- Processors which have the ARCH_CAP_RDCL_NO bit set in the IA32_ARCH_CAPABILITIES MSR.
If the bit is set the CPU is also not affected by the Meltdown vulnerability.
- OS level: CVE-2018-3620
- VMM level: CVE-2018-3646
- SGX enclave level: CVE-2018-3615
SUSE's mitigations cover the OS and VMM levels.
Note that this requires the memory being loaded into L1 datacache from another process / VM, which is hard to control for an active attacker.
Resolution
Microcode updates
Install the latest Microcode updates from Intel to enable the L1 Datacache flush features. These are either provided by SUSE in the "ucode-intel" or "microcode_ctl" packages, and/or via your BIOS / System vendor.The CPU microcode releases from Intel for addressing Spectre v4 in the last months already contain this feature.
SUSE provides "ucode-intel" and "microcode_ctl" packages containing the Intel provided CPU microcode bundles, and versions 20180703 and newer, also already contain this feature.
The CPU flag is shown in /proc/cpuinfo as "flush_l1d" flag when available.
Linux Kernel updates
Linux Kernel updates are needed to provide both protection from local users, and also to enable the protections in "KVM" hypervisor built into the Linux kernel.SUSE has published or is in the progress of publishing fixed kernels for all maintained distributions.
Cause
Additional Information
Linux Kernel updates
After installing and booting the Linux kernel update, a new sysfs variable will show the state of available mitigations:/sys/devices/system/cpu/vulnerabilities/l1tf
The following values can appear on non-VMX guest systems :
Not affected
The CPU is not affected by this problem
Mitigation: PTE Inversion
The in-kernel protection is active
(Note that this basic mitigation is always active, and has no performance impact.)
VMX: SMT vulnerable
SMT is enabled and vulnerable
VMX: SMT disabled
SMT is disabled
Also the state of the microcode protection level is shown :
L1D vulnerable
L1D flushing is disabled
L1D conditional cache flushes
L1D flush is conditionally enabled, flushing conditionally after exiting VMs
L1D cache flushes
L1D flush is unconditionally enabled, flushing unconditionally before entering VMs
L1D EPT disabled
L1D flush is disabled, as Extended Page Tables are either not present or disabled
"Mitigation: PTE Inversion; VMX: SMT vulnerable, L1D conditional cache flushes"
XEN Hypervisor updates
Mitigation control on the kernel boot command line
Linux Kernel boot command line options are available to control the aspects:The main boot option is "l1tf":
l1tf=offDisables the L1TF mitigations and emits no warnings.(Note: this option only controls the mitigation for the VMM side of the flaw)l1tf=fullThis enables all mitigations for L1TF, including disabling SMT (Simultaneous Multithreading).SMT control is still possible after boot using sysfs variables. A warning will be enabled if unsafe VMs are started.l1tf=full,forceSame as "full", but SMT control is not possible after boot.l1tf=flushThis leaves SMT enabled and enable the conditional hypervisor mitigations. Hypervisors will emit a warning when the first VM is started in an unsafe configuration.l1tf=flush,nosmtThis disables SMT and enables the conditional hypervisor mitigations. Hypervisors will emit a warning when the first VM is started in an unsafe configuration.l1tf=flush,nowarnThis leaves SMT enabled and does not enable hypervisor mitigations. Hypervisors will not emit a warning when the first VM is started.The setting "flush" is the current default on upstream and SUSE kernels.Mitigation control for KVM
- Controlling the L1 datacache flush behaviour is possible with the "kvm-intel.vmentry_l1d_flush" option:
kvm-intel.vmentry_l1d_flush=always
The L1D cache is flushed on every VMENTER.
kvm-intel.vmentry_l1d_flush=cond
The L1D cache is flushed on VMENTER only when there can be leak of host memory between VMEXIT and VMENTER. This could still leak some host data, like address space layout.
kvm-intel.vmentry_l1d_flush=never
Disables the L1D cache flush mitigation.The default setting here is "cond".
The l1tf "full" setting overrides the settings of this configuration variable.
- Controlling use of Extended Page Tables is possible with the "kvm-intel.ept" option:
kvm-intel.ept=0
The Extended Page tables support is switched off.
SUSE recommends to leave this enabled, but instead use the L1D cache flush and SMT mitigations.
- SMT control options:
On the kernel boot commandline:
nosmtSMT is disabled, but can be later re-enabled in the system.nosmt=forceSMT is disabled, and can not be re-enabled in the system.
If this option is not passed, SMT is enabled. Again, the "l1tf" option overrides this option.
Changing the system settings of SMT states is also possible, using specific sysfs control files:
/sys/devices/system/cpu/smt/control
This file allows to read the current control state and allows to disable or (re)enable SMT.
Possible states are :on
SMT is supported and enabled.
off
SMT is supported, but disabled. Only primary SMT threads can be onlined.
forceoff
SMT is supported, but disabled. Further control is not possible.
notsupported
SMT is not supported.
Potential values that can be written into this file:
on
off
forceoff
/sys/devices/system/cpu/smt/active
This file contains the state of SMT, if it is enabled and active, where active means that multiple threads run on 1 core.
Linux Kernel Live Patches
SUSE is mitigating CVE-2018-3646 with live patches, which adds support for the L1D cache flush method, defaulting to "conditional".
When the system does not have current L1D cache flush support available in the CPU, it will flush the L1 cache by reading 64kb of data.
Additionally, the SMT / hyperthread enablement guidance mentioned above applies, since full mitigation with untrusted guests is only achieved by disabling SMT at this time.
Per default SMT is being kept enabled after the livepatch.
For disabling SMT, SUSE provides a script called "klp-kvm-l1tf-ctrl-smt" in the "kgraft" (for SUSE Linux Enterprise 12) and "kernel-livepatch-tools"
Disabling SMT :Please also have a look at the 'man' page of klp-kvm-l1tf-ctrl-smt.
klp-kvm-l1tf-ctrl-smt -d
Enabling SMT :
klp-kvm-l1tf-ctrl-smt -e
Guidance :
No virtualization
The Linux kernel update will fully mitigate the issue.Virtualization with trusted guests
If the guest OS can be trusted and runs an updated kernel, the system is protected against l1tf and needs no further actions.Virtualization with untrusted guests
If SMT is not supported by the processor, or disabled in the BIOS, or by the kernel, only flushing the L1 Datacache is required when switching between VMs.If SMT is supported and active, the following scenarios are possible :
- Guests can be confined to single or groups of cores not shared with other guests.
While this reduces the attack surface greatly, interrupts or kernel threads could still run on those cores in parallel with malicious code, and data used by those could be exposed to attackers.
- Additionally
to isolating guests to single or groups of cores interrupt handling can
reduce the attacker surface, but its still possible that kernel threads
run on those cores.
- Only disabling SMT and enabling L1 Datacache flushes provides maximum protection.
Addendum :
The message above may be observed in dmesg output, indicating the hardware has too much memory and the presumed L1TF mitigation is not in effect.
The problem is however more likely to be seen on hardware where memory is placed at wrong physical ranges, rather than actually having too much memory.
The workaround depends on the physical address limit on the CPU.
grep physical /proc/cpuinfoWhen the returned limit contains "36 bits", the following kernel boot parameter can be added
mem=32GNote :The most commonly encountered physical address limit, will likely be 36b, but other limits may be encountered too.
Background :
L1TF mitigation relies on using the upper bits of the physically addressable memory range, and therefore the system cannot have any memory placed at the upper half of the physically addressable memory. This means that on platforms with 36b address limit (64GB) only the low [0, 32GB] physical address range can be populated.
There are platforms, however, that place memory above the low address range, and then the L1TF mitigation is disabled by default.
This will however effectively chop off any memory residing above 32GB physical address. How much memory is lost depends on the exact memory layout.
Please note :
Nehalem and certain other micro architectures (uarchs) might report a smaller physical address limit, while they internally support 44 bits. SUSE will be back-porting a patch for this issue, and release this in an upcoming maintenance update.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7023077
- Creation Date: 11-Jun-2018
- Modified Date:03-Mar-2020
-
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com