Hanging processes due to CPU throttling
This document (000021525) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server 12 SP5
SUSE Linux Enterprise Server 15 SP2
SUSE Linux Enterprise Server 15 SP3
Situation
This can occur on systems that configure CPU cgroup throttling, e.g. via systemd's CPUQuota= directive or kubernetes CPU limits. (Note: text from Environment)
Further analysis of the situation can be done with a debugger looking at the state of runqueues and throttled lists. Such as what we do on crashdump analysis or what describes this commit on live systems.
Resolution
- Upgrade to SUSE Linux Enterprise Server 15 SP4 or newer.
- Reconsider CPU throttling configuration, quota <= 5%*nr_cpus is susceptible to starvation on affected kernels. (nr_cpus is system-wide or cpuset restriction of the throttled cgroup)
- Use cpuset to restrict CPU consumption. Namely, setting cpuset with a single CPU eliminates throttling unfairness in principle.
Cause
In particular, the implementation of CPU throttling is not entirely fair historically when it comes to running a workload on multiple CPUs -- the quota is divided among the CPUs in slices (which are smaller than the quota itself), and under certain conditions, some CPUs may not receive any portion of the of the quota, effectively preempting anything that is supposed to run on such a CPU within a CPU-restricted cgroup. The behavior of the scheduler was eventually fixed in the SUSE Linux kernels, but the changes are too intrusive to be backported to older kernels.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000021525
- Creation Date: 09-Aug-2024
- Modified Date:28-Aug-2024
-
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com