Cluster reboots frequently without cause in logs (PACEMAKER)
This document (7018594) is provided subject to the disclaimer at the end of this document.
Environment
Situation
2017-01-04T14:29:38.217687-06:00 saturn kernel: [160693.230387] cgroup: fork rejected by pids controller in /system.slice/pacemaker.service
2017-01-04T14:36:49.590171-06:00 saturn kernel: [ 42.724175] cgroup: fork rejected by pids controller in /system.slice/pacemaker.service
2017-01-04T16:42:34.676537-06:00 saturn kernel: [ 41.473142] cgroup: fork rejected by pids controller in /system.slice/pacemaker.service
This only happens on SLES 12 SP2. Clusters with SLES 11 or SLES 12 or SLES 12 SP1 are not affected.
Resolution
https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12-SP2/#fate-320358
As the pacemaker cluster might operate a lot of resources and spawns a lot of lrmd processes this limit can be hit in some enviroments. As the release notes state, this is a possible limitation of an otherwise good default setting. This could be taken into account and planned before configuring the cluster.
To alleviate the issue the limit is simply increased, in
/etc/systemd/system.conf
the entry
#DefaultTasksMax=512
is changed to
DefaultTasksMax=8192
and to activate this setting
systemctl daemon-reload
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7018594
- Creation Date: 07-Feb-2017
- Modified Date:03-Mar-2020
-
- SUSE Linux Enterprise High Availability Extension
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com