Pacemaker fails to start after running crm cluster init

This document (000021397) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise High Availability Extension 15 SP5
SUSE Linux Enterprise Server for SAP Applications 15 SP5

Situation

Pacemaker fails to start after running crm cluster init -y

sles15sp5-node1:~ # crm cluster init -y
INFO: Loading "default" profile from /etc/crm/profiles.yml
INFO: SSH key for root does not exist, hence generate it now
INFO: SSH key for hacluster does not exist, hence generate it now
INFO: Configuring csync2
INFO: Starting csync2.socket service on sles15sp5-node1
INFO: BEGIN csync2 checking files
INFO: END csync2 checking files
INFO: Configuring corosync (unicast)
WARNING: Not configuring SBD - STONITH will be disabled.
INFO: Hawk cluster interface is now running. To see cluster status, open:
INFO:   https://192.168.150.40:7630/
INFO: Log in with username 'hacluster', password 'linux'
WARNING: You should change the hacluster password to something more secure!
INFO: BEGIN Waiting for cluster
.............................................................                                                                                                                                           ERROR: FAIL Waiting for cluster
ERROR: cluster.init: Time out waiting for cluster.

Pacemaker fails to start because it depends on Corosync which has failed to start due to an error writing to /dev/shm/

sles15sp5-node1:~ # journalctl -u pacemaker -e --no-pager
Mar 11 14:32:39 sles15sp5-node1 systemd[1]: Dependency failed for Pacemaker High Availability Cluster Manager.
Mar 11 14:32:39 sles15sp5-node1 systemd[1]: pacemaker.service: Job pacemaker.service/start failed with result 'dependency'.

sles15sp5-node1:~ # journalctl -u corosync -e --no-pager
...text truncated for brevity...
Mar 11 14:32:38 sles15sp5-node1 corosync[1981]:   [QB    ] couldn't allocate file /dev/shm/qb-1981-2283-19-HKPoVX/qb-request-cfg-data: No space left on device (28)
Mar 11 14:32:38 sles15sp5-node1 corosync[1981]:   [QB    ] couldn't create file for mmap
Mar 11 14:32:38 sles15sp5-node1 corosync[1981]:   [QB    ] qb_rb_open:/dev/shm/qb-1981-2283-19-HKPoVX/qb-request-cfg: No space left on device (28)
Mar 11 14:32:38 sles15sp5-node1 corosync[1981]:   [QB    ] shm connection FAILED: No space left on device (28)
Mar 11 14:32:38 sles15sp5-node1 corosync[1981]:   [QB    ] Error in connection setup (/dev/shm/qb-1981-2283-19-HKPoVX/qb): No space left on device (28)
Mar 11 14:32:38 sles15sp5-node1 corosync[1975]: Starting Corosync Cluster Engine (corosync): [FAILED]
...text truncated for brevity...

The cause in this case is that a tmpfs file-system is not mounted at /dev/shm/

sles15sp5-node1:~ # grep -c /dev/shm /proc/mounts
0

Resolution

Mount the tmpfs file-system at /dev/shm/ or reboot the server then re-initialize or restart the cluster.

sles15sp5-node1:~ # mount -o nosuid,nodev,strictatime,mode=1777 -t tmpfs tmpfs /dev/shm
sles15sp5-node1:~ # grep /dev/shm /proc/mounts
tmpfs /dev/shm tmpfs rw,nosuid,nodev,inode64 0 0
sles15sp5-node1:~ # crm cluster init -y
INFO: Loading "default" profile from /etc/crm/profiles.yml
INFO: Configuring csync2
INFO: Starting csync2.socket service on sles15sp5-node1
INFO: BEGIN csync2 checking files
INFO: END csync2 checking files
INFO: Configuring corosync (unicast)
WARNING: Not configuring SBD - STONITH will be disabled.
INFO: Hawk cluster interface is now running. To see cluster status, open:
INFO:   https://192.168.150.40:7630/
INFO: Log in with username 'hacluster'
INFO: BEGIN Waiting for cluster
...........                                                                                                                                                                                             INFO: END Waiting for cluster
INFO: Loading initial cluster configuration
INFO: Done (log saved to /var/log/crmsh/crmsh.log)

Cause

A root user has mistakenly unmounted the tmpfs file-system normally mounted at /dev/shm/ by running umount -a or umount /dev/shm. When Corosync starts as a dependency of Pacemaker it writes files in /dev/shm/ that fit if /dev/shm/ is a tmpfs file-system but will not fit if /dev/shm/ is just a directory in the 4MiB sized devtmpfs file-system that is mounted at /dev/

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

Document ID:000021397
Creation Date: 11-Mar-2024
Modified Date:11-Mar-2024
- SUSE Linux Enterprise High Availability Extension
- SUSE Linux Enterprise Server for SAP Applications

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com