One HAE node fails to start at boot with openais showing help screen
This document (7011300) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise High Availability Extension 11 (HAE)
Split Brain Detection (SBD) Partitions
Situation
The /etc/sysconfig/sbd files do not match on all nodes in the cluster. The following configuration details were found:
<snip>
Starting OpenAIS/Corosync daemon (corosync): Starting SBD - Shared storage fencing tool.
Syntax:
sbd <options> <command> <cmdarguments>
Options:
<snip/>
Node1 /etc/sysconfig/sbd
SBD_DEVICE="/dev/sdd1;/dev/sdc1;/dev/sdb1"
SBD_OPTS="-W"
Node 2 /etc/sysconfig/sbd
SBD_DEVICE="/dev/sdb1;/dev/sdc1;/dev/sdd1"
SBD_OPTS="-W"
The HAE Cluster Information Base stonith sbd resource configuration:
<primitive class="stonith" id="stonith-sbd" type="external/sbd">
<instance_attributes id="stonith-sbd-instance_attributes">
<nvpair id="stonith-sbd-instance_attributes-sbd_device" name="sbd_device" value="/dev/sdb1;/dev/sdc1;/dev/sdd1"/>
</instance_attributes>
</primitive>
Resolution
1. Fix the list of devices and their order.
# cat /etc/sysconfig/sbd
SBD_DEVICE="/dev/sdb1;/dev/sdc1;/dev/sdd1"
SBD_OPTS="-W"
2. Confirm that the shared device names listed in /etc/sysconfig/sbd exist with the same device path on all nodes, and are the same physical media (ie confirm /dev/sdb1, /dev/sdc1 and /dev/sdd1 exist on all nodes and all nodes have the same device path for each. In other words, one device cannot be /dev/sdc1 on node1 and /dev/sdf1 on node2).
3. Confirm that the same /etc/sysconfig/sbd exists on all nodes. Copy the /etc/sysconfig/sbd from step one above to all nodes in the cluster.
# scp /etc/sysconfig/sbd hn2:/etc/sysconfig/
4. Reformat the SBD partition on each devices listed:
# sbd -d /dev/sdb1 -d /dev/sdc1 -d /dev/sdd1 create
5. Reboot one node in the cluster. When it comes back online, reboot another node. Repeat the process until each node in the cluster has been rebooted.
The corrected configuration files would look like this.
Node1 /etc/sysconfig/sbd
SBD_DEVICE="/dev/sdb1;/dev/sdc1;/dev/sdd1"
SBD_OPTS="-W"
Node 2 /etc/sysconfig/sbd
SBD_DEVICE="/dev/sdb1;/dev/sdc1;/dev/sdd1"
SBD_OPTS="-W"
The HAE Cluster Information Base stonith sbd resource configuration would look like either of the following:
<primitive class="stonith" id="stonith-sbd" type="external/sbd">-OR-
<instance_attributes id="stonith-sbd-instance_attributes">
<nvpair id="stonith-sbd-instance_attributes-sbd_device" name="sbd_device" value="/dev/sdb1;/dev/sdc1;/dev/sdd1"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="stonith-sbd" type="external/sbd" />
Cause
The order of the devices and the device names are important. It is a safe practice to modify the /etc/sysconfig/sbd on one node, and then always copy it to all other nodes in the cluster. The CIB database must have the same device list in its sbd_device parameter list. If the CIB sbd_device parameter list is missing, the cluster will use the /etc/sysconfig/sbd devices.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7011300
- Creation Date: 02-Nov-2012
- Modified Date:12-Oct-2022
-
- SUSE Linux Enterprise High Availability Extension
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com