Preventing a Fence Race in Split Brain (COROSYNC,PACEMAKER)
This document (7022467) is provided subject to the disclaimer at the end of this document.
Environment
Situation
Node1 sees Node2 gone and fences
Nov 20 15:17:40 [117052] node1 cib: notice: crm_update_peer_state_iter: Node node2 state is now lost | nodeid=168364360 previous=member source=crm_update_peer_proc
Nov 20 15:17:41 [117056] node1 pengine: warning: pe_fence_node: Node node2 will be fenced because the node is no longer part of the cluster
Node2 sees Node1 gone and fences at the same time
Nov 20 15:17:40 [16727] node2 cib: notice: crm_update_peer_state_iter: Node node1 state is now lost | nodeid=168364359 previous=member source=crm_update_peer_proc
Nov 20 15:17:41 [16731] node2 pengine: warning: stage6: Scheduling Node node1 for STONITH
the resulting effect is, that both nodes fence each other. While Data Integrity is maintained this results in a complete loss of all services.
Resolution
pcmk_delay_max=<Seconds>
which, in case of an IPMI Device could look like
primitive brie_stonith_ducal stonith:external/ipmi \
params pcmk_delay_max=20 hostname=ducal ipaddr=10.162.192.209 userid=admin passwd=xxxx interface=lanplus \
op monitor interval=1800 timeout=20
this will make it more likely, that one fencing device will have a delay. It is at that moment irrelevant which node fences which node, as there is no way for a Cluster without Quorum to determine the right node to be fenced.
pcmk_delay_base=<Seconds>
Additional Information
params pcmk_delay_base=0 ...
params pcmk_delay_base=36 ...
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7022467
- Creation Date: 18-Dec-2017
- Modified Date:23-Feb-2021
-
- SUSE Linux Enterprise High Availability Extension
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com