Disk latency may cause unwanted node fencing
This document (7011350) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise High Availability Extension 12
SUSE Linux Enterprise High Availability Extension 11
Situation
sbd: [18584]: WARN: Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: 0) sbd: [18584]: WARN: Latency: No liveness for 5 s exceeds threshold of 3 s (healthy servants: 0) sbd: [18584]: WARN: Latency: No liveness for 6 s exceeds threshold of 3 s (healthy servants: 0) sbd: [18585]: WARN: Latency: 6 exceeded threshold 3 on disk /dev/disk/by-id/dm-uuid-mpath-3600508b40007015738922001340000
The sbd partition metadata shows the following:
# /usr/sbin/sbd -d /dev/sdb1 dump ==Dumping header on disk /dev/sdb1 Header version : 2 Number of slots : 255 Sector size : 512 Timeout (watchdog) : 5 Timeout (allocate) : 2 Timeout (loop) : 1 Timeout (msgwait) : 10 ==Header on disk /dev/sdb1 is dumped
Resolution
hn1:~ # cat /etc/sysconfig/sbd SBD_DEVICE="/dev/sdb1;/dev/sdc1;/dev/sdd1" SBD_OPTS="-W" hn1:~ # sbd -1 10 -4 20 -d /dev/sdb1 -d /dev/sdc1 -d /dev/sdd1 create Initializing device /dev/sdb1 Creating version 2 header on device 3 Initializing 255 slots on device 3 Device /dev/sdb1 is initialized. Initializing device /dev/sdc1 Creating version 2 header on device 3 Initializing 255 slots on device 3 Device /dev/sdc1 is initialized. Initializing device /dev/sdd1 Creating version 2 header on device 3 Initializing 255 slots on device 3 Device /dev/sdd1 is initialized. hn1:~ # sbd -d /dev/sdb1 dump ==Dumping header on disk /dev/sdb1 Header version : 2 Number of slots : 255 Sector size : 512 Timeout (watchdog) : 10 Timeout (allocate) : 2 Timeout (loop) : 1 Timeout (msgwait) : 20 ==Header on disk /dev/sdb1 is dumped
Cause
You can increase this [watchdog] to 10 or even 20s (you need to recreate the sbd device for that, the timeouts are configured at creation time), but take care to adjust the msgwait timeout at the same time to approximately twice the watchdog timeout.
You can decrease the latency impact by adding SBD partitions. For example, if you have three SBD partitions, at least two of those devices would need to exceed the latency threshold before a self-fence would occur.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7011350
- Creation Date: 12-Nov-2012
- Modified Date:24-Aug-2022
-
- SUSE Linux Enterprise High Availability Extension
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com