SUSE Support

Here When You Need Us

Stonith unreliable (HEARTBEAT, COROSYNC)

This document (7007731) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 11
SUSE Linux Enterprise Server 11 Service Pack 1

Situation

The STONITH device in the cluster does not work reliable, resulting in cluster failure as the fencing mechanism fails.
Example entries in /var/log/messages would be
Jan 28 10:25:44 nlpihafsrv01 stonith-ng: [8071]: info: stonith_query: Found 0
matching devices for 'wlpihafsrv04'
Jan 28 10:25:44 wlpihafsrv03 stonith-ng: [7906]: info: stonith_query: Found 0
matching devices for 'wlpihafsrv04'
even so that the STONITH device was tested before and all settings checked.

Resolution

The stonith-timeout parameter is too low, probably around 10 seconds. It should be set to

stonith-timeout="60s"

Then the stonith will succeed and the cluster be stable.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7007731
  • Creation Date: 31-Jan-2011
  • Modified Date:03-Mar-2020
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.