SUSE Support

Here When You Need Us

Cluster node fence as SAPHanaTopology fails with error code 1 (OCF_ERR_GENERIC) during a normal cluster stop

This document (000020964) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server for SAP Applications 12
SUSE Linux Enterprise Server for SAP Applications 15

Situation

SAP HANA cluster node is getting fenced during a normal cluster stop (#crm cluster stop) due to a stop operation error as SAPHanaTopology resource fails with error code 1 (OCF_ERR_GENERIC). The cluster logs will show:
 

#/var/log/messages

hana02 pacemakerd[15396]:  notice: Caught 'Terminated' signal
hana02 pacemaker-execd[15399]:  notice: rsc_SAPHanaTopology_HA1_HDB00 stop (call 83, PID 29963) exited with status 1 (execution time 2740ms, queue time 0ms)
hana02 pacemaker-controld[15402]:  notice: Result of stop operation for rsc_SAPHanaTopology_HA1_HDB00 on hana02: error
hana01 pacemaker-schedulerd[13601]:  warning: Scheduling Node hana02 for STONITH
hana01 pacemaker-schedulerd[13601]:  notice: Stop of failed resource rsc_SAPHanaTopology_HA1_HDB00:1 is implicit after hana02 is fenced


#/var/log/pacemaker/pacemaker.log

hana02 pacemaker-execd     [15399] (log_finished)  notice: rsc_SAPHanaTopology_HA1_HDB00 stop (call 83, PID 29963) exited with status 1 (execution time 2740ms, queue time 0ms)
hana02 pacemaker-controld  [15402] (process_lrm_event)  notice: Result of stop operation for rsc_SAPHanaTopology_HA1_HDB00 on hana02: error | rc=1 call=83 key=rsc_SAPHanaTopology_HA1_HDB00_stop_0 confirmed=true cib-update=96
hana01 pacemaker-schedulerd[13601] (unpack_rsc_op_failure) warning: Unexpected result (error) was recorded for stop of rsc_SAPHanaTopology_HA1_HDB00:1 | rc=1 id=rsc_SAPHanaTopology_HA1_HDB00_last_0
hana01 pacemaker-schedulerd[13601] (pe_fence_node) warning: Cluster node hana02 will be fenced: rsc_SAPHanaTopology_HA1_HDB00:1 failed there

Resolution

A maintenance update has been released with a fix. Update SAPHanaSR to the latest version or ensure the SAPHanaSR version is at the following version or higher:
 
SUSE versionSAPHanaSR version
SLES12 (SP4, SP5)SAPHanaSR-0.162.1-3.29.1
SLES15 (SP1, SP2, SP3, SP4) SAPHanaSR-0.162.1-150000.4.31.1

Cause

This regression has been introduced on the following releases of SAPHanaSR package:
 
SUSE versionSAPHanaSR version
SLES12 (SP4, SP5)SAPHanaSR-0.162.0-3.26.1
SLES15 (SP1, SP2, SP3, SP4)SAPHanaSR-0.162.0-150000.4.28.1

Status

Reported to Engineering

Additional Information

OCF_ERR_GENERIC (1) is generic error. A resource agent uses this exit code only when none of the more specific error codes describes the problem. The cluster resource manager interprets this exit code as a soft error. This means that unless specifically configured otherwise, the resource manager will attempt to recover a resource which failed with OCF_ERR_GENERIC in-place — usually by restarting the resource on the same node. However, a stop operation failure has on-fail=fence as default when STONITH is enabled (and block otherwise), thus leading to a cluster node fence. 

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000020964
  • Creation Date: 04-Feb-2023
  • Modified Date:06-Feb-2023
    • SUSE Linux Enterprise Server for SAP Applications

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.