SUSE Support

Here When You Need Us

SAPHanaController monitor timeout leads to database restart

This document (000021249) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise for SAP Applications 12
SUSE Linux Enterprise for SAP Applications 15
 

Situation


In certain situations on a SAP HANA scale-out system replication cluster the
resource agent monitor call to "landscapeHostConfiguration.py" times out two
times in a row on one of the nodes. This results in a local restart of the
SAP HANA database at that site.

Resource agent monitor timeouts are logged on the cluster node similar to:

... SAPHanaController ... RA ==== begin action monitor_clone (0.180.0.0628.1823) ====
... SAPHanaController ... RA: HANA_CALL TIMEOUT after 120 seconds running command 'landscapeHostConfiguration.py --sapcontrol=1'
... SAPHanaController ... RA: landscapeHostConfiguration.py second TIMEOUT after 120 seconds
... SAPHanaController ... RA ==== end action monitor_clone with rc=1 (0.180.0.0628.1823) (265s)====

Pacemaker messages are logged on that node similar to:

... Processing failed monitor of rsc_SAPHanaCon_P42_HDB02:0 on hana85: unknown error
... Initiating demote operation rsc_SAPHanaCon_P42_HDB02_demote_0 on hana85
... Initiating stop operation rsc_SAPHanaCon_P42_HDB02_stop_0 on hana85
... Initiating start operation rsc_SAPHanaCon_P42_HDB02_start_0 on hana85

Pacemaker actions are logged on the designated coordinator similar to:

... Setting hana_p42_clone_state[hana85]: PROMOTED -> DEMOTED from hana85
... Setting hana_p42_clone_state[hana85]: DEMOTED -> UNDEFINED from hana85
... Setting hana_p42_clone_state[hana85]: UNDEFINED -> DEMOTED from hana85
... Setting hana_p42_clone_state[hana85]: DEMOTED -> PROMOTED from hana85

Resolution

Update the package SAPHanaSR-ScaleOut to version 0.185 or newer.

Cause

The issue is caused by HANA tools used to call landscapeHostConfiguration.py.
Those tools are heavily depending on infrastructure, like NFS and Directory
Services. Newer versions of the RAs SAPHanaController and SAPHanaTopology are
calling landscapeHostConfiguration.py directly. Thus, temporary infrastructure
problems have less impact on HA cluster monitor calls.

Note: If not only the tools are affected, but the HANA database itself,
the stop operation will fail. In that case the node would get fenced and
finally a takeover would be triggered.
 

Status

Reported to Engineering

Additional Information

See also:

Manual page ocf_suse_SAPHanaController(7), ocf_suse_SAPHanaTopology(7)

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:000021249
  • Creation Date: 26-Oct-2023
  • Modified Date:10-Jan-2024
    • SUSE Linux Enterprise Server for SAP Applications

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.