Handling failed NFS share in SUSE HA cluster for HANA system replication

This document (000019904) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server for SAP Applications 15
SUSE Linux Enterprise Server for SAP Applications 12

Situation

For HANA scale-out systems, the directory /hana/shared/<SID>/ is provided as NFS share to all nodes of a site. The directory contains binaries, tools and other components needed for running and monitoring the HANA database.

In case of NFS failure, HANA might stop working but the Linux cluster might not take action in a reasonable time.
Due to obligatory NFS for the directory /hana/shared/<SID>/, scale-out systems are affected more often than scale-up systems.

The Linux cluster could manage the filesystems as cluster resources. On the other hand, putting HANA filesystems under cluster control would massively increase complexity. Dependecies between HANA instances, nodes, and filesystems would need to be implemented by cluster constraints.

Resolution

The NFS share for /hana/shared/<SID>/ is mounted by the OS as usual.

A dummy filesystem resource /hana/shared/<SID>/check/ is added to the cluster. If this filesystem reports monitor failures, the node gets fenced and take-over is initiated. The monitor and action timeouts can be chosen shorter than HANA timeouts to get faster fail-over actions. By letting the Linux cluster fencing a node immediately on filesystem monitor failure, the take-over time can be decreased even further. Regular start and stop of the HANA is not affected by
this dummy filesystem resource.
Since a bind-mounted dummy filesystem is used as cluster resource, NFS shares of both sites could be monitored by the same common clone resource.

Shown below is an example for a scale-out HANA's /hana/shared/<SID>/check/ clone resource. Details like mount point and NFS options are depending on the particular environment. Of course, timeouts are always subject to tuning. The example assumes SID "ADA" and instance number "00", a dummy directory "check" is used for the bind mount, the majority maker node is "vm-majority".
Mandatory parameters are mount option "bind", monitor on-fail action "fence" and OCF_CHECK_LEVEL "20".

---
primitive rsc_fs_check_ADA_HDB00 Filesystem \
  params device="/hana/shared/ADA/check/" \
  directory="/hana/shared/check/" fstype=nfs4 \
  options="bind,defaults,rw,hard,proto=tcp,intr,noatime,vers=4,lock" \
  op monitor interval=120 timeout=120 on-fail=fence \
  op_params OCF_CHECK_LEVEL=20 \
  op start interval=0 timeout=120 \
  op stop interval=0 timeout=120

clone cln_fs_check_ADA_HDB00 rsc_fs_check_ADA_HDB00 \
  meta clone-node-max=1 interleave=true

location fs_check_not_on_majority_maker \
  cln_fs_check_ADA_HDB00 -inf: vm-majority
---

Possible side effects:

1. The filesystem monitor is not related to HANA system replication. Thus on NFS failure the Linux cluster might decide to fence an HANA primary, even if the system replication is not in sync. Due to this srHook=SFAIL state, the HANA secondary will not get promoted to primary. Even if this useless fence could happen due to HANA monitor failure
anyway, it is more likely to happen with the intentionally shorter NFS monitor timeouts.

2. In some environments NFS is used for /hana/data/<SID>/ and /hana/log/<SID>/ as well as for /hana/shared/<SID>/. In such cases usually all shares are provided by the same NFS server via the same network. If that server or network fails, all shares are affected. Thus the afore mentioned dummy resource will cover failures of all three HANA filesystem.

Cause

The SAPHana(Controller) RA checks HANA and SR status only, but not any infrastructure. Outages of infrastructure are not resolved as long as not reported by HANA. Field experience shows HANA in several cases does not report an failure but just stops working (like archiver stuck) for a long time. Depending on configuration it might take hours before a take-over is
initiated by the Linux cluster.

Status

Top Issue

Additional Information

ocf_heartbeat_Filesystem(7)
mount(8)
https://documentation.suse.com/sbp/all/single-html/SLES4SAP-hana-sr-guide-PerfOpt-15/
https://documentation.suse.com/sbp/all/single-html/SLES4SAP-hana-sr-guide-PerfOpt-12/

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.