SAPHanaSR HANA system replication automation without layer 2 network
This document (000020333) is provided subject to the disclaimer at the end of this document.
Environment
Situation
The HANA's virtual IP address and virtual hostname is managed by Domain Name Service (DNS). The respective DNS infrastructure allows Dynamic DNS updates with short ttl.
Resolution
The virtual hostname mapping is changing between two virtual IP addresses, one for each site. A virtual IP address is set only if the HANA is a functional primary database. Once the HANA primary gets stopped, the virtual IP address will be removed.
Obvioulsy this solution is depending on DNS infrastructure and well behaving clients. In any case you should expect updating DNS to be by factors slower than updating ARP. So this configuration is intended to address the lack of common layer 2 network in a specific environment. It is not meant as general concept for metro-clusters.
For the outlined solution, an HA cluster CIB resource configuration example looks like below. Shown are the resources dnsupdate and IPAddr2 and their constraints. This resource configuration is replacing the usual configuration for IPAddr2 and alike. The resources SAPHana, SAPHanaTopology and stonith_sbd are not shown. They are configured as usual.
--- # # dnsupdate and IPAddr2 at site_1 # primitive rsc_ip_HNA_vm1 IPAddr2 \ params ip=10.0.1.121 cidr_netmask=32 \ op monitor interval=10 timeout=20 # primitive rsc_dnsupdate_HNA_vm1 dnsupdate \ params hostname=hna.my.site ip=10.0.1.121 ttl=60 \ keyfile="/etc/ddns/update.key" server=10.0.0.1 serverport=53 \ unregister_on_stop=true \ op monitor timeout=30 interval=20 \ op_params depth=0 # group grp_ip_HNA_vm1 \ rsc_dnsupdate_HNA_vm1 rsc_ip_HNA_vm1 \ meta resource-stickiness=1 # location loc_ip_on_master_HNA grp_ip_HNA_vm1 \ rule -inf: hana_hna_roles ne 4:P:master1:master:worker:master # location loc_ip_not_on_vm2 \ grp_ip_HNA_vm1 -inf: vm2 # # dnsupdate and IPAddr2 at site_2 # primitive rsc_ip_HNA_vm2 IPAddr2 \ params ip=10.0.2.122 cidr_netmask=32 \ op monitor interval=10 timeout=20 # primitive rsc_dnsupdate_HNA_vm2 dnsupdate \ params hostname=hna.my.site ip=10.0.2.122 ttl=60 \ keyfile="/etc/ddns/update.key" server=10.0.0.1 serverport=53 \ unregister_on_stop=true \ op monitor timeout=30 interval=20 \ op_params depth=0 # group grp_ip_HNA_vm2 \ rsc_dnsupdate_HNA_vm2 rsc_ip_HNA_vm2 \ meta resource-stickiness=1 # location loc_ip_on_master_HNA grp_ip_HNA_vm2 \ rule -inf: hana_hna_roles ne 4:P:master1:master:worker:master # location loc_ip_not_on_vm1 \ grp_ip_HNA_vm2 -inf: vm1 # ---
Requierements and Limitations
- The solution depends on the DNS server outside the cluster. This server needs to allow Dynamic DNS changes for host entries. It also needs to allow short Time-To-Live (ttl). See man page ocf_heartbeat_dnsupdate(7) and nsupdate(8) for needed features.
- Allowing Dynanimc DNS updates to HA cluster nodes needs to comply with the relevant security standards.
- The DNS ttl needs to be aligned with expected recovery time objective (RTO). That means around 30-60 seconds which might increase load on DNS server. Even a ttl of 30 seconds is by factors more than the usual ARP update done by IPAddr2.
- Applications might ignore ttl, but cache hostnames. In that case applications might get stuck in tcp_retries2.
- It is neccessary to carefully test whether the configuration works in a given evironment. Particularly the behaviour of DNS infrastructure and clients needs to be observed.
- A failed stop action of the dnsupdate resource will cause a node fence.
- The solution is meant for the SAP HANA system replication scale-up performance-optimized scenario. Due to complexitiy active/active read-enabled is not targeted.
- Any administrative takeover of the HANA primary database should follow a described procedure. This procedure needs to rule out the risk of duplicate HANA primary. It further should ensure that all clients will follow the takeover.
- Due to this requirements and limitations the solution is not a direct replacement for existing HA concepts, which are based on moving a single IP address. The solution based on Dynamic DNS is merely an automated disaster recovery (DR) solution for specific environments.
- Please contact SUSE services or your responsible support provider before implementing this solution.
Additional Information
- ocf_heartbeat_dnsupdate(7)
- ocf_heartbeat_IPAddr2(7)
- nsupdate(8)
- dig(1)
- named.conf(5)
- https://tools.ietf.org/html/rfc3007
- https://documentation.suse.com/sle-ha/15-SP1/html/SLE-HA-all/cha-ha-geo-ip-relocation.html
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020333
- Creation Date: 14-Jul-2021
- Modified Date:16-Jul-2021
-
- SUSE Linux Enterprise Server for SAP Applications
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com