SAP HANA Cluster – automated OS patching with SUSE Manager and Salt states

Share
Share

The following article has been contributed by Bo Jin, Sales Engineer at SUSE and Linux Consultant.

 

 

Challenge and motivation

SUSE Linux Enterprise Server for SAP Applications is not just a great product for running SAP workloads. SUSE also provides best practice guides for building up reliable SAP HANA SR high availability clusters by using SUSE Linux Enterprise Server for SAP as a solution.

Customers using clusters often struggle with patching SAP systems running SUSE Linux Enterprise Server in a Pacemaker cluster. The main reason is the need to reboot the operating system (OS) after a kernel patch installation. Although SUSE Live Patching is being used, you still need to patch the OS once in a while with all patches within the scheduled “maintenance window”. During the maintenance window, as an SAP Basis administrator, you need to run cluster commands to move, stop and start resources.

But what if you could automate the OS patching by using SUSE Manager and Salt states while keeping SAP HANA downtime short?

I have developed several Salt execution and state modules which interact with the Pacemaker cluster configuration and management tools crm, crm_mon and the SAPHanaSR-showAttr  command in order to query the cluster status.

These Salt modules will be used in Salt states which in turn enable a fully automated patching process for SAP HANA SR scale-up clusters.

 

The solution in brief

The  SUSE Best Practices guide SAP HANA System Replication Scale-Up – Performance Optimized Scenario describes the maintenance of a cluster in quite details. The steps, if not automated, must be executed manually. The Salt states, modules, runners and reactors that I have developed and that are described here have been integrated to exactly follow the best practice instructions.

Some of the “golden rules” of working with Pacemaker clusters I strictly follow are:

  • “Never change a cluster if the cluster state is not IDLE”
  • “Don’t change or configure an SAP HANA master-slave cluster resource if the system replication status is not SOK.”

Based on these rules, the patch workflow has been tested as described below.

 

The patching workflow

The following section explains the patching workflow at a glance. For a two-node SAP HANA scale-up cluster, the step patch diskless node is not needed, and you can continue with the primary node.

 

Stage 1: Patch secondary site

  • Execute Salt execution to all member nodes of the cluster:

# salt "hana-*" state.apply myhana

  • The Salt module will detect the node roles as primary and secondary – and diskless_node in case of a three-node cluster.
  • Start with the secondary node.
  • The SAP HANA SR scale-up cluster master-slave resource will be set into maintenance mode.
  • The secondary node will be patched and rebooted.
  • After the secondary node has been restarted, Pacemaker will be started.
  • The master-slave resource will be activated (unset maintenance mode).

 

Stage 2: Patch diskless node (optional)

  • Start patching the diskless_node in case of a diskless setup.
  • diskless_node will be rebooted after patching.

 

Stage 3: Patch primary site

  • Re-discover the node roles as primary, secondary and diskless_node in case of a three-node cluster.
  • Execute Salt states on the primary node.
  • Move the master-slave resource to the other node which is secondary at the moment.
  • The SAP HANA SR scale-up cluster master-slave resource will be set into maintenance mode.
  • The old primary node will be patched and rebooted.
  • After the old primary node has been restarted, Pacemaker will be started.
  • Clear the pacemaker cli-ban location constraint so that this node can be used again as new secondary site.
  • The master-slave resource will be activated (unset maintenance mode).
  • The old primary has become new secondary.
  • Now you are finished 😀.

The workflow uses Salt reactor and requisite systems. 

  • Requisites: The Salt requisite system is used to create relationships between states. This provides a method to easily define inter-dependencies between states.
  • Reactors: Salt’s Reactor system allows Salt to trigger actions in response to an event.

Feel free to adjust the reactor and requisites to map the workflow steps to your needs.

 

High level architecture

 

Salt modules for SAP Hana System Replication scale-up cluster

My colleagues from SUSE development created a great set of Salt execution modules, which is called salt-shaptools, that allows us to setup and configure new SAP HANA and NetWeaver clusters. In order to automate the patching of the cluster nodes, I have developed a few additional Salt modules that use crm , crm_mon and SAPHanaSR-showAttr  to query SAP HANA cluster resources and nodes status prior to patching the OS.

These execution modules are:

  • bocrm.check_if_maintenance
  • bocrm.check_if_nodes_online
  • bocrm.check_sr_status
  • bocrm.delete_cli_ban_rule
  • bocrm.find_cluster_nodes
  • bocrm.get_dc
  • bocrm.get_msl_resource_info
  • bocrm.if_cluster_state_idle
  • bocrm.is_cluster_idle
  • bocrm.is_quorum
  • bocrm.move_msl_resource
  • bocrm.off_msl_maintenance
  • bocrm.pacemaker
  • bocrm.patch_diskless_node
  • bocrm.set_msl_maintenance
  • bocrm.set_off_msl_maintenance
  • bocrm.set_on_msl_maintenance
  • bocrm.start_pacemaker
  • bocrm.stop_pacemaker
  • bocrm.sync_status
  • bocrm.wait_for_cluster_idle

 

SUSE Manager in action

In order to create patch and reboot jobs, I also created Salt runner module scripts which call the SUSE Manager API. The main advantage of using SUSE Manager instead of calling the Salt state directly via the cmd state module using cmd.run is that, for audit and compliance reasons, we can keep track of records about the patch jobs. These runner modules are:

  • checkjob_status.py
  • patch_hana.py
  • reboot_host.py

 

More information

More detailed information about the SaltStack configurations, modules and states which I have created for a fully automated patching of SAP HANA Database Scale-up clusters can be found in my GitHub repository at https://github.com/bjin01/salt-sap-patching which is licensed under GPL v3.0. Long live Salt, SUSE Manager and Pacemaker 😀 !

Share
(Visited 39 times, 1 visits today)
Avatar photo
7,087 views
Meike Chabowski Meike Chabowski works as Documentation Strategist at SUSE. Before joining the SUSE Documentation team, she was Product Marketing Manager for Enterprise Linux Servers at SUSE, with a focus on Linux for Mainframes, Linux in Retail, and High Performance Computing. Prior to joining SUSE more than 20 years ago, Meike held marketing positions with several IT companies like defacto and Siemens, and was working as Assistant Professor for Mass Media. Meike holds a Master of Arts in Science of Mass Media and Theatre, as well as a Master of Arts in Education from University of Erlangen-Nuremberg/ Germany, and in Italian Literature and Language from University of Parma/Italy.