Making snapshots of cluster nodes in ESXi enviroment safely
This document (000020853) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise High Availability Extension 12 SP5
SUSE Linux Enterprise High Availability Extension 12 SP4
Situation
It is possible to create backups of cluster nodes on ESXi environment using snapshot feature in ESXi.
However this scenario needs to follow the limitations of VM configuration that VMWare has described here:
https://kb.vmware.com/s/article/2151774
The key point in VMWare documentation is that shared storage used in the cluster needs to be excluded outside of snapshot.
If the cluster is used for SAP HANA purpose VMWare states following :
"A note on VMware snapshots: in contrast to VMware clones, which are exact copies of a virtual machine, a VMware snapshot represents the state of a virtual machine at the time it was taken, and it might negatively affect the performance of the virtual machine. This is based on how long it has been in place, the number of snapshots taken, and how much the virtual machine and its guest operating system have changed since the time it was taken. When a snapshot should be taken; for instance, before installing a new SAP HANA patch, take the snapshot and do not select the option “Snapshot the virtual machine’s memory”. The general recommendation is that you shut down the SAP HANA VM prior a snapshot being created."
This has been documented in page 46 of sap_hana_on_vmware_vsphere_best_practices_guide-white-paper.pdf
Resolution
Pre steps before initiating the snapshot:
snaphost01:~ # crm status Cluster Summary: * Stack: corosync * Current DC: snaphost01 (version 2.0.5+20201202.ba59be712-150300.4.21.1-2.0.5+20201202.ba59be712) - partition with quorum * Last updated: Fri Nov 11 13:29:31 2022 * Last change: Fri Nov 11 13:29:28 2022 by root via cibadmin on snaphost01 * 2 nodes configured * 2 resource instances configured Node List: * Online: [ snaphost01 snaphost02 ] Full List of Resources: * stonith-sbd (stonith:external/sbd): Started snaphost01 * very-important-resource01 (ocf::heartbeat:Dummy): Started snaphost02 snaphost01:~ # crm node standby snaphost02 snaphost01:~ # ssh snaphost02 Last login: Fri Nov 11 13:08:25 2022 from 172.16.171.11 snaphost02:~ # crm cluster stop INFO: Cluster services stopped on snaphost02 snaphost02:~ # exit logout Connection to snaphost02 closed. snaphost01:~ # crm status Cluster Summary: * Stack: corosync * Current DC: snaphost01 (version 2.0.5+20201202.ba59be712-150300.4.21.1-2.0.5+20201202.ba59be712) - partition with quorum * Last updated: Fri Nov 11 13:32:24 2022 * Last change: Fri Nov 11 13:31:57 2022 by root via crm_attribute on snaphost01 * 2 nodes configured * 2 resource instances configured Node List: * Node snaphost02: OFFLINE (standby) * Online: [ snaphost01 ] Full List of Resources: * stonith-sbd (stonith:external/sbd): Started snaphost01 * very-important-resource01 (ocf::heartbeat:Dummy): Started snaphost01 snaphost01:~ #
Now the snaphost02 node is ready for making the snapshot .
Post steps after making the snapshot:
snaphost01:~ # ssh snaphost02 Last login: Fri Nov 11 13:32:08 2022 from 172.16.171.11 snaphost02:~ # crm cluster start INFO: BEGIN Starting pacemaker(delaying start of sbd for 10s) INFO: END Starting pacemaker(delaying start of sbd for 10s) INFO: Cluster services started on snaphost02 snaphost02:~ # crm node online snaphost02 INFO: online node snaphost02 snaphost02:~ # exit logout Connection to snaphost02 closed. snaphost01:~ # crm status Cluster Summary: * Stack: corosync * Current DC: snaphost01 (version 2.0.5+20201202.ba59be712-150300.4.21.1-2.0.5+20201202.ba59be712) - partition with quorum * Last updated: Fri Nov 11 13:38:42 2022 * Last change: Fri Nov 11 13:38:35 2022 by root via cibadmin on snaphost02 * 2 nodes configured * 2 resource instances configured Node List: * Online: [ snaphost01 snaphost02 ] Full List of Resources: * stonith-sbd (stonith:external/sbd): Started snaphost01 * very-important-resource01 (ocf::heartbeat:Dummy): Started snaphost01 snaphost01:~ #
Snapshot has now been created and cluster is back online and fully functional without any hickups .
Similar actions can be done for other node.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020853
- Creation Date: 17-Nov-2022
- Modified Date:17-Nov-2022
-
- SUSE Linux Enterprise High Availability Extension
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com