SAP MaxDB failed to stop as well as unable failover to another remote site during the reboot /force reboot
This document (000021485) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise for SAP Applications 12
Situation
- In a two-node SAP MaxDB cluster, the database failed to stop and failover to another node during a node reboot. Upon node recovery, services consistently restarted on the same node.
- The SAP MaxDB resource entered a BLOCK state during reboot tests. Additionally, during unexpected system reboots, SAP MaxDB resources were unable to failover to another node and instead remained in a blocked state.
-
You may encounter the following error message during the reboot test:
[ERROR] Failed to start saphostagent.service: systemdI_sdbus_cmd: StartUnit '\''saphostagent.service'\'' - failed: Transaction for saphostagent.service/start is destructive (wickedd-dhcp6.service has '\''stop'\'' job queued, but '\''start'\'' is included in transaction).'
Resolution
To address the issue where saphostagent.service stops before the Pacemaker cluster, as noted in SAP Note 3139184 , you can create drop-in files to manage the service dependencies appropriately.
Here are the detailed steps to handle this situation:
Perform these steps on both cluster nodes.
1. Stop the Cluster on Both Nodes
# crm cluster stop --all
2. Create Drop-In Directory for saphostagent
# sudo mkdir -p /etc/systemd/system/saphostagent.service.d
3. Create Drop-In File for saphostagent
# sudo vi /etc/systemd/system/saphostagent.service.d/HA.conf
Add the following lines to the file:
[Service] Restart=no
4. Create Drop-In Directory for pacemaker
# sudo mkdir -p /etc/systemd/system/pacemaker.service.d
5. Create Drop-In File for pacemaker
# sudo vi /etc/systemd/system/pacemaker.service.d/00-pacemaker.conf
Add the following lines to the file:
[Unit] Description=pacemaker-dropin Wants=saphostagent.service After=saphostagent.service
6. Reload the Systemd Daemon
# sudo systemctl daemon-reload
7. Verify the Extended Link and Check if the extended link is created:
# systemd-delta | grep pacemaker
You should see something like:
[EXTENDED] /usr/lib/systemd/system/pacemaker.service → /etc/systemd/system/pacemaker.service.d/00-pacemaker.conf
8. Start the Cluster on Both Nodes
# crm cluster start --all
Explanation
Stopping the Cluster: Ensures no operations interfere with the changes.
Creating Drop-In Directories: Organizes configuration overrides.
Drop-In Files: Modifies service behavior to manage dependencies.
saphostagent.service: Prevents automatic restart.
pacemaker.service: Ensures saphostagent.service runs before and stops after pacemaker.service.
Reloading Daemon: Applies changes.
Verification: Confirms that the configuration is extended properly.
Starting the Cluster: Resumes cluster operations with new settings.
By following these steps, you can ensure that the saphostagent.service is managed correctly in relation to the Pacemaker cluster,
preventing it from stopping prematurely.
Cause
saphostagent.service
was killed before the cluster services stopped during the reboot action. As a result, the cluster resource was unable to stop the DB under cluster control.
Additional Information
- Manual pages systemd(1)
- 3139184 - Linux: systemd integration for sapstartsrv and SAP Host Agent
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000021485
- Creation Date: 08-Jul-2024
- Modified Date:31-Jul-2024
-
- SUSE Linux Enterprise High Availability Extension
- SUSE Linux Enterprise Server for SAP Applications
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com