Failed cluster actions in crm_mon
This document (7012145) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server 11
Situation
A cluster resource has failed to start on all nodes.
# /usr/sbin/crm_mon -r -1
--snip--
Online: [ node4 node5 node6 node19 ]
Full list of resources:
STONITH_SBD (stonith:external/sbd): Started node19
Clone Set: ctdb-clone [ctdb]
ctdb:0 (ocf::heartbeat:CTDB): Started node4 FAILED
ctdb:1 (ocf::heartbeat:CTDB): Started node5 FAILED
ctdb:2 (ocf::heartbeat:CTDB): Started node6 FAILED
Resource Group: firewall_group
External_IP (ocf:heartbeat:IPaddr): Started on node6
FW_Rules (lsb:iptables): Started on node6
netmon (ocf:heartbeat:ethmonitor): Started on node6
Resource Group: apache2
webip (ocf:heartbeat:IPaddr): Started node2
websrv (ocf:heartbeat:apache): Stopped
Failed actions:
ctdb:0_monitor_10000 (node=node4, call=155, rc=1, status=complete): unknown error
ctdb:1_monitor_10000 (node=node5, call=135, rc=1, status=complete): unknown error
ctdb:2_monitor_10000 (node=node6, call=499, rc=1, status=complete): unknown error
netmon_monitor_7000 (node=node4, call=45, rc=-2, status=Time Out): unknown exec error
websrv (node=node4, call=15, rc=5, status=complete): not installed
websrv (node=node5, call=12, rc=5, status=complete): not installed
websrv (node=node6, call=16, rc=5, status=complete): not installed
Resolution
From the Failed actions: list above, notice that netmon only failed on node4. The crm shell allows you to reduce cluster communication by cleaning up the resource only on that node. The other resources failed on all nodes and need to be cleaned up on all nodes, so the node option was left off of the crm shell command.
# crm resource cleanup ctdb-clone
# crm resource cleanup netmon node4
# crm_resource cleanup apache2
Cause
Additional Information
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7012145
- Creation Date: 15-Apr-2013
- Modified Date:03-Mar-2020
-
- SUSE Linux Enterprise High Availability Extension
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com