SUSE Support

Here When You Need Us

Usage of crm report for SLES HAE

This document (7007262) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 15
SUSE Linux Enterprise Server 12
SUSE Linux Enterprise Server 11
SUSE Linux Enterprise Server 10

Situation

There was an incident within the cluster that needs to be investigated. SUSE Technical Support requested "crm report" (formerly hb_report) for analysis.

Resolution

crm report below refers to the crmsh (CRM Shell) crm subcommand used to generate a cluster report.
crm_report below refers to the output file generated by crm report. (crm_report works as a command, but like hb_report, it is deprecated with next generation crmsh.)

 

The "crm report" utility (previously hb_report) is an essential tool for finding challenges and issues in a SLES HAE Cluster. In a cluster context it is important to capture all log files and all configs of all cluster nodes at the time of an incident to be investigated. For this reason, other tools and ways are possible to use, but are not as efficient in a SLES HAE cluster context. 

For crm report to work as intended, it should gather all information from all nodes. See 'Additional information' section for details.  

Before uploading your data to Support, always double-check that the final "crm_report" file contains subdirectories with cluster node names and their respective data

 

Collecting cluster report 


 
If the SSH connection between your nodes is already configured and working (see 'Additional information' if the connection needs to be configured), you may run the crm report command on the cluster node of your choice. Running it only on one node is sufficient, because crm report will collect all the logs and cluster configurations from all the cluster members
 
crm report is also able to extract cluster history data from eg. rotated logs, that is it will inspect for example rotated /var/log/pacemaker/pacemaker.log-<date>.xz files and if logged events match defined time range, and it will copy the logged events into final crm_report file. Of course, if the rotated logs were already removed, it cannot collect any logged events
 

Example 1:  Collecting cluster report as root user 

 This example shows collection of the report as root user. Assuming there was an incident to investigate on 14.10.2024 16:45, the interesting data would be from this time and from some time above and before, to ensure we capture all information that might have led to this incident. 
 
The timeframe in question could be from 14.10.2024 00:00 to 14.10.2024 23:59.  It is also often helpful to force the resulting output filename to contain both the date and time it was generated.

The following is an example of such parameters on an crm report
 

# crm report -f "2024/10/14 00:00" -t "2024/10/14 23:59" /tmp/crm_report-$(date +"%Y%m%d-%H%M")


With the syntax above, the resulting file is created with name in this format:

drwxrwsr-x+ 4 sfsc-dlm suse      14 Oct 14 13:09 crm_report-20241014-1348
 

Example 2:  Collecting cluster report example as non-root user with sudo 

 
To collect the report as non-root user with sudo (see sudo configuration in ‘Additional information’) add ‘-u <non-root user>’ option. An example: 
 

sudo crm report -f "2024/10/14 00:00" -t "2024/10/14 23:59"  -u sadmin1 


 
An example to double-check the content of crm_report file: 

 

# ls crm_report* 
crm_report-Mon-14-Oct-2024.tar.bz2 

 

Checking whether pacemaker.log was collected for the time in question: 

 

# tar --wildcards -xOjf ./crm_report-Mon-14-Oct-2024.tar.bz2 crm_report-Mon-Oct-2024/*/pacemaker.log | sed '1b;$b;d' 
Oct 14 00:05:55 oldhanaa1 pacemaker-controld  [31997] (crm_timer_popped)        info: Cluster Recheck Timer (I_PE_CALC) just popped (900000ms) 
Oct 14 23:50:56 oldhanaa1 pacemaker-controld  [31997] (do_state_transition)     notice: State transition S_TRANSITION_ENGINE -> S_IDLE | input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd 

 

Checking whether ha-log.txt (ha-log.txt is the same as the messages file) was collected for the time in question: 

 

tar --wildcards -xOjf ./crm_report-Mon-14-Oct-2024.tar.bz2 crm_report-Mon-14-Oct-2024/*/ha-log.txt | sed '1b;$b;d' 
2024-10-14T00:00:01.007277+02:00 oldhanaa2 CRON[10297]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 1 1) 
2024-10-14T23:55:01.691802+02:00 oldhanaa2 CRON[21550]: (root) CMD ([ -x /usr/lib64/sa/sa2 ] && exec /usr/lib64/sa/sa2 -A) 



In the above output, we can see that we have ha-log.txt logged events only from one node, that is we are missing some data needed from other node for the analysis; thus a sysadmin has to restore messages file from the missing node for the time in question manually from a backup and provide it separately. 

Additional Information

If SSH connections between the cluster nodes which crm report uses has not yet been configured, there are two options to do this. Depending on your environment requirements, crm report tries to either gather data from other nodes either via SSH root access to other cluster nodes, or via defined user via sudo (see 'Running cluster reports without root access ' for details). 
  

Configuration to collect cluster report as root with root SSH access between cluster nodes 


 
Root SSH access between cluster nodes is configured by default if ha-cluster-bootstrap package was used for initial cluster deployment, or if YaST cluster module was used. 
 
If the cluster was setup manually or if SSH root access was removed or not working, it is best to setup SSH keys without password to enable the script to traverse the cluster without a sysadmin giving the root password three to four times, that is, for each and every node of the cluster. 

To setup SSH keys (we use RSA in this example) for this, the command to run as user root is: 

 

ssh-keygen -t rsa  


 
which will create the following two keys in the /root/.ssh/ directory: 

 

id_rsa 
id_rsa.pub

    

The public key has to be copied over to all remaining cluster nodes: 
 

ssh-copy-id OTHER_NODE 



and add this SSH key public part into local authorized keys file so crm report from other nodes would also work: 
 

cat /tmp/id_rsa.pub >> /root/.ssh/authorized_keys  



After this it is possible for root to ssh without password from one server to another. This should be done for each and every member of the cluster. 
 
If root SSH access is too benevolent for your needs, either try running cluster reports gathering without root (as described below) or try to see sshd_config(5) man page for 'Match' block which could be used to restrict access for a particular user.  

   

Configuration to collect cluster report without root user 


 
General documentation for collecting cluster report without root user is available at 'Running cluster reports without root access'. 
 
This option uses SSH agent forwarding and sudo. SSH agent forwarding allows connections from an authentication agent (such as ssh-agent(1)), meaning the use of a sysadmin's local SSH keys to login to a final node via a jumphost (in this case the jumphost is the cluster node where cluster report is collected and final node would be any remaining cluster node). 
 
An example of sudoers(5) definition (in this case a user in ‘sysadmin’ user group who has access to all cluster nodes via SSH with his own SSH key, needs to collect cluster report as non-root user): 
 

Host_Alias CLUSTER = node1, node2 
Runas_Alias R = root 
Defaults!HA_ALLOWED env_keep+=SSH_AUTH_SOCK 
Cmnd_Alias HA_ALLOWED = /usr/sbin/crm_report *, /usr/sbin/crm report * 
%sysadmins CLUSTER = (R) NOPASSWD: HA_ALLOWED


 
This sudo(8) definition needs to be present on all cluster nodes; it allows the user to preserve SSH_AUTH_SOCK environment variable (which points to UNIX socket used by SSH to obtain the keys from the SSH agent) while running crm report as root via sudo
 
The user wanting to collect cluster report without root account must ensure that SSH forwarding of connections from an authentication agent such as ssh-agent(1) is enabled, eg. with OpenSSH client 'ssh -A', with PuTTY ‘Allow agent forwarding’, is used while connecting to the node where cluster report collection will be run. 
 

What does the "hb" in the legacy hb_report utility stand for?

The Heartbeat project, on which the original Pacemaker cluster stack was developed.  Remnants of this legacy project name can still be found in Pacemaker with the "ocf::heartbeat" resource agents.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7007262
  • Creation Date: 25-Nov-2010
  • Modified Date:29-Oct-2024
    • SUSE Linux Enterprise High Availability Extension
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

tick icon

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

tick icon

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.

tick icon

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.