Using mdadm to send e-mail alerts for RAID failures
This document (7001034) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Desktop 10
SUSE Linux Enterprise Server 10 Service Pack 1
SUSE Linux Enterprise Server 10
Situation
Mdadm is a command line utility that can be used to create, manage, and monitor Linux software RAID devices.
This TID will explain how to use mdadm to monitor and report issues with a software raid configuration in SLE Linux. This document is not intended to explain software raid setup in SLE Linux. The setup steps for mdadm are for use after a system has an active software raid setup.
Steps for setting up e-mail alerting of errors with mdadm:
E-mail error alerting with mdadm can be accomplished in several ways:
- Using a command line directly
- Using the /etc/mdadm.conf file to specify an e-mail address
Fail, FailSpare, DegradedArray, and TestMessage
Specifying an e-mail address using the mdadm command line
Using the command line simply involves including the e-mail address in the command. The following explains the mdadm command and how to set it up so that it will load every time the system is started.
mdadm --monitor --scan --daemonize --mail=jdoe@somemail.com
The command could be put /etc/init.d/boot.local so that it was loaded every time the system was started.
Verification that mdadm is running can be verified by typing the following in a terminal window:
ps aux | grep mdadm
Specifying an e-mail address using the mdadm.conf file
Using mdadm with the /etc/mdadm.conf file is very similar to the command line, except that the e-mail address is included in the mdadm.conf file. The following is an example of an mdadm.conf file:
#~~~~~~~~~~~~ Sample mdadm.conf file ~~~~~~~~~~~~~~~~~~~~~~~~
DEVICE partitions
ARRAY /dev/md0 level=raid1 UUID=1e60d34a:2900a5a6:016ce23d:edbe1177
ARRAY /dev/md1 level=raid1 UUID=b9db4840:b9f19361:ed0112d1:74f6071a
ARRAY /dev/md2 level=raid1 UUID=f6135aa0:dc21f04e:24d4c1e1:4fe7b596
MAILADDR jdoe@somemail.com
#~~~~~~~~~~~~ end of file ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The lines beginning with # were added for this documentation.
Utilizing the /etc/mdadm.conf would simplify the command line and make it look like this:
mdadm –monitor –scan –daemonize
This command could be added to the /etc/init.d/boot.local so that mdadm ran every time the system was started.
NOTE: It has been found that mdadm will not send an e-mail if the DEVICE partitions section does not exist in the /etc/mdadm.conf file. If those sections do not exist a new /etc/mdadm.conf file can be created by using the following command:
mdadm –detail –scan > /etc/mdadm.conf
The MAILADDR line could then be added as well.
Running an external program when an event occurs
Another option provided with the /etc/mdadm.conf file is to run an external application when an error is detected.
An example application could be something as simple as a script that causes messages to popup on the screen when an event occurs. The following script is one example:
NOTE: The following script is for example purposes only and is NOT supported by SUSE.
#!/bin/bash
#
# mdadm RAID health check
#
# Events are being passed to xmessage via $1 (events) and $2 (device)
#
# Setting variables to readable values
event=$1
device=$2
# Check event and then popup a window with appropriate message based on event
if [ $event == "Fail"];then
xmessage "A failure has been detected on device" $device
else
if [ $event == "FailSpare"]; then
xmessage "A failure has been detected on spare device" $device
else
if [ $event == "DegradedArray"]; then
xmessage "A Degraded Array has been detected on device" $device
else
if [ $event == "TestMessage"]; then
xmessage "A Test Message has been generated on device" $device
fi
fi
fi
fi
#~~~~~~~~~~~~~~~~~~~~~~~~~ End of Script ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To add an external program simply add the following line to the /etc/mdadm.conf file:
PROGRAM /etc/raid-events
Where /etc/raid-events is the file that contains the script listed above. Ensure that the file is also marked as executable.
Testing the configuration to ensure that e-mails are sent
After everything has been setup you can verify that the e-mail alerts are sent and can be received by running mdadm in test mode. This can be accomplished by doing the following:
- Open a terminal window and type su to login as root
- type mdadm --monitor --scan --test
An e-mail should be received for each arrary device listed in the /etc/mdadm.conf file.
If e-mails are not received the /var/log/mail* files can be used to help debug why the failure occurred. The most common cause is that the e-mail address is being blocked by the receving gateway.
Another item to check is to ensure the postfix is installed on the system as mdadm uses postfix to send out the e-mails.
Resolution
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7001034
- Creation Date: 25-Jul-2008
- Modified Date:12-Mar-2021
-
- SUSE Linux Enterprise Desktop
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com