Using smartmontools to detect impending hard disk failure
This document (7004508) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server 10
SUSE Linux Enterprise Desktop 11
SUSE Linux Enterprise Desktop 10
SUSE Linux Enterprise Server 9
Situation
The smartd daemon has reported drive errors in /var/log/messages.
You notice errors after running the smartctl --all command on one or more disk devices.
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 132089 0 0 132089 132089 850.315 0
write: 0 0 0 0 0 415.707 0
Non-medium error count: 90680
After running supportconfig -o SMART you see disk errors in the fs-smartmon.txt file.
Resolution
Additional Information
WARNING: To prevent system hangs from buggy devices, smartd is turned off by default or smartmontools is not installed at all. Please test smartd manually first before turning it on via the Runlevel Editor or by /sbin/chkconfig -add smartd.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7004508
- Creation Date: 23-Sep-2009
- Modified Date:06-Mar-2021
-
- SUSE Linux Enterprise Desktop
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com