As noted in Disk Goes Boom, one of my colocated machines had a nasty disk failure. The disks arrived today. I hope to figure out how bad the damage is, replace the bad disk, and ship them back to get installed.

In the meantime, I've done something that I should have done a year ago. I installed the smartsuite package on my two remaining machines. It comes wtih a command-line tool named smartctl that provides various options for poking and prodding at your SMART aware disks. (You can read more about S.M.A.R.T. technology here.) It also comes with a daemon that keeps an eye on the health of your disks and puts messages in syslog to let you know what's up with them.

Now all I need to do is figure out which messages to watch for in syslog. Once I do, I'll setup a cron job to alert me if any problems show up.

Posted by jzawodn at February 07, 2003 10:38 PM

Reader Comments
# Gerald said:

Hey, linking to Active Smart - that's interesting. Their software is good, but it's running on MS Operating System and it is not free. Tested the trial version some weeks ago because of a harddisk failure reported by BIOS. Newer BIOSes have a SMART option, but it becomes bothering when alarm is switched on.

on February 9, 2003 01:45 PM
# Justin said:

Ha, you too ;)

I haven't had the failure yet, but I've been getting some scary clicking noises. smartctl rocks -- it told me not to worry too much, and make sure I make some backups. I'm paraphrasing, but basically it indicated minor issues and nothing serious....

on February 10, 2003 11:08 AM
# x said:

Nice!

on February 12, 2003 07:20 AM
# Bruce Allen said:

You might want to have a look at a more recent version of smartctl and smartd:
http://smartmontools.sourceforge.net/
Among others these let you run self-tests on the disk and monitor the results.

Debian versions are available -- see the URL above for links.

on February 25, 2003 06:32 AM
# Mike said:

Instead of monitoring the log with cron, why not just (from default)
edit
/etc/default/smartmontools
start_smartd=yes <--uncomment this only

then edit
/dev/smartd.conf

DEVICESCAN <--comment out
/dev/hda -H -m admin@example.com <-- add this line

then;
/etc/init.d/smartmontools restart

That way it silently checks the drives until something gives error.... If the drive has an error, you get an email about it.

on September 3, 2004 03:04 AM
# Mike said:

ooops, your blog busted with certian chars.

from default debian packages
edit /etc/default/smartmontools
uncomment start_smartd=yes
edit /dev/smartd.conf
comment out DEVICESCAN
add a line /dev/hda -H -m admin@example.com

on September 3, 2004 03:09 AM
# levitra said:

That way it silently checks the drives until something gives error.... If the drive has an error, you get an email about it.

on July 26, 2005 01:04 PM
Disclaimer: The opinions expressed here are mine and mine alone. My current, past, or previous employers are not responsible for what I write here, the comments left by others, or the photos I may share. If you have questions, please contact me. Also, I am not a journalist or reporter. Don't "pitch" me.

 

Privacy: I do not share or publish the email addresses or IP addresses of anyone posting a comment here without consent. However, I do reserve the right to remove comments that are spammy, off-topic, or otherwise unsuitable based on my comment policy. In a few cases, I may leave spammy comments but remove any URLs they contain.