As noted in Disk Goes Boom, one of my colocated machines had a nasty disk failure. The disks arrived today. I hope to figure out how bad the damage is, replace the bad disk, and ship them back to get installed.
In the meantime, I've done something that I should have done a year ago. I installed the smartsuite package on my two remaining machines. It comes wtih a command-line tool named smartctl that provides various options for poking and prodding at your SMART aware disks. (You can read more about S.M.A.R.T. technology here.) It also comes with a daemon that keeps an eye on the health of your disks and puts messages in syslog to let you know what's up with them.
Now all I need to do is figure out which messages to watch for in syslog. Once I do, I'll setup a cron job to alert me if any problems show up.
Posted by jzawodn at February 07, 2003 10:38 PM