As noted in Disk Goes Boom, one of my colocated machines had a nasty disk failure. The disks arrived today. I hope to figure out how bad the damage is, replace the bad disk, and ship them back to get installed.

In the meantime, I've done something that I should have done a year ago. I installed the smartsuite package on my two remaining machines. It comes wtih a command-line tool named smartctl that provides various options for poking and prodding at your SMART aware disks. (You can read more about S.M.A.R.T. technology here.) It also comes with a daemon that keeps an eye on the health of your disks and puts messages in syslog to let you know what's up with them.

Now all I need to do is figure out which messages to watch for in syslog. Once I do, I'll setup a cron job to alert me if any problems show up.

Posted by jzawodn at February 07, 2003 10:38 PM

HiddenNetwork.com Banner

Reader Comments
# Gerald said:

Hey, linking to Active Smart - that's interesting. Their software is good, but it's running on MS Operating System and it is not free. Tested the trial version some weeks ago because of a harddisk failure reported by BIOS. Newer BIOSes have a SMART option, but it becomes bothering when alarm is switched on.

on February 9, 2003 01:45 PM
# Justin said:

Ha, you too ;)

I haven't had the failure yet, but I've been getting some scary clicking noises. smartctl rocks -- it told me not to worry too much, and make sure I make some backups. I'm paraphrasing, but basically it indicated minor issues and nothing serious....

on February 10, 2003 11:08 AM
# x said:

Nice!

on February 12, 2003 07:20 AM
# Bruce Allen said:

You might want to have a look at a more recent version of smartctl and smartd:
http://smartmontools.sourceforge.net/
Among others these let you run self-tests on the disk and monitor the results.

Debian versions are available -- see the URL above for links.

on February 25, 2003 06:32 AM
# Mike said:

Instead of monitoring the log with cron, why not just (from default)
edit
/etc/default/smartmontools
start_smartd=yes <--uncomment this only

then edit
/dev/smartd.conf

DEVICESCAN <--comment out
/dev/hda -H -m admin@example.com <-- add this line

then;
/etc/init.d/smartmontools restart

That way it silently checks the drives until something gives error.... If the drive has an error, you get an email about it.

on September 3, 2004 03:04 AM
# Mike said:

ooops, your blog busted with certian chars.

from default debian packages
edit /etc/default/smartmontools
uncomment start_smartd=yes
edit /dev/smartd.conf
comment out DEVICESCAN
add a line /dev/hda -H -m admin@example.com

on September 3, 2004 03:09 AM
# levitra said:

That way it silently checks the drives until something gives error.... If the drive has an error, you get an email about it.

on July 26, 2005 01:04 PM
Leave a Comment
Your Name (optional)


Your Email Address (required but won't be displayed on the site)


Your Weblog URL (no weblog? leave it blank)


Type "Jeremy" below (required)


Comment here. Stay on topic (policy). No HTML tags, sorry.


Remember Me



Disclaimer: The opinions expressed here are mine and mine alone. My future, past, or previous employers are not responsible for what I write here, the comments left by others, or the photos I may share. If you have questions, please contact me. Also, I am not a journalist or reporter. Don't "pitch" me.

 

Privacy: I do not share or publish the email addresses or IP addresses of anyone posting a comment here without consent. However, I do reserve the right to remove comments that are spammy, off-topic, or otherwise unsuitable based on my comment policy. In a few cases, I may leave spammy comments but remove any URLs they contain.