A funny thing happened today. Something we can all learn from.
In the last week, I've been helping some folks at work do some performance testing and tuning with MySQL. One group's problem seems to be solved. The other, however, was running into pretty poor performance. Today one of them IM'd me (is that a verb now?) with some concern. He was seeing swapping on the machine. And it was really slow.
After being interrupted by a few phone calls, I asked how much memory was in the box. 5GB he tells me. Okay, that should be more than sufficient. At that point we talked about his memory settings in MySQL. He had a reasonably sized innodb_buffer_pool. I think it was 1.5GB or so.
After a bit of thinking, I realized that there was something really wacky going on. There had to be. He sent me the output of top and it showed that mysqld was indeed using about 1.4GB of RAM. Not much else. Hmm.
That blew my only theory. I figured that there were some other random memory intensive processes running on the box. But no, nothing.
It was at this point that I was completely out of ideas. The data made no sense, so he was clearly not telling me something. Not because he was hiding information, but he simply wasn't seeing it and I was mostly relying on his descriptions..
So I got a login on the machine... and found the problem in about 45 seconds.
The machine had 512MB (or 0.5GB) of RAM, not 5GB. It swapping because, really, that's what he had told it to do.
I started by verifying the basic assumptions. I looked at what processes were running, how much disk space was on the box, how much physical RAM it had, and... that was it. I was done.
(If you think I'm picking on this guy or making fun of him, you're going to completely miss the point, so stop now and leave no comments please.)
The Moral of the Story
We've all been there before. You know, things simply don't make a damned bit of sense when you're debugging some weird ass problem or piece of code. That's when you really need a second set of eyes, ears, or both.
A tactic I've used before (when facing many strange problems in my code) is to bug someone else to come over so that I can explain to them how it works. Four times out of five, as I'm explaining it I figure out the bug. The other one time? The guy (or gal) I'm explaining it to finds some really stupid, basic thing I'm doing wrong. (Like a misreading memory info.)
We all do this.
These sanity checks (or something like them) are vital to figuring out computer-related problems. And I'm sure they're just as critical in so many other detail-oriented pursuits: science, engineering, medicine, detective work, and so on.
The biggest problem that I seem to have with them is not doing them soon enough.
Are there other sanity check strategies you've found useful? I'd love to hear about 'em...
Posted by jzawodn at January 12, 2004 09:51 PM