I spent a fair amount of time of Friday trying figure out why our FreeBSD servers running MySQL 4.0.2 were doing so much better than our Linux servers running MySQL 4.0.2. They're all slaves of the same 3.23.51 master and get roughly equal query loads, thanks to our Alteon load-balancers (yes, the ones that occasionally stop working right).
What I noticed while watching each of them with mytop is that the Linux boxes seem to have far more slow queries than the FreeBSD boxes. Now the FreeBSD boxes in question are newer. They're Compaq DL-380s with dual 1.2 GHz CPUs, 2GB of RAM, and 6 36GB SCSI disks. The Linux boxes are a bit older and slower. But the difference was still surprising. Over the last 24 hours, the FreeBSD boxes had each logged 3 slow queries, while the Linux boxes had logged a few thousand of them. Clearly something was up.
So I got on the boxes and noticed something odd. The load average on the Linux machines was higher than I'd expect. Rather than being in the 0.5 - 2.0 range, it was hitting between 7 and 9 during busy times. Odd. I ran top for a while to see if I noticed anything odd. Sure enough, after a few minutes, I found the pattern. The kswapd process was using up a fair amount of CPU time--sometimes as much as 99% of one CPU.
It gets more interesting. Both Linux boxes have swap disabled. It's been that way ever since I got sick of dealing with the 2.4 kernel's brain-dead virtual memory system last year. Why would kswapd even be running on a system with no swap? I have no idea.
But I decided to do some research and see if anyone had seen this before. The closest I got was this message on the linux-kernel mailing list, a complain by MySQL AB's own Sascha Pachev.
He noted similarly odd behavior and asked that Rik look into it. Unfortunately, I haven't been able to find any follow-up messages.
So I went back to looking at the configuration on the machines in question. Both have 2GB of RAM, roughly half of which is for MySQL. I have the key_buffer set to 512M as well as the innodb_buffer_pool. That leaves 1GB for the OS cache, buffers, and related stuff. It should be more than enough, shouldn't it?
Just for the heck of it, I backed both values down to 384M and restarted MySQL. After an hour or so, things began to look bleak again. Lots of slow queries and the kswapd process (actually a kernel thread) was getting more CPU time than I'd like. It was at this point that I really began to marvel at the situation. The FreeBSD VM subsystem never does stupid things like this. In fact, our MySQL/FreeBSD boxes rarely swap unless I do something really stupid. How can the one in Linux be this much worse? Beats me.
Anyway, even more frustrated, I decided to re-enable swap reboot the machine. At this point, I had little to lose. Once it came back up and I got MySQL started, things looked okay. kswapd wasn't as busy, and there were fewer slow queries. In fact, after 1 day and 9 hours, the server has only logged 66 slow queries. But according to top there's about 47MB of swap in use. The resident size of mysqld is 736MB, while it's overall size is 816MB. Apparently the kernel swapped out part of the buffer pool for InnoDB or the MyISAM key buffer.
I guess that extra gig of memory isn't enough for it.
I fail to understand what it's doing. But the machine seems to perform better with swap enabled. The only theory I've developed so far goes like this: With swap disabled, the kernel (being very stupid), goes looking for pages that it can swap out. It finds them but cannot swap them to disk. Next time around, it repeats this process, never realizing how futile it is. With swap finally enabled, it can swap out some memory and get the breathing room that it thinks it needs.
If anyone has hints on how this can be tuned (like telling the kernel not to bother), I'd LOVE to hear about it.
Linux may have FreeBSD beat when it comes to threading, but it sure could learn a lot from FreeBSD when it comes to virtual memory management.
Update #2: Allow me to respond from some feedback that I've seen so far. First off, we've been running 2.4.18 for quite a while now. We started with 2.4.9, tried 2.4.12 and 2.4.16. There's only so much time I can spend switching kernel versions and re-testing. Now that 2.4.19 is out, we'll give it a shot.
A few folks have suggested that since FreeBSD is the best tool for the job, I should just shut up and use it. If only that was the case. I'll post another entry in a few days detailing the problems with running a high-volume MySQL server on FreeBSD. It has issues of it's own, mostly related to FreeBSD's poor threads implementation.
Thanks for all the feedback so far. Some of it looks promising. The flames, however, are simply ignored.
Posted by jzawodn at August 04, 2002 02:42 AM