August 26, 2003

gzip vs. bzip2

I've been doing a bit of compression performance testing related to some possible MySQL development (think "better compressed MyISAM tables") and was shocked at the difference between gzip and bzip2.

Given a MyISAM data file of roughly 2,661,512 (or 2.5GB), I compressed it using both gzip and bzip2 using their respective -9 options to achieve maximal compression and timed each. I did this twice. I then decompressed the resulting file to stdout and sent the results to /dev/null and timed that too. The times are in mm:ss and the size is in KB.

 comp timecomp. sizedecomp time
gzip14:31349,7360:55
bzip239:44275,3449:46

Needless to say, I was blown away by the results. It's clear that bzip2 produces smaller compressed files but it does so at a very big cost--especially if you're thinking of using it in an application that requires frequent decompression. It's one thing to compress files and not look at them again for a few years. But we're talking about compressed tables that'd see lots of use.

Wow.

What about myisampack and MySQL's compressed tables? We tried that already. The resulting file is 921,888KB (900MB). We need to do quite a bit better than that.

Posted by jzawodn at 04:26 PM

Yahoo! News RSS Feeds Launched

This has been in the works for a while and it's finally up for for real. Visit http://news.yahoo.com/rss for details.

RSS is alive and well at Yahoo. Watch for more in the future. :-)

Congrats and thanks to Jeff and team for making it happen!

Posted by jzawodn at 02:01 PM