Notes from Brad's Live Journal talk at the 2004 MySQL User's Conference...

Brad walks thru the evolution of LiveJournal from a small college project with one server to a large load-balanced multi-machine, replicated, and load balanced backend. The evolution looks like the sort of thing I've seen play out many times now.

He eventually hit the write wall on replication, so had to go from single master and many slaves to smaller clusters. Each user lives in their own cluster, which contains a master and several slaves. There was still a single "global cluster" that's used to map users to clusters. This scales much better.

This caused a few problems. Auto-increments were no longer globally unique. Each user needed their own number space. That makes it hard to migrate users to other clusters. They used multi-column primary keys to get around this.

LJ is always fighting a battle against I/O or CPU bottlenecks. They split MyISAM tables into multliple tables and databases to get a bit more concurrency.

Amusing machine names--mostly from South Park and various meats.

Eventually moved to a master-master setup to reduce the impact of having a so many single points of failure. This makes a lot of their maintenance easier. But they have to be careful of conflict resolution.

They use Akamai for static content (images, css, etc).

LiveJournal uses both InnoDB and MyISAM, picking the right one for the job and trying not to use any features specific to either one of 'em. Brad recommends designing for both. (I agree in some cases and disagree in others...)

Email done via Postfix + MySQL rather than static DBM files that need to be rebuilt. Each Postfix server gets its own MySQL install.

LJ logs apache in MySQL too. They use MyISAM and INSERT DELAYED with a big delayed insert buffer. Their proxy boxes use mod_proxy+mod_rewrite and mod_rewrite talks to an external program that's picking a server based on how busy the back end servers are.

Brad is writing a proxy in C# (for fun, apparently) to replace some of this, maybe.

Uses DBI::Role to get db handles in Perl.

LJ relies pretty heavily on caching nowadays. None of the stuff in MySQL was quite what they needed, so they built memcached. Used by LJ, Slashdot, Wikipedia, others use it now. Original version in Perl, now written in C. Lots of O(1) operations inside make it quite fast. The client can do multi-server parallel fetching (kick ass!). They run multiple instances on boxes with more than 4GB RAM. They have a 90-93% hit rate on the cache.

Sniffing the logs using Net::Pcap rather than having stop/start MySQL just to toggle logging. (Nice!)

Brad's slides are on-line.

Posted by jzawodn at April 14, 2004 11:45 AM

Reader Comments
# Christopher Schmidt said:

This is one of the nicest summaries I've seen yet of the current state of LJ's backend in writing. I knew all this from working on the site for so long, but it's sometimes hard to put it all into a short, concise form. Brad's slides are good, but personally, reading them just hurts my head.

Just wanted to say "Nice job."

on April 19, 2004 01:42 PM
Disclaimer: The opinions expressed here are mine and mine alone. My current, past, or previous employers are not responsible for what I write here, the comments left by others, or the photos I may share. If you have questions, please contact me. Also, I am not a journalist or reporter. Don't "pitch" me.

 

Privacy: I do not share or publish the email addresses or IP addresses of anyone posting a comment here without consent. However, I do reserve the right to remove comments that are spammy, off-topic, or otherwise unsuitable based on my comment policy. In a few cases, I may leave spammy comments but remove any URLs they contain.