Notes from Brian's MySQL Replication and Clustering talk at the 2004 MySQL User's Conference...
3.23.xx had single threaded replication, 4.0 had dual-threads to implement read-ahead replication (trivia note: that was my idea). Understanding the control files: master.info, relay log, binary logs, etc.
Be careful of things that do not replicate properly, such as a UDF that generates random numbers. It's possible to have different storage engines on the master and slave.
Brian's slides are impossible to read if you're not in the first 4-6 rows. Doh!
When setting up replication, you need to make an initial snapshot of your master. Typical techniques covered (mysqldump, rsync/scp, mysqlhotcopy, etc). He doesn't mention mysqlsnapshot or flushing the binlog on the server (the common case). Weird.
Replication commands: SHOW SLAVE HOSTS, PURGE MASTER LOGS, RESET MASTER, SHOW MASTER STATUS, SHOW SLAVE STATUS, STOP SLAVE, START SLAVE, etc. When replication fails: mysqlbinlog, checking slave status, SET GLOBAL SQL_SLAVE_SKIP_COUNTER, etc. Hmm, he just confused removing the relay logs and removing the master.info file. Whoops.
Brian warns about shared-disk clustering with MyISAM tables. NFS bad. Very bad. You can do "clustering" with one master, many slaves. Works well for read-intensive applications that can tolerate a bit of latency. Not all can. Sometimes it's better to put up slaves that you can hammer on for reporting purposes, such as with real-time apache logging.
Now we're looking at what I call tiered replication (master to slave to many slaves). This keeps a burden off the master, but I'd argue the burden of replication is so low that it's a non-issue in most cases.
Multi-master replication. Hmm. Brian described a star replication topology that either can't work or he glossed over a couple of major points. I need to ask about that one. Circular replication is next. He didn't really warn about how fragile this is.
Okay, yeah. He did gloss over some stuff because MySQL can't do that out of the box.
Ahh, now he's explaining that replication is log-based. Should have done that much earlier on.
Replication in 5.0: it'll work with the cluster product correctly. All masters will get copies of the cluster inserts into their binlogs. Storage engine injectiion is coming too, meaning a custom engine can add stuff to the binlog. Row-based (or logic-based) replication will be coming as well, it seems.
Replication in 5.1: adding multiple threads to replication. That means a slave can have multiple IO threads. Hmm. Some details not clear. It seems to make use of multiple network paths. But I'm not sure how the binlogs get split out on the master.
More info on-line in the manual, mailing list, and so on.
Posted by jzawodn at April 15, 2004 06:48 AM