April 18, 2008

High Performance MySQL, Second Edition

While my involvement can generously be labeled as "minimal", the second edition of High Performance MySQL is slated to hit store shelves soon.

High Performance MySQL, second edition
pre-order on Amazon.com

More info is available on O'Reilly.

Thanks to Baron, Peter, Vadim, and Arjen for picking up the torch to get a greatly expanded seconded edition done and out the door. There's a heck of a lot of new material in it.

Posted by jzawodn at 07:47 AM

January 16, 2008

Congrats to MySQL AB and Sun!

As long-time readers here know, I've been a fan of MySQL for many, many years--so much that I even wrote a book about it. It's a fantastic product that's unmatched in many ways. Over the years, I've had the pleasure of getting to know many of the early folks at MySQL AB, including the Monty and David (co-founders), Marten (CEO), numerous engineers (Brian, Jim, Mark), sales folks (Kerry, Larry), and so on.

So I was very happy to see the announcement this morning that Sun is Buying MySQL AB. Sun is a great company that really gets Open Source and is making some big and very smart bets on it for their future. I think it's a great home for MySQL.

There's some smart thinking going on over at Sun these days. That Jonathan guy has a good head on his shoulders. :-)

Congrats again to everyone involved!

Posted by jzawodn at 09:50 AM

January 09, 2007

COEUM Software: Spammers!

Normally I ignore the spam that makes it thru to my inbox. It's usually pretty easy to pick out by reading the subject and sender's address (or name). I'll just mark it as spam and go on with life. But this particular gem came thru a few days ago with a subject of "MySQL Performance and Tuning...", a topic that's been near and dear to my heart.

coeum software: spammers
click to see full-sized copy

So you can imagine how pissed I was to open the message, only to be confronted by one of those annoying large image-looking spam messages.

Worst of all is their insulting language at the bottom of the spam:

If you no longer wish to receive these emails...

It's written to imply that there was a point in history when I did wish to receive them.

Fuck that.

If you're looking for real MySQL Performance Consulting (which I used to do), let me know. I can refer you to a non-sleazy company or two. COEUM clearly doesn't deserve your business.

Posted by jzawodn at 05:07 PM

July 10, 2006

Head MySQL Geek Needed at Yahoo!

It seems like only yesterday that Jeremy Cole came to Yahoo! to take the job I vacated to join the Yahoo! Search team.

Well, he's out on his own now and we need some serious MySQL geekage around here.

The official job description looks like this:

Yahoo!'s Platform Engineering group is looking for a MySQL expert to provide consulting, training, and internal support for MySQL and data storage technologies. You will be working with teams to help them understand how MySQL may fit into their applications, making internal releases of custom MySQL binaries from source, analyzing database performance, and helping others to tune their hardware and software settings.
An ideal candidate has designed distributed and high-performance application architectures, is familiar with MySQL replication and load balancing, and knows the features, pros, and cons of MySQL's storage engines. Familiarity with Perl DBI, PHP, and MySQL administration tools is also required.
A source-level understanding of MySQL helpful but not required.
A BS/MS in Computer Science or equivalent and 4+ years experience with MySQL and Unix (FreeBSD/Linux) is required.

Shoot me a resume if you're interested.

Posted by jzawodn at 02:22 PM

February 15, 2006

Oracle Acquisitions are not about MySQL

I've been thinking about this for the last day or so and have come to the conclusion that Oracle's acquisition of Sleepycat Software (and Berkeley DB) is not about MySQL. Even when combined with their previous purchase of Innobase Oy (and InnoDB), it's not about MySQL.

With all due respect to Phil Windley (and Gadgetopia), you're wrong. Oracle is thinking much bigger and more strategically than "put the squeeze on MySQL."

Trying to put MySQL out of business would be a fairly short-term tactical move. I think Oracle is looking 5 years down the road and seeing what the world looks like as the commoditization of enterprise scale infrastructure software components continues. They're seeing that they "own" a progressively smaller piece of that pie unless they act soon. The rumors of Oracle eyeing JBoss and others are completely in line with this thinking.

If Oracle can become a one-stop shop for folks building the next generation of big business applications, whether or not they use "traditional" Oracle software, the company manages to stay relevant in the new world--and that includes their lucrative consulting services.

Is this reminiscent of IBM's approach to Linux circa 2001? It sure is.

Think bigger guys. Oracle's not just a database company and hasn't been for years.

Now, they could still end up putting the squeeze on MySQL along the way. But I suspect that'd be a happy byproduct of larger moves they're making.

What do you think?

Posted by jzawodn at 07:22 AM

February 14, 2006

Oracle Buys Berkeley DB, Sleepycat Software

Wow, the rumors were true. Oracle is snapping up Open Source Database companies now. First it was Innobase (see Oracle buys Innobase. MySQL between rock and hard place?) and now it's Sleepycat Software.

The purchase of Sleepycat, which has been rumored for weeks, gives Oracle another open-source product to complement its proprietary database offerings. At an investor conference last week, Oracle CEO Larry Ellison reiterated the company's strategy to generate revenue from a combination of open-source and proprietary software.

They produce and support the famed Berkeley DB embedded database engine and have radically improved it's features since the version 1.x days. Nowadays you get a small, fast, transactional database engine with industrial grade reliability and replication.

It's interesting to note that MySQL's first transactional storage engine (BDB) was created on top of Berkeley DB. Their more popular transactional storage engine (InnoDB) is built on top of technology produced by Innobase, which Oracle bought last year.

This leads to the obvious question: What is Oracle up to? Are they trying to do to Open Source Databases what Yahoo appears to be doing to Web 2.0 companies?

There's been speculation of a master plan at Oracle that involves buying up various bits of the Open Source infrastructure used in building applications. Is JBoss next, as some have suggested?

We'll see.

Posted by jzawodn at 07:50 AM

January 30, 2006

Someone AJAXified mytop!

ajax mytop Check this out. Someone has built and AJAX powered version of mytop, the little console based MySQL monitoring tool I wrote years ago. I guess it's now buzzword compliant.

If you can't wait to get your hands on it, head over to the AjaxMyTop project on SourceForge.

Everything old is new again. :-)

Posted by jzawodn at 12:31 PM

December 02, 2005

MySQL 5.1 Appears: Partitioning, Dynamic Plugins, and more...

The first public release of MySQL 5.1 is available now. The MySQL 5.1.3 alpha release is a developer preview that gives early adopters, fans, and hard-core database geeks a chance to kick the tires of the next big release.

Major new features include:

Partitioned tables. You can have single tables spread across multiple disks based on how you define them at table creation time.

A Plugin API and support for dynamically loading new modules of code into the server. The first example of this is pluggable full-text parsers. That means you'll be able to write a custom parser to index any sort of oddball textual data you might want to store and retrieve. MySQL still handles the details of executing the queries, so you need only worry about the specifics of parsing your data.

The instance manager has been beefed up with additional SHOW commands for getting at information about log files. You can also issue SET commands that change configuration options which get written back to the configuration file.

VARCHAR fields on cluster tables really are VARCHAR fields now.

And there's lots more. The MySQL documentation, as always, has the gory details.

Posted by jzawodn at 07:16 PM

October 07, 2005

Oracle buys Innobase. MySQL between rock and hard place?

As reported in several sources (Slashdot, InfoWorld, AP on Yahoo, Reuters), Oracle has acquired Innobase Oy for an undisclosed sum of money. This appears to be a strategic move by Oracle to put MySQL between a rock and hard place.

Innobase is the company that provides the underlying code for the InnoDB storage engine in MySQL. It's the de-facto choice for developers who need high concurrency, row-level locking, and transactions in MySQL. For many years now, MySQL AB and Innobase Oy (founded by Heikki Tuuri) have worked closely together to make that technology a seamless part of MySQL.

Like all of the MySQL code, InnoDB is dual licensed. That means you can freely use it under the GPL or buy a license for it if your usage would violate the GPL.

MySQL's public reaction right now isn't the "holy f$@%ing shit!" that likely occurred internally. Kaj Arno, MySQL's VP of Community Relations, sent out a message to many MySQL users today titled " MySQL AB Welcomes Oracle to the FOSS Database Market".

The message began by saying:

MySQL AB and the Free / Open Source database market today received some unexpected recognition by Oracle, through their acquisition of Innobase Oy.
So what does this have to do with MySQL?
Well, Innobase is the provider of the popular InnoDB Storage Engine in MySQL. One of the things our users appreciate about MySQL is its unique pluggable storage engine architecture. You have the flexibility to choose from number of storage engines including MyISAM, Memory, Merge, Cluster and InnoDB. And with MySQL 5.0, we added the new Archive and Federated storage engines.
Just like the rest of MySQL Server and its Storage Engines, InnoDB is released under the GPL. With this license, our users have complete freedom to use, develop, modify the code base as they wish. That is why MySQL has chose the GPL: to protect the freedom that users value in free / open source software.

Later on, Kaj makes an effort to calm the fears of MySQL users by saing that MySQL will continue to support all their users and work with Oracle as a "normal business partner."

The big elephant in the room, however, the uncertainty around Oracle's future plans for the InnoDB source code. Their press release says:

Innobase is an innovative small company that develops open source database technology. Oracle intends to continue developing the InnoDB technology and expand our commitment to open source software. Oracle has already developed and contributed an open source clustered file system to Linux. We expect to make additional contributions in the future.

As well as:

InnoDB is not a standalone database product: it is distributed as a part of the MySQL database. InnoDB's contractual relationship with MySQL comes up for renewal next year. Oracle fully expects to negotiate an extension of that relationship.

I expect those negotiations could be quite interesting. Maybe not next year, but the year after? Oracle could decide to put the squeeze on MySQL someday in a way that hurts their customers but not "the community" (those using the GPL version).

MySQL is now faced with the prospect of licensing technology they cannot ship without from their biggest rival. Interestingly, there's always been once piece of the InnoDB puzzle that's not available under the GPL: the InnoDB Hot Backup Tool. Without it, database administrators cannot backup their InnoDB tables without shutting down MySQL or at least locking out all transactions.

Oracle just bought themselves a whole lot of leverage with MySQL AB and a talented team of database engineers to boot.

I've always wondered why MySQL AB didn't buy Innobase Oy years ago. It always made complete sense from where I sat. But I'm hardly an insider when it comes to the relationship between those companies. Needless to say, that relationship just got far more "interesting."

I hope, for the sake of the community and the company (I've known many MySQL employees for years), that Oracle is true to their promises. But it is Oracle, so I'm naturally skeptical.

Posted by jzawodn at 09:26 PM

September 27, 2005

MySQL 5.0 Release Candidate Available

It’s been a long time coming but the wait is nearly over (and well worth it). If you haven’t been following MySQL 5.0 development very closely, the MySQL 5.0 Release Candidate has the following major new features:

  • Views (both read-only and updatable views)
  • Stored Procedures and Stored Functions
  • Triggers
  • Server-side cursors (read-only, non-scrolling)
  • Precision math
  • Larger VARCHARS (up to 64kb)
  • ARCHIVE storage engine
  • FEDERATED storage engine

I could write a lot more about several of those and may do so at some point. If you've been waiting for any of those features, give 5.0 a try. They're looking to get all of those last minute bugs squashed ASAP.

More details available in MySQL 5.0 in a Nutshell. Give it a try!

Oh, and speaking of open source... Check out Open Source Goes Corporate in this week's InformationWeek. I had a chance to talk with Larry a few weeks about how much MySQL, Apache, FreeBSD, PHP, Linux, and other Open Source software we use at Yahoo!

Posted by jzawodn at 07:52 AM

August 07, 2005

CFP: Open Source Database Conference 05

From a friend...

I am writing to make you aware of the CfP for the upcoming Open Source Software & Support Verlag, the producer of internationally renowned conferences such as JAX, International PHP Conference, ApacheCon Europe and others, announces a new conference for the international Free Software/Open Source community:

Open Source Database Conference 05
November 7 to 9, 2005
Frankfurt, Germany

It is our pleasure to invite you to become part of this new conference by sending you the official Call for Paper. Your submissions would be very much appreciated. Please find further information on the conference and on the submission proceeding below.

THE CONFERENCE AT A GLANCE

  • Open Source Database Conference 05: November 7 to 9, 2005
  • Event location: NH-Hotel Frankfurt-Mörfelden
  • Main conference: November 8 and 9, 2005
  • Power workshops (tutorials): November 7, 2005
  • Duration of a session: 75 minutes
  • Duration of a power workshop (whole day): ~ 6 hours
  • Duration of a power workshop (half-day): ~ 3 hours

The main conference is divided into the following tracks:

  • Database Fundamentals
  • Database Development
  • Database Administration
  • Business Intelligence
  • Free Software/Open Source Database Business

Conference topics to be covered are:

  • Database Administration
  • Migration to Free Software/Open Source Databases
  • Performance tuning and optimization
  • APIs/Connectors
  • New Technologies
  • Lowering TCO with Free Software and Open Source RDBMS
  • Case studies
  • Community-related topics

Languages, technologies: all (Java, PHP…)

We are looking forward to your submission and wish you all the best!

Posted by jzawodn at 10:05 PM

May 09, 2005

MySQL Super Smack now maintained by Tony Bourke

The MySQL stress testing tool known as Super Smack used to live on my web site after I adopted it. However, I've now turned it over to Tony Bourke. The new URL for Super Smack is http://vegan.net/tony/supersmack/

Thanks to Tony for stepping up to handle Super Smack. Shortly after I really got into working on the book, I had little time left to deal with Super Smack. He's been putting it to good use and should keep development moving along nicely.

I'll get a redirect up on the old super-smack page shortly.

Posted by jzawodn at 07:48 AM

April 19, 2005

At MySQL User Conference Today

As previously noted, I'll be at the MySQL User Conference in Santa Clara today.

I'm looking forward to several of the talks on the schedule:

Who knows... maybe I'll see you there.

Posted by jzawodn at 09:28 AM

April 08, 2005

Don't Forget the MySQL Users Conference

The 2005 MySQL Users Conference isn't far off. It runs from April 18th - 21st at the Santa Clara Convention Center.

The session list is impressive. MySQL and O'Reilly have put together a great program. I'll be participating in a panel discussion titled Scaling and High Availability Challenges that's moderated by Om Malik on Tuesday the 19th.

Posted by jzawodn at 08:36 AM

November 19, 2004

Jeremy Cole will be Yahooing Soon too!

One of the conditions on taking my new job was that I find my replacement to handle many of Yahoo's growing MySQL related needs. I worried that it was going to take a long time, because good MySQL folks are scarce. If you've ever tried to hire one, you know this. That meant I had to either get lucky or spend quite some time doing double duty.

By some quirk of fate and excellent timing, Jeremy Cole was available. He visited Yahoo recently, we offered him a job, and he's starting December 6th.

You may have seen him speak at conferences past. He spent over 4 years with MySQL AB and knows MySQL inside-out.

Welcome aboard!

The good news is that other Yahoo's don't have to learn a new name. They can still think "MySQL question... I'll ask Jeremy." :-)

How about that? We got Russell and Jeremy both recently. Who's gonna be next?

Posted by jzawodn at 06:33 PM

October 20, 2004

My Book on Slashdot, or... Many of My Friends Have No Life

And here's how I know. Within 10 minutes of the Slashdot review of High Performance MySQL, many of them contacted me via IM, e-mail, and whatnot. It frightens me to think of how often they're reloading that page.

Anyway, the review is quite positive. I'm glad Steve liked the book and fully expected all the "MySQL SUCKS!" comments that appeared moments later. Apparently some nerds have an infinite capacity for fighting vi vs. emacs, mysql vs. postgresql, gnome vs. kde, and all those other stupid battles.

Oh, well. As long as half of them buy the book, I'm happy. :-)

Posted by jzawodn at 12:30 PM

September 07, 2004

How Cheap is MySQL?

Speaking of MySQL, if these new results are to believed, MySQL is about 60% the cost of Oracle.

MySQL  ... $ 82.74/op
Oracle ... $139.84/op

That's measuring "total operations per second", which isn't the "normal" database comparison metric. But let's assume it's a fair way to compare databases anyway. (We could argue all day week the time about the "right" way to do it.)

That's surprising, given the vast difference in size between the two companies, isn't it? Back when I said:

MySQL is to Oracle as Linux is to Windows. It will slowly but steadily creep up the food chain, just like Linux has.

I didn't expect it to climb the price ladder quite this quickly.

Wow.

Makes you wonder what things will look like when Oracle cuts prices to match, doesn't it?

Posted by jzawodn at 10:49 AM

Who kidnapped Monty?

Is it just me, or is MySQL getting a little too corporate these days?

They recently posted an "interview" with Monty titled Catching up with Monty that seems a lot more like a marketing piece than something Monty would actually say or write:

But then again, what do I know? I've only spoken with him a few times a year for the last few years (in person and via e-mail). This is the first time he ever sounded like he was speaking through a heavy PR/Marketing filter.

I went on a Linux GeekCruise earlier and it was a very rewarding event for me. To be able to mingle and discuss (for days) your Linux problems and ideas with some of the important people behind Linux was both fun and very educational.
We wanted to give MySQL users the same opportunity that the Linux people have on the Linux GeekCruise. People can come discuss and learn about MySQL and also get practical tips and solutions to problems that they face when using MySQL. There is plenty of free time during the MySQL Cruise, and I am sure we will spend a notable part of that time to discuss solving practical problems given to us by users on the trip.
Another reason that I like events like the MySQL GeekCruise is that it gives me the chance to meet MySQL users and through them get a better understanding of what they really want from the MySQL products in the future.

It's customary when documenting an interview to mention who conducted the interview. But "we caught up with him" does nothing to reveal the identity of this person. Was it via e-mail or in person?

So, who kidnapped Monty and replaced him with this evil clone? :-)

It's not Zak, since he resigned. And I doubt it's Jim, since he just doesn't seem like that kind of guy.

Don't get me wrong, I'm not criticizing Monty. I have tons of respect for him. I'm criticizing his clone or whoever crafted that "interview" on his behalf.

Posted by jzawodn at 10:37 AM

August 17, 2004

Beta Software is Funny

I've been spending a lot of time working to get a MySQL Cluster up and running. The docuementation leaves out a lot of critical info (or presents it in a less than optimal order). But I can mostly live with that.

However, one of the programs has confusing command line arguments. ndbd the "data node" piece of the NDB storage engine (which is all the cluster brains), cracks me up:

root@mysql1:/home/mysql/var/cluster# ../../libexec/ndbd -h
Usage: ../../libexec/ndbd [--version] [-v] [--start] [-s] [--no-nostart] [-n]
   [--deamon] [-d] [--initial] [-i] [--connect-string=constr] [-c constr]
   [--usage] [-?] The MySQL Cluster kernel
-v, --version                      Print version
-s, --start                        Start ndb immediately
-n, --no-nostart                   Don't start ndb immediately
-d, --deamon                       Start ndb as deamon
-i, --initial                      Start ndb immediately
-c constr, --connect-string=constr "nodeid=;host="

-?, --usage                        Print help

Okay, let me get this stright. There are two ways to make it start up immediately (-s and -i). But if you want it not to start immediately you'd use the poorly named --no-nostart option.

Which of those, if any, is the default if I specify none?

Beats me.

I've configured something wrong. My management node comes up fine but my first data node is decidedly unhappy. And the error it logs might as well be in ROT-13 octal.

Back to head scratching.

Once I get this working, I'll post my recipie and work to include it in the second edition of High Performance MySQL.

Update: I've got all four data nodes up and talking to the management node. Excellent. Now I need a client...

Posted by jzawodn at 04:21 PM

August 10, 2004

Yahoo! Job Opening: Software Engineer

Yup, another job opening. If you're interested or know someone who'd kick ass in this job, let me know.

The job is on-site in Sunnyvale, California.

<job_posting>

Enjoy solving hard problems creatively? Know all the GOF patterns? Can you make database schemas into 3rd Normal form? Do you know the difference between REST, SOAP, and MOM?

We are looking for an engineer to architect new services, build shared libraries, and refactor existing systems. You will work with Yahoo! News, Sports, Weather, Finance, Health and other groups to build exciting systems. You will deliver complex projects in demanding deadlines while helping other engineers design and implement their systems.

If you've had experience building high-throughput systems, can design an class hierarchy in your sleep, and know all about web services, then we're looking for you.

Qualifications

  • 5-7 years experience designing modern systems
  • BS or MS in Computer Science
  • Knowledge of practical application of design patterns
  • Good written/verbal communication skills and strong investigation, research and evaluation skills
  • Clear prioritization skills in a chaotic, fast-paced environment
  • Web services experience a plus
  • C/C++, Perl, Java
  • Apache
  • XML/XSL
  • MySQL/Oracle

</job_posting>

I know some (many?) of the folks you'd be working with in this job. They're smart folks who love building great technology.

Oh, and don't ask me what "exciting systems" are. We all know that job listings are partly sales pitches, so that's what you get I guess.

Posted by jzawodn at 09:43 PM

July 15, 2004

Making mysql a bit more shell-like

I use the mysql command-line client a lot. It acts like a mini-shell that makes it easy to send commands or queries to a MySQL server and view the results. However, its navigational facilities aren't terribly shell-like.

To change to a particular database, you must use the "USE" command:

mysql> use test
Database changed

To get a list of databases, you must use the "SHOW DATABASES" command:

mysql> show databases;
+------------+
| Database   |
+------------+
| Tribute911 |
| mysql      |
| test       |
+------------+
3 rows in set (0.07 sec)

To get a list of tables within a database, you'd typically use the "SHOW TABLES" command after having USEd that database:

mysql> use test
Database changed

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| BODY_SIDE      |
| Filer          |
| aatest         |
| blah           |
| http_auth      |
| passwd         |
| t1             |
| t2             |
| test           |
+----------------+
9 rows in set (0.07 sec)

Often times I'll forget that I'm at a "mysql>" command prompt, type "ls" or "cd", and just expect it to know what I want.

Today I fixed that bug. As of now, my copy of the client understands the "cd" and "ls" commands. The patch is available (against the latest 4.0 BitKeeper tree) and has been sent to the developers for possible inclusion.

Here it is in action:

mysql> ls
+------------+
| Database   |
+------------+
| Tribute911 |
| mysql      |
| test       |
+------------+
3 rows in set (0.07 sec)

mysql> cd mysql
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> ls
+-----------------+
| Tables_in_mysql |
+-----------------+
| columns_priv    |
| db              |
| func            |
| host            |
| tables_priv     |
| user            |
+-----------------+
6 rows in set (0.07 sec)

I like it. :-)

Update: In the comments, Russell asks about setting the prompt to contain the current database name. Of course you can do that!

mysql> prompt mysql (\d)>
PROMPT set to 'mysql (\d)> '
mysql ((none))> use mysql
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql (mysql)>

That was added quite a while ago.

Posted by jzawodn at 01:11 PM

July 13, 2004

OSCON Tutorial Hacking

O'Reilly really wants the slides or material for the MySQL Performance Workshop session that I'm doing at OSCON in a few weeks.

I'll give you one guess. See if you can figure out what I'm doing tonight...

Good guess!

I'm gonna start with plain text and worry about what (if any) "presentation software" I want to use later. I'm really tempted to just have them print an outline as the handout. All I really need is a whiteboard and a few markers. For some reason, I have no trouble talking about MySQL in a semi-intelligent manner for hours on end if you stick me in front of a room full of people and give me a few markers and a whiteboard.

But the mental stress involved in actually assembling a presentation drives me nuts. And I procrastinate. A lot. Besides, how am I supposed really to know what will truly be useful to the folks in the room until I get into that room and talk with them a bit?

Anyway, enough stalling...

Posted by jzawodn at 09:27 PM

July 08, 2004

Database Abstraction Layers Must Die!

Beware of men preaching of false hope.

Take, for example, the way some folks feel like they need a database abstraction layer in their applications. Rasmus has long argued against them, and I've agreed with his reasoning and conclusion. (Because it's correct!)

I was reminded of this when I recently read "rant, by request...", in which the author argues against the Smarty PHP template system. Why? Because PHP is itself a templating system. Adding another layer increases complexity, degrades performance, and generally doesn't really improve things.

So why do folks do it? Because PHP is also a programming language and they feel the need to "dumb it down" or insulate themselves (or others) from the "complexity" of PHP.

In that same article, I find myself strongly disagreeing with something else the author says:

Pick any book on PHP from a shelf in your local bookstore, and look how result rows from a MySQL database are printed. (MySQL is of course the DBMS used in those books, which should already give you a clue about how bad the book is.) The mysql_-functions are used all over the place in the presentation layer.

So, only bad books discuss MySQL in their examples? Let's look past that obvious bashing, and continue...

Here, in these forums, we have learned people to not use those mysql_-functions directly, but use a database abstraction layer instead. This makes coding simpler (no need to know all those functions for the various DBMS's) and when they decide to use another DBMS instead of MySQL (and they undoubtedly will at some point), the conversion will be painless.

The fact that he generally pisses on MySQL isn't what bugs me, though it doesn't help. What bothers me is the double-standard. He's advocating "raw" PHP instead of more "abstract" templating languages because they're bigger, slower, more complicated. But when it comes to the database side of things, he's suddenly arguing for the bigger, fatter, slower abstractions again?

This makes no sense for several reasons. Let's look at them.

The Portability Fallacy

The author uses an argument I hear all the time: If you use a good abstraction layer, it'll be easy to move from $this_database to $other_database down the road.

That's bullshit. It's never easy.

In any non-trivial database backed application, nobody thinks of switching databases as an easy matter. Thinking that "the conversion will be painless" is a fantasy.

Good engineers try to select the best tools for the job and then do everything they can to take advantage of their tool's unique and most powerful features. In the database world, that means specific hints, indexing, data types, and even table structure decisions. If you truly limit yourself to the subset of features that is common across all major RDBMSes, you're doing yourself and your clients a huge disservice.

That's no different from saying "I'm doing to limit myself to the subset of PHP that's the same in Perl and C, because I might want to switch languages one day and 'painlessly' port my code."

That just doesn't happen.

The cost of switching databases after an application is developed and deployed is quite high. You have possible schema and index changes, syntax changes, optimization and tuning work to re-do, hints to adjust or remove, and so on. Changing mysql_foo() to oracle_foo() is really the least of your problems. You're gonna touch most, if not all, of your SQL--or you'll at least need to verify it.

That doesn't sound "painless" to me.

Good Programming vs. "Neutral" APIs

The author is also clearly unhappy with the alternative, having mysql_foo() and mysql_bar() functions all over the application. Well, I may be nuts here, but I never have that problem. I use a revolutionary new programming technique. Instead of littering my code with those calls, I put my core data access layer into a library--a separate piece of reusable code that I can include in various parts of my application and... reuse!

That means if I ever decide to make major changes to the way my application interacts with the database (persistent connections, replication awareness, load balancing, different error handling), I'm able to do so without searching every damned file in my code base for mysql_* functions that I need to tweak.

I never thought this was rocket science, but apparently it has eluded him. Somehow he manages to see the benefit of separating presentation from logic, but never considered separating the data access layer from the data processing layer.

Some things never cease to amaze me--and make me very sad at the same time.

Posted by jzawodn at 09:35 AM

June 23, 2004

Chapter 7 (Replication) of High Perforamnce MySQL is On-Line

Order High Performance MySQL from Amazon.com You can find a complete copy of the Replication chapter from High Performance MySQL (my book) over on dev.mysql.com.

Read it.

Learn something.

Then buy a copy.

Learn even more.

Marvel at your new MySQL knowledge.

Then send me a good resume so I can get you a job here at Yahoo.

Or not. :-)

About a month ago, they also posted the full text of chatper 6 (Performance Tuning). If that doesn't convince you to buy a copy, then I'm out of luck. We're not giving them anymore chapters to put up.

(Hmm. This may be the first time I've pimped my book at my employer in the same blog entry. Weird.)

Posted by jzawodn at 02:01 PM

April 21, 2004

Understanding the MySQL Release Cycle

In his MySQL Conference Roundup post, Russell confessed his confusion about how MySQL versions are numbered and developed:

I was actually disappointed to hear that 4.1 won't be going beta until next month and won't be in production until Q4!!! I was told it was going to be production *a lot* sooner (as in this week). Urgh. As I'm using the spatial indexing stuff, and it'd be nice if it was more solid. And what about 5.0? I thought 4.1 was going to become 5.0 when launched? I'm confused.

Heh. It is confusing, especially since different projects do it differently (Linux kernel, Apache, etc). The good news is that MySQL is fairly easy to understand.

There are always at least three versions (or categories of versions) to know about. Let's give them names and rough descriptions that match the way I think about it.

  • legacy: these versions are no longer actively developed. If you find a bug, you're usually told to upgrade to stable. Major security flaws, however, generally seem to be fixed. Today this is the 3.23.xx series.
  • stable: the stable series is safe for use in production. Bugs are actively fixed and new versions appear a few times per year. Today this is the 4.0.xx series.
  • development: this is where all the new stuff happens. The code may or may not be safe for production use at any given time and may or may not exist as alpha or beta releases as they reach major milestones and need testing. Today this is both 4.1.xx and 5.0.xx

That last bit is where the confusion comes from. The 4.1.xx series is likely to go beta in a month or two (I'd guess). It's where you're going to find spatial (2-D) indexing, subqueries, multiple character set support, prepared statements, and lots of other goodies. The 5.0.xx series is going to take longer--just as you'd expect. It's where the work for stored procedures is going on.

So 4.1 will not become 5.0 when launched, but 5.0 will certainly inherit all of 4.1's features.

It's also worth checking the relevant section of the MySQL manual to get their take on all this.

Oh, he also heavily pimps High Performance MySQL in that post too. Thanks, Russ!

Posted by jzawodn at 08:29 PM

April 19, 2004

High Performance MySQL (the book) is Shipping!

As noted on our site for the book, O'Reilly brought the first copies to the 2004 MySQL User's Conference in Orlando and it had sold out by the middle of the second day.

Woohoo!

Now Amazon reports that the book is shipping in 1-2 weeks (rather than being pre-order), and we even have our first customer review posted too.

I read the front matter and first three chapters on the plane ride home last night, pen in hand. I didn't mark as much stuff as I expected to but did manage to find a couple of dumb things we may want to fix for subsequent printings.

Thanks to everyone who has pre-ordered a copy or bought one at the show. The response has been amazing so far!

Derek has a picture of the books for sale at the show booth. I'll link to it as soon as he posts the pics.

Update: Posted.

Posted by jzawodn at 01:59 PM

April 15, 2004

Introduction to the MySQL Administrator

Notes from the Introduction to the MySQL Administrator talk at the 2004 MySQL User's Conference...

(I initially wanted to go to the "High Availability Solutions with Databases" session, but it seems to be a big Sun product pitch. Bad move, Sun. You sessions should clearly state that they're heavily focusing on your solutions.)

Alfredo is walking thru the UI of the Administrator. It seems to be fully functional now on Windows, with Linux and Mac OS X lagging a bit behind. The gist of this is that the GUI tool removed the need to touch the my.cnf file as well as interacting with the MySQL command-line tool for most administrative operations (security/privileges, account management, schema changes, config changes, replication setup, and so on).

Documenting this in the second edition of the book will be a challenge, mainly because I find most GUI admin a bit tedious.

It's hoard to write a lot about this tool without screen shots, since it's clearly a very visual application. I am impressed with how far along the tool is both in terms of functionality and the user interface itself. This will go along way toward making MySQL more approachable to those who shun the command line.

Toolkits used on various platforms are... Windows: Delphi, Linux: GNOME/C++, Mac OS X: Cocoa.

Hmm. Short talk.

Posted by jzawodn at 11:40 AM

InnoDB Multiple Tablespaces and Compressed Tables

Notes from Heikki Tuuri's InnoDB Multiple Tablespaces and Compressed Tables talk at the 2004 MySQL User's Conference...

Heikki spent quite some time as an academic doing math and later computer science. He wrote the first line of code in 1994 and was trying to figure out why relational databases were so slow. Five years later he had 110,000 lines of code and could run a TPC-C benchmark. He met Monty in 2000. Work in the InnoDB/MySQL interface began shortly after and took about 6 months. First release in 2001.

Heikki owns the company (Innobase Oy) and he now has two employees. Pekka working on hot backup and Marko working on compressed table space development. A third guy will be coming on board soon as well. Innobase Oy is an OEM supplier for MySQL AB. They make money via hot backup licenses, royalty from MySQL licenses, and support contracts. The company is profitable.

Development slowed in 2003 because Heikki was dealing with new users, bug reports, and support contracts/questions. New hires will speed that up again.

Multiple tablespaces appeared in MySQL 4.1.1 (Dec 2003). Sponsored by RightNow Technologies of Montana. Each table can be stored in its own .ibd file rather than one massive tablespace.

To enable it, just add innodb_file_per_table to my.cnf and you're good to go. But you do still need one ibdata file for the data dictionary, undo logs (rollback segment), and so on.

If you later disable support for multiple tablespaces, InnoDB will continue to use the old files while putting new tables back in the single tablespace. InnoDB doesn't handle symlinked tables during an ALTER TABLE like MyISAM does. The new table will end up in the default location, not where the original table lived.

You cannot move .ibd files from one instance to another. The .ibd files contain transaction ids and log sequence numbers. You also cannot move tables around on the same machine. Instead, use RENAME TABLE (and maybe a symlinked directly) to accomplish it.

Interesting... for restoring an old version of a table, you can remove the current one using ALTER TABLE mytable DISCARD TABLESPACE and then ALTER TABLE mytable IMPORT TABLESPACE to restore the old one. But the table must be clean, meaning no uncommitted records, all buffered inserts must be done, and all delete marked records must have been purged.

Compressed table formats will help reduce the disk usage required to store data. NULLs will take no space, many of the length records have been removed, and they've added on-the-fly zip compression to further reduce space. The typical InnoDB user should see at least a 50% reduction in disk space usage.

Old tables will not be automatically converted to the new format. Pages will still be 16k, but most pages will end up as 8k on disk. They'll need some sort of compression prediction but are still hacking on that it seems.

They also are looking at in-memory compression so that more pages will fit in the buffer pool. (Nice!)

Upcoming features: linux async disk I/O (available in Linux 2.6) and on-line index generation without setting locks. Unknown ETA.

They've added automatic index generation on foreign keys (in case you forget to create them). This should be in 4.0.19, he thinks. Sounds like it'll be in the next 4.1.x release for sure.

Posted by jzawodn at 08:03 AM

MySQL Replication and Clustering

Notes from Brian's MySQL Replication and Clustering talk at the 2004 MySQL User's Conference...

3.23.xx had single threaded replication, 4.0 had dual-threads to implement read-ahead replication (trivia note: that was my idea). Understanding the control files: master.info, relay log, binary logs, etc.

Be careful of things that do not replicate properly, such as a UDF that generates random numbers. It's possible to have different storage engines on the master and slave.

Brian's slides are impossible to read if you're not in the first 4-6 rows. Doh!

When setting up replication, you need to make an initial snapshot of your master. Typical techniques covered (mysqldump, rsync/scp, mysqlhotcopy, etc). He doesn't mention mysqlsnapshot or flushing the binlog on the server (the common case). Weird.

Replication commands: SHOW SLAVE HOSTS, PURGE MASTER LOGS, RESET MASTER, SHOW MASTER STATUS, SHOW SLAVE STATUS, STOP SLAVE, START SLAVE, etc. When replication fails: mysqlbinlog, checking slave status, SET GLOBAL SQL_SLAVE_SKIP_COUNTER, etc. Hmm, he just confused removing the relay logs and removing the master.info file. Whoops.

Brian warns about shared-disk clustering with MyISAM tables. NFS bad. Very bad. You can do "clustering" with one master, many slaves. Works well for read-intensive applications that can tolerate a bit of latency. Not all can. Sometimes it's better to put up slaves that you can hammer on for reporting purposes, such as with real-time apache logging.

Now we're looking at what I call tiered replication (master to slave to many slaves). This keeps a burden off the master, but I'd argue the burden of replication is so low that it's a non-issue in most cases.

Multi-master replication. Hmm. Brian described a star replication topology that either can't work or he glossed over a couple of major points. I need to ask about that one. Circular replication is next. He didn't really warn about how fragile this is.

Okay, yeah. He did gloss over some stuff because MySQL can't do that out of the box.

Ahh, now he's explaining that replication is log-based. Should have done that much earlier on.

Replication in 5.0: it'll work with the cluster product correctly. All masters will get copies of the cluster inserts into their binlogs. Storage engine injectiion is coming too, meaning a custom engine can add stuff to the binlog. Row-based (or logic-based) replication will be coming as well, it seems.

Replication in 5.1: adding multiple threads to replication. That means a slave can have multiple IO threads. Hmm. Some details not clear. It seems to make use of multiple network paths. But I'm not sure how the binlogs get split out on the master.

More info on-line in the manual, mailing list, and so on.

Posted by jzawodn at 06:48 AM

April 14, 2004

MySQL Cluster High Availability Features

Notes from the MySQL Cluster High Availability Features talk at the 2004 MySQL User's Conference...

Redundancy between nodes via heartbeat. Redundancy between clusters via replication. In other words, NDB provides local and global redundancy. System recovery for a full-scale shutdown. Hot backup and restore. The architecture in general was designed to eliminate single points of failure.

Lots of diagrams illustrating recovery in various failure scenarios that I can't ASCII copy too easily. Doh!

Posted by jzawodn at 02:51 PM