December 14, 2009

Trust Oracle? Why?

posted at 06:27 PM | PermaLink

In a 10-point press release issued today Oracle has listed a series of "commitments" regarding their acquisition of MySQL by way of acquiring Sun.

I am not impressed.

As a former employee of a large Internet company (the largest at the time, in fact) that used both Oracle and MySQL, I'm utterly puzzled by this. I can't think of why we should trust Oracle to do right by the users of MySQL--especially the non-paying users.

You see, for years Oracle worked agressively behind the scenes to discredit MySQL and tried hard to understand how their customers could ever consider using such a "toy" instead of their flagship product. In fact, it was so important to Oracle that they offered some very substantial discounts to customers who were using MySQL and Oracle. In some cases the discounts were so impressive that their motivation was clear: cut off the opportunity for MySQL to grow and spread in such organizations. (Remember what happened to Netscape when Microsoft gave away Internet Explorer for free?)

The funny thing is that it really didn't work. MySQL was already a fast moving train with lots of momentum. And it was still accelerating.

It was clear that Oracle saw MySQL as a threat to their business. When they eventually bought Innobase (the company that makes the InnoDB storage engine), many of us got more than a bit nervious. That put Oracle in a position of having a choke hold on the one componenet that was critical to MySQL's future success. They could have just shut down development entirely. But that may have made their motives a bit too clear.

Since then they've continued to develop InnoDB. However, the pace hasn't exactly been agressive and their openness around that has left me (and others) wondering what their longer term plans really are. The few tidbits we get seem to be overly vague. Could they have been throttling development of InnoDB? Or not providing the same resources that MySQL (and now Sun) would have? It's hard to say.

But here's the thing that continues to bug me...

Back a few years ago when Oracle dismissing MySQL in public while working hard against it in private, I realized that they were simply trying everything they could to protect their crowned jewels: public denials and classic FUD paired with hush-hugh backroom deals.

Nobody has managed to explain, in even a mildly convincing way, what has changed since then. Why should we suddenly trust Oracle? Their crowned jewels are still threatened by MySQL.

Convince me.

See Also: Monty's appeal is selfless!

Posted by jzawodn at 06:27 PM

November 16, 2009

Speaking at San Francisco MySQL Meetup

posted at 07:00 AM | PermaLink

I'll be speaking Tuesday night at the San Francisco MySQL Meetup. Here's the blurb from the announcement:

Jeremy Zawodny will talk about Craigslist's current use of MySQL, where it's painful, and how things are being re-architected to make the pains go away. This includes hardware changes, sphinx, redis, memcached, and some custom Perl work.

Despite not living too far away from the SF MySQL Meetup venue, I'm a little annoyed at myself for never having attended. So this will be my first time and I'm really looking forward to meeting local like-minded folks.

Thanks to Mike and Erin for recruiting me.

If you're coming and have specific things you'd like me to talk about, drop me a note in the comments or on Twitter.

Posted by jzawodn at 07:00 AM

November 09, 2009

Fixing Poor MySQL Default Configuration Values

posted at 07:49 AM | PermaLink

I've recently been accumulating some MySQL configuration variables that have defaults which have proven to be problematic in a high-volume production environment. The thing they all have in common is a network blip or two can trigger some very undesirable behavior.

max_connect_errors

If a client is having trouble connecting to MySQL, the server will give up waiting after connect_timeout seconds and increment the counter which tracks the number of connect errors it has seen for the host. Then, when that value reaches max_connect_errors, the client will be locked out until you issue a FLUSH HOSTS command. Worse yet, if you have occasionally network blips and never need to restart your MySQL boxes, these errors can accumulate over time and eventually cause you middle of the night pain.

See Host 'host name' is blocked in the MySQL docs. Sadly, there is no way to disable this check entirely. Setting the variable to 0 doesn't accomplish that. Your only real solutions are (a) setting it to a very high value (max_connect_errors=1844674407370954751), and (b) running an occasional FLUSH HOSTS command.

connect_timeout

This is related to the above problem. In situations of network congestion (either at the client or server), it's possible for an initial connection to take several seconds to complete. But the default value for connect_timeout is 5 seconds. When you trip over that, the max_connect_errors problem above kicks in.

To avert this, try setting connect_timeout to a value more like 15 or 20. And also consider making thread_cache_size a non-zero value. That will help in situations when the server occasionally gets a high number of new connections in a very short period of time.

skip-name-resolve

MySQL does a reverse DNS lookup on every incoming connection by default. This sucks. It seems that no matter how good your infrastructure is, there are blips in DNS service. MySQL's host cache exists to keep those lookups to a minimum. Yet I've seen this cause pain off and on for eight years now. I can only assume there's a bug in the host cache or the resolver library when this happens.

I recommend adding skip-name-resolve to your /etc/my.cnf to skip DNS entirely. Just use IP addresses or ranges for your GRANTs. It seems that slow replies from DNS servers can also help you to trip over connect_timeout as well. Imagine having 2 or 3 DNS servers configured but the first one is unavailable.

slave_net_timeout

When the network connection between a master and slave database is interrupted in a way that neither side can detect (like a firewall or routing change), you must wait until slave_net_timeout seconds have passed before the salve realizes that something is wrong. It'll then try to reconnect to the master and pick up where it left off. That's awesome.

However, the default value is 3600 seconds. That's a full hour! FAIL.

Who wants their slaves to sit idle for that long before checking to see if something might be wrong? I can't think of anyone who wants that.

My suggestion, if you're in a busy environment, is that you set that to something closer to 30 seconds.

Posted by jzawodn at 07:49 AM

October 21, 2009

Oracle and MySQL

posted at 08:41 PM | PermaLink

In The EU and MySQL, Tim Bray treads lightly on the topic of Oracle's pending ownership of MySQL if the Sun acquisition goes through. I left a comment on his post, but he's likely to be heavily moderating what appears there since he works for Sun.

So here's what I posted on his blog.

I haven't yet seen anyone explain what motivation Oracle has for pouring resources into MySQL, especially if it eats away at their DBMS business on the low end.
I've been puzzling over this since their acquisition of Innobase Oy (the makers of InnoDB) years back. Is Oracle serious about seeing MySQL grow and succeed, or was that just a way to get a strangle-hold on a critical piece of MySQL?
I've never had the chance to ask Ken Jacobs that. Actually, I have but it would have been kind of rude. And even if I did, I'm not sure I could trust the answer.
I doubt this comment will get published, but as a MySQL long time user, supporter, advocate, and author I'm really glad to see things like PBXT, MariaDB, and Percona's XtraDB out there.

Really, we need that kind of diversity in Open Source. A MySQL/InnoDB "monopoly" wouldn't have been healthy in the long run.

A reporter contacted me today to ask, among other things, if I think Oracle was/is threatened by MySQL. Oracle claims that they serve two different markets, etc. He wasn't so sure.

Sadly, there's some background information that I should not publish here, but suffice it to say that Oracle was and probably still is threatened by MySQL. Their sales/marketing tactics made this quite clear long ago. But those deals were rarely public--for good reason.

Posted by jzawodn at 08:41 PM

June 16, 2009

My Drizzle Article in Linux Magazine (XtraDB and Sphinx too!)

posted at 07:13 AM | PermaLink

After a few years off, I've been doing some writing for Linux Magazine (which is on-line only) again recently. First off, my just published feature article is Drizzle: Rethinking the MySQL Database Kernel. As you might have guessed, it looks at Drizzle and some of the reasoning behind forking and re-working MySQL.

I'm also writing a weekly column that we've been calling "Bottom of the Stack" (RSS) which started a few weeks ago. Recent articles are:

The basic idea is that I'll be writing about back-end data processing and systems--the sort of stuff that lives in the bottom half of the traditional LAMP stack.

If you have ideas of stuff you'd like to cover, please drop me a line.

As a side note, I wrote my first article for Linux Magazine back in June of 2001: MySQL Performance Tuning. Those were the MySQL 3.23 days. How time flies!

An amazing credit to some of the folks involved with Linux Magazine, all of my past writings are available there.

Posted by jzawodn at 07:13 AM

May 27, 2009

The Big ALTER TABLE Test

posted at 07:34 AM | PermaLink

As previously noted, I've been playing with XtraDB a bit at work. Over a week ago I decided to test compression on one of our larger tables and it took a bit longer than I expected.

(root@db_server) [db_name]> ALTER TABLE table_name \
    ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4;
Query OK, 825994826 rows affected (8 days 14 hours 23 min 47.08 sec)
Records: 825994826  Duplicates: 0  Warnings: 0

Zoiks!

It's too bad we couldn't use all the cores on the machine for the ALTER TABLE, huh?

On the plus side, the file sizes aren't too bad.

Before:

-rw-rw---- 1 mysql mysql 1638056067072 2009-05-24 09:23 table_name.ibd

After:

-rw-rw---- 1 mysql mysql  587617796096 2009-05-27 07:14 table_name.ibd

I'll have more to say about XtraDB and the compression options in a later post. But given the interest that my Twitter messages about this big ALTER TABLE generated yesterday, I figured I'd share a bit more detail here.

For anyone doing the math at home, that's going from rougly 1.5TB to 500GB (the new file size is slightly inflated, since this slave managed to replicate about a week's worth of data before I caught it). I was hoping for 4:1 compression and managed about 3:1.

Posted by jzawodn at 07:34 AM

May 14, 2009

MySQL 5.1.34 and XtraDB 1.0.3-5

posted at 08:17 AM | PermaLink

For a couple weeks now, we've had a MySQL server at work running MySQL 5.1.34 and the Percona XtraDB 1.0.3-5 plug-in. I'm testing an upgrade path for our current MySQL 5.0.xx based servers.

Aside from some confusion about the initial setup (getting the built-in InnoDB to stay out of the way), things have gone very well. All of our largest and most active tables have been converted to the new Barracuda file format and I tested compression on the two largest. The first didn't fare so well, but it's a fairly over-indexed table with small rows. The second, however, contains a decent sized TEXT column (classified posting bodies) and it compresses quite nicely. Any change in CPU utilization is not significant.

I hope to soon get a second server running and try to increase the compression ratio, going from KEY_BLOCK_SIZE of 8K to 4K to see if we can squeeze some more out of it without much penalty.

I love all the extra stats provided by the InnoDB plug-in and the Percona (and Google) enhancements. There are a lot of knobs that I've not yet tried to turn, but it's good to know they're available when that day comes.

More to come...

See Also: Is MySQL 5.1 a compelling upgrade?

Posted by jzawodn at 08:17 AM

April 30, 2009

Is MySQL 5.1 a compelling upgrade?

posted at 08:00 AM | PermaLink

Of the many things I noticed last week at the MySQL Conference, one of the most notable was how many companies have not upgraded from MySQL 5.0 to 5.1 yet. Craigslist is in that camp and it seems that we're joined by the likes of Facebook, Google, Yahoo, and about half a dozen other companies that use MySQL heavily.

Come to think of it, SmugMug are the only folks I've talked with who've made the jump (video).

So it's not much of a surprise that Percona is asking if they should backport 5.4 fixes to 5.0.

Given our usage of MySQL to date, the only really compelling reason to upgrade is to get access to the InnoDB plug-in (and XtraDB). I'd like to get compression, some of the various performance patches, and tuning options, so plug-in support is a requirement. But beyond that, I just don't see anything new in 5.1 that we need.

As I noted in The Real or Official MySQL? Does Not Matter!, the storage engines matter more than the various add-on features in the server itself.

Have you upgraded or are you thinking about it? If so, why? If not, why not?

Posted by jzawodn at 08:00 AM

April 28, 2009

Slides from "MySQL and Search at Craigslist"

posted at 08:11 AM | PermaLink

Last week I delivered a talk titled "MySQL and Search at Craigslist" as part of the 2009 MySQL Conference and Expo. I talked about some of the good and bad of our MySQL work and also talked a lot about our recent Sphinx deployment. The slides are embedded below and here, thanks to SlideShare. (Anyone know why Google Docs doesn't yet handle OpenOffice presentations?)

I gave a copy to O'Reilly but don't yet see them on the conference site.

The usual disclaimers apply: I said a lot that's not well reflected in the slides, and I'm sure they're less informative without the audio or video that may or may not have been captured. Either way, hopefully they're useful to folks who saw the talk and even a few of those who did not.

I also delivered a condensed version of this talk at the Percona Performance Conference and those slides are available too.

Thanks to everyone who provided useful feedback and discussion before and after the talks.

Posted by jzawodn at 08:11 AM

April 24, 2009

MySQL and Percona Conferences Rocked

posted at 07:30 AM | PermaLink

Now that it's Friday, I can finally come up for air and say that both the MySQL and Percona Conferences (which I wrote about earlier) surpassed my expectations. Having the two going on semi-concurrently worked out pretty well. At no time did I find myself without at least two sessions I wanted to see. Often times I had to really cut lunch short to make sure I didn't miss anything.

Other MySQL Conference veterans I asked has very similar responses.

There's a ton of other stuff to digest, and I hope to write up some thoughts in the coming days and weeks.

As usual, a big part of the conference for me was being able to meet up with people I don't see often or who I've never met face to face. Meeting up with people solving similar problems at places like Facebook, Google, Mixi, and other high-traffic sites is invaluable.

Being able get a good sense of what new storage engines are out there and how modern hardware is changing database systems (SSDs, multi-core servers) proved to be very educational in ways I didn't expect.

All the politics and posturing around the Oracle/Sun buyout weren't as significant as I'd expected. People really wanted to get down to business and technology.

I enjoyed giving my talks and answering questions about problems that we have and haven't solved yet at Craigslist.

I'm heading over to Drizzle Devloper Day in a couple hours.

Posted by jzawodn at 07:30 AM

April 20, 2009

MySQL and Percona Performance Conferece Lineup

posted at 10:32 AM | PermaLink

Amidst all the Oracle/Sun/MySQL news today, the MySQL Conference kicks off this week. So I just spent a few minutes putting together my picks for the sessions I'd like to attend at the MySQL Conference and the Percona Performance Conference (schedule). There's quite a lineup and I have some hard choices to make. Both groups have put together excellent events. And, wow, there are a lot of new storage engines and appliances coming out.

To make my life easier, I'm putting the list of interesting sessions from both conferences here so I can try to decide where to spend my attention.

Sessions I'm Presenting or Part Of

I'm presenting MySQL and Search at Craigslist on Tuesday morning and am part of The Great Open Cloud Shootout on Wednesday morning.

On Friday, I'll be at the Drizzle Developer Day to talk about "What Craigslist Needs from Drizzle."

Tuesday Sessions: MySQL

  • 8:30am - State of the Dolphin - Karen Padir
  • 9:15am - This is Not a Web App: MySQL at Google - Mark Callaghan
  • 10:50am - MySQL and Search at Craigslist - Jeremy Zawodny
  • 11:55am - InnoDB: Innovative Technologies for Performance and Data Protection - Ken Jacobs, Heikki Tuuri
  • 2:00pm - Falcon Storage Engine - Designed for Speed - Kevin Lewis, Ann Harrison
  • 3:05pm - The PBXT Storage Engine: Meeting Future Challenges - Paul McCullagh
  • 4:25pm - Solving Common SQL Problems with SeqEngine - Beat Vontobel
  • 5:15pm - Hadoop and MySQL: Friends with Benefits - Frank Mashraqi

Wednesday Sessions: MySQL

  • 8:50am - The Great Open Cloud Shootout - panel
  • 10:50am - Build Your Own MySQL Time Machine - Chuck Bell, Mats Kindahl
  • 11:55am - Using Q4M: A Message Queue Storage Engine for MySQL - Kazuho Oku
  • 11:55am - libdrizzle: A New Client Library for Drizzle and MySQL - Eric Day
  • 11:55am - libdrizzle: A New Client Library for Drizzle and MySQL - Eric Day
  • 2:00pm - Maria: The New Transactional Storage Engine for MySQL - Monty Widenius
  • 2:00pm - SAN Performance on a Internal Disk Budget: The Coming SSD Revolution - Matthew Yonkovit
  • 2:00pm - Crash Recovery and Media Recovery in InnoDB - Heikki Tuuri
  • 3:05pm - MySQL Performance on EC2 - Mark Callaghan
  • 4:25pm - Perl Stored Procedures for MySQL - Antony Curtis
  • 4:25pm - High Availability and Scalability Patches from Google - Ben Handy, Justin Tolmer
  • 5:15pm - Optimizing MySQL Performance for ZFS - Allan Packer, Neelakanth Nadgir
  • 5:15pm - Redundant Storage Cluster: For When It's Just Too Big - Bob Burgess
  • 5:15pm - Inserts at Drive Speed: Designing a Custom Storage Engine for Write-Mostly Applications - Ben Haley

Wednesday Sessions: Percona

  • 9:00am - Maria In Depth - Monty Widenius
  • 9:55am - The Return of Gearman - Eric Day
  • 11:15am - Fighting Replication Lag - Peter Zaitsev
  • 12:45pm - Evaluating Disk Backends for MySQL Servers - Ewen Fortune
  • 1:35pm - Database Performance with Proxy Architectures - Robert Hodges
  • 2:00pm - Covering Indexes: Orders-of-Magnitude Improvements - Dr. Bradley C. Kuszmaul
  • 5:00pm - Sphinx and MySQL: A Perfect Match - Andrew Aksyonoff
  • 6:15pm - InnoDB Performance Tuning - Peter Zaitsev
  • 7:05pm - CouchDB: Behind the Buzz - Jan Lehnardt
  • 7:55pm - Linux Filesystems: Who, What, and Where - Stewart Smith
  • 9:10pm - Open Q&A: Performance - panel

Thursday Sessions: MySQL

  • 8:30am - The SmugMug Tale - Don MacAskill
  • 10:50am - SQL is Dead - Monty Taylor
  • 10:50am - Map/Reduce and Queues for MySQL Using Gearman - Eric Day, Brian Aker
  • 10:50am - Dormando's Proxy for MySQL - Alan Kasindorf
  • 11:55am - Memory Management in MySQL and Drizzle - Stewart Smith
  • 11:55am - Improving Performance by Running MySQL Multiple Times - MC Brown
  • 2:00pm - MySQL Row Change Event Extraction and Publish - Gene Pang
  • 2:00pm - InnoDB Performance and Usability Patches - Vadim Tkachenko, Ewen Fortune
  • 2:50pm - Make Your Life Easier with Maatkit - Baron Schwartz
  • 2:50pm - BLOB Streaming: Efficient Reliable BLOB Handling for all Storage Engines - Barry Leslie
  • 3:50pm - Database We Can Believe In: Stories from the Front Lines (of the Obama Campaign) - many speakers

Thursday Sessions: Percona

  • 9:55am - Pushing the Envelope - Don MacAskill
  • 10:50am - The Life of a Dirty Page - Mark Callaghan
  • 1:35pm - High Performance MySQL from a Boring Architecture - Baron Schwartz
  • 3:15pm - Hypertable - Doug Judd
  • 4:30pm - Drizzle's Approach to Improving Performance of the Server - Jay Pipes
  • 7:50pm - MySQL Replication: Getting The Most From Slaves - Peter Zaitsev
  • 9:05pm - Open Q&A: Feature Request Bonanza - panel

Friday: Drizzle Developer Day

Drizzle Developer Day is on Friday at Sun in Santa Clara. I'm looking forward to many of the talks (some of which will be completely over my head but interesting anyway).

Back To Work!

And, with that, I need to go work on my presentation! See you at the conference... I'll try to post interesting tidbits on Twitter and tag 'em with #MySQLConf like others appear to be doing.

I just found out that I can share my public MySQL Conference Schedule.

Posted by jzawodn at 10:32 AM

Oracle Buying Sun, Gets MySQL

posted at 06:57 AM | PermaLink

Interesting news this morning, just as the 2009 MySQL Conference is starting. As is being reported all over the place, Oracle has agreed to buy Sun at $9.50 per share, giving them to a ton of great technology (Solaris, ZFS, MySQL, DTrace, etc.).

One one of the biggest threats to Oracle's core database business (at the low end, at least) for a while now has been MySQL. And now they're poised to own MySQL after Sun bought it not long ago. (It seems like yesterday that Oracle bought Innobase.)

As I noted a while back, the MySQL landscape is changing.

This news is sure to make the conference more... interesting.

Oracle, please get the InnoDB team together with the MySQL team and see about GPLing ZFS.

Posted by jzawodn at 06:57 AM

March 31, 2009

The Real or Official MySQL? Does Not Matter!

posted at 07:23 AM | PermaLink

Yesterday Patrick Galbraith asked What is the official branch of MySQL? which got a lot of attention, including on Slashdot (and the token PostgeSQL comments quickly appeared).

Here's the funny thing. It doesn't matter anymore. Patrick's question is interesting in an academic sense, but it's mainly a distraction from what really matters. (Hint: What's the official Linux and who really cares? Ubuntu? RedHat? Debian? CentOS?)

Storage Engines

Nowadays what matters is the set of available storage engines. InnoDB, Percona's XtraDB, PrimeBase's PBXT, Maria, Falcon, and several others are available or will be soon. I predict that for the foreseeable future, any MySQL distribution or derivative must support the storage engine plug-in API that MySQL 5.1 defined. And since that's the case, it largely won't matter which flavor you using.

Protocol(s)

Look at what's happened in the world of key/value databases in the last few years. More than a few of them speak the memcached protocol as either their native and default or an optional add-on. I suspect the same thing will be the case here. All MySQL distributions and derivatives will speak the "traditional" MySQL protocol (just like memecached has the old protocol). Some of them, notably Drizzle, will have other (newere, better) protocols available as well (much like memcached has the new binary protocol).

Summary

In summary, the choice of MySQL version or derivative won't matter as much as you might think because they'll have the same Storge Engine plug-ins available (thanks to the shared plugin-in API), they'll all speak a common protocol (this may not be true for replication--watch that area closely), and will largely offer the same subset of SQL and SQL extensions.

They'll all be supported by different groups/companies (including some "database appliance" vendors), will all be tuned differently and aimed at slightly different uses cases, and will certainly benefit from a lot of cross-pollination.

That doesn't sound so bad to me.

The fact that nobody can point to the "real" MySQL in a few years just won't matter. Does anyone ask (anymore) which is the "real" Linux? Nope. And for very similar reasons. Think of MySQL as "kernel" and Storage Engine as "filesystem" and you'll realize we've been down this road before.

We're looking at the upgrade from 5.0 to 5.1 soon at Craigslist and don't know if we'll be using InnoDB or XtraDB yet. Time will tell.

See Also: The New MySQL Landscape, which I wrote a few months back--before a good chunk of the MySQL team had left Sun.

Posted by jzawodn at 07:23 AM

February 10, 2009

Playing With CouchDB: First Impressions

posted at 10:58 AM | PermaLink

About a week ago, Nat posted Open Source NG Databases on O'Reilly Radar. That caught my interest because I'm playing with some "alternative" databases for some of our data at Craigslist. Don't get me wrong, MySQL is great. But MySQL isn't well suited to every use case out there either. (I'll talk more about this at the MySQL Conference.)

Meanwhile, I left a comment on that posting about CouchDB and have been playing with it a bit more since then--mostly loading in test data, figuring out the data footprint, performance, etc.

Overall, I'm impressed and encouraged. I agree with what Ben Bangert said. The simple API is great but the lack of a schema to worry about really makes my life simple in this application. I don't have any initial plans for views, but writing them in Javascript is an interesting idea. I can definitely appreciate the flexibility there. And having good replication built-in solves one of my big needs.

I'm sure my thinking will have evolve after I've loaded a few hundred million documents in, but so far I'm really liking it. The CPAN modules in Net::CouchDb do a pretty good job and get you up and running quickly. I had a knee-jerk response to tweak a few things there but quickly realize that they're far from being the bottleneck anyway.

It seems that without any tuning or fancy work, I can get about 75-100 inerts/sec on my desktop class Ubuntu box (Intel Core 2 Duo, 2.66GHz, 1GB RAM, single 80GB SATA disk). That's not bad for out-of-the-box performance. And doing the math on space used for a document set (after compaction), I'm seeing roughly ~3KB/doc. That's a bit more than I expected but really not bad at all.

I wonder if there's a future for gzip compression in CouchDB. Or maybe we should just use ZFS...

Posted by jzawodn at 10:58 AM

January 16, 2009

Sphinx Search at Craigslist

posted at 09:55 AM | PermaLink

A couple days ago, Andrew posted a news item titled Sphinx goes billions to the Sphinx web site.

Last but not least, Powered By section, now at 113 sites and counting, was updated and restyled. I had long wondered how much Sphinx search queries are performed per month if we sum all the sites using it, and whether we already hit 1B page views per month or not. Being open-source, there's no easy way to tell. But now with the addition of craigslist to Powered By list I finally know that we do. Many thanks to Jeremy Zawodny who worked hard on making that happen, my itch is no more. :-)

Well, I guess the cat's out of the bag! My first project at Craigslist was replacing MySQL FULLTEXT indexing with Sphinx. It wasn't the easiest road in the world, for a variety of reasons, but we got it all working and it's been humming along very well ever since. And I learned a heck of a lot about both Sphinx and craigslist internals in the process too.

I'm not going to go into a lot of details on the implementation here, other than to say Sphinx is faster and far more resource efficient than MySQL was for this task. In the MySQL and Search and Craigslist talk I'm giving at the 2009 MySQL Users Conference, I'll go into a lot more detail about the unique problems we had and how we solved them.

For what it's worth, the implementation isn't really done. I did update the search help page on the site to reflect some of the capabilities (hey, look! OR searches!) but there are features I have planned that I'd like to expose as time allows.

Posted by jzawodn at 09:55 AM

January 07, 2009

Don't Bet on Moore Saving Your Ass

posted at 07:10 AM | PermaLink

Over on the 37signals blog, DHH writes Mr. Moore gets to punt on sharding. His argument is basically that if you continually delay fixing your data storage and retrieval layer, Moore's Law will be there to save our ass--over and over again.

Bzzzt. Wrong answer.

Depending on future improvements to fix your own bad planning is a risky way to build an on-line service--especially one you expect to grow and charge money for.

It's easy to forget history in this industry (as Paul pointed out in the comments on that post). There was a point a few years ago when people still believed the clock speed of CPUs would be doubling roughly every 18 months for half the cost. Putting aside that Moore's Law is really about transistor density and not raw speed, we all ended up taking a funny little detour anyway.

Until recently, the sweet spot (in terms of cost and power use) was probably a dual CPU, dual core server with 16 or 32GB of RAM. But soon that'll be dual quads with 32 or 64GB of RAM. And then it'll be quad eight core CPUs with 128GB or whatever.

But notice that nowadays we're not all running 6.4GHz CPUs in our servers. Instead we're running multi-core CPUs at slower clock speeds. Those two are definitely not equivalent.

A funny thing happens as you add cores and CPUs. You begin to find that the underlying software doesn't always... get this... scale. That's right. Software designed in a primarily single or dual CPU world starts to show its age and performance limitation in a world where you have 8, 16, or 32 cores per server (and more if you're running one of those crazy Sun boxes).

You see, David is talking specifically about MySQL (and probably InnoDB), which is currently being patched by outside developers precisely because it has multi-core issues . Its locking is expensive and not granular enough to utilize all those cores. It's expensive in terms of memory use too. And there are assumptions built into the I/O subsystem that don't scale well in today's world of fast multi-disk RAID units, SSDs, and SANs. People are hitting these issues in the real world and it's definitely becoming a serious bottleneck.

See Also: The New MySQL Landscape.

Moore's Law is no silver bullet here. A fundamental change has occurred in the hardware platform and now we're all playing catch-up in one way or another.

I'll discuss this a bit in my upcoming MySQL Conference Talk too. The world is not nearly as clear or simple as DHH is suggesting. Perhaps they can get by with constantly postponing the work of sharding their database, but that doesn't mean you should follow their lead.

Posted by jzawodn at 07:10 AM

December 19, 2008

Talk Announcement: MySQL and Search at Craigslist

posted at 07:54 AM | PermaLink

I recently learned that my talk has been accepted for the 2009 MySQL Conference in Santa Clara, California. It is currently scheduled for Tuesday the 21st and titled MySQL and Search at Craigslist.

Here's the abstract (which I've promised to expand upon soon):

Millions of people search for things every day on craigslist: tickets, cars, garage sales, jobs, events, and so on.
This talk will look at the recent evolution of database and search architecture at Craigslist, including performance, caching, partitioning, and other tweaks. We'll pay special attention to the unique challenges of doing this for a large data set that has an especially high churn rate (new posts, edits, and deletes).
And we strive to do this using as little hardware and power as possible.

If you're coming to the conference, drop by and harass me. :-)

If you're not sure check out the full schedule--there's a lot of good stuff packed into the conference already and a lot of talks are still not even posted.

Posted by jzawodn at 07:54 AM

December 08, 2008

The New MySQL Landscape

posted at 07:02 AM | PermaLink

Interesting things are afoot in the MySQL world. You see, it used to be that the MySQL world consisted of about 20-40 employees of MySQL AB (this funny distributed Swedish company that built and supported the open source MySQL database server), a tiny handful of MySQL mailing lists, and large databases were counted in gigabytes not terabytes. A Pentium III was still a decent server. Replication was a new feature!

Hey, anyone remember the Gemini storage engine? :-)

How times have changed...

Nowadays MySQL is sort of a universe onto itself. There are multiple storage engines (though MyISAM and InnoDB are still the popular ones), version 5.1 is out (finally), and the whole company made it over 400 employees before it was gobbled up by Sun Microsystems (a smart move, IMHO, though history will judge that) a while back.

If I had to guess 5 years or so ago what would be interesting to me today about MySQL, I'd have been really, really wrong. The future rarely turns out like we think. Just ask Hillary Clinton.

Here's a little of what's rattling around in the MySQL part of my little brain these days...

Outside Support, Patches, and Forks

The single most interesting and surprising thing to me is both the number and necessity of third-party patches for enhancing various aspects of MySQL and InnoDB. Companies like Percona, Google, Proven Scaling, Prime Base Technologies, and Open Query are all doing so in one way or another.

On the one hand, it's excellent validation of the Open Source model. Thanks to reasonable licensing, companies other than Sun/MySQL are able to enhance and fix the software and give their changes back to the world.

Some organizations are providing just patches. Others, like Percona are providing their own binaries--effectively forks of MySQL/InnoDB. Taking things a step further, the OurDelta project aims to aggregate these third party patches and provide source and binaries for various platforms. In essences, you can get a "better" MySQL than the one Sun/MySQL gives you today. For free.

Meanwhile, development on InnoDB continues. Oh, did I mention the part where they were bought by Oracle (yes, *that* Oracle) a while back? Crazy shit, I tell you. But it makes sense if you squint right.

Anyway, the vibe I'm getting is that folks are frustrated because there's not a lot of communication coming out of the InnoDB development team these days. I can't personally verify that. It's been years since I corresponded with Heikki Tuuri (the creator of InnoDB). So folks like Mark Callaghan of Google have been busy analyzing and patching it to scale better for their needs.

And we all benefit.

Drizzle

Taking things a step further yet, the Drizzle project is a re-making of MySQL started primarily by Brian Aker, who worked as MySQL's Director of Architecture for years. Brian is now at Sun and, along with a handful of others at Sun and elsewhere, is ripping out a lot of the stuff in a fork of MySQL that doesn't get used much, needlessly complicated the code, or is simply no longer needed.

In essence, they're taking a hard look at MySQL and asking what it really needs to provide for a lot of it's uses today: Web and "cloud" stuff. He visited us at Craigslist a few months ago to talk about the project a bit and get our input and feedback. I believe it was that day I joined one of the mailing list and started following what's going on. Heck, I even build Drizzle on an Atom-powered MSI Wind PC regularly.

It's great to see a re-think of MySQL going on... keeping the good, getting rid of the bad, and modularizing the stuff that people often want to do differently (authentication, for example).

It's even better to see the group that's hacking on it. They really have their heads on straight.

Unanswered Questions

Why is all this even necessary? Are the "enterprise" customers and their demands taking focus away from what used to be the core use and users of MySQL? Is Sun hard to work with?

It's clear that both the MySQL and InnoDB teams could be doing more to help. But having worked at a large company for long enough, I realize that things are rarely as simple as they should be.

Will this stuff get integrated back into mainline MySQL? Will Linux distributions like Ubuntu, Debian, and Red Hat pick up OurDelta builds? What about Drizzle?

Will Drizzle hit its target and be the sleek and lean database kernel that MySQL once could have been?

Hard to say.

It's hard to guess what the future holds and too easy to play armchair quarterback about the work of others. But these are question worth wondering about a bit.

What's it all mean?

Nowadays MySQL has a much slower release cycle that it used to. It's still available in "commecial" and free ("community") releases. There's still a company behind it--a much larger one in fact. But one that also has a vested interest in showing how it works better on their storage appliances or 256 "core" computers and whatnot.

Clustering is still very niche. Transactions are not.

Meanwhile, all the cutting edge stuff (at least from the point of view of scaling) is happening outside Sun/MySQL and being integrated by OurDelta and even Drizzle. The OutDelta builds are gaining steam quickly and Drizzle is shaping up.

Heck, I'm hoping to get an OurDelta box or two on-line at work sometime soon. And I'd like to put a Drizzle node up too. I want to see how the InnoDB patches help and also play with the InnoDB plug-in (and its page compression).

The next few years are proving to be far more interesting than I might have expected from a project and technology that looked like was on a track straight for Open Source maturity.

And you know what? I like it.

Posted by jzawodn at 07:02 AM

September 08, 2008

Long Term Data Archiving Formats, Storage, and Architecture

posted at 04:24 PM | PermaLink

I'm thinking about ways to store archival data for the long term and wanted to solicit anyone who's been down this road for some input, advice, warnings, etc.

Background

Essentially I'm dealing with a system where "live" and "recently live" data is stored in a set of replicated MySQL servers and queried in real-time. As time goes on, however, older "expired" data is moved to a smaller set of replicated "archive" servers that also happen to be MySQL.

This is problematic for a few reasons, but rather than be all negative, I'll express what I'm trying to do in the form of some goals.

Goals

There are a few high-level things I'd like this archive to handle based on current and future needs:

  1. Be able to store data for the foreseeable future. That means hundreds of millions of records and, ultimately, billions.
  2. Fast access to a small set of records. In other words, I like having MySQL and indexes that'll get me what I want in a hurry without having to write a lot of code. The archive needs to be able to handle real-time queries quickly. It does this today and needs to continue to work.
  3. Future-proof file/data format(s). One problem with simply using MySQL is that there will be schema changes over time. A column may be added or dropped or renamed. That change can't easily be implemented retroactively on a larger data set in a big table or whatnot. But if you don't then code needs to be willing to deal with those changes, NULLs appearing, etc.
  4. Fault tolerance. In other words, the data has to live in more than once place.
  5. Support for large scans on the data. This can be to do full-text style searches, looking for patterns that can't easily be indexed, computing statistics, etc.
  6. It's worth noting that data is added to the archive on a constant basis and it is queried regularly in a variety of ways. But there are no delete or updates occurring. It's a write heavy system most of the time.

Pieces of a Possible Solution

I'm not sure that a single tool or piece of infrastructure will ever solve all the needs, but I'm thinking there may be several open source solutions that can be put to use.

You'll notice that this involves duplicating data, but storage is fairly cheap. And each piece is especially good at solving one or more of our needs.

  1. MySQL. I still believe there's a need for having a copy of the data either denormalized or in a star schema in a set of replicated MySQL instances using MyISAM. The transactional overhead of InnoDB isn't really needed here. To keep things manageable one might create tables per month or quarter or year. Down the road maybe Drizzle makes sense?
  2. Sphinx. I've been experimenting with Sphinx for indexing large amounts of textual data (with some numeric attributes) and it works remarkably well. This would be very useful instead of building MySQL full-text indexes or doing painful LIKE queries.
  3. Hadoop/HDFS and Flat Files or a simple record structure. To facilitate fast batch processing of large chunks of data, it'd be nice to have everything stored in HDFS as part of a small Hadoop cluster where one can use Hadoop Streaming to run jobs over the entire data set. But what's good future-proof file format that's efficient? We could use something like XML (duh), JSON, or even Protocol Buffers. And it may make sense to compress the data with gzip too. Maybe put a month's worth of data per file and compress? Even Pig could be handy down the road.

While it involves some data duplication, I believe these pieces could do a good job of handling a wide variety of use cases: real-time simple queries, full-text searching, and more intense searching or statistical processing that can't be pre-indexed.

So what else is there to consider here? Other tools or considerations when dealing with a growing archive of data whose structure may grow and change over time?

I'm mostly keeping discussion of storage hardware out of this, since it's not the piece I really deal with (a big disk is a big disk for most of my purposes), but if you have thoughts on that, feel free to say so. So far I'm only thinking 64bit Linux boxes with RAID for MySQL and non-RAID for HDFS and Sphinx.

Related Posts

Posted by jzawodn at 04:24 PM

August 21, 2008

Dumber is Faster with Large Data Sets (and Disk Seeks)

posted at 02:21 PM | PermaLink

dumb I remember reading Disk is the new Tape earlier this year and how much it resonated. That's probably because I was working for Yahoo at the time and hearing a lot about their use of Hadoop for data processing. In fact, I even did a couple videos (1 and 2) about that.

Anyway, I recently faced the reality of this myself. When I wrote about The Long Term Performance of InnoDB I'd been beating my head against a wall trying to get millions of records out of InnoDB efficiently.

It was taking days to get all the records. Yes, days!

After joking that it'd probably be faster to just dump the tables out and do the work myself in Perl, I thought about Disk is the new Tape and realized what I was doing wrong.

Allow me to offer some background and explain...

There are several tables involved in the queries I needed to run. Two of them are "core" tables and the other two are LEFT JOINed because they hold optional data for the rows I'm pulling. There are well over a hundred million records to consider and I need only about 10-15% of them.

And these records fall into roughly 500 categories. So what I'd been doing is fetching a list of categories, running a query for each category to find the rows I actually need, processing the results, and writing them to disk for further processing.

The query looked something like this:

    SELECT field1, field2, field3, ... field N
      FROM stuff_meta sm, stuff s
 LEFT JOIN stuff_attributes sa ON sm.item_id = sa.item_id
 LEFT JOIN stuff_dates      sd ON sm.item_id = sd.item_id
     WHERE sm.item_id = s.item_id
       AND sm.cat_id  = ?
       AND sm.status IN ('A', 'B', 'C')

That seemed, at least in theory, to be the obvious way to approach the problem. But the idea of waiting several days for the results led me to think a bit more about it (and to try some InnoDB tuning along the way).

While it seems very counter-intuitive, this was sticking in my head:

I’m still trying to get my head around this concept of "linear" data processing. But I have found that I can do some things faster by reading sequentially through a batch of files rather than trying to stuff everything in a database (RDF or SQL) and doing big join queries.

So I gave it a try. I wrote a new version of the code that eliminated the two AND bits in the WHERE clause. Combining that with using mysql_use_result in the client API, meant it had to process a stream of many tens of millions of records, handle the status filtering and shorting records into buckets based on cat_id (and some extra bookkeeping).

As an aside, I should note that there used to be an ORDER BY on that original query, but I abandoned that early on when I saw how much work MySQL was doing to sort the records. While it made my code a bit easier, it was far more efficient to track things outside the database.

Anyway, the end result was that I was able to get all the data I needed in merely 8 hours. In other words, treating MySQL as an SQL powered tape drive yielded a 12 fold improvement in performance.

Put another way, taking the brain-dead stupid, non-SQL, mainframe-like approach got me results 12 times faster than doing it the seemingly "correct" way.

Now this isn't exactly what the whole disk vs. tape thing is about but it's pretty close. I'm aware that InnoDB works with pages (that will contain multiple records, some of which I don't need) and that's part of the problem in this scenario. But it's a really interesting data point. And it's certainly going to change my thinking about working with our data in the future.

Actually, it already has. :-)

Dumber is faster.

As I've mentioned before, craigslist is hiring. We like good Perl/MySQL hackers. And good sysadmin and network types too.

Ping me if you're interested in either role.

Posted by jzawodn at 02:21 PM

August 12, 2008

The Long Term Performance of InnoDB

posted at 01:05 PM | PermaLink

The InnoDB storage engine has done wonders for MySQL users that needed higher concurrency than MyISAM could provide for demanding web applications. And the automatic crash recovery is a real bonus too.

But InnoDB's performance (in terms of concurrency, not really raw speed) comes at a cost: disk space. The technique for achieving this, multiversion concurrency control, can chew up a lot of space. In fact, that Wikipedia article says:

The obvious drawback to this system is the cost of storing multiple versions of objects in the database. On the other hand reads are never blocked, which can be important for workloads mostly involving reading values from the database.

Indeed.

Imagine a set of database tables will tens of millions of rows and a non-trivial amount of churn (new records coming in and old ones being expired or removed all the time). You might see this in something like a large classifieds site, for example.

Furthermore imagine that you're using master-slave replication and the majority of reads hit the slaves. And some of those slaves are specifically used for longer running queries. It turns out that the combination of versioning, heavy churn, and long running queries can lead to a substantial difference in the size of a given InnoDB data file (.ibd) on disk.

Just how much of a difference are we talking about? Easily a factor of 4-5x or more. And when you're dealing with hundreds of gigabytes, that starts to add up!

It's no secret that InnoDB isn't the best choice for data warehouse looking applications. But the disk bloat, fragmentation, and ongoing degradation in performance may be an argument for having some slaves that keep the same data in MyISAM tables.

I know, I know. I can do the ALTER TABLE trick to make InnoDB shrink the table by copying all the rows to a new one, but that does take time. Using InnoDB is definitely not a use it and forget about it choice--but what database engine is, really?.

Looking at the documentation for the InnoDB plug-in, I expect to see a real reduction in I/O when using the new indexes and compression on a data set like this. (Others sure have.) But I don't yet have a sense of how stable it is.

Anyone out there in blog-land have much experience with it?

Posted by jzawodn at 01:05 PM

April 18, 2008

High Performance MySQL, Second Edition

posted at 07:47 AM | PermaLink

While my involvement can generously be labeled as "minimal", the second edition of High Performance MySQL is slated to hit store shelves soon.

High Performance MySQL, second edition
pre-order on Amazon.com

More info is available on O'Reilly.

Thanks to Baron, Peter, Vadim, and Arjen for picking up the torch to get a greatly expanded seconded edition done and out the door. There's a heck of a lot of new material in it.

Posted by jzawodn at 07:47 AM

January 16, 2008

Congrats to MySQL AB and Sun!

posted at 09:50 AM | PermaLink

As long-time readers here know, I've been a fan of MySQL for many, many years--so much that I even wrote a book about it. It's a fantastic product that's unmatched in many ways. Over the years, I've had the pleasure of getting to know many of the early folks at MySQL AB, including the Monty and David (co-founders), Marten (CEO), numerous engineers (Brian, Jim, Mark), sales folks (Kerry, Larry), and so on.

So I was very happy to see the announcement this morning that Sun is Buying MySQL AB. Sun is a great company that really gets Open Source and is making some big and very smart bets on it for their future. I think it's a great home for MySQL.

There's some smart thinking going on over at Sun these days. That Jonathan guy has a good head on his shoulders. :-)

Congrats again to everyone involved!

Posted by jzawodn at 09:50 AM

January 09, 2007

COEUM Software: Spammers!

posted at 05:07 PM | PermaLink

Normally I ignore the spam that makes it thru to my inbox. It's usually pretty easy to pick out by reading the subject and sender's address (or name). I'll just mark it as spam and go on with life. But this particular gem came thru a few days ago with a subject of "MySQL Performance and Tuning...", a topic that's been near and dear to my heart.

coeum software: spammers
click to see full-sized copy

So you can imagine how pissed I was to open the message, only to be confronted by one of those annoying large image-looking spam messages.

Worst of all is their insulting language at the bottom of the spam:

If you no longer wish to receive these emails...

It's written to imply that there was a point in history when I did wish to receive them.

Fuck that.

If you're looking for real MySQL Performance Consulting (which I used to do), let me know. I can refer you to a non-sleazy company or two. COEUM clearly doesn't deserve your business.

Posted by jzawodn at 05:07 PM

July 10, 2006

Head MySQL Geek Needed at Yahoo!

posted at 02:22 PM | PermaLink

It seems like only yesterday that Jeremy Cole came to Yahoo! to take the job I vacated to join the Yahoo! Search team.

Well, he's out on his own now and we need some serious MySQL geekage around here.

The official job description looks like this:

Yahoo!'s Platform Engineering group is looking for a MySQL expert to provide consulting, training, and internal support for MySQL and data storage technologies. You will be working with teams to help them understand how MySQL may fit into their applications, making internal releases of custom MySQL binaries from source, analyzing database performance, and helping others to tune their hardware and software settings.
An ideal candidate has designed distributed and high-performance application architectures, is familiar with MySQL replication and load balancing, and knows the features, pros, and cons of MySQL's storage engines. Familiarity with Perl DBI, PHP, and MySQL administration tools is also required.
A source-level understanding of MySQL helpful but not required.
A BS/MS in Computer Science or equivalent and 4+ years experience with MySQL and Unix (FreeBSD/Linux) is required.

Shoot me a resume if you're interested.

Posted by jzawodn at 02:22 PM

February 15, 2006

Oracle Acquisitions are not about MySQL

posted at 07:22 AM | PermaLink

I've been thinking about this for the last day or so and have come to the conclusion that Oracle's acquisition of Sleepycat Software (and Berkeley DB) is not about MySQL. Even when combined with their previous purchase of Innobase Oy (and InnoDB), it's not about MySQL.

With all due respect to Phil Windley (and Gadgetopia), you're wrong. Oracle is thinking much bigger and more strategically than "put the squeeze on MySQL."

Trying to put MySQL out of business would be a fairly short-term tactical move. I think Oracle is looking 5 years down the road and seeing what the world looks like as the commoditization of enterprise scale infrastructure software components continues. They're seeing that they "own" a progressively smaller piece of that pie unless they act soon. The rumors of Oracle eyeing JBoss and others are completely in line with this thinking.

If Oracle can become a one-stop shop for folks building the next generation of big business applications, whether or not they use "traditional" Oracle software, the company manages to stay relevant in the new world--and that includes their lucrative consulting services.

Is this reminiscent of IBM's approach to Linux circa 2001? It sure is.

Think bigger guys. Oracle's not just a database company and hasn't been for years.

Now, they could still end up putting the squeeze on MySQL along the way. But I suspect that'd be a happy byproduct of larger moves they're making.

What do you think?

Posted by jzawodn at 07:22 AM

February 14, 2006

Oracle Buys Berkeley DB, Sleepycat Software

posted at 07:50 AM | PermaLink

Wow, the rumors were true. Oracle is snapping up Open Source Database companies now. First it was Innobase (see Oracle buys Innobase. MySQL between rock and hard place?) and now it's Sleepycat Software.

The purchase of Sleepycat, which has been rumored for weeks, gives Oracle another open-source product to complement its proprietary database offerings. At an investor conference last week, Oracle CEO Larry Ellison reiterated the company's strategy to generate revenue from a combination of open-source and proprietary software.

They produce and support the famed Berkeley DB embedded database engine and have radically improved it's features since the version 1.x days. Nowadays you get a small, fast, transactional database engine with industrial grade reliability and replication.

It's interesting to note that MySQL's first transactional storage engine (BDB) was created on top of Berkeley DB. Their more popular transactional storage engine (InnoDB) is built on top of technology produced by Innobase, which Oracle bought last year.

This leads to the obvious question: What is Oracle up to? Are they trying to do to Open Source Databases what Yahoo appears to be doing to Web 2.0 companies?

There's been speculation of a master plan at Oracle that involves buying up various bits of the Open Source infrastructure used in building applications. Is JBoss next, as some have suggested?

We'll see.

Posted by jzawodn at 07:50 AM

January 30, 2006

Someone AJAXified mytop!

posted at 12:31 PM | PermaLink

ajax mytop Check this out. Someone has built and AJAX powered version of mytop, the little console based MySQL monitoring tool I wrote years ago. I guess it's now buzzword compliant.

If you can't wait to get your hands on it, head over to the AjaxMyTop project on SourceForge.

Everything old is new again. :-)

Posted by jzawodn at 12:31 PM

December 02, 2005

MySQL 5.1 Appears: Partitioning, Dynamic Plugins, and more...

posted at 07:16 PM | PermaLink

The first public release of MySQL 5.1 is available now. The MySQL 5.1.3 alpha release is a developer preview that gives early adopters, fans, and hard-core database geeks a chance to kick the tires of the next big release.

Major new features include:

Partitioned tables. You can have single tables spread across multiple disks based on how you define them at table creation time.

A Plugin API and support for dynamically loading new modules of code into the server. The first example of this is pluggable full-text parsers. That means you'll be able to write a custom parser to index any sort of oddball textual data you might want to store and retrieve. MySQL still handles the details of executing the queries, so you need only worry about the specifics of parsing your data.

The instance manager has been beefed up with additional SHOW commands for getting at information about log files. You can also issue SET commands that change configuration options which get written back to the configuration file.

VARCHAR fields on cluster tables really are VARCHAR fields now.

And there's lots more. The MySQL documentation, as always, has the gory details.

Posted by jzawodn at 07:16 PM

October 07, 2005

Oracle buys Innobase. MySQL between rock and hard place?

posted at 09:26 PM | PermaLink

As reported in several sources (Slashdot, InfoWorld, AP on Yahoo, Reuters), Oracle has acquired Innobase Oy for an undisclosed sum of money. This appears to be a strategic move by Oracle to put MySQL between a rock and hard place.

Innobase is the company that provides the underlying code for the InnoDB storage engine in MySQL. It's the de-facto choice for developers who need high concurrency, row-level locking, and transactions in MySQL. For many years now, MySQL AB and Innobase Oy (founded by Heikki Tuuri) have worked closely together to make that technology a seamless part of MySQL.

Like all of the MySQL code, InnoDB is dual licensed. That means you can freely use it under the GPL or buy a license for it if your usage would violate the GPL.

MySQL's public reaction right now isn't the "holy f$@%ing shit!" that likely occurred internally. Kaj Arno, MySQL's VP of Community Relations, sent out a message to many MySQL users today titled " MySQL AB Welcomes Oracle to the FOSS Database Market".

The message began by saying:

MySQL AB and the Free / Open Source database market today received some unexpected recognition by Oracle, through their acquisition of Innobase Oy.
So what does this have to do with MySQL?
Well, Innobase is the provider of the popular InnoDB Storage Engine in MySQL. One of the things our users appreciate about MySQL is its unique pluggable storage engine architecture. You have the flexibility to choose from number of storage engines including MyISAM, Memory, Merge, Cluster and InnoDB. And with MySQL 5.0, we added the new Archive and Federated storage engines.
Just like the rest of MySQL Server and its Storage Engines, InnoDB is released under the GPL. With this license, our users have complete freedom to use, develop, modify the code base as they wish. That is why MySQL has chose the GPL: to protect the freedom that users value in free / open source software.

Later on, Kaj makes an effort to calm the fears of MySQL users by saing that MySQL will continue to support all their users and work with Oracle as a "normal business partner."

The big elephant in the room, however, the uncertainty around Oracle's future plans for the InnoDB source code. Their press release says:

Innobase is an innovative small company that develops open source database technology. Oracle intends to continue developing the InnoDB technology and expand our commitment to open source software. Oracle has already developed and contributed an open source clustered file system to Linux. We expect to make additional contributions in the future.

As well as:

InnoDB is not a standalone database product: it is distributed as a part of the MySQL database. InnoDB's contractual relationship with MySQL comes up for renewal next year. Oracle fully expects to negotiate an extension of that relationship.

I expect those negotiations could be quite interesting. Maybe not next year, but the year after? Oracle could decide to put the squeeze on MySQL someday in a way that hurts their customers but not "the community" (those using the GPL version).

MySQL is now faced with the prospect of licensing technology they cannot ship without from their biggest rival. Interestingly, there's always been once piece of the InnoDB puzzle that's not available under the GPL: the InnoDB Hot Backup Tool. Without it, database administrators cannot backup their InnoDB tables without shutting down MySQL or at least locking out all transactions.

Oracle just bought themselves a whole lot of leverage with MySQL AB and a talented team of database engineers to boot.

I've always wondered why MySQL AB didn't buy Innobase Oy years ago. It always made complete sense from where I sat. But I'm hardly an insider when it comes to the relationship between those companies. Needless to say, that relationship just got far more "interesting."

I hope, for the sake of the community and the company (I've known many MySQL employees for years), that Oracle is true to their promises. But it is Oracle, so I'm naturally skeptical.

Posted by jzawodn at 09:26 PM