On Open Source Citizenship (by Jeremy Zawodny)

With a title like "Why Google and Yahoo! can't be better open source citizens" one might think that our companies were squeezing as much as possible out of the open source world and giving little back.

But after reading Matt Assay's post a few times, I've begun to wonder how much open source code he's been publishing.

Putting aside the many contributions that Yahoo and Google have already made to various open source projects (Linux, FreeBSD, Perl, MySQL, PHP, etc.), I'd like to debunk his conclusion:

All of which means, as Tim pointed out, that these companies have failed to write code according to a cardinal open source principle: modularity. Yahoo! and Google can't open source more code because their code is too tightly bound together - layer upon layer upon layer requiring layer upon layer upon layer. This doesn't mean that Yahoo! and Google are bad, but it is disappointing that they are such heavy users of open source, and have architected themselves into a corner that makes giving back impossible or problematic.

Well, I've got news for you, Matt. Some of our code is tightly bound together. You know why? Because abstractions often cost performance, and performance costs money. When you're serving billions of pages per day, even the small stuff adds up quickly. But there really is a rhyme and reason behind the systems architecture and various bits of code.

But the layering or tight coupling isn't the only problem. Heck, it's not even the biggest problem.

So let's suppose that we decided to release "what we can" into the open source world. Of course, there'd be a lot of legal vetting first. Code licensing is a real mine field, but let's suppose that we cleared that hurdle. It would look as if Yahoo was doing exactly what businesses looking to get into open source are told NOT to do: throwing some half-baked code "over the wall" and slapping a license on it.

You noted this in your article too:

Jeremy eventually owned up to a reason that I found much more compelling - disappointing, but compelling. Jeremy said that Yahoo's applications are tightly bound together, making it difficult to open one piece without giving away information about how the remainder is written, or making it useless because knowing 1/10th of the application wouldn't be helpful (because of all the unknown code).

Right. There'd be places in the code where magic voodoo functions are called but we couldn't really talk about what they do or how they might work. That's called our secret sauce or "business logic" if you prefer. A good deal of that is kept under wraps for very legitimate reasons. It'd be like the FBI and CIA documents you get under a Freedom of Information request. The really juicy stuff is always hidden behind the thick black lines.

Open Source is supposed to be "open", right?

How would that look? Would it encourage outsiders to use, improve, and hack on our code? Or would it make us look like we don't really "get" open source? Like we're trying to get free labor without anything in return?

Putting aside the fact that there'd be some "we can't talk about what happens here" holes in the code, getting contributions back for the code would be... interesting. Are you willing to give us the necessary rights to use it in all the ways we'd like? Bug fixes might come in and be useful, but what about features? "Here's a patch that finally adds 'foo' to Yahoo! Mail." Then we'd be beat up for not integrating it fast enough. It's tricky to introduce new features to a product that tens of millions of people are using.

But I'm jumping the gun a bit. How would a developer test a bug fix or new feature? Aside from the most trivial examples, you'd need to replicate a sufficient chunk of our infrastructure for that to work. Does that mean we have to release the code and documentation regarding our package management and distribution tools? What about our naming conventions, network ACLs, and so on? Where does it stop?

So Matt, I guess what I'm trying to say is that this is way more complicated than "Yahoo and Google don't write modular code." And I suspect that you know that but decided to opt for the cheap headline.

I don't want to end this on a completely negative note, so here's one example of Yahoo! code that's been released on SourceForge under an open source license (BSD) with good documentation and support: YUI, our User Interface library. As I wrote on the YDN blog, a fair number of folks were interested in it. I'm even told that Valueclick is using it to revamp their site. Why not? We're using it all over Yahoo! (Google has examples too, I'm sure. Perhaps ExplorerCanvas is one?)

How about that?

Claiming that we can't be better open source citizens because of our inability to write good software is, pardon my French, just crap. Being good open source citizens means contributing where it makes sense, allowing our employees to be a part of the open source world, and helping to evangelize the benefits of open source software.

I think that if you've spent any time looking at Yahoo or Google's involvement, you'd notice that we've been increasingly doing more and more to help the open source world.

Posted by jzawodn at August 03, 2006 07:40 AM | edit

Reader Comments

# Ryan said:

I think a good example of this is the Python project... while it's Goal is to tweak it for their own uses and ensure it's success, Google is reponsible for all the major updates, fixes, ect in the languange.

Granted they hired it's inventor.. but it's still a good example. They also put out a framework like the YUI (but the YUI seems to be capable of doing more)

Speaking of optimized code, I just bought your book Jeremy... I was doing some web research for mysql optimization and you came up first result for a powerpoint slide that actually helped me with my project... so thanks!

on August 3, 2006 08:23 AM

# Andre Stechert said:

It's easy to take for granted the work that's contributed by Y & G. Google's a bit more obvious: the summer of code, code.google.com, and support of individual contribution are heavily discussed. Yahoo's contributions tend to be a little less publicized, but check out, e.g., how much of the hadoop-dev mailing list traffic originates within Yahoo. That's map-reduce for the masses, in progress, booyah. Unfortunately, like PHP/MySQL/Apache/FreeBSD/etc., it's not started from within Yahoo, so I guess the Apache foundation gets the credit? Too bad nobody notices. On the flip side, don't forget the contributions made by folks like BradFitz/LJ/memcached/perlbal/mogilefs. Also large services. Also serving billions of web pages. However, fully open source and subprojects used heavily outside their company, leading to better subprojects.

on August 3, 2006 09:55 AM

# Jeremy Zawodny said:

Andre:

Good points. Too many people assume that this is done so that we "get the credit", but there are many other good reasons too. It's pretty clear that we benefit from a healthy open source community, so it's in our interest to help it stay healthy and grow.

on August 3, 2006 10:28 AM

# Nate Koechley said:

Small YUI clarification: sibling projects "support" YUI (such as the Design Patterns Library, http://developer.yahoo.com/ypatterns/), but the developer support forum for YUI code is at http://groups.yahoo.com/group/ydn-javascript/ and bug and feature request trackers are at sourceforge: http://sourceforge.net/tracker/?group_id=165715

Thanks,
Nate

on August 3, 2006 10:34 AM

# grumpY! said:

there is a gap between open source projects and community projects. YUI is open source, but it is not a community project. "commit bits" (for lack of a better term) do not appear to exist outside of yahoo's employee base, even though some incredibly detailed and advanced commentary on the codebase has been entered on the mailing list by external users.

ultimately the open tools markets tends to gravitate towards community projects. look at redhat/fedora vs ubuntu. look at xfree86 vs xorg. there is a lot of evidence to indicate that locking out the community leads to dead projects where there are community alternatives, or forks where there are no alternatives and the licensing allows.

if yahoo just wants some free bug reporting, the current approach is likely adequate. if yahoo wants to have to dominant dhtml library, a community component is a must.

on August 3, 2006 10:36 AM

# Jeremy Zawodny said:

Nate:

Fixed link, thanks. I actually intended to point to the mailing list, but I guess I had the wrong one in the clipboard.

on August 3, 2006 10:51 AM

# Jeremy Zawodny said:

grumpY!:

Why would Yahoo want a dominant DHTML library? I'm puzzled as to why you'd suggest that.

on August 3, 2006 10:55 AM

# Chris DiBona said:

Well said, Jeremy, well said. You encapsulate my thoughts exactly.

on August 3, 2006 11:45 AM

# grumpY! said:

perhaps "dominant" was not the correct term to use. perhaps "relevant". i do not mean this in a derisive manner. i believe there is ample evidence to suggest that open source projects that do not engage a community tend to become irrelevant.

even to attract a diverse audience of free bug-squashers, the public must perceive the code as relevant.

on August 3, 2006 12:03 PM

# Scott Plumlee said:

Slightly off topic, but do you know if Yahoo donated to OpenSSH in the recent round of donations they received? Mozilla and Google had both donated either $7500 or $10000, I can't recall exactly. Don't know if you have any weight in that department, but it would be a nice (non-code) offering back to the community. I wonder if cash or code is more appreciated by the majority of open source projects and developers.

on August 3, 2006 12:44 PM

# Sam Newman said:

I'm using YUI on my current project - hopefully in a month or two I'll be able to let you know what it is (damn contracts). All I can say is that I've been impressed by it's quality and it's saved me a load of time. Some more advanced examples are lacking from the documentation, but hopefully I (and others) will be able to contribute to that over time.

And Scott, as for what is better - cash or code - I'd guess I'd have to say it depends on how much money, and what code :-)

on August 3, 2006 04:18 PM

# James Briggs said:

Another point to make when outsiders ask about a company's Open Source contributions is this. It is often very hard to get patches accepted unless you employ a committer. I've experienced that with several projects, and that problem is often mentioned at technical talks by others. You can really bang your head against a wall for a long time, often past the point of being worthwhile.

on August 3, 2006 04:34 PM

# Justin Mason said:

to James Briggs --

maybe, since the company is getting all this open source code for free, employing a committer or two is a reasonable price to pay, and a good way to "give back"? TANSTAAFL. ;)

I really didn't get Matt Asay's post. Google and Yahoo! are some of the *best* companies out there, in terms of how much they contribute back to open source -- while they may not open a whole lot of their own code, they certainly employ lots of committers and provide backing for many projects. That counts for a lot.

on August 4, 2006 05:15 AM

# Shrikant Joshi said:

Sometimes I hate open source.

After all, we want open source only when want to customize things to our own requirements. Other times, we are often happy with what we have.

Maybe I think like that because I am not a developer and I am happy with M$ and its 'monopoly'.

Or maybe...

Regards,
Shri.

on August 4, 2006 05:47 AM

# Tom Harrison said:

Jeremy -

I totally agree with your points. Open Source is one of the most compelling and interesting phenomena of recent years; it has enabled many companies to become great (Yahoo, Google , several I have worked at and many others included) because there was a point when we all had no money and the idea of a cheap box running free software we could fix if we needed to was pretty compelling.

The thing about Open Source is that its asymmetric -- users are not always the best contributors (except, perhaps of bug fixes or extensions). Most Open Source is written by people who do it for something other than money. One of those things is the basic urge to do something that is uncompromised. Having spent a dirty, dirty week of hacking to get something done, I know that for my company I make compromises that are right for the business. We'll clean the hacks up later, if it makes sense, but the things I have contributed to the world are small but pristine -- libraries of useful methods, and such. If someone uses my code, I am happy, but probably don't know. I wrote it because it was good and unfettered by the conflicts of business.

If we start assuming tit-for-tat, the whole thing breaks.

Tom

on August 4, 2006 03:45 PM

# Dossy Shiobara said:

Jeremy, thanks for explaining one of the difficulties of open sourcing a previously closed source body of work. I put down some of my own thoughts about this as it relates to the AOLserver community here:

http://dossy.org/archives/000316.html

on August 8, 2006 09:40 AM

# Kevin Burton said:

Lets face it. Tightly bound code is bad. Its bad from an engineering perspective. While I was at Rojo I wrote too much tightly bound code. My excuse was that I was under a deadline and I had to get work done.

Once I left the existing engineers eventually had to take the time to fix things but at least there were more of them (not just me) and the company had funding.

What I'm trying to say is that if you have a piece of core architecture code like memcache, mysql or lbpool(code.tailrank.com/lbpool which is a recent OSS release of ours) it forces you to clean up your mess which makes you a better coder.

If I were a product manager at Yahoo and I had a alpha geek working for me on infrastructure code I'd insist that it was OSSd so I didn't look like a jerk when the guy moved on to greener pastures.

Kevin

on August 8, 2006 08:54 PM

# Jeremy Zawodny said:

Where does the infrastructure stop and the application begin?

Most of our low-leve infrastructure is modular--in the sense that it's a platform used by nearly everyone. But we don't build needless abstraction layers on top of it (usually).

on August 8, 2006 08:58 PM

# chad said:

As a former yahoo, I would love to see something like yinst released. I think it fit the perfect spot for intra-company software deployment. In fact, I routinely taunt my current coworkers with "yinst could do that" or "yinst made this so easy".

Of course, these are purely selfish thoughts because I want to make my life easier, not because I have a desire to make yinst better.

on August 9, 2006 11:36 PM

Disclaimer: The opinions expressed here are mine and mine alone. My current, past, or previous employers are not responsible for what I write here, the comments left by others, or the photos I may share. If you have questions, please contact me. Also, I am not a journalist or reporter. Don't "pitch" me.

Privacy: I do not share or publish the email addresses or IP addresses of anyone posting a comment here without consent. However, I do reserve the right to remove comments that are spammy, off-topic, or otherwise unsuitable based on my comment policy. In a few cases, I may leave spammy comments but remove any URLs they contain.