This is too cool. The inventor of the World-Wide Web, Tim Berners-Lee will be on Talk of the Nation's Science Friday in moments.
I've never heard Tim at a conference before, but I'd love to someday.
Since I'm a bit too old to go around the neighborhood asking for candy, I'm spending a lot of time working on the book today. No blog stuff.
Okay, maybe. I'll mention that AmphetaDesk 0.93.1 has been released.
So you may have read about the fact that we're thinking pretty hard about re-architecting things to use lots of XML at work. Now we're facing an interesting challenge, and I'd like to ask the blog world for advice. Surely we're not the first group to encounter this.
The problem is that we have tons of data that we need to represent in XML. Much of the data is related to stock tickers. For example, given YHOO, we have earnings information, P/E ratio, average daily volume, EPS, full company name, and so on. However, we also have some data that doesn't map to particular tickers--instead is maps to more general symbols that we use internally (industry codes, etc.).
We're trying to figure out how to create an XML Schema (or several?) that encompasses the full collection of data that we may want to publish both on-line and internally. It's hard to figure out the right approach or where to start.
Do we build one gigantic schema? If so, what problems do we run into down the road? Will we be generating new versions too often?
Should we instead build several schemas? One for us, one for those who consume our data, and others as needed?
If anyone has written about this from a practical point of view, we'd love pointers to it. (At least I would.) Theory is all well and good, but if you haven't been through this exercise before, I'm going to be skeptical about your recommendations. Why? This feels like a hard problem that looks easy on the surface.
Also, what about naming conventions for both namespaces and elements? That's likely to become a semi-religious debate quickly, but it can't hurt (too much) to ask. :-)
This is a little funny. Yahoo got slashdotted today. It was because of Michael Radwin's PHPCon 2002 talk on Yahoo adopting PHP. The funny thing is that it held up just fine--served by a single FreeBSD server running Apache. The hardware was nothing special. So, why is it that when most sites get "featured" on Slashdot, they crumble?
They generally have two fatal flaws: (1) not enough bandwidth, and (2) dynamic content. We're fortunate enough to have some excellent network connectivity, so we can handle a lot of traffic. The fact that public.yahoo.com was serving static files, no PHP or anything fancy, meant that the CPU had time to spare. During the peak of traffic, the CPU was still over 50% idle much of the time. Running a tail -f against the apache log was quite amusing. It was scrolling really, really fast.
Reading over the comments, I noticed that almost everyone suffered from a similar mental disorder: they didn't bother to read slides before commenting. It's really pathetic. Won't the Slashdot freaks ever learn?
Ah, well. I suppose that's what's so great about free speech, huh?
The latest release of mytop is now available. Various minor bugs are fixed, and I've added stats for the query cache in MySQL 4.x. The announcement should be on Freshmeat shortly too.
After the conference was officially over, I had some time to hang out with Zak and Jim (from MySQL AB), Shane from ActiveState, Scott, and a few other folks. We munched on the hotel bar's snacks, had a few drinks, and chatted about lots of geeky stuff and some not-so-geeky stuff.
But if there was any doubt as to our true nature, the truth was revealed when Zak busted out a notebook to get down-and-dirty with the source code to figure out the right way of fixing PHP's mysql_pconnect() so that it'd be less wasteful of connections.
Heh. Oh, well. Another conference over. Met lots of good people, some new and some old.
Perhaps some of them will even be at PHPCon East 2003. :-)
The conference was closed by Dirk, one of the founders of Rackspace discussing the critical role that PHP played in getting Rackspace off the ground. He focused on PHP's integration, quick development times, and flexibility.
He then threw us a bit of a curve ball by revealing that a sizable chunk of their PHP code is being replaced by Python. The silver lining in the story is that SOAP is allowing them to keep much of the customer-facing stuff in PHP and the back-end code in Python.
In chatting after his talk, I was impressed that Dirk remembered me from last year's Open Source Database Summit. He did a keynote talk there too. Apparently he remembered my first talk about Yahoo and MySQL.
On Friday afternoon, I had to spend some time with the conference staff to go over various things. In the time I had left, I bounced between Scott's PHP Security talk and George's High Performance PHP talk. Scott's was a little basic for the audience, as he notes on his weblog. George's seemed dead-on. I learned how to do things in PHP that I knew how to do with Perl--benchmarking and profiling. It's good to know that PHP has those bases covered.
I didn't get a chance to visit Shane's SOAP talk. I would have liked to sit in for a few minutes, but I got caught up in George's presentation.
On Friday morning, I sat in on Stephan Schmidt's "Introduction to XSLT with PHP" presentation. What I found interesting here was not how XSLT works (I already knew that) but the two things I learned. First, I finally got a handle on XPath syntax. I'd heard that it is powerful--a sort of "regular expressions for XML" but never spent more than 10 seconds looking at it. Now I have a much better appreciation for it.
The second thing I got out of it was an idea of how many XSLT related PHP modules are floating around out there. I expected there was just one or two. The good news is that XSLT with PHP seems to be decent now and rapidly improving.
On Friday morning, I attended Michael Radwin's Making the Case for PHP at Yahoo! talk, even though I'd seen it the day before at work. The room was packed. A lot of people were interested in what we're doing with PHP at Yahoo. And Michael's talk did a great job of explaining things.
He started with an overview of Yahoo's server-side "scripting" technology, from the early days all the way thru to today. He also spent some time discussing what makes Yahoo special and how that factors into our requirements for a scripting language.
He then discussed the selection process we went through and the benchmarking we performed. Finally, he discussed what we've learned in the 3-4 months since PHP was first deployed at Yahoo.
After all the talks were nearly done, I hooked up with George, Scott, Bryan & Tiffany (of Pyzine fame), and a few others. We headed downstairs for drinks and food in the hotel bar.
We chatted about tons of stuff. Google, Yahoo, weblogs, Dave Winer, O'Reilly's lack of PHP books, the World Series, and so on. The food wasn't terribly good (don't order the peach cobbler) but he beer was.
Other groups ventured up to San Francisco for food, dancing, and other festivities.
On Thursday evening, the last Work-in-Progress talk I attended was Philippe Lewicki's "Enterprise Application Migration to PHP/MySQL" in which he described his company's approach to migrating a typical business application to the web using PHP and MySQL. The current system runs a on a Mac server and clients on Windows. The clients can generate simple reports and graphs, as well as running standard queries and entering new data. By using Mozilla, MySQL, PHP, and some interesting PHP modules and add-ons, they've been able to provide a pretty compelling web and open source-based alternative.
Some of the things he demonstrated were really impressive. I'm starting to wish I had taken more notes. Or any notes.
On Thursday evening, I attended George Schlossnage's Work-in-Progress talk on Apache_Hooks, a project to allow PHP access the various request phases of Apache. George is actually an accomplished mod_perl and PHP hacker. So he got involved with this project (originally conceived by Rasmus) to try and level the playing field a bit between PHP and mod_perl.
To provide a bit of context, imagine being able to write an Apache authentication handler in PHP that could then let control pass on to a mod_perl content handler.
Apache_Hooks is still very experimental but it seems to work reasonably well. It's in a separate branch of the PHP CVS repository for now. Nobody know if or when it'll become mainstream, but it is very cool stuff. It sounds like a few folks were interested enough to try running it in production.
I have a copy of his presentation, but I'm sure he'll have one on-line soon. Check the PHPCon web site in a week or two. We're trying to gather all the presentation links there.
This WiP also demonstrated the power of wireless networking in an amusing way. George was having trouble with the projector because his newer TiBook doesn't have a standard VGA out plug and he forgot the adaptor for it. Nobody else had one either. We puzzled over what to do until someone realize that 5 or 6 of the 10 of us in the room all had laptops with 802.11b cards and VGA out. So we setup an ad-hoc wireless network and FTP'd the slides from George's machine.
Update: As George notes, his Apache Hooks talk is now on-line.
On Thursday afternoon, I attended Scott Johnson's Software Engineering Practices for Large-Scale PHP Projects talk. His talk was very popular (had to move to the ballroom) and very good. Scott did an excellent job of reminding us all that just because PHP is easy to code, we shouldn't take shortcuts and forget everything that software engineering stands for.
Much of the advice was very practical and often backed up with real-world examples to illustrate some of his points. Give the talk a look.
On Thursday afternoon, I gave my Scaling MySQL and PHP talk. Amusingly, I put the talk together only 1.5 days before the conference after I found out that a speaker had canceled and they needed to fill a spot.
The alternate title for the talk is "The making, breaking, and repair of remember.yahoo.com." It covers the project to build remember.yahoo.com in one week's time, the site launch, most of the problems we faced while it was on-line.
The talk was well received and I enjoyed presenting it. It's always fun to say, "look, we did some stupid things--try and learn from our mistakes."
On Thursday morning, I attended Christian Wenz's talk about Microsoft's ASP.NET, Web Services, and PHP. He began with a short introduction to Web Services and XML. Then he demonstrated building and using Web Services using Microsoft's ASP.NET and C#. Then he discussed how this relates to PHP. Is PHP dead? Can it compete?
He then discussed a few ways that you can both produce and use Web Services using various PHP modules. It's not as easy as Microsoft makes it, but it's certainly not impossible. And it's only going to get easier as time goes on.
I actually learned more from this talk about ASP.NET than I did about PHP. I feel like I understand both technologies better as a result.
On Thursday morning, Rasmus Lerdorf (now working at Yahoo) gave the opening keynote. (Expect it to appear here someday.) He covered physics, rocket science, the web problem, and a little bit of PHP along the way. One of his main points was that the web problem isn't fundamentally difficult. Unlike complex web software from various commercial vendors, PHP provides the basic tools to need to build solutions to "the web problem" without feeling like you need a degree in rocket science to get started.
There was a bit of discussion about changes to the language in PHP 4.3 and/or 5.0. The one point that came up repeatedly is that PHP will create references to object by default, rather than copying them. That may break some existing code, but it'll do What Everyone Already Expects so it's a Good Thing.
Now that I'm mostly recovered from PHPCon 2002 (still a lot of work e-mail to plow through), I'll try and recount what I remember of the keynotes, sessions, and so on.
In general, I enjoyed the conferece a lot. Met some interesting people and learned some new tricks--as always.