Jeremy Zawodny's blog: May 06, 2003 Archives

May 06, 2003

Microsoft "knows" the Internet

Somehow I ended up on Microsoft's web site. I clicked around and found some on-line quizzes.

Find out how smart you really are by taking the Faster Smarter Online Challenge. You may enter the sweepstakes once per quiz. The more quizzes you play, the better your chances to win a digital camera or other cool prizes!

Oh boy! I wanna win!

I clicked on quiz #3 (Internet) because I think I know something about the Internet--maybe even as much as Bill himself.

It required a plug-in. The window that popped up had the title Faster Smarter Internet. Click the image at the right to see for yourself.

Riiiiight. Yeah, those guys up in Redmond really do get the Internet, don't they?

I guess I failed the quiz. You win, Bill. I'm not worthy.

Posted by jzawodn at 09:17 PM

Ready, Aim, ...

This might be fun as long as you don't cross the streams.

Posted by jzawodn at 08:25 PM

Tidbits

There's some stuff that I've been meaning to check out. But I haven't had a chance yet:

Hackers and Painters (Paul Graham)
The Hundred-Year Language (Paul Graham)
Related Entries (Kalsey)
MTSQL (Brad Choate)
Rsync Snapshots (Mike Rubel)
Blog Buttons
Mac OS X Journaling

Well, at least I can close all those browser windows know. I'll let the blog remember.

Posted by jzawodn at 08:12 PM

Nice Pics

I don't remember how I came across them, but I wish I took pictures this nice.

Posted by jzawodn at 08:04 PM

Debugging, step #1

Always verify that the bug exists before you go looking for the cause.

I just finished tracking down a bug that boiled down to one of two things, depending how you interpret the situation: (1) wrong expectations, or (2) not understanding the algorithm.

Instead of verifying the bug, I set off looking for the cause. After quite a while I felt no closer to finding it, so I decided to attack it in reverse. In doing so, I convinced myself that the output was correct. Then I was able to explain it to the person who reported the bug to me and all was well.

In this particular case, the code was relatively young. I wrote it a month or two back to generate the "related search" feature on Yahoo Search. For example, when you search for "jeremy" you'll see several related searches at the top of the result page.

The algorithm is really quite powerful and produces some fascinating insights. I implemented and tuned it, I did not invent it.

Anyhoo, part of the problem was that the code normally works with millions and millions of lines of input (and may take days to finish). But the test case that "proved" the bug contained maybe 20 lines on input. In working with millions of lines of input, there's a lot of noise that we throw out. You see, the haystack in which the needle is buried also contains a fair amount of dung. But the dung threshold for millions of lines of input is vastly different from what you'd use for 20 lines of input.

The net result was that the 20-line run produced less data than expected, even though the code was doing just was it was designed to do.

Had I spent 5 minutes up front doing a sanity check, I'd have noticed the "bug" quite a while ago.

Lesson learned. Verify bug first, then look for bug cause.

Posted by jzawodn at 04:19 PM

Over 20,000 queries per second

I was running some MySQL benchmarks the other day to test performance with a small (mostly static) database and a big query cache. Imagine my surprise when I was able to get over 20,000 queries per second.

Wow.

Here's the best part--the hardware is over two years old.

Yup. The MySQL box was a Dual P3-933 with 1GB RAM and some 10k RPM SCSI disks in a RAID-5 setup. (Not that the disks mattered at all.) It was running MySQL 4.0.12 on RedHat 9.0 with the 2.4.20-6smp kernel.

Now, where can I find a dual 2.4GHz machine on which to repeat the test... :-)

Update: It looks like I'll be able to run the test on a dual 2.8GHz FreeBSD 4.8 box in the next couple of days. Excellent.

Posted by jzawodn at 10:33 AM