Gmail Spam Filtering Update (by Jeremy Zawodny)

gmail Last week I asked Did Gmail's Spam Filtering Freak Out This Week? because it certainly had for me.

I was contacted by a member of Gmail's anti-spam team over the weekend. He asked for a few sample messages and was then able to diagnose the problem in fairly short order. In the meantime, several days worth of manually reclassifying email as "not spam" had improved things quite a bit.

At this point it's almost back to normal. However, it'll be a few weeks before I trust it to the point that I can ignore the spam folder on most days. But I'm impressed by the speed with which the system seemed to learn from my inputs and the team's interest in getting this resolved.

Posted by jzawodn at January 18, 2007 05:56 AM | edit

Reader Comments

# Alistair said:

Why would you be surprised by that though, it is in their interest to want to resolve something like that as soon as possible?

Al.

on January 18, 2007 06:35 AM

# Jeremy Zawodny said:

I'm always surprised when large organizations react quickly to the problems of a single individual.

Call my cynical, but I've delt with too many banks, insurance companies, and enterprise software vendors.

Maybe it's an American thing... I dunno. But either way, it's a compliment. :-)

on January 18, 2007 06:52 AM

# Justin Mason said:

'I'm always surprised when large organizations react quickly to the problems of a single individual.'

s/individual/A-list blogger/, dude ;)

So they maintain individual probabilistic-classifier training databases for each user? interesting! I think they may be the only large ISP to do so.

on January 18, 2007 07:52 AM

# Jeremy Zawodny said:

I've long believed that their spam filitering has a non-trivial personal component to it. That's the only way I've been able to explain it's behavior and excellent performance over the years.

Of course, there's obviously a lot of system-wide smarts that go into the filtering too. Let's face it, it's easier to identify a lot of the spam out there when you're able to see a lot of the spam out there in high volume.

on January 18, 2007 07:55 AM

# Xavier Cazin said:

I've always considered the "Report Spam" button as a perfect example of how the long tail could contribute without even noticing it to improve an application: if 100 humans decide that the same email is (or is not) a spam, Gmail get a far better information than from any pattern recognition algorithm, and they will take advantage of that to confirm to the next user (aka the next spam reporter) that one of the best feature is their anti-spam feature. Which is true.

on January 18, 2007 10:06 AM

# Chris Messina said:

Yeah, actually, I thought I was alone but apparently not.

I'm averaging about 3000 spams a day -- mostly due to my catchall account, which I've now redirected to a new Gmail account (which only gets stuff from services I've signed up for). After having that account for 3 days, it has 20,000 spam messages. Thankfully they're no longer in my main inbox.

But yeah, I also experienced the painful blip in their algorithm adjustment.

on January 18, 2007 06:23 PM

# lundy said:

On a semi-related note, I use hotmail as my secondary e-mail account. Do they have ANY spam filters? It's ridiculous. Even e-mails I've marked as junk mail in the past get through.

on January 19, 2007 06:10 AM

# Brian Ewins said:

A recent google techtalk on 'Human Computation':
http://video.google.co.uk/videoplay?docid=-8246463980976635143

suggested that a key factor in people-based-filters is to put knowns in with the unknowns, eg gmail might *add* small amounts of known spam to your mail, so that if a spambot user is rating all spam as non-spam it'll show up.

I have to wonder if this is the real cause of the trickle of spam I still get through gmail. Some of the recent 'lottery' spams I've had are too obvious not to have been caught.

on January 23, 2007 02:48 AM

# GlobalWarming Awareness2007 said:

I think there is definitely a global component as well. I'm always receiving those blatantly spammy job offers through my application/resumé gmail redirect address (which I'm certain is being found through putting that address on my resumé on HotJobs).

They all go like this:
From: Careers
Dear ,

Recently my assistant saw your resume online and forwarded it to me as a possible candidate for a tutor position with our company.

http://www.goodgradesnow...

Click here to be removed from any future employment offers.

on January 24, 2007 08:28 PM

# said:

Google spam problems affect more than gmail. Here's an email I sent to Google today:

I am having a huge (for me) new spam problem - not with my new gmail account (????@gmail.com), but with the account listed as my gmail default from/reply to address.

I don't think you adequately inform accountholders of the exposure risk to their other email accounts, and I intend to post this correspondance online to increase public awareness of this particular security risk in using a gmail account.

The details:

I opened a gmail account last month (mid Feb). As my default from/reply to address, I chose an address I've had a few years, but seldom used (?????@?????.rr.com).

From Mar 1-13 only, I had my gmail account retrieve mail from a few of my other POP3 accounts. All mail incoming to gmail was forwarded to yet another of my addresses.

Since I opened my gmail account, I have sent only ONE email from that gmail account. I used the ?????@?????.rr.com from/reply to address. I sent it to someone I regularly mail to using my other email accounts.

In the last 14 hours (Mar 19)I have recieved over 100 spams to my ?????@?????.rr.com email address (not coming through gmail). I have never received any spam to that address before - or that much spam to all my 7 other accounts combined! (I have tight security on my systems, and use the Cloudmark spam filter.)

This new spam must be a security problem originating with gmail, since it's never been a security problem with any of my accounts (or the mail recipient's account) in the past.

Today I have removed all reference to other email addresses from my gmail account, and permanently deleted all mail.

Unfortunately, I don't expect that will solve my new spam problem.

I expect an apology from Google. More than that, I expect your prominent disclosure of NEW security risks to your accountholders' other email addresses.

on March 19, 2007 01:51 PM

# Randy Stewart said:

Jeremy-

Hmmm....methinks you have connections in high places in Mountain View, but maybe I'm just cynical. False positives, IMHO, are far worse than false negatives. I've been getting quite a few of them myself as I'm POPping other accounts into Gmail. One of my Google Alerts even ended up in my spam folder...

I mentioned in an email to you recently about Boxbe.com, the email marketplace I'm working on. Today we launched integration with Gmail, so rather than using a new Boxbe email address, you could use your existing email address at Gmail.

While this won't help your false positives issue, it might help some of the other commenters out.

Here's Boxbe's URL http://www.boxbe.com
and instructions on how to set it up to work with Gmail -
http://blog.boxbe.com/help/how-to/integrating-boxbe-with-gmail

Cheers,
Randy Stewart

on March 20, 2007 11:32 AM

# Challenge-Response hater said:

Boxbe is yet another scummy Challenge-Response system, with a few twists.

Users of Challenge-Response anti-spam systems are either selfish or ignorant! Either they're IGNORANT of the fact that the system forces all the victims forged in the spam sent to the the user to filter their spam for the user - i.e. deal with challenges receives in response to mail the victims never sent, or they're NOT ignorant of this fact, and use the system anyway, which means they're SELFISH.

I'm sure Randy is aware that boxbe sends tons of spam to innocent victims. It seems he doesn't care and is eager to peddle his product here too.

Boxbe marketing material:
"Anyone who isn't on your Guest List will receive a request to verify message before it is delivered to your inbox." (which goes on: "Legitimate marketers who want to reach you have the option of paying a small fee that you set so that they can get their message through to you.")

on April 5, 2008 10:15 AM

# Shawn said:

You were contacted by a member of Gmail's anti-spam team?!!! I'm jealeous.
I know you posted this two years ago, but I am in the throws of exactly the same problem. Other than replies to an email from me, EVERY email I have received in the last few days has gone to spam, including those in my contacts. I am now watching my spam folder diligently (I guess I can call it my inbox until resolved), but can find no way to resolve the problem. Oddly, there doesn't seem to be any real spam getting in there though, and before I would say there were 100 per day... how do I get a Gmail human to get in touch?

Shawn

on March 6, 2009 10:51 AM

Disclaimer: The opinions expressed here are mine and mine alone. My current, past, or previous employers are not responsible for what I write here, the comments left by others, or the photos I may share. If you have questions, please contact me. Also, I am not a journalist or reporter. Don't "pitch" me.

Privacy: I do not share or publish the email addresses or IP addresses of anyone posting a comment here without consent. However, I do reserve the right to remove comments that are spammy, off-topic, or otherwise unsuitable based on my comment policy. In a few cases, I may leave spammy comments but remove any URLs they contain.