You'd think the comment spammers would be a bit smarter, but apparently not. Over 80% of all attempted spam hits on my site provide no HTTP Referer data. None of them work, of course, because my MT install isn't quite what they think it is (they don't know to type "jeremy" in the extra field). But it still takes up a bit of CPU effort to ask cgiwrap to fire up mt-comments.cgi and whatnot.

So I finally did what I've been meaning to do for a while now:

    RewriteCond  %{HTTP_REFERER}        ^$
    RewriteCond  %{REQUEST_METHOD}      ^POST$
    RewriteRule  ^/mt/mt-comments.cgi   - [F]

And 20 seconds after I did an apachectl graceful it blocked an attempt.

Morons, I tell you.

See Also

Posted by jzawodn at January 08, 2005 04:17 PM

Reader Comments
# Matt said:

Not morons, they just do as little work as possible for as long as possible. Referrer checking, hidden form fields, randomized time-limited and IP-based hidden form fields, everything (except a JS hashcash implementation) has only worked for a few weeks. WP Hashcash hasn't been defeated yet, but the spammers don't need to because they've just moved to Trackbacks.

on January 8, 2005 04:38 PM
# Jeremy Zawodny said:

Yeah, you're exactly right.

But "morons" was the best insult I could come up with withouy saying "fucktards" I guess.

on January 8, 2005 04:40 PM
# Nick W said:

Sensible spammers wouldnt be trying to spam a tech blog heh.. you could just 302 the really persistent ones and really teach them a lesson...

The sensible ones i know are trying to avoid live blogs and hit abandoned ones, longer shelf life for their links and no one gets hurt.

Still, it's a piece of cake to write a bot so i guess with the bar being so low 80% not even bothering to put in referer data is not so surprising.

Im ashamed to say i dont know mod_rewrite, but it looks pretty simple - care to explain it Jermemy?

ADDED - Oh man... i just pm'd you at threadwatch because i couldnt comment! - took me a while to figure out that i have blocked the referer header in Firefox heh... man i need to get some sleep..

Collateral damage jeremey...

on January 8, 2005 05:24 PM
# Wilhem said:

The collateral damage is not just for Firefox users: there are several firewall and "security" products, especially the awful but extremely popular Norton Internet Security series which block HTTP_REFERER by default. None of these people can now comment on your blog either.

on January 8, 2005 06:03 PM
# Greg said:

Now the spammers will switch to GET, or doesn't MT support that?

on January 8, 2005 06:15 PM
# josh said:

As somebody's already mentioned a lot of software firewalls block this. I had this in place for a while (only directly in the CGI) but people complained, so I made it check the referrer and if it doesn't exist then forward them to a new page (mt-comment2.cgi or something) without the referring bit (and a brief explanation).

on January 8, 2005 06:19 PM
# Darryl said:

The trackback spammers have gone crazy. I've blocked hundreds in the last day or two. I think the target is definately old weblogs which no-one is bothering to up the protection. That must be ripe picking right now....

So, Jeremy what do those mod_rewrite rules do ?

on January 8, 2005 08:12 PM
# Jason Clark said:

Enjoy the morons while you can, they always improve. I see very few referer-less spammers on my site anymore. Now, they work in pairs. One machine (or group of machines, with ip addresses from all over) will come through GETing my pages, and spamming the referer. Shortly thereafter, a second machine (or group) comes through POSTing the spam, using the correct referer. By strictly policing my refers and using mod_rewrite to 403 know bad referers, I'm reducing the attemps, but it's labor intensive.

Your extra field seems like a good idea- not hidden, so GET-and-POST bots don't work, and not image based (a la CAPTCHA) so accessible.

For me, the single most effective spam trap has been to disallow the use html tags. Because my comment system allows the use of Markdown syntax, literal tags can always be assumed to be spam. Of course, if everyone did it, it wouldn't work.

My next step is to look into using mod_security to inspect all POST headers for missing standard headers, but as always, it will remain an arms race.

on January 8, 2005 08:50 PM
# Jason Clark said:

Correction to my last comment: I disallow the use of HTML <A> tags, and treat any seen as a spam flag. (forgot to escape)

on January 8, 2005 08:51 PM
# kasia said:

I go a little further with my check..

RewriteCond %{REQUEST_METHOD} POST
RewriteCond %{REQUEST_URI} !.mt-tb\.cgi*
RewriteCond %{HTTP_REFERER} !.*unix-girl\.com.*
RewriteRule (.*) /post_error.html [R,L]

on January 8, 2005 08:56 PM
# said:

You don't need those .* matches at the end of the rules, they will match partial unless you tell it otherwise by ending the match with $ :)


And blocking links in comments seems like a pretty good idea... If my pagerank ever gets up high enough that I get hit by bots that are smart enough to see the hidden input, I might look into that...

on January 8, 2005 09:09 PM
# Collin said:

Uh, that was odd. Sometime between when I entered my comment and hit post, it managed to forget my info and post it anyway... Strange that it allows even a blank name :)

on January 8, 2005 09:10 PM
# Che Dong said:

Hi Jeremy:
My blog pinged this article but returns with:

2005.01.09 07:29:14 220.249.25.80 Ping 'http://jeremy.zawodny.com/mt/mt-tb.cgi?tb_id=3820' failed: Need a TrackBack ID (tb_id).

Che Dong

on January 8, 2005 11:24 PM
# Jeremy Zawodny said:

It's a known "feature" (or bug) of MT 3.xx that you're using. Other folks can Trackback me just fine.

I suppose it'll go away when I upgrade to MT 3.xx myself someday.

on January 8, 2005 11:30 PM
# Johan Petersson said:

People with browsers and firewalls that block referrers should already experience lots of other problems. Many sites use the field to limit access to images (to prevent "hotlinking") and downloads.

Blocking the referrer field is not a particularly net-friendly thing to do, and the security/privacy benefits of doing so is negligible. On a company site you may want to cater for such users, but for a personal blog requiring a good referrer is not an unreasonable restriction.

on January 9, 2005 04:41 AM
# Mike Boone said:

I'm not running MT, but I've been watching some comment spammers spidering my site. In the past couple months they've been providing bogus referer data. It started with fake .info domains (which I started blocking) but now they're using less easily blockable domains. The losers spidering my site have expanded past the blog into the rest of the site, so now my referer logs are almost useless...probably about 75% of the referers listed are junk. They've been unsucessful comment spamming me, but the referer spam makes up for it. :(

on January 9, 2005 06:39 AM
# Moe said:

Jeremy, didn't you mention that comment spammers are reading your blog the other day?

on January 9, 2005 11:18 AM
# Jeremy Zawodny said:

The comment spammers who read my blog (at least the ones I know of) are smarter than this.

on January 9, 2005 11:35 AM
# Moe said:

I guess they're not into phentermine, then ;-)

on January 9, 2005 01:43 PM
# Arvind said:

This method has been beated to death many times. I've seen a similar system around since the 14th of July 2004 See here: http://blog.kung-foo.tv/archives/001037.php and the comments discuss why it isn't a good system

on January 10, 2005 06:09 AM
# c. s. said:

For the first time in almost 2 years, I had my first serious spam attempt a few days ago. The spammer accidentally triggered a bug in my code that caused my website to send me a complete vardump at the time of the error. Had it not been for the error, the spammer would have actually gotten through.

I've posted a partial of the vardump on my website ($_POST only) if you're curious. In the mean time, I've added a quick and dirty scoring system to continue keeping the mess out.

I have to say one thing, though: you have to be REALLY stupid to blindly attack a home grown system.

on January 13, 2005 10:24 AM
# Scott Johnson said:

I have had this page bookmarked for a while now, and I have *finally* gotten around to implementing a similar technique. mod_rewrite is fun. ;-)

on September 18, 2005 06:11 PM
# micha said:

Hey,

for my own, i had a lot of problems with the 85.255.112,113,114.0 network, so i do this:

[code]
RewriteCond %{REMOTE_ADDR} 85\.55\.11(3|4)\.[0-9]*
RewriteCond %{REQUEST_METHOD} ^POST$
RewriteRule ^/(.*)$ http://inhoster.com/abuse.php?name=test&email=abuse@inhoster.com&subject=spam&message=a_lot_of_spam_from_%{REMOTE_ADDR} [L,R]
[/code]

not very nice, but all my abuse mails have been ignored :|

when you dont want that the bots are taking your ressources, redirect them to there own site. bots always follow redirect, because some blogs use redirects before trackbacks and comments are posted :)

regards from germany :)

on February 24, 2006 10:02 PM
# George said:

This is a cool tutorial, thanks for the heads up. Really appreciate.

on December 30, 2007 07:18 AM
# ffxiv gil said:

One machine (or group of machines, with ip addresses from all over) will come through GETing my pages, and spamming the referer. Shortly thereafter, a second machine (or group) comes through POSTing the spam, using the correct referer.In the mean time, I've added a quick and dirty scoring system to continue keeping the mess out.

on August 8, 2010 11:24 PM
Disclaimer: The opinions expressed here are mine and mine alone. My current, past, or previous employers are not responsible for what I write here, the comments left by others, or the photos I may share. If you have questions, please contact me. Also, I am not a journalist or reporter. Don't "pitch" me.

 

Privacy: I do not share or publish the email addresses or IP addresses of anyone posting a comment here without consent. However, I do reserve the right to remove comments that are spammy, off-topic, or otherwise unsuitable based on my comment policy. In a few cases, I may leave spammy comments but remove any URLs they contain.