I've been deleting more and more spam comments from my blog recently. I'm this close to hacking MT to call SpamAssassin before allowing a comment to actually post.
Am I the first? Has someone else already done the work? Google hasn't located anything relevant for me. And I figure it'll be a 5-10 minute hack once I get back into the MT codebase again.
Posted by jzawodn at July 10, 2003 02:55 PM
Looking to lose some weight before summer hits?
Hehe..
Anyways I did a little reading and here are a couple interesting links.. Mostly they discuess the blog spam subject with no clear way to resolve the problem.
http://dotnetguy.techieswithcats.com/archives/003066.shtml
http://www.moik78.com/2003_04_01_moik78_archive.html#200095459
I guess it depends on how they are hiding your blog. If it is an automated robot this might be of some use...
http://macrofun.pvpers.com/archives/000069.html
http://littlegreenfootballs.com/weblog/?entry=5803#c0034
You might be onto something :)
An interesting note while searching I came across your name in this blog.. http://www.google.ca/search?q=cache:dYHF762nv-oJ:www.notestips.com/80256B3A007F2692/1/NAMO-5KC2YN+blog+form+spam&hl=en&ie=UTF-8 wtf!
SA's not going to work great for blog spam -- it has a pretty high dependency on having a complete set of RFC2822 headers (ie for the spam to be in an email) in order for it to do much for catching spam. If the problem is with your mail->blog gateway, then it might do a pretty good job, but if it's posts directly to your blog, it will be a lot less useful (if useful at all).
I'd be real curious to see what happened if you ran all good comments through a Baysian and see what it had to say about the bogus ones.
Time for a MT plugin. :-)
Are you actually getting spam comments that SA could do anything with? I get two or three a week, but they're all just 5-10 words, usually quoted from somewhere up the page or "Nice article, keep up the good work"-ish, with a URL that doesn't seem likely to be a weblog and keywords for the name. Since there are plenty of people who comment with non-name-names, I can't see SA or anything Bayesian doing any good ("Zip Codes" is the only "name" I've ever gotten more than one spam from). The only idea I've heard so far that seemed likely to work is a shared blacklist, so that at least once I ID "get-zip-codes-for-free.com" as a spammer, you can just automatically block any comments with that URL. Just have to figure out how to get around the fact that people will constantly report scripting.com or whoever else has enemies, and we're set.
LJ has comment screening, and security controls... :)
I don't suffer from this so I'm not sure how it works. Are they posted by "real people" or bots? If bots, could you not use similar methods to those used when creating new Yahoo! mail accounts. The kind of thing where there's an image with a number in it and the person commenting has to type this number in. A hassle for your genuine readers though...
If SA does only work wellwith full e-mail+headers perhaps a baysian filter would be a good soluton. The filter I use for my mail is Bogofilter. It would be incredibly easy to integrate into something like MT, however, finding it enough blog-spam and blog-ham for it to learn may be the tricky part.
Guess one of the advantages of not being an a or b list blogger is that you don't get hit with these :) I think I had one so far! Well, other than some very irrelevant stuff posted on a couple of high google ranking stuff (cats in heat), but only one real spam.
i think that junkeater, normally focussing on guestbook spam, could be an interesting spam blocker. perhaps someone could give it a try for blog comments. looks like a spam assassin for guestbooks, forums and so on.
I have the very occasional spam posted to a perl-based message board I run. I noticed Pegasus mail has an inbuilt spam trap which just uses a bunch of rules to tag spam. Must investigate more, but I should think it'd be fairly easy to create a perl script to filter using those rules...
Of course, with a message board, the spammer has instant results as to whether their spam has gone through or not, so I may be stuck.
Stop that spam now!
Brand new innovative anti-spam system allows you to moderate your comments before they appear on your blog: helps attract women by keeping the trolls away!
Available now from http://www.scriptygoddess.com/archives/003944.php .
To unsubscribe, please send an email to active_address_found@example.com
;)
Nice Site, I would like to show you mine sometime.
Thanks for the information. Your site has very much helped me. A lot of the interesting information.