For When You Can't Have The Real Thing
[ start | index | login ]
start > 2004-06-20 > 1

2004-06-20 #1

Created by dave. Last edited by dave, 19 years and 254 days ago. Viewed 4,003 times. #2
[diff] [history] [edit] [rdf]

Rogers Email Spam Hell

At some point in recent history, one of my email addresses got flooded with spam. To the point that fetchmail/spamassassin/FileScan couldn't get through it fast enough before some resource got choked out.

I discovered this while messing around with the Tera-Byte hosting traffic statistics. They seem to claim an astronomical amount of traffic (as in, 15-20 Gb per month for some months) which doesn't mesh with what my spot-checking indicates. But I'll discuss that another time -- this is about email.

Fetchmail, being charmingly fastidious, aparrently does not issue DELE commands for individual emails until all the pending emails have been successfully downloaded. So if you have 20 emails, and delivery gets jammed on message number 18, fetchmail aborts, leaving all 20 emails in the inbox marked as unread. This means that next time that you go back to your inbox, you now have 22 messages waiting instead of 20. Since you again barf on number 18, nothing gets cleared out… Lather, rinse, repeat.

What I had to do was to run fetchmail manually and figure out which messages it was jamming on and then manually telnet to the POP server so that I could manually issue the DELE commands to get rid of those messages already downloaded.

As a side effect of all this, I discovered that while fetchmail and sendmail can process email messages in parallel, the SpamAssassin setup I have here does not. It has a lock file system to prevent a massive number of hungry Perl processes from starting up and choking the computer. And at roughly 6 seconds of wall time required to process each message, this can become a bottleneck -- especially when there are 400 messages in the pipeline waiting to come down.

So I've temporarilly moved to the spamd/spamc setup, and per-message processing time for SA has dropped to less than a second per message. FileScan virus scanning, on the other hand, is still about 5-10 seconds per message.

I don't know what caused the bottleneck. It appears to me that there was less than three days worth of email waiting (ie the first message in the inbox was less than three days old) which is a bit disturbing because it means I am getting more than 100 spam messages a day.

no comments | post comment
This is a collection of techical information, much of it learned the hard way. Consider it a lab book or a /info directory. I doubt much of it will be of use to anyone else.

Useful: | Copyright 2000-2002 Matthias L. Jugel and Stephan J. Schmidt