For When You Can't Have The Real Thing
[ start | index | login ]
start > 2004-09-18 > 1

2004-09-18 #1

Created by dave. Last edited by dave, 19 years and 171 days ago. Viewed 3,640 times. #1
[edit] [rdf]

Spam Statistics:

Results from spamassassin-learn today:
  • 95 definite spam
  • 635 probably spam
  • 111 escaped spam (ie not tagged, but manually filtered instead)
  • 395 ham messages
So in other words, of 1236 messages I received in the sample period, only 32% were real messages. And the bulk of those were mailing list messages I glanced at and didn't read in any detail.

It also shows that of 841 unwanted emails I received in the sample period, SpamAssassin caught almost 87% of them. This is an important number to think about when being unhappy about the amount of spam filtering through into the mailbox.

There are a couple of problems with these numbers. They don't include email viruses. They don't take into account the email which is still sitting in my inbox awaiting classification (although it is all ham since as I write this it has just been manually filtered.) We can look at the lifetime statistics to get a feel for some of those numbers. Since I started with spam assassin, I have these lifetime stats:

  • Total Messages: 23393
  • Probable Spam: 14567
  • Definite Spam: 1991
  • Viruses: 664
  • Delivered Normally: 6159
So that's 46774 messages. Only a 1.4% virus rate, much lower than I thought, although this does not reflect new viruses that the scanner didn't recognize initially. We can see from above that the 'escaped spam' rate is approximately (111/(111+395))=22%. Therefore, we can assume that about 22% of the Delivered Normally amount is also escaped spam, meaning only 4804 are real messages, or 21% of the total.

That means, overall, 4 out of 5 emails I have received since 22 April 2004 have been junk.

That's mind-boggling.

one comment (by Agnother User) | post comment
This is a collection of techical information, much of it learned the hard way. Consider it a lab book or a /info directory. I doubt much of it will be of use to anyone else.

Useful: | Copyright 2000-2002 Matthias L. Jugel and Stephan J. Schmidt