Spam Reduction Notes

Jim Tittsler

These are notes from a talk given at the Tokyo Linux Users Group technical meeting on January 18th, 2003. It was intended as a survey of the current techniques that you might like to try... with an emphasis of what works for me.


Chapman:Have you got anything without spam?
Jones: Well, there's spam egg sausage and spam, that's not got much spam in it.
Chapman:I don't want ANY spam!
Monty Python's Flying Circus, Episode 25, June 25, 1970

What is SPAM?

SPAM can

SPAM is a registered trademark of Hormel Foods, LLC, for luncheon meat.

What is spam?

Start again...
typical definition
Unsolicited Commercial Email (UCE)
I prefer to stress the untargeted, bulk nature of the mail
unsolicited, automated, Email

Escalating problem

Nov 2002:Brightmail says 36% of traffic is spam
Jan 2003: MessageLabs predicts spam exceeds ham by July
IBM Spamato discovery photo

Delete Key

Angela Pang demonstrating her method for eliminating spam by hitting the delete key.

Multiple Email Addresses

ad hoc content filtering



Exim ACLs

  # Accept mail to postmaster or abuse in any local domain,
  # regardless of source.
  accept  local_parts   = postmaster:abuse
          domains       = +local_domains

  # reject if the sender is a known spammer.
  deny        senders = @@cdb;/etc/exim/spam-domains.cdb : \ 
              message = message from spammer rejected 

  # Deny unless the sender address can be verified.
  require verify            = sender

Public (realtime) Blacklists

  # DNS blacklists.
  # There are a variety of realtime blacklists that attempt to
  # identify spam sources, open relays, and even dialup address
  # blocks.  You really need to check the policy published by
  # each list before deciding to use it.
  deny dnslists = : \  
         : \ 
       message  = rejected because $sender_host_address is in the blacklist at $dnslist_domain\n\ 


Tagged Message Delivery Agent


Make it comutationally expensive for spammers (or at least people not on your whitelist) to send you mail.

Vipul's Razor (AKA Spamnet)

automated filtering

A Plan for Spam

scoring results for Paul Graham method
Graphs from SpamBayes background information

Robinson Combining

scoring results for method suggested by Gary Robinson

SpamBayes Chi-Squared

produce two numbers
typical results of current SpamBayes classifier
Chi-squared distribution explanation

Using SpamBayes

Mailing List Applications?

Will filters end spam?

Tools in other domains

look for outlying events


Spam Conference

Watch the video archives of the experts speaking on the subject at the Spam Conference held at MIT on January 17th, 2003.