Lomas Davies logo
BD mag
Photograph by Leo Townsend - copyright Lomas Davies 2008
Building Design Online Articles - Hugh Davies
building design logo

20 November 2009
What are the odds of defeating spam?

Thomas Bayes was a Presbyterian minister and mathematician who spent much of his life working on the theory of probability. It was only after his death in 1761 that "Bayes Theorem" was published. It is this theory that lies behind the Bayesian filters that underpin today’s anti-spam software.

Spam email is a significant problem. In many architects’ practices it can account for over 50% of incoming email. Anti-spam software can use several methods to filter spam from legitimate email, including "whitelists", blacklists, keyword filtering and Bayesian filtering.

Whitelists comprise email senders who are considered to be legitimate contacts (eg people in your address book). Blacklists are the exact opposite, being custom or commercial lists of known spam originators. Keyword filtering relies on rejecting emails on the basis of the particular words (eg Viagra) appearing within the email.

Bayesian filtering is particularly clever in that it ranks words by the probability that they are contained in a junk email, using this to rank the probability of the whole email being junk. Bayesian filtering can also "learn" from its mistakes, recalculating probabilities on being told it has failed to spot spam or has falsely identified a legitimate email as spam.

The advantage of spam filtering on your own computer is that you can manage your own whitelist merely by keeping your address book up to date, and it is very easy to fish out false positives - legitimate email labelled as spam - merely by looking in your junk email folder. The disadvantages are that the filtering may be inadequate and all these junk emails still end up being moved around the whole system.

The next level up is to use spam-filtering software on your mail server. Mail server software may have its own built-in mail filtering (eg Spam Assassin in Kerio Mail Server) or there are third-party solutions for exchange servers (eg GFI mail essentials). Using server-level filtering means that all interfaces to your mail: workstation, webmail, laptops and mobile phones are equally protected. It is possible to “train” the Bayesian filtering, optimising it for the whole network. "Training" can also be achieved by individual users moving spam into the junk folder rather than just deleting it.

Coarse filter

Beyond this you can try and catch spam before it even reaches your mail server, through hosted spam services such as those provided by Messagelabs. But as false positives are not as readily available to end-users it is perhaps best used as a coarse filter, augmented by finer server-level filtering.

While spam is annoying it is not as potentially damaging as false positives. Waiting endlessly for an email from a potential client about the "erection" of a new building because overzealous spam filters have blocked it causes much more frustration than any number of junk emails.

Since no spam filtering is infallible, it is worth remembering a few golden rules:

  • Never open an attachment that has arrived by unsolicited email.
  • Never reply to spam or try to “unsubscribe” — it only encourages them!
  • If it sounds too good to be true then it probably is!



Lomas Davies Limited | Director - Hugh Davies
Registered No 4056784 in England and Wales at 19 Goodge Street London W1P 1FD | VAT Registered No 769 9003 88