Bill Pechey
Bill Pechey

Free filters keep spam in check

A growing range of open-source filtering software is available to help keep inboxes free of junk

Written by Bill Pechey

There have been a lot of unsolicited emails over the years, but I don't need to tell anybody that the the number has gone through the roof over the past 12 months or so. Like so many other people, I find over half of my email is now unwanted and something has to be done. Fortunately, there is light at the end of the tunnel.

I have been playing with some interesting software called PopFile for about the last six months. It installs on my computer as an email proxy, intercepts all my incoming mail sessions and processes the messages, marking those it considers to be spam. My email client puts the suspects in a separate folder so I can check how well it's all working.

In fact, the software works much better than I imagined, sorting my emails with an accuracy of about 99.6 percent, although this does mean that about one in 200 emails is put in the wrong category. PopFile works by learning the characteristics of spam and wanted email by looking at each word and keeping a record of how often that word appears in each category. It then calculates a probability that a particular message is spam by combining the separate probabilities of the words, using Bayes' Rule, a method that now seems to be known as Bayesian filtering.

For the first few weeks, you have to pick out those messages that are wrongly categorised but, after that, the system makes so few errors that many people just don't bother to deal with them.

The beauty of all this is that the system gets better with time and adapts very quickly to new types of spam. I find that it is very good at picking out emails with viruses even if they appear to be from people I know.

Of course, you might actually want some types of spam. The system handles this by adapting to each user's preferences.

As I've said, a wanted message is occasionally wrongly categorised as spam. That message might be very important so you do have to check through the spam before you delete it. This doesn't take long if everything is in the same place. Some of the programs can be adjusted to make it much more likely that some spam is missed than wanted messages are wrongly filed.

Having been thoroughly fascinated by this technology, I decided to look into it further and found that this type of email classifier has been around for about five years. The original work was done separately by IBM and Microsoft but others have refined the techniques since and the performance has improved steadily. There are now quite a lot of filtering programs that work this way, and many, like PopFile, are open source.

Microsoft is now so confident about the technique that it will be included in the next version of Outlook, Microsoft's email client.

Is this the beginning of the end for spammers? Well, not yet. The only way to defeat spammers is to make spamming uneconomic. This means reducing the response rates to extremely small levels. Spammers will fight back and try to devise ways of getting their message through, however, probably by making spam very similar to normal email, which would cause it to lose much of its appeal. There is hope.

Have your say: reply to IT Week

  • Have your say
  • Send to a friend
  • Print this
  • Share

Tags:

reader comments

related articles

Spam

Canning spam

Tools, strategies and legal efforts for eradicating unsolicited email - plus advice on how to ensure legitimate email marketing remains both legal and welcome 27 Feb 2004

 

Stay a step ahead of spammers

Awash with unwanted messages? A fledgling open-source tool is surprisingly good at staving off spam 26 Aug 2003

related white papers

today's top stories

Financial IT job market recovery continues

Recruitment growth suggests IT budgets are increasing 30 Jul 2010

Satellite broadband touted as digital divide clincher

KA-SAT launch promises 10Mbit/s service for hard-to-reach locations 29 Jul 2010

Ofcom slams ISPs for exaggerated broadband speed claims

New code of practice for ISPs planned by the regulator 27 Jul 2010

Aerohive offers traffic light Wi-Fi monitoring

Firm promises simple 'red, yellow or green' system with Client Health Score tool 27 Jul 2010

Flaw in top wireless security protocol WPA2 uncovered

Disgruntled insiders could hack corporate wireless LAN 26 Jul 2010

Advertisement

How to achieve business and financial-system implementation success
A look at how organisations - regardless of size - can work towards successful business software installations and factors that determine the outcome.

Case study: Specsavers put customer care into focus
How Specsavers captured customer feedback at point of sale and incorporated the results into its CRM system.

Advertisement

Citrix

Keep up to date with the latest products, services and technologies from the world's leading IT companies; IThound.com brings you thousands of white papers, case studies and analyst reports.

Advertisement

Newsletter signup

Sign up for our range of FREE newsletters:

More available - click 'submit' to view

Existing User

Newsletter user login:

Jobs

Related jobs

Job of the week

Job alerts

Sign up here

Find your next job

IT Salary Checker

Check salary here

Advertisement

Latest poll

ICO to lean more heavily on public sector bodies

ICO to lean more heavily on public sector bodies

The ICO has said it will lean more heavily on public sector bodies to secure timely FOI responses, do you think this is:

View poll results

Latest audio and video articles

picture of Jason HartVideo

Ethical hacker reveals the security secrets behind cloud computing

Jason Hart, Senior VP at Cryptocard, shows Computing just how easy it is to illegally gain access to corporate cloud services to wreak havoc and steal money. 29 Jun 2010

gartner logoVideo

Part 1: 2010 trends in SOA and Application Development and Integration

Gartner analyst Paolo Malinverno explores trends in SOA 29 Jun 2010

Latest in-depth articles

Map of 3G coverageComment

The risks of selling off the 800MHz radio spectrum at the wrong price

It's a choice between revenue now or universal broadband later 30 Jul 2010

Luton Borough Council officesAnalysis

Local authority leads the way in digital backup technology

Luton Borough Council tells of the benefits of early adopter of VTL, data deduplication and virtualisation 27 Jul 2010

Primary Navigation