|
Main Menu Main Page Webcam MyTake RSS Photo Gallery RSS Panoramic Images GeoTrace Appupdater OpenID Projects Services Software Hitlist Links About Me Search Contact Disclaimer
|
SpamAssassin with IMAP This whole project started when I wanted to send a copy of all of my e-mail to my cell phone as a text message (Verizon phones all have e-mail addresses, @vtext.com). The problem is, each message costs 2 cents to receive and I didn't want to be paying for spamming my own cell phone. So my solution was to setup SpamAssassin on my mail server to filter it all out first using Bayesian filtering. I also didn't want to go through the pain of forwarding every false positive and negative message back through SpamAssassin to help it learn. I also didn't want the Subject line modified with the standard SPAM header. Since I was using an IMAP server, I figured there must be an easier way. As it turns out, SpamAssassin is relatively easy to configure to do this. This tutorial provides instructions on how to setup SpamAssassin in single user mode for an IMAP folder based learning.1. Install SpamAssassin as usual if it isn't installed already. 2. Run this script on your existing e-mail and add this script to your crontab. Each time it runs, SpamAssassin will learn according to the messages in those folders. The first run will probably take a long time. #!/bin/sh # sa-learn.sh # use the mbox flag only if your folders are in mbox format sa-learn --no-rebuild --mbox --ham ~/mail/Inbox sa-learn --no-rebuild --mbox --spam ~/mail/Spam sa-learn --rebuild Check your results by typing 'sa-learn --dump magic' at a prompt. If you don't have at least 200 ham and spam messages, you need to make changes to your user_prefs file to help speed up the learning process. 3. Edit your .procmailrc file. This is what passes the e-mail messages to SpamAssassim and moves them into the Spam folder. It should look something like this: #~/.procmailrc # SpamAssassin sample procmailrc # # Pipe the mail through spamassassin (replace 'spamassassin' with 'spamc' # if you use the spamc/spamd combination) # # The condition line ensures that only messages smaller than 250 kB # (250 * 1024 = 256000 bytes) are processed by SpamAssassin. Most spam # isn't bigger than a few k and working with big messages can bring # SpamAssassin to its knees. # # The lock file ensures that only 1 spamassassin invocation happens # at 1 time, to keep the load down. # :0fw: spamassassin.lock * < 256000 | spamassassin # All mail tagged as spam (eg. with a score higher than the set threshold) # is moved to "Spam". :0: * ^X-Spam-Status: Yes mail/Spam #Your Spam folder name here # vtext.com compatible forwarding code # get "From" address and store :0 h FROM=|formail -IReply-To: -rtzxTo: :0c: # This filters out most system messages * !^From: .*\@localhost.* * !^From: .*(postmaster|MAILER-DAEMON)\@.* # Must specify some addresses to sendmail so it decodes properly for SMS text messages # -r is for where any error messages should go # For Verizon, they will strip out anything past the first 160 characters #| /usr/sbin/sendmail -r myrealemailaddress@mydomain.com -f $FROM number@vtext.com # For Cingular/AT&T, they send as many SMS messages as it takes for the entire email to get to your phone, this limits to 1 message (160 characters) per email | formail -k -XSubject: -XFrom: -I "From: $FROM" | head -c 162 | /usr/sbin/sendmail -r myrealemailaddress@mydomain.com -f $FROM number@cingularme.com # Cingular text format, normal mail format, character difference # FRM:<from> From: <from> +2 # SUBJ:<subject> Subject: <subject> +4 # MSG:<body> <body> -4 # +2 total # Normal code for forwarding a copy of ham messages, not fully compatible with vtext.com #:0c: # ! forwardaddress@domain.com # Work around procmail bug: any output on stderr will cause the "F" in "From" # to be dropped. This will re-add it. :0 * ^^rom[ ] { LOG="*** Dropped F off From_ header! Fixing up. " :0 fhw | sed -e '1s/^/F/' } 4. Edit your user_prefs file. This is where you customize the SpamAssassin settings. #~/.spamassassin/user_prefs # SpamAssassin user preferences file. See 'perldoc Mail::SpamAssassin::Conf' # for details of what can be tweaked. ######################################################### # How many hits before a mail is considered spam. #required_hits 5 report_safe 0 # You may need to set these lower than the default 200 early on to get # SpamAssassin to start filtering, depending on your inital training from step 1. #bayes_min_ham_num 10 #bayes_min_spam_num 10 # Whitelist and blacklist addresses are now file-glob-style patterns, so # "friend@somewhere.com", "*@isp.com", or "*.domain.net" will all work. # whitelist_from someone@somewhere.com # Add your own customised scores for some tests below. The default scores are # read from the installed spamassassin rules files, but you can override them # here. To see the list of tests and their default scores, go to # http://spamassassin.org/tests.html . # # score SYMBOLIC_TEST_NAME n.nn # I raised these scores to more effectively filter out spam # Add more as you see fit score BAYES_50 5.1 score BAYES_56 5.1 score BAYES_60 5.1 score BAYES_70 5.1 score BAYES_80 5.1 score BAYES_90 5.1 # This will be sure to filter all ADV messages score ADVERT_CODE 5.1 score ADVERT_CODE2 5.1 5. Now all you need to do is periodically check your Spam folder for false positives and move them to the Inbox and move any spam in the Inbox to the Spam folder. The sa-learn.sh script will run and re-learn the messages into the proper group. If you continue to have problems consult the SpamAssassin documentation. |
o Native IPv6 DNS Still Not a Reality o Time Warner Cable Business Class o The Solution to San Diego Traffic o Ticketmaster Website Fataly Flawed o A letter to Rep. Camp supporting the Digital Media Consumers Rights Act o State leaders avert cash crisis – for now o Air tankers delayed as Cal fire season reawakens o Forecast: Calif. home prices to dip further in '09 o Panel approves casino standards o Poll: Asian-Americans oppose gay marriage ban o Sheriff: Calif. family cremated mom, kept benefits o Court rules L.A. County can't give judges perks o Feds: Utah minnow may need protection o Local Calif. GOP compares Obama to Osama bin Laden o Judge won't block Roan Plateau drilling leases o Cisco Demos Public Rooms For Telepresence o RIAA Wants Its $222,000 Verdict Back o Do Software Versions Really Matter? o Sex Offender E-Mail Registry Signed Into Law o Linux Now an Equal Flash Player o World's Smallest IPv6 Stack By Cisco, Atmel, SICS o 6 Languages You Wish the Boss Let You Use o PHP5 CMS Framework Development o Blizzard Answers Your Questions, From Blizzcon o Millions of Internet Addresses Are Lying Idle o 99.8% of Gamers Don't Care About DRM, Says EA o Watching Tonight's Presidential Debate Online o Google Demands Higher Chip Temps From Intel o Internet Co-inventor Vint Cerf Endorses Obama o NASA To Repair Hubble By Remote Control |
|||
|
This page was last updated on Wednesday, October 1, 2008 If you have questions, comments, or other feedback about this page send e-mail using the Contact Form. |
|||||