E-Mail:
Get our new Windows 7 eBook (PDF) for $7 with 70+ Tips. Download Now!

Sophisticated Email Filtering

  • No Related Post

Sophisticated Email Filtering

Very little about the Internet aggravates me more than spam. It’s recently taken a turn for the worst, at least in my inbox. Maybe I’m just paying the price for being a web user since the beginning and putting my real email addresses into too many form fields. After seven years on the web, it’s impossible to tell how many mailing lists my email address has been sold to. Regardless, Linux offers a sophisticated way to filter those useless emails in the form of procmail.

procmail, like many other Linux programs, is completely configurable to meet your needs. To be sure that procmail was installed in your stock install, use any of the following commands:

    which procmail
    type procmail
    whereis procmail

If procmail is, in fact, installed, you’ll be returned a path to the program with any of these commands. If not, you can download the program here.

procmail is classic Linux shorthand for “process mail,” a high-level description that perfectly describes procmail’s job. Invoked with a .forward file in the user’s home directory, procmail reads incoming mail, separates the body from the header, and uses the data in those individual chunks to handle mail in any manner defined in the .procmailrc file. The .procmailrc file, in turn, contains “recipes” for handling mail, most of which are a combination of regular expressions and program instructions. Given the importance of regular expressions to procmail, some background is necessary.

Regular expressions are a precise way of looking at and interpreting imprecise data - strings of characters. While individual characters themselves are certainly precise, the strings they create when stuck together have a very broad scope. The function of regular expressions is to recognize the patterns of characters that form these strings. procmail uses regular expressions to match patterns of characters that might indicate what type of mail you’ve received.

A thorough study of regular expressions and their uses in Linux could comprise an entire college semester. Luckily for us, most procmail recipes use a tight regular expression scheme. Let’s take a look:

Character Matches
^ start of a line
$ end of a line
. any character
* the preceding character any number of times, including ‘no’ times
+ the proceeding character any number of times, once or more
? either zero or one instance of the proceeding character
| either one defined character or another

Let’s take a look at a few common procmail recipe lines, and talk about how the regular expressions serve the purpose of sorting the wheat from the chaff.

:0:
* ^TOLockergnome
lockergnome

This instruction set tells procmail to start a new instruction [:0:], to match any line that starts [^] with the string “TO” followed by the string “Lockergnome” [TOLockergnome], and to move it to the directory named lockergnome. Pretty simple, huh? Let’s add a bit to make the focus tighter.

:0:
* ^To:.*(rantrave|suggest*|submit|feedback)
* !From:.*(chris|lori|amy|jake|furo|randy|adam)
feedback

This instruction tells procmail to look at the To: line for any address that contains rantrave, etc., but is not from one of the in-house Gnomes. If it finds a pattern that matches, procmail will move the email to the feedback folder. This is actually a useful filter, as it eliminates any possibility of filtering an internal response to one of the lockergnome addresses. These are relatively simple examples. Some others can be found in the procmailex man page.

That’s a fairly big chunk for one day. Tomorrow, we’ll talk about how to set up the .procmailrc file and some others you may want to use to perform some very sophisticated email filtering.

What Do You Think?

 
35 queries / 0.435 seconds.