Sign In
Email Marketing

Struggling with Spam? How to Filter Spam Emails

How to Filter Spam Emails Effectively Using Sieve

Spam emails are a persistent nuisance, clogging inboxes and posing security risks. While many email providers offer basic spam filtering, a more granular and customizable approach can significantly improve accuracy and reduce false positives. This article explores how to leverage Sieve, a powerful scripting language for filtering email messages on a mail server, to create robust and personalized spam filtering rules. We’ll delve into practical examples of Sieve scripts to identify and handle spam, focusing on techniques beyond simple keyword blocking.

Understanding Sieve: A Primer on Email Filtering

Sieve (RFC 5228) is a standard scripting language designed specifically for filtering email. It allows users to create rules that automatically sort, forward, reject, or otherwise manipulate incoming messages based on a variety of criteria. Unlike client-side filters, Sieve filters operate directly on the mail server, providing a more efficient and centralized solution. This means filtering occurs before the email even reaches your inbox, conserving bandwidth and processing power on your local machine.

The core principle behind Sieve is “test then action.” Each rule consists of one or more tests to determine if a message matches specific criteria, followed by one or more actions to perform on matching messages. Sieve supports a wide range of tests, including matching against headers, body content, sender addresses, recipient addresses, and more. Actions can include moving messages to specific folders, discarding them as spam, forwarding them to another address, or even modifying the message content.

Example 1: Basic Spam Identification Based on Header

This example demonstrates a simple Sieve rule to identify spam based on a specific header. Many spam emails contain headers that are easily identifiable as spam indicators. Here, we’ll look for the “X-Spam-Flag” header, commonly added by mail servers that have already performed some level of spam filtering.

require ["fileinto", "imap4flags"];

if header :contains "X-Spam-Flag" "YES" {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["fileinto", "imap4flags"];: This line declares the extensions used in the script. fileinto allows moving messages to specific folders, and imap4flags allows setting IMAP flags like “\Seen”.
  • if header :contains "X-Spam-Flag" "YES": This is the core filtering logic. It checks if the header “X-Spam-Flag” contains the value “YES”. The :contains match type is case-insensitive.
  • fileinto "Junk";: If the header test is true, this action moves the message to the “Junk” folder (you might need to adjust this to your server’s specific spam folder name, like “Spam”).
  • setflag "\\Seen";: This sets the “\Seen” flag on the message. If you want to mark the message as read automatically, you can include this line. If you prefer to leave it unread, remove this line.
  • stop;: This line is crucial. It tells the Sieve interpreter to stop processing further rules. Without this, other rules might be applied to the same message, potentially undoing the spam filtering.
Example 2: Discarding Spam with a Specific Subject

Sometimes, specific spam campaigns use consistent subject lines. This example demonstrates how to discard (delete) messages based on a particular subject. Use this with caution, as it permanently removes messages.

require ["discard"];

if header :contains "Subject" "Viagra" {
  discard;
  stop;
}

Explanation:

  • require ["discard"];: This line declares that the discard extension is used. This extension is necessary for deleting messages.
  • if header :contains "Subject" "Viagra": This checks if the “Subject” header contains the word “Viagra”.
  • discard;: If the subject line contains “Viagra”, the message is discarded (deleted).
  • stop;: As before, this stops further rule processing.

Important Note: Be very careful when using the discard action. Incorrectly configured rules can lead to the accidental deletion of legitimate emails. Always test thoroughly before deploying discard rules to a production environment.

Example 3: Using Regular Expressions for More Flexible Subject Matching

The :contains match type is useful for simple string matching, but regular expressions provide much greater flexibility. This example demonstrates how to use regular expressions to match a pattern in the subject line, allowing for variations in the spam subject.

require ["regex", "fileinto", "imap4flags"];

if header :regex "Subject" "^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}$" {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["regex", "fileinto", "imap4flags"];: This line includes the regex extension, which enables regular expression matching.
  • if header :regex "Subject" "^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}$": This uses the :regex match type to test the “Subject” header against a regular expression that attempts to match email addresses. This is just an example regex and might need adjusted.
  • fileinto "Junk";: If the subject line matches the regular expression, the message is moved to the “Junk” folder.
  • setflag "\\Seen";: Sets the “\Seen” flag on the message.
  • stop;: Stops further rule processing.

This example demonstrates how powerful regular expressions can be in identifying patterns that are common in spam emails. Note that regular expression matching can be computationally intensive, so use them judiciously to avoid performance issues.

Advanced Header Analysis for Spam Detection

Beyond simple header checks like the “X-Spam-Flag,” a deeper analysis of email headers can reveal more subtle indicators of spam. Examining the “Received” headers, “Message-ID,” and other fields can expose inconsistencies or patterns that are characteristic of spam emails. This section will explore techniques for analyzing these headers and creating Sieve rules to identify and filter spam based on header anomalies.

Example 1: Checking for Multiple “Received” Headers

Spam emails often have a large number of “Received” headers, which can indicate that the message has been relayed through multiple servers, possibly to obscure the original source. While not all emails with many “Received” headers are spam, it can be a useful indicator when combined with other tests.

require ["fileinto", "imap4flags", "relational", "variables"];

if size :over 100k { # Ignore very large emails
  stop;
}

if count :over 5 "Received" {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["fileinto", "imap4flags", "relational", "variables"];: This includes the relational and variables extensions, which are needed for the count test and variable manipulation (though not used directly in this example).
  • if size :over 100k { stop; }: This line adds a check to ignore messages larger than 100KB. Processing large emails can be resource-intensive, and spam filters often perform better by focusing on smaller messages.
  • if count :over 5 "Received": This checks if the number of “Received” headers is greater than 5. The count test counts the occurrences of a specified header.
  • fileinto "Junk";: If the number of “Received” headers exceeds 5, the message is moved to the “Junk” folder.
  • setflag "\\Seen";: Sets the “\Seen” flag on the message.
  • stop;: Stops further rule processing.

This rule helps identify emails that have been relayed through an unusually large number of servers, which is often associated with spam. Adjust the threshold (5 in this example) based on your specific email environment.

Example 2: Checking for Invalid or Missing “Message-ID” Headers

The “Message-ID” header is a unique identifier for each email message. Spam emails may sometimes have invalid or missing “Message-ID” headers. This rule checks for the absence of a “Message-ID” header.

require ["fileinto", "imap4flags"];

if not exists "Message-ID" {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["fileinto", "imap4flags"];: This includes the necessary extensions.
  • if not exists "Message-ID": This checks if the “Message-ID” header is absent. The exists test checks for the presence of a header, and not negates the result.
  • fileinto "Junk";: If the “Message-ID” header is missing, the message is moved to the “Junk” folder.
  • setflag "\\Seen";: Sets the “\Seen” flag on the message.
  • stop;: Stops further rule processing.

This rule can help identify spam emails that are poorly constructed or deliberately omitting the “Message-ID” header to avoid tracking.

Example 3: Analyzing the “Received” Header Chain for Suspicious Hops

This more advanced example analyzes the “Received” headers to identify suspicious patterns in the server hops. It checks if any of the “Received” headers indicate a local server claiming to have received the message from itself (which is often a sign of a spoofed or misconfigured server). This requires the “header” test with regular expressions and can be complex.

require ["fileinto", "imap4flags", "regex", "variables"];

if anyof allof (header :regex "Received" "from localhost",
               header :regex "Received" "from 127\\.0\\.0\\.1") {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["fileinto", "imap4flags", "regex", "variables"];: Includes the necessary extensions.
  • if anyof allof (header :regex "Received" "from localhost", header :regex "Received" "from 127\\.0\\.0\\.1"): This complex conditional statement checks if any of the “Received” headers contain “from localhost” or “from 127.0.0.1”. The anyof and allof constructs allow for complex logical combinations of tests. In this case, it actually checks for *any* header matching either regex. Note that the period in “127.0.0.1” is escaped with a backslash (\\.) because it’s a special character in regular expressions.
  • fileinto "Junk";: If the condition is met, the message is moved to the “Junk” folder.
  • setflag "\\Seen";: Sets the “\Seen” flag on the message.
  • stop;: Stops further rule processing.

This example demonstrates how to analyze the “Received” header chain for suspicious patterns, which can be a strong indicator of spam or spoofed emails.

Expert Tip: Tools like mail header analyzer (search online) can help you visualize and understand the structure of email headers, making it easier to identify patterns and create effective filtering rules. Examining legitimate emails alongside spam emails can help you differentiate between normal header patterns and suspicious anomalies.

Filtering Based on Body Content and Regular Expressions

While header analysis can be effective, analyzing the body content of emails is often necessary to identify more sophisticated spam campaigns. This section focuses on techniques for filtering spam based on keywords, phrases, and patterns found within the email body. We’ll explore the use of regular expressions to create flexible and powerful content-based filtering rules.

Example 1: Blocking Emails Containing Specific Keywords

This example demonstrates how to block emails that contain specific keywords commonly associated with spam. This is a basic but often effective technique, especially for blocking obvious spam messages.

require ["fileinto", "imap4flags", "body"];

if body :contains "online casino" {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["fileinto", "imap4flags", "body"];: This includes the body extension, which is required for accessing the email body content.
  • if body :contains "online casino": This checks if the email body contains the phrase “online casino”. The :contains match type performs a case-insensitive search.
  • fileinto "Junk";: If the keyword is found, the message is moved to the “Junk” folder.
  • setflag "\\Seen";: Sets the “\Seen” flag on the message.
  • stop;: Stops further rule processing.

You can extend this rule by adding more elsif conditions to check for other spam-related keywords. Remember to choose keywords carefully to avoid false positives.

Example 2: Using Regular Expressions to Match URLs with Specific Domains

Spam emails often contain URLs that link to malicious or undesirable websites. This example demonstrates how to use regular expressions to match URLs with specific domains within the email body.

require ["fileinto", "imap4flags", "body", "regex"];

if body :regex "https?://(www\\.)?example\\.com" {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["fileinto", "imap4flags", "body", "regex"];: This includes the necessary extensions.
  • if body :regex "https?://(www\\.)?example\\.com": This uses the :regex match type to search for URLs that contain “example.com”. Let’s break down this regex:
    • https?://: Matches “http://” or “https://”. The ? makes the “s” optional.
    • (www\\.)?: Matches “www.” optionally. The parentheses create a group, and the ? makes the entire group optional. The periods are escaped with backslashes (\\.) because they are special characters in regular expressions.
    • example\\.com: Matches “example.com”. Again, the period is escaped.
  • fileinto "Junk";: If the URL pattern is found, the message is moved to the “Junk” folder.
  • setflag "\\Seen";: Sets the “\Seen” flag on the message.
  • stop;: Stops further rule processing.

Replace “example.com” with the domain you want to block. You can modify the regular expression to match more complex URL patterns.

Example 3: Using the “text” Test for More Robust Body Matching

The `:text` test is designed for searching within the textual content of the email body, automatically handling different character encodings and HTML formatting. This can often be more reliable than a simple `:contains` test, especially for emails with complex formatting.

require ["fileinto", "imap4flags", "body"];

if body :text "urgent payment required" {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["fileinto", "imap4flags", "body"];: This includes the necessary extensions.
  • if body :text "urgent payment required": This uses the :text match type to search for the phrase “urgent payment required” within the textual content of the email body.
  • fileinto "Junk";: If the phrase is found, the message is moved to the “Junk” folder.
  • setflag "\\Seen";: Sets the “\Seen” flag on the message.
  • stop;: Stops further rule processing.

The `:text` test is generally preferred over `:contains` when you want to focus on the actual textual content of the email, ignoring HTML tags and other formatting elements. The `:text` test is designed to handle character encoding issues automatically, which can lead to more accurate results.

Quote: “Effective spam filtering requires a layered approach. Combining header analysis with body content filtering, and regularly updating your rules, is essential for staying ahead of spammers.” – Email Security Expert

Implementing Whitelists and Blacklists Effectively

Whitelists and blacklists are fundamental components of any effective spam filtering system. Whitelists ensure that emails from trusted senders are always delivered, while blacklists prevent emails from known spammers from reaching your inbox. This section will explore how to implement and manage whitelists and blacklists effectively using Sieve.

Example 1: Creating a Simple Whitelist

This example demonstrates how to create a whitelist that allows emails from specific sender addresses to bypass spam filtering. It checks the “From” header against a list of trusted email addresses.

require ["fileinto", "imap4flags"];

if anyof (
    address :is "From" "friend1@example.com",
    address :is "From" "colleague@company.com",
    address :is "From" "family@example.net"
) {
  # Whitelisted sender - deliver to inbox
  stop;
}

# If not whitelisted, process spam filtering rules
if header :contains "X-Spam-Flag" "YES" {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["fileinto", "imap4flags"];: Includes the necessary extensions.
  • if anyof (address :is "From" "friend1@example.com", address :is "From" "colleague@company.com", address :is "From" "family@example.net"): This checks if the “From” header matches any of the whitelisted email addresses. The address :is test performs an exact match on the email address in the specified header. The anyof condition means that if *any* of the email addresses match, the entire condition is true.
  • stop;: If the sender is whitelisted, the stop action prevents further rule processing, ensuring that the message is not subjected to spam filtering.
  • The following section contains your standard spam filtering rule (in this case checking the “X-Spam-Flag” header). This is only executed if the email is *not* whitelisted.

Add more address :is conditions to include additional whitelisted senders. Place this whitelist rule at the beginning of your Sieve script to ensure that it’s processed before any spam filtering rules.

Example 2: Creating a Simple Blacklist

This example demonstrates how to create a blacklist that automatically moves emails from known spam senders to the “Junk” folder. It checks the “From” header against a list of blacklisted email addresses.

require ["fileinto", "imap4flags"];

if anyof (
    address :is "From" "spammer1@bad-domain.com",
    address :is "From" "advertiser@spam-site.net",
    address :is "From" "fraudster@dodgy-email.org"
) {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["fileinto", "imap4flags"];: Includes the necessary extensions.
  • if anyof (address :is "From" "spammer1@bad-domain.com", address :is "From" "advertiser@spam-site.net", address :is "From" "fraudster@dodgy-email.org"): This checks if the “From” header matches any of the blacklisted email addresses.
  • fileinto "Junk";: If the sender is blacklisted, the message is moved to the “Junk” folder.
  • setflag "\\Seen";: Sets the “\Seen” flag on the message.
  • stop;: Stops further rule processing.

Add more address :is conditions to include additional blacklisted senders. Be cautious when adding senders to the blacklist, as incorrect entries can lead to the accidental filtering of legitimate emails.

Example 3: Using Domain-Based Whitelisting with Regular Expressions

Instead of whitelisting individual email addresses, you can whitelist entire domains. This example demonstrates how to use regular expressions to whitelist all email addresses from a specific domain.

require ["fileinto", "imap4flags", "regex"];

if address :regex "From" "@example\\.com$" {
  # Whitelist all senders from example.com
  stop;
}

# If not whitelisted, process spam filtering rules
if header :contains "X-Spam-Flag" "YES" {
  fileinto "Junk";
  setflag "\\Seen";
  stop;
}

Explanation:

  • require ["fileinto", "imap4flags", "regex"];: Includes the necessary extensions.
  • if address :regex "From" "@example\\.com$": This uses the address :regex test to check if the “From” header contains an email address that ends with “@example.com”. The $ anchor in the regular expression ensures that the match occurs at the end of the string. The period is escaped \\..
  • stop;: If the sender’s domain is whitelisted, the stop action prevents further rule processing.
  • The following section contains your standard spam filtering rule (in this case checking the “X-Spam-Flag” header). This is only executed if the email is *not* whitelisted.

This rule allows all senders from the “example.com” domain to bypass spam filtering. Use this technique with caution, as it can also allow spam emails from compromised accounts within the whitelisted domain.

Comparison Table: Whitelists vs. Blacklists

FeatureWhitelistBlacklist
PurposeEnsure delivery of emails from trusted sendersPrevent delivery of emails from known spammers
Effect on FilteringBypasses spam filtering rulesTriggers spam filtering actions
Risk of False PositivesLow (if managed carefully)Higher (requires careful monitoring)
MaintenanceRequires ongoing maintenance to add and remove sendersRequires ongoing maintenance to add and remove senders
Best Used ForTrusted contacts, important notificationsKnown spammers, domains associated with spam

Important Considerations: Regularly review your whitelists and blacklists to ensure their accuracy and effectiveness. Remove outdated or incorrect entries to prevent false positives or missed spam messages. Consider using a combination of whitelists, blacklists, and other filtering techniques for a more robust spam filtering system.

Testing and Deploying Sieve Filters

After creating Sieve filters, it’s crucial to test them thoroughly before deploying them to a production environment. Testing helps identify errors, prevent false positives, and ensure that the filters are working as intended. This section will explore techniques for testing Sieve filters and deploying them effectively on a mail server.

Example 1: Using a Sieve Testing Tool

Many mail servers and email clients provide built-in Sieve testing tools that allow you to simulate the filtering process without actually delivering or modifying emails. These tools typically allow you to input an email message and a Sieve script, and then display the actions that would be taken by the script. Check your mail server or email client documentation for specific instructions on using their Sieve testing tool.

For example, with dovecot and managesieve, you might test using a command line such as:

sieve-test -v -c /etc/dovecot/dovecot.conf -u testuser@example.com myscript.sieve < testmail.txt

Explanation:

  • sieve-test: This is the command-line tool for testing Sieve scripts.
  • -v: Enables verbose output, which provides more detailed information about the filtering process.
  • -c /etc/dovecot/dovecot.conf: Specifies the Dovecot configuration file. Adjust the path to your actual Dovecot configuration file.
  • -u testuser@example.com: Specifies the user account to use for testing. This helps simulate the user's specific configuration and settings.
  • myscript.sieve: Specifies the Sieve script to test.
  • < testmail.txt: Redirects the content of the "testmail.txt" file as input to the sieve-test command. The "testmail.txt" file should contain the email message you want to test.

The verbose output will show you exactly how the Sieve script processes the email and which actions are taken. This allows you to verify that the script is working as intended and identify any potential errors.

Example 2: Using a Test Email Account

A more practical approach is to create a dedicated test email account and send test emails through your mail server. This allows you to test the entire filtering process, including the interaction between the Sieve script and other mail server components.

To use this method:

  • Create a new email account specifically for testing Sieve filters.
  • Upload your Sieve script to this test account.
  • Send various test emails to the test account, including both legitimate emails and simulated spam emails.
  • Check the test account's inbox and spam folder to verify that the filters are working correctly. Are legitimate emails being delivered to the inbox? Are spam emails being moved to the spam folder?
  • Analyze the email headers of the processed messages to understand how the Sieve script has modified them.

This method provides a more realistic testing environment compared to using a Sieve testing tool. It allows you to identify any potential issues related to mail server configuration, character encoding, or other factors that can affect the filtering process.

Example 3: Deploying a Sieve Script Using Managesieve

Once you've thoroughly tested your Sieve script, you can deploy it to your mail server using a tool like Managesieve. Managesieve is a protocol and a set of tools that allows you to manage Sieve scripts on a remote mail server. Many email clients support Managesieve, allowing you to upload and activate Sieve scripts directly from your email client interface.

Alternatively, you can use a command

Share this article