Spam Prevention

What is SPAM?

Spam, also known as Unsolicited Commercial Email (UCE) is commonly defined as commercial communication from organizations you have no relationship with. Some have mistakenly interpreted spam as being any unwanted email message that shows up for any reason. The later can include email from friends or relatives you no longer want to communicate with as well as organizations or mailing lists you have had a relationship with but lost interest in.

Jade Networks uses the more formal definition of spam as being UCE - from organizations you have not requested commercial email from. Email from organizations, mailing lists, or relationships you have not asked to quit communicating with are NOT considered spam and are the responsibility of the message recipient to deal with. With that said spam is a HUGE problem on the Internet. Most sales ads sent as spam are frauds/scams with many containing viruses and/or phishing hooks. They try to trick the recipient to visit a fraudulent website in an attempt to steal personal and/or financial information.

In addition to the above methods to get the recipient to click on a link there are other dangers and methods that criminals use. Get rich schemes such as the common 419 schemes are perhaps the oldest scam type used on the Internet. Not only is the flood of spam a significant inconvenience, wasting time and money, it can be dangerous to users not familiar with the social engineered messages designed to deceive and steal.

Unlike other security threats spam is directed at the end user and not the networks. There are tools however that can be used to detect and deflect many of these types of attacks.

How to Block?

Effective blocking of spam is unfortunately not as easy as it might seem. The problem is not so much how to block the bad stuff from coming in, but how to do it while AT THE SAME TIME not affecting legitimate messages flowing in and out. Usually the blocking of a good message by mistake (also known as a false positive) is considered much worse than letting bad messages in. Many techniques have been tried with varying degrees of success with no single approach completely blocking spammers. A few of these are discussed below.

User Filters

Bayesian spam filtering techniques were used in many early email server solutions attempting to block spam. These and other similar types of filtering techniques are available in Email User Agents such as Thunderbird and others including some webmail systems. These are the programs that users use to send and receive email. Through continued use and the classification of messages as spam the mail program "learns" or gets a better idea of what patterns in a mail message represent spam for a given user. The system can then take action on these messages on behalf of the user. The accuracy of these systems vary considerably so usage is not as common as the designers would have liked. In addition normally only the more technically sophisticated users tend to use these features which limits their usefulness.

Content Filters (Server Side)

Email messages can also be analyzed and acted upon when received by the Message Transfer Agent (MTA) on the inbound server. One of the most common server-side content analysis engines is SpamAssassin. This open source solution takes a message and optionally runs several different filters against it to determine a spam score. Some of the available plug-ins to SpamAssassin include hooks for the Distributed Checksum Clearinghouse (DCC), Pyzor, and Razor. There are many more filters (including server-side bayesian and phishing filters) that can be plugged into SpamAssassin as documented on the project website.

We have found that there is a large variance in accuracy of the different network based content filters with many returning unacceptably high false positive rates. While we use SpamAssassin internally we are very cautious about integrating additional content filters as in some situations they have proven to block too many good messages along with the bad. With that said we enabled Razor 2 content analysis into the Jade MTA's in August, 2015.  From the Razor home page:

Razor is a distributed, collaborative, spam detection and filtering network. Through user contribution, Razor establishes a distributed and constantly updating catalogue of spam in propagation that is consulted by email clients to filter out known spam. Detection is done with statistical and randomized signatures that efficiently spot mutating spam content. User input is validated through reputation assignments based on consensus on report and revoke assertions which in turn is used for computing confidence values associated with individual signatures.

Greylisting

Most spammers use a one-shot approach to the distribution of spam. Normal email systems queue messages and when problems are encountered will store the message and retry sending at a later time. This overhead both in terms of logic and local storage is more than most spammers can justify. For this reason they normally attempt to send the message once and then discard regardless if it was received for not.

With this in mind many Internet mail servers (MTA's) keep track of the message senders and recipients of messages that flow through their systems. When a new message arrives if there is a record of the external sender and local recipient address pair the message is immediately accepted (pending other security checks). If not (which is usually the case for spam messages) the MTA issues a temporary unavailable return code to the sending system.  This signals to the foreign system that we cannot accept the email now due to a local problem but to try again later.

Normal mail systems will keep the message and retry later during their next queue run (hence a temporary error condition).  The Jade configuration uses a one minute delay during which time new message delivery attempts will continue to be deferred.  After this delay any new attempts to send email using this sender/recipient pair will succeed with the new sender/recipient pair information stored in a local database.  Further communication using these addresses will pass through the system with no delays.

The first time a new sender tries to communicate normally results in a short initial delay.  Most all non-spam messages do get retried and delivered within a short period.  Our experience with greylisting is that this simple mechanism alone accounts for an over 80% spam reduction through our MTA's with no message loss from legitimate senders.

Recipient Address Validation

Many older email systems blindly accept all messages that are submitted and try to deliver them the best they can after accepting them. This has proven to be a bad model as spammers usually forge the message envelope sender address to something that either cannot be replied to or that points to an innocent recipient address. When the later happens a non-delivery notification is sent to the forged sender address.  This is known as backscatter (see the Wikipedia backscatter article for an in-depth discussion of this).  Backscatter messages are considered by many as another form of spam and can get the receiving MTA blacklisted as well. 

Backed up non-delivery notification messages can cause many additional problems in the receiving MTA.  If too many messages get backed up it can affect mail system performance. The solution these problems is to verify all recipient addresses during the SMTP transaction (SMTP RCPT TO command).  Any submission attempts to improper recipient addresses then get shut down at the SMTP level before they are ever accepted in the first place.  Most modern email systems take this approach as do the Jade Networks MTA's.

Another tactic that spammers use is dictionary lookups for possible user names to try.  These names are used to construct email addresses when sending email. It is not uncommon for spammers to simply walk through addresses using the SMTP RCPT TO until they get a valid account name. Older MTA's will blindly reply with a 200 OK return for all and worry about the consequences later. MTA's that perform address validation during the SMTP transaction will let the remote MTA know right away if an address is valid or not.

There is a serious security problem with providing criminals immediate feedback in their search for valid users.  Unless throttled the criminal can figure out quickly who most of your users are using this technique, even if they never send email.  One defence against this type of an attack is to keep track of the failed attempts and to disconnect the remote MTA after an error threshold has been breached. Jade Networks is configured with a threshold of 2 so a single failed attempt is accepted but not multiple. Technically this is a violation of the SMTP specifications however in the many years we have used this across many sites we have yet to encounter any message loss as a result.  This approach effectively shuts down dictionary attacks at the SMTP level.

Sender Policy Framework (SPF)

The Simple Mail Transfer Protocol (SMTP) is used to transfer email messages between systems.  SMTP is in fact so "simple" that without proper safeguards in place it is trivial to forge sender address.  Criminals often use this technique to make email appear to come from someone else - most often an innocent unsuspecting person with no relationship to the fraudulent message being sent.  The criminal tries to hide behind innocent people to hide their actions.

One solution to this problem in widespread use across the Internet is the Sender Policy Framework (SPF).  SPF specifies an extension to the Domain Name System (DNS) where additional Resource Records (RR's) are defined.  These new RR's associate authorized message originating MTA's with each domain.  When email is received, the receiving MTA looks at the SMTP MAIL FROM: value to determine the sender domain.  A DNS check is then made for this domain to see if SPF records exist.  If they do this information is compared against the value of the connecting IP address of the sending MTA.  If the connecting IP address is associated with a valid SPF record the message is accepted.   If not the message is rejected during the SMTP session with a SPF violation error.

SPF checks are very effective in fighting the reception of email with forged envelope sender addresses.  Jade configures SPF in all Jade DNS nameservers for all Jade and hosted domains.  SPF is also used for all inbound messages to its MTA's.  Jade SPF DNS records for customer domains are initially configured to list the appropriate Jade MTA's.  For most customers this works fine as long as email is sent through the Jade network.  If there is a need for a customer to do message submission through other MTA's please let us know so we can adjust the SPF records accordingly.  Use of public MTA's is highly discouraged however as anyone else with access to these submission points will also have the ability to send masqueraded messages appearing to come from the customer domain.

For additional information on SPF there is an excellent article with extensive references on the WikiPedia site.

DNS Blacklists (RBL's)

Spammers rarely just send out a few messages but rather try to spew as many as they can to get their message out.   It's a numbers game - the more rubbish they spew the higher the chance that they will get a bite somewhere even if the percentages are low.  Network operators around the world try to keep track of where spam messages originate.   While many are propagated through BOTNET's using hijacked computing resources, many are also sent through conventional channels.  DNS Blacklists, also known as DNS Blackhole Lists (DNSBL's) and Real-time Blackhole Lists (RBL's) are databases of known IP addresses associated with spam senders.  These databases are stored using the DNS and can be queried by MTA's (using the sender's IP address) to determine if the remote sender is listed or not.

Many RBL's exist but there are large differences in the various lists.  The listing (and de-listing) criteria between different list providers can be significant.  The more aggressive providers will easily add addresses to their lists with false-positive rates typically being high.  Overall accuracy rates can be problematic if too many good messages get rejected along with the bad.

At Jade we have run tests with various RBL's configured into our mail servers.  Most have, for our customers at least, resulted in unacceptably high false-positive rates.  The loss of messages critical to business can cost companies far more than a few extra obvious spam messages to delete.  At the time of this writing we use only one RBL - zen.spamhaus.org.  This RBL has very low false-positive rates while still catching most spam getting through our other countermeasures (see above).  With that said it is also not 100% effective with some messages still getting through.

Non-SMTP Access Controls

Spam and other fraudulent messages that get through the above controls and are detected are transferred to a special spam handling account.  An automated system then retrieves them and extracts the information about the connecting host.  An abuse report message is then constructed with this information and other diagnostics information and sent to the Jade Abuse Report account and also to the abuse contact for the site which submitted the message to the Jade MTA (if one was found in the remote whois data).  The abuse report is sent using the Internet standard ARF (Abuse Report Format) format (see Abuse Report Reporting Standards for links to the technical standards).  The text portion of the abuse report contains a brief description of the report, the original message headers and the ARF (message header summary section) and WHOIS data.  The original message is attached to this administrative message intact without any modification.  Once received by the Jade Abuse Report account the message is reviewed for a second time.   If it is determined that the original message was not spam the block is removed and remote admin notified.  Domains who do not have abuse contact information published get silently blocked with no notification.  The double human verification of spam messages was designed to reduce the possibility of improperly blocking remote sites.  So far there have been no false positives with this approach.

At the time the message is processed by our internal spam handling process the offending sending IP address is automatically entered into our spam blacklist.  This is currently implemented as an IP block on our mail servers.  Should the review determine that the message was inappropriately submitted (it was not spam) the block is manually removed and the remote admin notified.     We are working on a system where the blacklist will eventually be entered into a Jade RBL (which we will then also make public).

Jade Networks Email Configuration

Jade Networks uses all of the above mechanisms in the fight against spam.  Of the above countermeasures greylisting and RBL use catches over an estimated 90% of the received spam.  SPF checks catch some as well.   Received spam that gets through results in approximately 10 - 20 messages a day entering our local IP blacklists (our current messaging volume is relatively low).  We intend to re-evaluate Razor-2 content filtering system but will not do this until we have have a chance to consult with our customers and schedule accordingly.

A quick note on the local blacklists - our long term plan here is to integrate this into a new host based security solution we are working on.  There are few problems with the current solution.   To start with blacklisted MTA's cannot connect to our server at all.  A better solution is to allow the connection and reject it with an SMTP error indicating they are blacklisted.  This is how our RBL rejections work.

While blocking at the IP level has proven quite effective in blocking known spam senders it has also resulted in some collateral damage.  There are currently no automated controls for removing addresses from the list.   The new system we are working on will have a database backend where histories for each IP address can be stored and tracked.  We are also working on network policies that determine under what conditions a site can be removed from the blacklist (but not our database records).  We should have this automated system operational by the end of 2015.

Current Development Work

We are always looking for ways to improve the Jade systems.  In addition to the above mentioned plans for the new Jade security backend we have short term plans for additional enhancements the email backend.  The Jade SpamTools were upgraded to be fully ARF compliant in September 2015.  At that time the system was changed to allow automatic submission to upstream MTA's when valid abuse reporting addresses were found in the submitting MTA's whois record.  This combined with enhanced reporting and ARF standard compliance has so far resulted in better and faster responses from some of the larger providers.

During October, 2015 we will be running tests and will eventually deploy a Domain Keys (DKIM) solution (see DKIM Standards for the technical details).  This will compliment our current SPF authentication tools.  After DKIM integration we will be evaluating the newer DMARC recommendations for possible integration into the Jade mail infrastructure.

See Also