Synopsis

Spam affects all people with email addresses that are published on the web. It also affects many who had gone to some length to keep their address private. Its getting worse, and those that are making money from it are innovating rapidly to defeat the mail filters leveraging the theories of Reverend Bayes. It's a war.

Rather than obsolete email, the ubiquitousness of the @ sign, charge for email, perhaps more schemes like the Sender Policy Framework (SPF) idea need to be rolled out. SPF has flaws, by the way. Amongst other things it can push the roll out of SPF ... early adopters can see no benefit.

Many years ago 2002* on the mail list for Apache's Java Mail Server (JAMES) an extension to SMTP was proposed and discussed.  The idea: There needed to be a phased update of mail servers to a new standard whereby the origin of emails could be confirmed.  "Did You Send This" (DYST) was born.....

Details

Your SMTP server caches a digest of your email for an hour or so.  The destination SMTP server before passing the mail to the user in question's in box, calls back to your SMTP server and asks "Did you send this" in respect of the digest of the message it has just received.  It actually does the HELO business first then a DYST command which would be new to the command set of SMTP.  

It would involve a new IETF RFC, and a wide scale upgrade to SMTP software over a period of time.  You could consider it to be phased implementation though.  Some universities roll it out first, and after a period of some warning or marking of incoming email, they turn off support for mails originating from servers that do not support DYST.  Hotmail, Yahoo (etc) upgrade their servers to support DYST but do not turn off support on the same schedule as the Uni's would.  Corporate leviathan's would be last.  The laggards in the bell curve of adoption.

The trick would be to call back to the alleged originator before fully accepting the email and to issue some sort of rejection if not.  Sending servers that do not get it, and simply retry over and over could be blacklisted, thus the fact that an email did get sent, could be apparent to the genuine sender.  In fact a blacklist system would be a good way of reducing the amount of redundant DYST checks.

Pressure to change

The pressure from that action, would cause more and more sysadms to upgrade their SMTP or mail software, or switch to a 'joined up' solution that did support it.  After a period of time, all mail servers are seen to be upgraded, and the remainder find it impossible to email many people.

Spammers will still take over Windows 95 boxes in various bedrooms and hijack the user's bona fide account for spamming, but the ISP is going to get early warning now, and the account could be suspended pretty quickly.  Not as a result of a gazillion DYST calls, but because of profiling/counting the outgoing mails, indeed checking their reply-to or sent-as header entries.  This means the hijacked account can still send spam, but established mechanisms can close if down.

Flaws

There are multiple flaws to the implementation though ..

It means for one, I can't run hammant dot org email through ThoughWorks dot com's email system at moments of convenience. Lots of techies mix and match their incoming and outgoing email. Sorry but it has to go in the name of Spam reduction.  

Also, particularly for the larger organizations, the SMTP server that is used for outgoing email is not the same one as (or not connected to) the one used for incoming email from external correspondents. Thus all mails might respond as "no I did not send that" to a DYST check. Solvable of course with upgrades to the more 'sophisticated' email solutions.

Does anyone want to write an RFC with me? Someone with a better academic phrasing than me and has done one before?

* 2002 when predictions were like this ...

"Jupiter Media Metrix says the average American received 571 spam messages in 2001. By 2006, that same person will receive 1,479"

Update - 12th Sept.

Andy Hedges makes some comments.  Some clarifications from me:

Digest Expense

It is true that MD5 digests, are expensive to produce. If could be that the hashing function is less expense than a MD5 digest, or only operates on the first 100 chars of the message payload.

Instead of a hash, one could think of just the message ID that email usually comes with as a 'cost free' token used for DYST confirmation. Here is a sample from my own inbox:

  Message-Id: <4E8F0A7A-DB08-493A-93DA-5DBF49C77F0E@pobox.com>
  Message-Id: <50074EDE-25C5-4855-8D8B-853DBBD2DA21@apache.org>
  Message-Id: <20060905073421.29562.qmail@web82706.mail.mud.yahoo.com>

The first two above look like they are generated by some function. The last looks to map to a design somehow as there is a date in it.  Perhaps the first two map to a design as well, but they are possibly obfuscated.  

If we were to consider message ID as the thing to be used instead of a digest of the message, then we'd better be sure that all implementations are not subject to more elaborate spoofing problems by spammers as they raise their bar.  Andy did not suggest message-id, but to understand why digest or some other function over the message body is desirable, we have to be sure that the number/phrase/sequence that DYST exchanges is not spoofable.  Message ids could for many system be spoofable - like serial numbers. 

One out wait

> "I have to wait 1 hour for my email! This will be a show stopper for many"

There is no one hour wait. That's only how long the server sending the message keeps the digest (or other) before discarding. Most implementations will check in realtime, thus the delay to your inbox would be the order of a second.

Not different to SPF?

They are quite similar. SPF relies on an enhancement to DNS to be rolled out across most of the internet as well as mail systems being upgraded.  DYST relies on a upgrade to mail servers only.  Considering the ISO seven layer model, both DNS and DYST are in the 'application' layer. Even though DNS often feels lower down in that model, it is not.  Considering the bell curve of adoption,  both could benefit from some early adopters switching off the ability to receive mail from those that have not upgraded.

I feel, perhaps, that because DNS and mail are disassociated, it is more difficult for me to upgrade my systems to be compliant with SPF standards. If gmail is your email, then you have no problem - they can handle the upgrades for millions at a time.  If you have a domain name like mychildhoodnickname.info and you want to recive email as me@mychildhoodnickname.info, then it could be a challenge finding web hosts, mail hosts (like luxsci.com) that can play together with a a fair set of DNS entries hosted somewhere. Unless, of course, you have your own server.

In SPF's favor perhaps is the fact that mails do not have to be individually checked. If the SPF check has been done for the message's combination of alleged sender, and IP address of actual sender it does not need to be done again.

In summary, DYST not a revolutionary spam busting technique, just one proposed in the same timeframe as SPF. Andy's article is otherwise well written, and thought provoking.

Published

July 27th, 2006
Reads:

Categories