Synopsis
Spam affects all people with email addresses that are published on the
web. It also affects many who had gone to some length to keep their
address private. Its getting worse, and those that are making money
from it are innovating rapidly to defeat the mail filters leveraging
the theories of Reverend Bayes. It's a war.
Rather than obsolete email, the ubiquitousness of the @
sign, charge for email, perhaps more schemes like the Sender
Policy Framework
(SPF) idea need to be rolled out. SPF
has flaws, by the way. Amongst other things it can push the roll out of SPF ... early adopters can see no benefit.
Many years ago 2002* on the mail list for Apache's Java Mail Server (JAMES)
an extension to SMTP was proposed and discussed. The idea: There
needed to be a phased update of mail servers to a new standard whereby
the origin of emails could be confirmed. "Did You Send This"
(DYST) was born.....
Details
Your SMTP server caches a digest of your email for an hour or so.
The destination SMTP server before passing the mail to the user
in question's in box, calls back to your SMTP server and asks "Did you
send this" in respect of the digest of the message it has just
received. It actually does the HELO business first then a DYST
command which would be new to the command set of SMTP.
It would involve a new IETF RFC, and a wide scale upgrade to SMTP
software over a period of time. You could consider it to be
phased implementation though. Some universities roll it out
first, and after a period of some warning or marking of incoming email,
they turn off support for mails originating from servers that do not
support DYST. Hotmail, Yahoo (etc) upgrade their servers to
support DYST but do not turn off support on the same schedule as
the Uni's would. Corporate leviathan's would be last. The
laggards in the bell curve of adoption.
The trick would be to call back to the alleged originator before fully
accepting the email and to issue some sort of rejection if not.
Sending servers that do not get it, and simply retry over and
over could be blacklisted, thus the fact that an email did get sent,
could be apparent to the genuine sender. In fact a blacklist
system would be a good way of reducing the amount of redundant DYST
checks.
Pressure to change
The pressure from that action, would cause more and more sysadms to
upgrade their SMTP or mail software, or switch to a 'joined up'
solution that did support it. After a period of time, all mail
servers are seen to be upgraded, and the remainder find it impossible
to email many people.
Spammers will still take over Windows 95 boxes in various bedrooms and
hijack the user's bona fide account for spamming, but the ISP is going
to get early warning now, and the account could be suspended pretty
quickly. Not as a result of a gazillion DYST calls, but
because of profiling/counting the outgoing mails, indeed checking their
reply-to or sent-as header entries. This means the hijacked
account can still send spam, but established mechanisms can close if
down.
Flaws
There are multiple flaws to the implementation though ..
It means for one, I can't run hammant dot org email through thoughworks
dot com's email system at moments of convenience. Lots of techies mix
and match their incoming and outgoing email. Sorry but it has to go in
the name of Spam reduction.
Also, particularly for the larger organizations, the SMTP server that
is used for outgoing email is not the same one as (or not connected to)
the one used for incoming email from external correspondents. Thus all
mails might respond as "no I did not send that" to a DYST check.
Solvable of course with upgrades to the more 'sophisticated' email
solutions.
Does anyone want to write an RFC with me? Someone with a better academic phrasing than me and has done one before?
* 2002 when predictions were like this ...
"Jupiter
Media Metrix says the average American received 571 spam messages in
2001. By 2006, that same person will receive 1,479"
Update - 12th Sept.
Andy Hedges makes some comments. Some clarifications from me:
Digest Expense
It is true that MD5 digests, are expensive to produce. If could be that
the hashing function is less expense than a MD5 digest, or only
operates on the first 100 chars of the message payload.
Instead of a hash, one could
think of just the message ID that email usually comes with as a 'cost
free' token used for DYST confirmation. Here is a sample from my
own inbox:
Message-Id: <4E8F0A7A-DB08-493A-93DA-5DBF49C77F0E@pobox.com>
Message-Id: <50074EDE-25C5-4855-8D8B-853DBBD2DA21@apache.org>
Message-Id: <20060905073421.29562.qmail@web82706.mail.mud.yahoo.com>
The first two above look like they are generated by some function. The last
looks to map to a design somehow as there is a date in it.
Perhaps the first two map to a design as well, but they are possibly obfuscated.
If we
were to consider message ID as the thing to be used instead of a digest
of the message, then we'd better be sure that all implementations are
not subject to more elaborate spoofing problems by spammers as they
raise their bar. Andy did not suggest message-id, but to
understand why digest or some other function over the message body is
desirable, we have to be sure that the number/phrase/sequence that
DYST exchanges is not spoofable. Message ids could for many
system be spoofable - like serial numbers.
One out wait
> "I have to wait 1 hour for my email! This will be a show stopper for many"
There is no one hour wait. That's only how long the server sending the message keeps
the digest (or other) before discarding. Most implementations will
check in realtime, thus the delay to your inbox would be the order of a second.
Not different to SPF?
They are quite similar. SPF relies on an enhacement to DNS to be rolled
out across most of the internet as well as mail systems being upgraded.
DYST relies on a upgrade to mail servers only. Considering
the
ISO seven layer model,
both DNS and DYST are in the 'application' layer. Even though DNS often
feels lower down in that model, it is not. Considering the
bell curve of adoption,
both could benefit from some early adopters switching off the ability
to recieve mail from those that have not upgraded.
I feel, perhaps, that because DNS and mail are dissasociated, it is
more difficult for me to upgrade my systems to be compliant with SPF
standards. If gmail is your email, then you have no problem - they can
handle the upgrades for millions at a time. If you have a domain
name like mychildhoodnickname.info and you want to recive email as
me@mychildhoodnickname.info, then it could be a challenge finding web
hosts, mail hosts (like
luxsci.com) that can play together with a a fair set of DNS entries hosted somewhere. Unless, of course, you have your own server.
In SPF's favor perhaps is the fact that mails do not have to be
individually checked. If the SPF check has been done for the message's
combination of alleged sender, and IP address of actual sender it does
not need to be done again.
In summary, DYST not a revolutionary spam busting technique, just one
proposed in the same timeframe as SPF. Andy's article is otherwise well
written, and thought provoking.