FamilyGreenberg.com

New Photos:

The Rockefeller Center Christmas Tree Photoblog (11/07)
The World's Largest Baton - The Penn Band (10/20/07)

New Ramblings:

Google Analytics (1/14/07)
Boot Camp - Being All That Apple Can Be (4/30/06)

New Links:

Last Updated
03/06/2008 10:58 AM

Google Analytics (1/14/07)
Boot Camp - Being All That Apple Can Be (4/30/06)
Upgrading to the Video iPod - a Dilemma (2/23/06)
The iPod Will Keep Playing (9/11/05)
Mactel - The Unholy Alliance is Born (6/19/05)
We are Blog. Resistance is Futile (2/28/05)
DSL - The Darkside of Broadband (12/23/04)
The Microsoft Monopoly (4/18/04)
The Downloading Debates (1/17/04)
September 11, 2003 (9/11/03)
Solving the Spam Problem - A New Approach (1/10/03)
September 11, 2002 (9/11/02)
Information Privacy in the Post-9/11 World (3/25/02)
The Semantic Web - Can it Really Work? (4/23/02)
The "Mom & Dad" Test (7/98)
The Birth of Cyber-Prejudice (7/98)

Wanna talk about it? Click here to discuss.

(By the way, it should go without saying, but these opinions are mine & only mine.)

- Solving the Spam Problem - A New Approach -

In yet another example of a real world phenomenon re-establishing itself in cyberspace, people are now realizing that the junk mail that fills their physical mailboxes is filling their e-mail boxes with equal veracity. In a recent InternetWeek article, Mitch Wagner discusses the subject with his friend, Barry Shein, a man with perhaps the greatest professional title ever: President of The World ("The World" is a 20-person Internet Service Provider in Boston). Barry says, "Spam is a thousand times more horrible than you can ever imagine. . . [it's] taking down the entire Internet." Others argue that spam is no more likely to take down the Internet than junk mail is likely to take down the U.S. Post Office. What follows is a brief analysis of how we got into this mess, how we're going about solving it, and why a new approach may be in order.

The Problem

Bruce Schneier, founder and CTO of Counterpane Internet Security, is quoted in the InternetWeek article with a great, one sentence summary of the spam problem: "The incremental cost of sending one more piece of e-mail is free . . . because the cost of sending a million pieces of e-mail is essentially the same as sending a dozen."

"The incremental cost of sending one more piece of e-mail is free . . . because the cost of sending a million pieces of e-mail is essentially the same as sending a dozen."

-- Bruce Schneier,
Counterpane Internet Security

This business model is completely unique to spam. In every other form of marketing (even web-based advertising) there is a direct, identifiable cost to communicate your message. This is important, because it forces marketers to target their demographic and optimize their spend. Without this control mechanism, the cost of figuring out who to communicate with is, by definition, higher than simply communicating with everyone.

To illustrate, consider the characteristics of the ideal direct mail campaign: Mailings are sent to everyone in the world who wants the product at the maximum price they're willing to pay. There is a 100% response rate, since those who received the mailing want the product at the offered price, and those that don't want the product didn't receive the mailing at all. Now consider the ideal spam campaign: e-mail is is sent to every e-mail address on the planet on a regular basis. Everyone who wants the product at the offered price buys it, and everyone who doesn't want it ignores it. The response rate is just above 0%, but the spammer couldn't care less. Why spend time and money winnowing the list when all the resulting non-responses are free?

The Current Solution - Viruses, Part Two

There are some interesting parallels between the spam problem and the other bane of the cyber-community - viruses.

Spammers and hackers both eat up CPU cycles, bandwidth and storage within our IT infrastructures. They both divert the attention of our IT staffs from productive, value-added activities toward damage control and preventative maintenance. Finally, they both create detours for the end-users, who must pause on their way to wherever they're going to deal with a virus or a mailbox full of spam.

Ironically, our response to the two problems also eats up infrastructure capacity and human cycles. We build filters to weed out the viruses/spam. The hackers/spammers modify their approach to bypass our filters, and we counter with enhanced filters. This challenges the hackers/spammers yet again, and the process repeats. This all-too-familiar vicious cycle may enhance the end-user experience, but it also feeds the virus/spam resource parasite. In fact, we've created quite the cottage industry around anti-virus technology, and are rapidly on our way to doing the same for anti-spam.

Despite the similar response, the two problems actually have some very distinct differences. Anyone with intermediate programming skills and an internet connection can create a virus, and identifying that individual before the virus is unleashed is essentially impossible. There are too many legitimate applications out there, and the ones that are viruses have little in the way of distinguishing characteristics until they're released into the network. This leaves us with the reactive, inefficient approach of filtering.

Spammers, on the other hand, require the use of a mail server to do their dirty work. What's more, the behavioral patterns of spammers are distinctly different than those of the average e-mail user. This provides an opportunity that I believe hasn't been fully explored.

A New Approach

We are currently reacting to a bandwidth hog by hogging more bandwidth

Rather than reacting to a bandwidth hog by hogging more bandwidth (as we're forced to do with viruses), what if we attacked the spam problem by attacking its source? If it were possible to reduce or eliminate the creation of spam, we could save the time and money we currently spend fighting spam, as well as the time and money that the spam itself is costing us in terms of technical and human capital.

The key here is the eliminate the unique "zero marginal cost" characteristic of spam by defining it as a separate and distinct service from e-mail, and charging different rates for it accordingly. After all, the U.S. Post Office currently identifies commercial mailings as a separate product and charges differently for it. If we wanted to reduce junk mail in America, which would we try first: asking the post office to increase the price of commercial mail, or installing "junk mail filters" on each and every mailbox in America in an attempt to prevent the mail carrier from depositing the offending letter at its destination? And if we did select the latter for whatever reason, wouldn't the post office complain (as IT managers are currently complaining) about all the wasted bandwidth involved in receiving, sorting, and eventually destroying the millions of undeliverable letters?

So, this leaves us with the challenge of identifying spam as a unique service. For that, we turn to the cyber-equivalent of the post office, the ISP. Since I don't run an ISP, I'm lacking in real data, but I bet the following estimate is directionally correct:

Number of E-mails Sent per User, per Month
0 - 50	20%
51 - 500	40%
501 - 10,000	30%
10,001 - 50,000	0%
50,000 +	10%

The first group (0 - 50) are the casual internet users. They only have an ISP account in the first place because their kids or grandkids told them to get one, and the only e-mail they ever receive is from a select few family members or friends. Many don't know how to respond to the e-mail they do receive, and those that do rarely bother. On a busy day, they may send four or five e-mails, and conservatively average 2-3 per day (or 50 per month).

The second group is the more typical "home" internet user. This person has a longer address book (maybe 100-200 names), and corresponds regularly with various friends and family members. They check their home e-mail accounts once a day or so, and typically have 20-30 e-mails to read each time. Of these, they reply to about half of them, and generate a few self-initiated e-mails as well, for a daily average of 15-20 (or 500 per month).

The third group is the IT professional set, who use their e-mail account as their primary means of correspondence for business and pleasure throughout the day. They use e-mail more than they use the telephone, and may send and receive a few hundred e-mails each day. The busiest among them may send an average of one e-mail every five minutes (throughout the 24 hour day), for a total of around 10,000 e-mails per month.

The fifth group (skipping the fourth group for a minute), are the spammers. Since it would be almost physically impossible to average 50,000+ e-mails per day (that's an e-mail per minute throughout the 24 hour day, with no time for eating, sleeping, or heck - even writing the e-mails), we can surmise that these e-mails are being sent by a machine rather than a person. Voila! Our commercial/bulk mailers, a.k.a. the spammers.

A few notes about this (theoretical) data: First, the presence of the fourth group (the one that contains 0% of the users) is critical to identifying spammers. Essentially, we are theorizing that there is a wide chasm between the most active e-mail user and the most casual spammer. It is this chasm that would allow ISPs to impose a marginal cost (e.g., a per-email charge for all users that send more than 30,000 e-mails per month in the above example) for spammers, without affecting even the most intense personal e-mail user.

Second, the numbers and percentages in the above chart are not nearly as important as the presence of the distinct strata. If I've under-estimated e-mail usage, and a particular ISP finds the fourth group to reside between 30,000 and 75,000 e-mails per month, then so be it - the opportunity still exists to price the two products differently: a flat, monthly rate for personal e-mail users (as exists today), and a per-email charge for spammers (as exists for just about every other form of commercial advertising).

Third, this strategy requires participation at a critical mass to be effective. If a single ISP were to implement such a pricing structure, spammers would simply move to another ISP. This would do little for the ISP except cost them a few accounts, as they would still have to filter spam from other ISPs, respond to end user complaints, and so on. To succeed, an industry standard must develop. ISPs could advertise themselves as "spam fighters," generating goodwill amongst customers who are particularly sensitive to spam. If a few of the major vendors were to come on board (e.g., Yahoo, AOL, MSN), spam could very quickly cease to be an anomaly in the marketing world, and spammers, like other marketers, would be forced to more carefully consider their message and their audience before embarking on a campaign.

Fourth, the "vicious circle" aspect of technology does not go away in this framework. There would be a nominal infrastructure cost for the ISPs to install such a billing plan (tracking number of e-mails sent per user, as opposed to simply billing everyone a common bulk rate). Privacy advocates would likely be watching, so there would need to be sufficient comfort that billing based on the monthly total of e-mails sent did not violate any previously sacred personal space. Also, spammers would likely find ways to beat the system (e.g., sending 29,999 e-mails per month from each of 1,000 e-mail accounts), which would necessitate more complex rules (e.g., surcharges for more than 100 accounts with a given billing address).

The solution is not without its costs, but if The President of the World is really spending 30% of his staff expenses and four hours per day of his personal time on spam, then it's likely a cost he's willing to absorb.