Welcome, to the Spam-O-Matic !
What's a Spam-O-Matic ?
What's that, you say ? Spam-O-Matic ? What's that ?
The Spam-O-Matic is a product of Spam-O-Matic Inc.
It is an artificially intelligent electronic internet mail filtering system, designed to reduce the amount of un-wanted e-mail clogging up the internet these days. Some people see literally thousands of unwanted "junk mails" clogging their inbox daily, making internet e-mail nearly useless for them. Spam-O-Matic was designed because I have that problem, and needed a fool-proof system, or as close as one can get. No other product on the market offers the reliability, effectiveness, and ease of use in an enterprise wide, server based system, either as an appliance, or as an installable program compatible with any known internet mail server, as far as we know.
Why we wrote it.
We think it's possible to stop spam, and that content-based filters are the way to do it. The Achilles heel of the spammers is their message. They can circumvent any other barrier you set up. They have so far, at least. But they have to deliver their message, whatever it is. If we can write software that recognizes their messages,there is no way they can get around that.
How it works.
OK. How does it work ? What do I need to do ?
Spam-O-Matic works by examining your e-mail before it gets to you, performing a probability analysis, and tagging your e-mail with a token, and possibly a "tag" to aid you in quickly dispatching unwanted junk.
The token looks like this ....
<SoM7GXDi0>
though the actual letters and numbers will be different, which is appended to the end of the subject line in any e-mail you receive. If Spam-O-Matic determines that you would probably rather not see this particular bit of earth-shattering new product ( or whatever ) then Spam-O-Matic will also prepend a REJECT tag at the front of the subject line. This allows you to very simply set up your normal e-mail filtering or sorting rules to look for the REJECT tag, and appropriately file rejects.
Artificial Intelligence ?
OK. You mention words like "artificial intelligence" "train" and "learn".
Do I have to teach this thing ?
Yes, in a word, you do.
Spam-O-Matic works very much like a secretary screening all of your e-mail.
Except that Spam-O-Matic doesn't take coffee breaks, lunch, or quit for the day, and doesn't suffer stress, fatigue, or lack of days off. Like a new secretary, it will need to learn YOUR routines, and YOUR e-mail habits.
Spam-O-Matic tailors itself to YOU and the types of e-mail messages YOU receive. It does not depend on other outside influences, black-hole servers, or even the rest of the people using the same filter. Spam-O-Matic does, in fact, learn what YOUR e-mail looks like, and tailors its filtering uniquely to you. In order to archive this, it is necessary to teach it what you want, and what you don't want. Like any child, it may make mistakes, and it may be necessary to spank it once in a while, by sending back any mistakes it may make for corrective action. This is another reason Spam-O-Matic is not just another "spam filter" like so many. It custom tailors itself to YOUR e-mail. What YOU want, and what YOU don't want, based on YOUR real world experience.
How it learns YOUR preferences.
Spam-O-Matic requires an initial 8 day training period, to learn your real-world e-mail habits. It will appear to be doing nothing except adding a token to your mail for the first week or two AFTER you initially introduce yourself by sending back the first message you want the filters to learn, either as a keeper, or as a reject ( spam. )
It is ESSENTIAL that Spam-O-Matic initially learn what the first several good e-mails look like, in order to avoid falsely rejecting good messages.
In order to quickly make the filters very effective, manually forward ( individual ) copies of the first 100 good e-mails you receive to the keep address. NOTE: Do NOT "batch" a bunch together, especially if you are using Microsoft. These will be learned as good mail IMMEDIATELY. ( this requirement will be eliminated in a very near future version ) Once this is done, any junk sent to the spam address will be trained at the earliest training run ( as early as within a few hours. ) Note that NO spam is trained within the first 8 days to give the filters a chance to see what good e-mail "looks like" AND to give YOU a chance to make sure that anything not sent to spam, is in fact, good and wanted e-mail.
If you never send anything to it, it will not "wake up" to your existence, and your e-mail will remain totally unfiltered as it likely is now. Once you send any message to Spam-O-Matic it will begin accumulating a cache of samples of your e-mail. We are almost fanatic about not throwing away good messages, so Spam-O-Matic assumes by default that you want everything sent to you. If this is correct, and the e-mail you get is all good and wanted mail, you need do nothing further. Spam-O-Matic will constantly update its database with all of the mail you receive and do not send back as examples of what is good and wanted. When you get mail that you do not want, and it is not tagged as a REJECT, simply send it to the spam training e-mail address ( see your network administrator ) and Spam-O-Matic will learn this as an example of what you do NOT want.
Resetting the filtering engine.
I've changed my mind, or made some bad mistakes, and sent sveral of the wrong messages to the filter. What can I do ?
Spam-O-Matic does provide a way through the web interface to re-set the engine to its original state, where it knows nothing about you, and allows you to start over. All of the training you may have done is lost, and the filter is returned to a factory shipped state.
Tags added to mail
As Spam-O-Matic learns ( and it learns quickly ) it will begin to tag mail that is probably a reject, as a REJECT. If it is, you should do nothing. From this point on, all you need do, is to send back, either to spam or to keep, messages that were incorrectly tagged, or incorrectly not tagged.
Spam-O-Matic may tag a message as QUESTIONABLE. This means that Spam-O-Matic is more than sure, but less than certain that this is a REJECT, but not certain enough to actually REJECT the message. Please, send back all QUESTIONABLE messages either to spam ( as a reject sample ) or to keep ( as a good message example ) as this will refine the filters, and make Spam-O-Matic more effective.
A Spam-O-Matic token.
Spam-O-Matic appends a tag to the end of a message subject lines that looks like this:
White and Black lists.
Spam-O-Matic supports both WHITE and BLACK lists. A whitelist entry is an address or characteristic of a message that you consider to be so important that you do not want it rejected, or questioned regardless of what it is, especially if it is a virus !!! It comes from a source so trusted that you always want it, regardless, no matter what. It's also so wanted and above reproach that there is a certainty that this message should be used as an automatic training example of what you DO want to get.
A blacklist entry ( Spam-O-Matic comes with some defaults of known spammers ) is one that you are so sure it should be REJECTED and used as an example of what you do not want, even if it is your paycheck, that it should be dismissed with extreme prejudice, and used as a training example of what you do not want.
In order to help you in the event a blacklist entry becomes "wrong" for some reason ( more on this below ) these are specially tagged as BLACK-listed at the beginning of the subject line.
So typically, this is what happens....
You receive e-mail.
Normally, you need do nothing with mail you DO want to see, and it will be used to update the knowledge base so that Spam-O-Matic will recognize what you consider to be good mail.
Any mail that is correctly tagged as REJECT, you need do nothing with, except to delete it. Spam-O-Matic has correctly assessed that there is a very high probability that this is the kind of message you do not want to see. We recommend that you review the REJECT messages for the first few months, to be sure that there are no mistakes. Probably there will be few, if any, but Spam-O-Matic Inc. assumes no liability for a missed correction on your part, or for not teaching it that it made a mistake.
As the system learns ( and it learns very quickly ) you will find more and more that you need do nothing at all, except send the rare but occasional example of new kinds of junk back to spam. It won't be long before the ONLY action you need take, is to send the new examples of junk as the spammers try to circumvent the filters, back to spam. Spam-O-Matic will then stay as current as the newest spam becomes.
About standard X-Headers
If you are using any of the Microsoft mailers, Outlook or Outlook Express,
any version, this does not apply to you. Microsoft mailers strip away all standard X-headers
before displaying messages.
Spam-O-Matic adds certain X- headers to all mail messages received through it.
This is normal.
Some of them are used internally by the Spam-O-Matic engine itself, and some
are merely informative.
You certainly can filter and / or sort on them, if your system will allow you to
do so, and this will be more reliable than what may ( or may not ) appear in Subject: lines.
The list of possible standard headers that Spam-O-Matic may add are :
If a message is tagged as REJECT in the subject line, there is a very high probability that this is a message, and a "kind" of message that you don't want to see. You can filter in whatever mailer you use, to take any message with REJECT in the subject line, and do what you will with it. We HIGHLY recommend moving it to a REJECT folder or directory, and reviewing them to be sure they are correctly tagged. There is a very slight chance that a mistake might be made.
Mail that Spam-O-Matic has learned is the "kind" of mail that you DO want, will have no tag, and will look and act just like your e-mail did before employing Spam-O-Matic, except that it will have a token appended to the subject line. If Spam-O-Matic has made an error, and allowed junk through, you simply forward this message to spam, and Spam-O-Matic does the rest.
Mail that the Spam-O-Matic isn't quite sure about, or is pretty sure but not quite enough to risk rejecting a false-positive, will be tagged QUESTIONABLE. Questionable messages are NOT scheduled for training as anything, and absolutely should be sent back to keep or spam as appropriate.
False Positives and False Negatives
Um....
What's a "false positive" ?
A false-negative is junk that got through and should not have.
False positives are innocent emails that get mistakenly identified as REJECT. For most users, missing legitimate email is an order of magnitude worse than receiving spam, so a filter that yields false positives is like an acne cure that carries a risk of death to the patient.
The more spam a user gets, the less likely he'll be to notice one innocent mail sitting in his spam folder. And strangely enough, the better your spam filters get, the more dangerous false positives become, because when the filters are really good, users will be more likely to ignore everything they catch.
This is the one failing of Spam-O-Matic that we don't know how to fix, except by breaking Spam-O-Matic to make it less effective, and we don't want to do that !
You should forward all of these to either keep or spam as appropriate, so that the fine-tuning of the mail filtering engine will increase the efficiency of the filter, and at the same time reduce the possibility of falsely rejecting mail that you consider good mail.
When you send junk mail to the filter, it is not trained immediately, but is scheduled for training at the next training run on the filters. This gives you a chance to correct an error, should you accidentally hit the wrong address, or prematurely hit the send button. Simply send that message again, to the keep address, and it will be removed from the schedule, and immediately trained as a keep-er.
Senders that you have added to the default blacklist ( by sending a copy of a message that they have sent you to blacklist ) will have ALL mail tagged as BLACK-listed, and will further be used to update the filters as to what a reject looks like.
Similarly, senders that you have added to the default whitelist ( by sending a copy of a message that they have sent you to whitelist ) will have ALL mail accepted as good mail, and will further be used to update the filters as to what a good and wanted message looks like. THINK VERY CAREFULLY ABOUT THIS ! These are NOT tagged so if something got through that you think maybe it should not have, it's a good idea to check on your whitelist, and to keep it minimal after the first month or so. The probability analysis will have learned by then, what "kind" of mail your white-listed senders normally send you.
We highly recomend NOT adding mailing lists to your whitelist. I did this, and the list got spammed, creating quite a bit of work for me since ANYTHING whitelisted is auto-trained as good and wanted mail, even if it is not. Besides, the content filtering engine will quickly learn to recognise legit messages from a list, and reliably reject spam if the list gets spammed.
Black or While list corrections
If you make a mistake and add someone to either your whitelist or your blacklist
that you prefer not be there, you can remove them from the respective list
through the web interface, and thereafter they will be treated as any other
e-mail you would receive.
This will be particularly useful in the event ( it does happen, has happened
to me, and is the reason that I am not on the default whitelist ) some spammer
hijacks an e-mail address that is on your whitelist.
REMOVE that entry as soon as possible, and then send all of that junk to
spam, to prevent what was received from being auto-trained as examples
of good e-mail since the sender was on the whitelist.
Note that normally, training takes place 7 to 8 days after a message was originally received, so you have 6 or 7 days at most to correct this, should it happen.
How will Spam-O-Matic communicate with me ?
Normal operations produce no messages to you, but
it's possible you will get e-mail from Spam-O-Matic.
Scheduled for Training as and Trained as
When you get a message from Spam-O-Matic with a subject line:
Subject: Scheduled for Training as Spam:Fw: hi <SoM7GXDi0>
Spam-O-Matic has correctly received a message that it made a mistake, the original was unwanted junk, and will be learning that this message is an example of what you do NOT want to see, because you deliberately sent a message to spam. This message will be trained at the earliest training run, and will NOT wait the normal 7 or 8 days !
OR, when you get a message from Spam-O-Matic with a subject line:
Subject: Trained as NOT spam: Fw: Staff Meeting <SoM7GXDi0>
Spam-O-Matic has correctly received a message that it made a mistake, the original was wanted and has learned that this message is an example of what you DO want to see. Or, that you made a mistake, and incorrectly sent a good and wanted message to spam, Spam-O-Matic informed you that it was scheduled as an example of junk, you have sent another message correcting your error, and Spam-O-Matic responded to that, and corrected the error before any damage was done, and has immediately trained this message as a keeper.
Corrective Measures
If Spam-O-Matic makes a mistake, and incorrectly marks good mail as a REJECT,
then you need to send that message ( a copy is fine, as long as it has the
Spam-O-Matic token at the end of the subject line, and actually all that
is needed is the token on the subject line ) to keep.
Spam-O-Matic will then send you a confirmation that the message has been
trained as an example of what you DO want to receive.
NOTE: anything you send to keep is IMMEDIATELY trained as soon as you
reply to the confirmation request, as good mail,
and is not learned later during off-hours as would be the normal case.
This is deliberate, and helps prevent future messages of this "kind" from
being wrongly REJECTed almost immediately.
You MUST NOT send good mail to spam by mistake, else you will increase the chances of mail being falsely tagged as spam when it is not !! This is a correctable error, as mentioned above, but must be corrected right away ! Likewise, you must not send spam to keep, nor do anything with the REJECT that has been correctly REJECTed once the filters have been activated, as this will reduce the effectiveness of Spam-O-Matic. This is not fatal, but will take some time to un-train, as Spam-O-Matic can be quite stubborn when unlearning what it was told is good mail. We are rather heavily biased toward NEVER tagging good mail as junk, and sending junk to keep will overly exaggerate this behavior, and decrease the effectiveness.
Any mail that is correctly tagged as REJECT, you need do nothing with, except to delete it. Spam-O-Matic has correctly assessed that there is a very high probability that this is the kind of message you do not want to see. We recommend that you review the REJECT messages for the first few months, to be sure that there are no mistakes. Probably there will be few, if any, but Spam-O-Matic Inc. assumes no liability for a missed correction on your part.
If you are NOT using a mailer that attempts to add "functionality" then
Spam-O-Matic has the capability to learn mail patterns that are not originally
received through the filters.
You ( presumably ) have sent a sample of junk to spam or good mail to keep.
With a subject line that looked like this:
We presume the sample came from you, but we must be sure. Spam-O-Matic will
send a confirmation request to where the message apparently came from, but
ONLY if you are a legitimate user with a real account on the filter.
If you are, you will get the request. Note that the sample now has a token
appended to the subject, and the CONFIRM REQUEST blatantly added to the front
of the Subject: line. This is deliberate, in order to get your attention
quickly. It then also has what the request was for, in this case to
Train JUNK: as an example of junk mail. The line also now has a token appended.
If this is a legitimate request from you, you simply REPLY to the message.
Spam-O-Matic has a number of internal security checking routines that will
check the validity of the request, AND your reply. If it is legit, then
Spam-O-Matic will perform the requested action on the message EXACTLY as
it was received from you ( not necessarily what you thought you saw )
If you are using a text mailer, where what you see IS what you get,
then this affords you another way to teach the filters, both junk, and keep.
With any mailer, requests to add to the blacklist, or the whitelist will
always generate a confirmation request.
As the system learns ( and it learns very quickly ) you will find more and more
that you need do nothing, except send the rare but occasional example of new
kinds of junk back to spam.
It won't be long before the ONLY action you need take, is to send the new examples
of junk as the spammers try to circumvent the filters, back to spam.
Spam-O-Matic will then stay as current as the newest spam becomes.
In the second case, see the notes further down in this document.
It can mean one of three things.
Your Microsoft mailer may tell you:
"One or more of the pictures in this message could not be found.
YES you are sure.
This is not an error in Spam-O-Matic.
"Message Character Set Conflict
This message contains some text having a character set other than the default
character set. When sending this message you have the following options."
This is not an error in Spam-O-Matic.
Microsoft is known to mangle encoded messages that are in odd or foreign
character sets, and makes them impossible to find in the message cache.
Fortunately, although badly mangled, Microsoft DOES show you the
Spam-O-Matic key token in the original subject line, even though Microsoft
mailers mangle it beyond recognition when the message is sent.
Delete everything from the subject line except the message ID token
which will resemble something like <SoM0GXGi0>
Send the message. You will get the warning box again.
For reasons we don't fully understand, by deleting everything else from
the subject line, Microsoft will send the token without mangling it.
Spam-O-Matic will use an original cached copy as-received,
and not a mangled version that your mail reader is going to send
regardless of what you do.
Alternatively,
COPY this key token into a new message subject line, or type it in manually
EXACTLY as it appears, including the <> and everything between them,
( note, it is case-sensitive and an error will cause other errors )and
send the message.
Spam-O-Matic will recover the original un-mangled message from this key,
and will train properly.
Very often, Chinese and other foreign "junk" will have a subject line
that looks like this....
When you forward this message to Spam-O-Matic, Microsoft is known
to mangle this kind of message beyond all recognition,
and makes them impossible to find in the message cache.
Fortunately, although badly mangled, Microsoft DOES show you the
Spam-O-Matic key token in the subject line, even though Microsoft will
mangle it beyond recognition when the message is sent.
In a case like this, delete everything from the subject line
except the Spam-O-Matic token, so that it looks like this...
then choose "Send As Is" When Microsoft warns you that the Chinese
pict-o-grams ( or other non-standard characters ) won't be sent properly.
Spam-O-Matic will not care.
If you get a message from Spam-O-Matic with a subject line:
This is not an error in Spam-O-Matic.
If, however, you fit the fourth case, you should generate a new message
to Spam-O-Matic.
In the subject line, put ONLY the token which Microsoft will show you
correctly, but did not send correctly, in the original message.
( you may have to look for it, and type it in manually )
Look in the body of the request message for a token that looks
like this ( the actual letters and numbers may be different ) :
<SoM0GX2R0>
If you get a message from Spam-O-Matic with a fraud warning,
it can mean one of two things:
In the first case, you need to see your network administrator.
In the second case, you should generate a new message to Spam-O-Matic.
In the subject line, put ONLY the token which Microsoft will show you
correctly, but did not send correctly, in the original message.
( you may have to look for it, and type it in manually )
Look in the body of the fraud warning message for a token that looks
like this ( the actual letters and numbers may be different ) :
<SoM0GX2R0>
If you get a message from Spam-O-Matic with a subject line that looks
like this:
It can mean one of three things.
If you think you've found a bug, you need to contact your network
administrator, who will provide us with the relevant feedback.
More than likely, Microsoft has mangled the message, and mangled the
original token so badly that we can't recognize it.
Fortunately, although badly mangled, Microsoft DOES show you the
Spam-O-Matic key token in the ORIGINAL message text area
lower down in the expired warning, even though Microsoft will
mangled it beyond recognition when the message is sent.
COPY this key into a new message subject line, and send it.
For reasons we haven't yet figured out, Microsoft doesn't mangle the
tokens when you manually copy them this way, and the system will use
them correctly.
This will be corrected in a future version.
It is necessary that this be done, otherwise the system will assume
that the original message was a keeper, and will auto-train as such,
making Spam-O-Matic less effective.
Speaking of confirmations.....
When you get a message from Spam-O-Matic with a subject line:
Spam-O-Matic CONFIRM REQUEST: Train JUNK: (fwd) Re: Some Subject <SoM5Sa420>
This is what happened....
More than likely, you are using an HTML "web-page as e-mail" type of
mailer, such as Microsoft, ( especially Windows XP service pack 2 ) or Netscape, and others, and your
mailer has mangled the original token beyond recognition.
In that case, see the notes about Microsoft further down, as they also
apply in this case, to Netscape, AOL, and any other mailer that "
Subject: Some Subject
OR your mailer has either eliminated or mangled the original token.
Several things MUST happen before Spam-O-Matic will accept this, in order
to prevent spammers from spoofing the filters.
Spam-O-Matic has cached the sample, and will wait a maximum of 25 hours for
a confirmation before taking ANY further action.
If you do not respond, Spam-O-Matic will assume that this was an attempt
by some spammer to spoof your filters, and will delete the message, taking
no action on it at all.
Fraud and Spoofing.
If you get a message from Spam-O-Matic with a fraud warning, it can mean one
of two things:
In the first case, you need to see your network administrator.
Expired Tokens
If you get a message from Spam-O-Matic with a subject line that looks
like this:
Subject: Spam-O-Matic ERROR Message: The token, <SoM0GX2R0> has EXPIRED
If you think you've found a bug, you need to contact your network
administrator, who will provide us with the relevant feedback.
Microsoft specific notes.
Certain characteristics of some mailers, particularly Microsoft Outlook,
and Outlook Express are worthy of special notes.
Spam-O-Matic is one of the few filters that is compatible with Microsoft
Outlook, Outlook Express, and Microsoft Exchange.
Although every attempt has been made to make Spam-O-Matic more intelligent
than any other spam filter, you may get some errors.
When you send this message these pictures will not be included.
Are you sure you want to send this message ? "
Spam-O-Matic does not care if pictures are sent back,
as we use an original cached copy EXACTLY as-received, and not a mangled
version that your mail reader is going to send regardless of what you do.
First, CANCEL this box !
There is a work around. As far as we know, this ONLY
applies to Microsoft Outlook, and Microsoft Outlook Express
( but there may be others of which we are not aware ).
ALWAYS choose "Send As Is"
Delete the original message.
Subject: Fw: O3tfdSlPTzrSnMeZI ?????????? <SoM0GXGi0>
or will possibly contain a string of little square boxes.
Subject: <SoM0GXGi0>
Spam-O-Matic CONFIRM REQUEST: Train JUNK: (fwd) Re: Some Subject <SoM5Sa420>
it can mean one of four things:
With any mailer, requests to add to the blacklist, or the whitelist will
always generate a confirmation request.
If you fit any of the first three cases, simply reply to the
request message, and Spam-O-Matic will correctly process your
original request.
or anywhere on the original message ( probably still in your recycle bin )
including in the fancy boarders.
Now, send a new message to Spam-O-Matic that consists of only
a subject line that looks like this, containing that token EXACTLY
as it appears in the message you received ( remember, it is case-sensitive ):
Subject: <SoM0GX2R0>
Spam-O-Matic will correctly process the original, unmangled
message, and you can disregard the confirmation request.
Note that this is not a bug in Spam-O-Matic, but is
a "feature" of certain mail programs.
This is not an error in Spam-O-Matic.
Now, send a new message to Spam-O-Matic that consists of only
a subject line that looks like this:
Subject: <SoM0GX2R0>
containing that token EXACTLY
as it appears in the message you received ( remember, it is case-sensitive )
Spam-O-Matic will correctly process the original, unmangled
message, and you can disregard the fraud warning.
Note that this is not a bug in Spam-O-Matic, but is
a "feature" of certain mail programs.
Subject: Spam-O-Matic ERROR Message: The token, <SoM0GX2R0> has EXPIRED
Remember, Spam-O-Matic is biased in favor of allowing some junk
through, so that a real keeper is not mistakenly tagged as spam.
Allowing junk to be trained as a keeper will further this bias,
making Spam-O-Matic somewhat less effective than the 97.03 percent
effective that we typically see.
What do you think ?
If you think ( as we do ) that the Spam-O-Matic is the best server
side mail filter available, tell everyone.
If you think anything less, please tell us.