Hercule Filter is a detective for spam-typical HTML and forged mail headers.
You may find the latest version at www.hinzen.de/Spamihilator
and contact the author at edy@hinzen.de.
The current version number of this plugin will be shown in the upper right corner of the options dialog.
Please note that 'Hercule' scans first the simple and fast-to-scan options and next those options who may take more time.
First of all, the mail headers will be checked, since those are given first by Spamihilator.
If in one of the (internal defined) sections of tests the mail is marked as Spam, 'Hercule' stops it's processing.
That means, that 'Hercule' reports only the first found sign(s) although probably more indications of Spam are contained in a mail.
If you select an entry in Spamihilator's recycle bin, the filter reason will be shown at the label "Spam Words" as usual.
For some options, the amount of found occurences will be shown, too.
External image (2) means, for example, that two external images have been found.
If you need detailed information about the rejection reason, please use the Logging-Options as decribed below.
The options can be set on several panels in the options dialog.
You may set the options back to defaults by using the [Reset] Button.
This help file explains in different sections the various option panels. The explanations are structured as follows:
Title of option panel | |
---|---|
Subtitle of option panel | |
Option, as you read it in the dialog | Description
Message as shown in the recycle bin of Spamihilator.
Examples
|
Header | |
---|---|
Mark as SPAM, if mail header ... | |
contains forged date | Detects wrong dates (e.g. violating RFC 2822).
Forged date
|
contains date elder than one year | Detects date strings with too old values.
Elder than one year
|
has bad charset | Detects invalid charsets.
Bad charset
|
has bad subject | Detects subjects containing white space or fillers like ".....".
Bad subject
|
has empty subject | Detects mails with empty subjects.
Empty subject
|
contains authentication warning | Detects mails with warnings given by program "sendmail".
Authentication warning found
|
contains invalid IP addresses | Detects mails with IP addresses that violate internet standards.
Invalid IP address
|
reveals mail address | Detects mails revealing your mail address in header fields where not necessary.
Header reveals your mail address
|
has more than one BCC field | In general, the BCC (Blind Carbon Copy) field should not be sent. Some programs do either. If more than one BCC header is found, it's likely SPAM.
Multiple BCC fields found
|
forged mail header | This option scans several vialoations of standards for mail headers.
Forged mail header
|
HTML (1) | |
---|---|
Mark as SPAM, if detected ... | |
HTML hides e-mail address | Detects tricks used to find out your address e.g. if you send the SPAM to abuse newsgroups.
HTML hides your mail address
|
link containing e-mail address | Detects external links containing your mail address (in clear text or encrypted).
URL reveals your mail address
|
link perhaps revealing your identity | Detects external links with parameters (like "aff_id=0815_4177") that probably reveals your address.
URL reveals identity
|
external images | Detects useage of external images.
External image
|
zero sized images | Detects useage of zero sized (and hereby invisible) images.
Zero sized image
|
image link contains e-mail address | Detects useage of image links revealing that you have opened the mail.
Image reveals your mail address
|
external frames | Detects useage of external frames. Those could be used to forge mails that may let you believe that a trusted company (e.g. your bank) has mailed to you.
External frame
|
invisible frames | Detects useage of invisible frames. Those can be used to reveal your address or to download intrusion programs.
Invisible frame
|
empty mail | Detects empty mails. Some spam programs seem to crash sometimes sending mails without any contents.
Empty mail
|
HTML (2) | |
---|---|
Mark as SPAM, if detected ... | |
more than ... invalid HTML tags | If checked, misspelled or invalid HTML tags may not exceed this value.
Please don't set too low, because humans may have sent you a mail with manually written HTML tags and probably some typos in there.
Note: The list of valid HTML tags is held in file "HerculeFilter.ini" but cannot be edited using the options dialog. Invalid HTML tags
|
more than ... too long tags | If checked, defines the count of HTML tags exceeding the currently longest possible tag length.
The Tag <blockquote> is currently the longest valid HTML tag with ten characters. A tag longer than 12 characters will be recognized as too long. The useage of this option is currently less recommended, because strings like <www.hinzen.de/Spamihilator> could be misinterpreted as tag instead of a term in brackets. Too long HTML tag
|
more than ... bad HTML tags | Detects bad tags typically used by spammers.
Sample: <S§R>
Bad HTML tags
|
bad URLs | Detects bad urls e.g. with some redirect or hideing tricks.
Bad URLs
|
URLs containing ... | If checked, detects the useage of URLs containing one of the entered substrings. Checks, hyperlinks, image-, frame-, stylesheet-links and some more.
Black listed URL
|
Tricks | |
---|---|
Mark as SPAM, if detected ... | |
more than ... random words | Detects useage of random words typically used intending to fool spam filters.
Counts e.g. the amount of words at the end of the mail without any punctuation marks.
Random words
|
more than ... META tags | Detects useage of masses of META tags intending to fool spam filters.
Too much META tags
|
SPAM-typical HTML | Detects useage of HTML tags and structures typically used by spammers.
Spam-typical HTML
|
Intrusion-typical HTML | Detects useage of HTML used to infect your system with virusses and trojans.
(Doesn't detect the virusses itself - it only can try to check the typically used HTML-techniques.)
Intrusion-typical HTML (viruses, trojans)
|
forgotten placeholders | Detects if a spammer used a random-word-program and probably set wrong keywords.
Scans e.g. strings like
Place holders in body
|
URL spoofing | Detects if a spammer tries to let you see another URL than that one really used.
URL spoofing
|
scripting | Detects if scripting is used for some spamming tricks.
Contains script
|
Style | |
---|---|
Mark as SPAM, if detected ... | |
tiny letters | Detects useage of tiny letters (not read by humans but) confusing spam filters
Tiny letters
|
hidden letters | Detects useage of hidden letters (not read by humans but) confusing spam filters
Invisible letters
|
white letters | Detects useage of white letters (not read by humans but) confusing spam filters
White letters
|
Logging | |
---|---|
Mode | |
Never | No logging takes place. |
Standard | Standard logging. In general, only errors are reported. |
Verbose | For every rejected mail the sender, the subject and the rejection-reason are logged. |
Extended | In addition to the above, the correspondig dialog option that caused the rejection will be listed. Durations will be shown. |
Debug-Mode | Only for debugging. Start and end of subroutines and the contents of the mails are logged, too. |
Remove previous log | Within this option you define how long previous log entries will be held. |
Version | Remarks |
---|---|
1.2.0.0 | Added "Gray list". |
1.1.0.3 | Recognizes more Scripts hidden in CSS. |
Accepts "?xml"-notation of xmlns. | |
1.0.9.7 | Corrected bug in previous version that caused most settings deactivated, regardles of user settings. |
1.0.9.6 | Improved performance for mails with big attachments. |
Logging-mode "extended" now shows durations. | |
1.0.9.5 | Fixed serious bug that marked all dates as forged if other date separator than "." is defined in local settings. |
Improved recognition of external files (e.g. Images). | |
Improved logging details. | |
1.0.9.4 | Fixed small bug concerning message-id headers including comments. |
1.0.9.3 | Fixed bug showing no filter-reason when logging was deactivated. |
1.0.9.1 | Improved scan for external images / frames. |
Improved performance of HTML-Scan. | |
Now accepts XML name spaces (xmlns, as used e.g. by Office programs) in HTML-Scan. | |
Fixed bug that marked given time zones without plus- or minus-sign ("+" "-") as forged date (e.g. "0100"). |