What Is Anti-Spam | Part 2
Spam Filtering By Keywords Or Addresses
This method is very limited because it is based on the rejection or sorting mail according to rules previously established vocabulary, defining terms as prohibited. Some keywords that occur frequently in spam, such as “sex”, “viagra” or “money” will be the basis for establishing such rules. Similarly we may decide to block all messages from a particular sender, domain specific, or even an entire country.
This method creates a high probability of error and is ineffective when spammers makeup words used ( “vi @ gr @”, “s3x”, etc..). It should then use regular expressions.
Filtering by regular expressions
A regular expression (often called “regular expression” computer) is a pattern that can be applied to a string to see if that string matches the pattern (e.g., “a number followed by three letters followed by a d space, then a figure “could be written this way: / ^ [0-9] (1) [A-Za-z] (3) [0-9] (1 }$/). Using regular expressions to find variations of words “sensitive”, it increases the chances of finding spam. For example, if a spammer attempts to foil a filter keyword using the word “viiaaagraa”, the regular expression / ^ vi + a + gra + $ / i (a ‘v’ followed by one or more ‘i ‘followed by one or more’ a ‘, followed by’ g ‘, a’ r ‘, and one or more’ a ‘, regardless of case) allows to find the word. Obviously, this example is very simple, but complex expressions to detect expressions and variations much more subtle and sophisticated. A limitation of the use of regular expressions is illustrated in the problem of Scunthorpe, which produces false positives.
Analysis of viruses and attachments
The emails often have attachments, and they may contain viruses. It is therefore important to have in the process of sorting messages, antivirus. Often, the content filters are integrated. For example, it is not uncommon for SpamAssassin and ClamAV together.
Spam: Images
Images are one of the major difficulties facing the content filters. Indeed, it is virtually impossible to determine if the image is legitimate or not (often spammers use images to cover text). One technique to determine the images in the email are legitimate or not is to look at the number of images in the email and see how they are placed in the message. This may be a good indication of the nature of the message. Furthermore, it is possible to generate a checksum on the image and compare it with other controls are available on the Internet (just like the RBL). This will allow the system to check if the image has already been used in a spam email classified accordingly.
Sender filtering server
This type of filtering can ban email addresses, domains, or servers. Thus, any message from elements of the blacklist will be blocked by the spam system. These list items are often defined by a system administrator who, by experience, is able to determine the most common sources of spam. This technique has the characteristic of not being limited only spam in the pure sense of the term and hard: it can also block mail from legitimate sources, if the system administrator considers harmful. Obviously, this type of filtering is highly subjective and depends on the goodwill and the attendance of the person creating the list.
“Realtime Blackhole List”
The Realtime Blackhole List (RBL) have a mandate to provide a list of servers known as major senders of spam, and lists the major spammers. This is actually a large blacklist widespread. The use principle is simple: when a filter receives an email, it checks whether the sending server is contained in an RBL. If yes, email is categorized as spam.
The RBL is a filter used as sources of servers are usually determined by the system administrator. This method therefore contains its share of controversy, because some RBL are deemed to be more effective than others. Their choices thus directly influence the effectiveness of the spam. In addition, some RBL rules were more flexible than others about adding a server in their list, further complicating the situation. To overcome this problem we can refer to several RBLs and not block a source if it is present in both lists.
Continued…

