There are a plethora of spam-prevention modules for Drupal (e.g. HoneyPot, reCAPTCHA, Mollom, Spamicide, MotherMayI, Spambot) . They work well, generally speaking, but not all websites are the same. I run a website that has a forum with content created by authenticated users. Account registration is open but protected. The node forms are protected, but I still get very specific spam... mostly things either pertaining to some fake technical support service, or Ugg boots advertisements.
Nice try "AndrewBrown".
After a few years I thought about using Rules to look for certain common words/phrases. A single word comparison isn't too difficult, but using Regular Expressions in Rules isn't as easy as I hoped. For starters, the Text Comparison option for RegEx doesn't support flags, so a case insensitive match is a bit more tricky.
Here's the example that I came up with, including both phrase (case insensitive) matching and a search for a phone number. Several posts I observed had phone numbers, and I really don't want users posting phone numbers in this website's forum. After the matching conditions are met, the node is unpublished, the user is blocked, an email is sent to a content moderator, then a message is displayed to the user. Obviously replace the example domain/email values and adjust searching as needed in this Rules export:
{ "rules_forum_spam_filter" : {
"LABEL" : "Forum SPAM Filter",
"PLUGIN" : "reaction rule",
"OWNER" : "rules",
"TAGS" : [ "spam" ],
"REQUIRES" : [ "rules" ],
"ON" : { "node_presave--forum" : { "bundle" : "forum" } },
"IF" : [
{ "OR" : [
{ "text_matches" : {
"text" : [ "node:body:value" ],
"match" : "[Cc][Uu][Ss][Tt][Oo][Mm][Ee][Rr][[:space:]]*[Ss][Uu][Pp][Pp][Oo][Rr][Tt]",
"operation" : "regex"
}
},
{ "text_matches" : {
"text" : [ "node:body:value" ],
"match" : "[Cc][Uu][Ss][Tt][Oo][Mm][Ee][Rr][[:space:]]*[Ss][Ee][Rr][Vv][Ii][Cc][Ee]",
"operation" : "regex"
}
},
{ "text_matches" : {
"text" : [ "node:body:value" ],
"match" : "[Hh][Ee][Ll][Pp][[:space:]]*[Ll][Ii][Nn][Ee]",
"operation" : "regex"
}
},
{ "text_matches" : {
"text" : [ "node:body:value" ],
"match" : "[Gg][Mm][Aa][Ii][Ll][[:space:]]*[Hh][Ee][Ll][Pp]",
"operation" : "regex"
}
},
{ "text_matches" : {
"text" : [ "node:body:value" ],
"match" : "(\\+0?1\\s)?\\(?\\d{3}\\)?[\\s.-]\\d{3}[\\s.-]\\d{4}",
"operation" : "regex"
}
},
{ "text_matches" : {
"text" : [ "node:body:value" ],
"match" : "[Gg][Mm][Aa][Ii][Ll][[:space:]]*[Cc][Uu][Ss][Tt][Oo][Mm][Ee][Rr]",
"operation" : "regex"
}
}
]
},
{ "NOT data_is" : { "data" : [ "site:current-user:uid" ], "value" : "1" } }
],
"DO" : [
{ "node_unpublish" : { "node" : [ "node" ] } },
{ "user_block" : { "account" : [ "site:current-user" ] } },
{ "mail" : {
"to" : "to@example.com",
"subject" : "Possible spam at example.com",
"message" : "Please see node [node:nid] by author [site:current-user:uid].",
"from" : "from@example.com",
"language" : [ "" ]
}
},
{ "drupal_message" : {
"message" : "The forum post that you submitted appears to be spam. It will be evaluated over the next few days to confirm. Your account has temporarily been suspended.",
"type" : "error"
}
}
]
}
}