regex - SpamAssassin Regular Expression not matching -
i have read similar questions here, being regular expressions not created equal, not able find solution problem.
i working on rule spamassassin tell if recipient's e-mail username contained in body of message. example, e-mail sent testuser@somedomain.com
contains testuser
in body of message. have written , tested regular expression on regex-101 , able match expected, when create rule not work when test in spamassassin.
here expression:
/to:\s([a-z0-9][-a-z0-9]{1,19})\@somedomain\.com[a-z0-9\s=;:\/\.-]*\1\b/i
what should match e-mail address in to:
header (or anywhere in body of message matching format to: user@somedomain.com
. mentioned before, expression matches expected on regex-101, when make rule in spamassassin, not match.
if remove leading to:\s
match, concerned matching e-mail in to:
header. have tried these various mutations of expression:
/to:\s([a-z0-9][-a-z0-9]{1,19})\@somedomain\.com[a-z0-9\s=;:\/\.-]*\1\b/i /to: ([a-z0-9][-a-z0-9]{1,19})\@somedomain\.com[a-z0-9\s=;:\/\.-]*\1\b/i /to:[\s]{0,2}([a-z0-9][-a-z0-9]{1,19})\@somedomain\.com[a-z0-9\s=;:\/\.-]*\1\b/i /:\s([a-z0-9][-a-z0-9]{1,19})\@somedomain\.com[a-z0-9\s=;:\/\.-]*\1\b/i /\s([a-z0-9][-a-z0-9]{1,19})\@somedomain\.com[a-z0-9\s=;:\/\.-]*\1\b/i
none of previous rules match, 1 does:
/([a-z0-9][-a-z0-9]{1,19})\@somedomain\.com[a-z0-9\s=;:\/\.-]*\1\b/i
here text using testing:
subject: test spam mail (gtube) private jet rental message-id: <gtube1.1010101@example.net> date: wed, 23 jul 2003 23:30:00 +0200 from: sender <sender@live.com> to: recipient@somedomain.com precedence: junk mime-version: 1.0 content-type: text/plain; charset=us-ascii content-transfer-encoding: 7bit recipient gtube, generic test unsolicited bulk email
which should match on to: recipient@somedomain.com
.... recipient
, can match when remove to:\s
expression. full expression tests out in regex-101, seems specific spamassassin, i'm not sure.
edit
here updated version of expression not allow dash @ end of username, allow in middle:
/\bto:\s([a-z0-9][-a-z0-9]{0,18}[a-z0-9])\@somedomain\.com[a-z0-9\s=;:\/\.-]*\b\1\b/i
with assistance @sln in chat, came following expression matches full rule expected:
/to:\s+([a-z0-9][-a-z0-9]{1,18}[a-z0-9])\@somedomain\.com[\s\s]*?\1\b/i
that match to: username@somedomain.com ... username
, should, part, match on e-mail message contains recipient's username in body of message. in our case, many of spam e-mails receive contain username, such as:
greetings username! blah blah blah spam message.
what ended fixing replacing [a-z0-9\s=;:\/\.-]*
following e-mail address [\s\s]*?
Comments
Post a Comment