Using regular expressions in content dictionaries

Unanswered Question
Mar 21st, 2008

I need to create a content dictionary containing regular expressions. I also need to use the "\" to escape some characters that would otherwise be regex meta-characters. When using a regex in a message filter, the "\" must be doubled because of parsing issues. This is clearly documented in the manual. What isn't documented is whether this must be done when the regex is within a content dictionary.

Here's an example:

if (mail-from == "@bad-domain\\.com$") { drop(); }


I want to change this filter to:
if (mail-from-dictionary-match("bad-domains")) { drop(); }


So what do I put in the content dictionary, "@bad-domain\.com$" or "@bad-domain\\.com$"?

Thanks,
I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
smatheis_ironport Mon, 03/24/2008 - 22:31

You could setup a test with a truly bogus domain, and then simulate the message with the Trace feature; this should show if the content filter works or not.

Donald Nash Mon, 03/24/2008 - 22:50

While not using the exact methodology you mention, that's what I'm doing now.

kluu_ironport Mon, 03/24/2008 - 23:45

You should use this:

"@bad-domain\.com$"

The above tells the system to deference the "." (any character) to mean a literal period.

If you used this,

@bad-domain\\.com$

What the system would match is "@bad-domain\.com", because the first backslash would dereference the second backslash, to be taken literally. So, the double backslashes is the wrong format.

The only reason you see it in the final results when you've committed changes is that the system adds the backslash for you so that there's no error when it gets compiled.


Also, you could have left the single backslash out completely too and it would probably work.

"@bad-domain.com$"

If you sed that as your pattern in the dictionary, it would match against these:

@bad-domain.com
@bad-domainncom
@bad-domain1com
@bad-domain&com

basically, the "." means any character. But to be precise, you should only add one backslash in front of special characters. Here is a list of special characters:

| ( ) [ { ^ $ * + ? .

For a detailed explanation about special characters and how to use them, please see the Advanced User Guide.
[https://supportportal.ironport.com/irppcnctr/srvcd?u=http://secure-suppo...

Donald Nash Tue, 03/25/2008 - 00:20

Kluu,

I know about regular expressions. What I didn't know was whether backslashes needed to be doubled when they appear in content dictionaries, as they do when they appear in message filters. The manual clearly states that they need to be doubled in message filters due to issues with the AsyncOS parser. It does not say whether or not they need to be doubled when they appear in content dictionaries. The section about dictionaries says something to the effect of, "See the section about message filters in the Advanced User Guide for more info on regular expressions," but does not elaborate further. That's why I asked for clarification here.

FYI, I have discovered via my testing exactly what you explained here: backslashes should not be be doubled when used in content dictionaries.

Thanks.

Actions

This Discussion