Regular expressions and content filters

Unanswered Question
Jun 17th, 2009

I'm having difficulty dropping unwanted mail that contain chinese characters. I have a content filter that looks for the gb2312 charset but it fails to match properly.

Sample header:
Content-Type: text/html; charset="gb2312"

header("Content-type") == "(?i)gb2312"
I run a trace and it does not match and the message gets through.
I have noticed some messages have the Content-type header with a different case like "Content-Type". I tried it both ways and it fails.
Some messsages are chines but do not specify a character set. I typically see somethiong like ?utf-8? in the subject line and that will not match either.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
kluu_ironport Fri, 06/19/2009 - 21:55

You may have try and match the message-body of the email. The following support portal kb article goes over the common character sets and how to match against it.

How to block Russian / Cyrillic / Ukrainian char sets



This Discussion