cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3064
Views
0
Helpful
4
Replies

Ironport ESA System Overview Graph Statistics Do Not Match Logs

macklinmc
Level 1
Level 1

We use the C670 Monitor | Overview web page to gather statistics. Using a web page scraper and the Export function, the CSV file is grabbed and the hourly statistics put in a database for long term stats trending. We've noticed however that comparing 1 hour of mail log or smtp_debug to these stats, we can't reconcile the numbers. The log files show many fewer overall connections, on the order of 50% missing information in any given hour. In mail logs the  key is "New SMTP ICID" from an outside interface and that count is expected to match the Overview graph Incoming total of the various categories. In smtp debug the message is the "Info: ICID [number] address [ipaddress] log entry where the IP address is external, ie. not RFC 1918. The smtp debug and mail log counts match exactly. The overview graph numbers don't match on incoming connections or SBRS rejects, not eveb close.

We are using the overview page stats as a dashboard display and much to my horror when I went to the logs to explain some unexpected spike on 1 or 2 machines in the complex, there was no correlation to the overview display.

I've read that in cases of high stress, the ironport will throw away SBRS reject info and that would roughly explain the difference but the discrepancy occurs in cases where the boxes don't appear stressed, ie. no cpu, memory or disk alerts. Anyone have any ideas?

1 Accepted Solution

Accepted Solutions

Hello Macklin,

note that there is also the multiplier I mentioned before, who adds three recipients to every rejected message, these are most likely the 40% you are still missing.

Regards,

Andreas

View solution in original post

4 Replies 4

Andreas Mueller
Level 4
Level 4

Hello Macklin,

the difference you see is caused by the fact that the numbers shown in the overview statistics are calculated by the numbers of recipients per message and not per connection. Main reasons for that are that you can send more than one message via a single connection, and that each message can contain multiple recipients, and each recipient may be processed by a different policy, filter, etc. Also, for rejected connections due bad reputation, there is a muliplier of 3 (configurable) for these kind of connections, where we don't know the exact number of recipients the sender was about to inject).

Thus, the numbers of connections and messages per hour will never match, (unless you configure your mail flow policies to allow only one message and recipient per connection), the message count will always be way above the number of connections. This also explains the spikes you see occasionally, where a single host  injects a couple hundred messages/recipients via a single connection. Very common when newsletters/campaigns are send out.

Unfortunately I cannot think of an easy way to retreive the exact numbers from the logs, in theory you'd simply count the lines showing

MID 123 ICID 345 RID 0 To: <userl@example.com>

for a given time frame, but that would count both incoming and outgoing messages equally.

Hope that helps,

Andreas

Thanks for the reply. Multiple recipients is an important factor I hadn't looked at yet. I expect that to be some amount but not 50%.  The ICID...RID...To: records in mail log and "<< RCPT TO" records in smtp debug yield a greater number but still not even close to the amount of incoming email reported in the stats.  There's still 40% of the total missing. I've been looking at busy hour samples. I'm going to look at some of the low volume time periods to see if the difference is constant.

Hello Macklin,

note that there is also the multiplier I mentioned before, who adds three recipients to every rejected message, these are most likely the 40% you are still missing.

Regards,

Andreas

Thanks for the additional message Andreas. Don't know how I missed the multiplier comment first time. Must be getting old. That explains the discrepancy in the numbers.