This message is (possibly) more of a philosophical question than a plea for help, or at least I thought I would ask before logging a TAC case via our support provider.
We are using 7206 VXR router hardware (256 MB DRAM, NSE-1) for the gateway hosts on our DMZ, and these boxes have been in production ever since the NSE-1 engine became available (18 months - 2 years??).
Apart from security policy, these routers also control access from our internal network to the Internet on a purely "go ./ no go" basis for our internal hosts - here we charge for access to the Internet and thus hosts not permitted to access the Internet, don't "run up" a large traffic bill. This situation also requires quite granular control on hosts that can and can't access the Internet.
Thus we end up with an access list with approximately 3,500 entries. Most of them are host and small groups of masked host "permit" entries in conjunction with the standard "don't let SNMP in from the outside" style entries.
The 7206 VXR hosts are running IOS version 12.2(10a) and more recently they have been tried with 12.2.(11)T. However for as long as the TurboACL (access-list compiled) functionality has been available we have been seeing 5 - 6 minutes (depending upon traffic levels) of multiple error logs entries of the type: "%SYS-3-CPUHOG: Task ran for 5564 msec (107/3), process = TurboACL, PC = 60660E28 -Traceback= 60660E30 6047228C 60471EE0 60471FAC 60471E48 60471A14 60473384 60474FE8 604750BC 6047520C" when the access lists are modified - normally once a day during an ACL update.
Needless to say that during access-list compilation the router itself is very sluggish and reports 100% cpu utilisation, and other process orientated tasks (e.g. AppleTalk RTMP updates, etc.) tend to be affected by the CPU load.
I should say that one small bonus with the 12.2 IOS train is that now at least the routers route traffic during the access list compilation process, prior to 12.2 traffic to and from the Internet was likely to be halted, or at least very "patchy" during compilation.
I have checked the troubleshooting "CPUHOG" messages technical tips and scanned the output of the "Bug Toolkit". There are references to this type of error for other platforms, and/or earlier versions of the IOS but nothing current for the 12.2 IOS train.
So, my question is more one of "should I accept the CPU hog messages" as just a symptom of using long ACLs? Should I go ahead and log a fault call regarding this problem? Should we be rethinking how ACL are applied in our network perimeter? Is the 7206 VXR the right hardware for handling these kinds of access requirements. Is there some "in depth" technical tips regarding tuning TurboACL with respect of long access lists as in my particular situation?
Lastly, I should add that the 7206 routers are handling 10 - 20 Mb/s of varied traffic, and I have configured NetFlow (ip route-cache flow) on all interfaces after researching and noting that this would also be useful in our situation where we have big access lists and logs of short lived connection style traffic.
In earlier versions of the IOS I have tried turning off the TurboACL functionality and noted that router utilisation floats at around 70% - 90%, as opposed to 10% - 20% with TurboACL's turned on. So obviously the compiled access lists are a wonderful thing, our issue is more with the fact that our routers are crippled if we update the access lists rather than once everything is working.
Needless to say, if you get to the end of this long message then "well done", otherwise all comments gratefully received.