My 2651XM's CPU reaches % 100 and becomes unstable. Once i reload, it becomes OK, but CPU usage starts increasing slowly % 27-% 28-....% 61-% 62-...% 91-% 92...boom. Reload and the same thing occurs. When I show processes cpu sorted, IP Input takes the lead. But interesting enough (Maybe i dont know how to interpret "show cpu process sorted") lets say that it says last 5 second usage % 60 in first line, then shows each processes last 5 second usage like IP Input % 23 process x %z and so on in following lines. When I add all processes' last 5 second usage, the sum is NOT equal to the total stated in first line. Something incrementally using my CPU and not freeing up, and i cant see it.
What is the exact version of IOS you are running?
We had a situation lately with a version of IOS for our 6500s, where it was using up memory and not freeing it. Every 2 weeks or so when the boxes would run out of memory, they would reboot on their own or just crash and do core dump. We used a different version of IOS and it seems to be ok for now.
So, I would suggest trying a different version of IOS
The exact ios I use is c2600-advsecurityk9-mz.124-4.XC7.bin . My router also was using high mem and crashing, I checked crash files and saw lots of NAT translations, decreased the dynamic translation timeout and memory usage issue resolved. But CPU usage is still high. AFAIK, IP Input is the thread for process switching but CEF is enabled on all interfaces. Output is below
Switching path Pkts In Chars In Pkts Out Chars Out
Process 188064 25360571 235406 299542784
Cache misses 576 - - -
Fast 2355443 731664062 3025315 2296058772
Auton/SSE 0 0 0 0
I suspect the design that it is configured "router on a stick". Int f0/0 is internet and int f0/1 is trunked for 26 subinterfaces, being the gateway of 26 individual VLANs. But inter-vlan routing is denied by ACL. Only subinterface to internet interface is allowed. I know that router on a stick configuration is a bad practise, but that still doesnt explain that incremental CPU usage. Here is the weird thing
CPU utilization for five seconds: 69%/56%; one minute: 64%; five minutes: 63%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
61 3850219 111311 34589 9.25% 9.68% 8.05% 0 IP Input
142 43985 2739 16058 1.47% 0.53% 0.54% 0 encrypt proc
43 20963 145 144572 0.65% 0.08% 0.06% 0 Per-minute Jobs
64 3816 811 4705 0.65% 0.05% 0.06% 66 SSH Process
170 157796 32858 4802 0.40% 0.39% 0.34% 0 IP NAT Ager
181 51320 63841 803 0.32% 0.18% 0.16% 0 NAT MIB Helper
104 54616 13644 4002 0.16% 0.18% 0.15% 0 DHCPD Receive
2 33681 1551 21715 0.08% 0.06% 0.08% 0 Load Meter
30 35350 6479 5456 0.08% 0.05% 0.06% 0 TTY Background
31 11830 7249 1631 0.08% 0.02% 0.00% 0 Per-Second Jobs
164 11406 3899 2925 0.08% 0.01% 0.00% 0 Syslog
53 2684 6483 414 0.08% 0.00% 0.00% 0 PI MATM Aging Pr
95 74804 11796 6341 0.08% 0.14% 0.13% 0 Inspect process
14 4 2 2000 0.00% 0.00% 0.00% 0 DDR Timers
The sum of CPU usage of individual processes is not equal to the one specified in line "CPU utilization for five seconds: 69%/56%; one minute: 64%; five minutes: 63%"
Even the IP input is lees then 10%.
Does show log tell you any thing about unusual activity related to a process?
Also, how big is your route table?
Are you running BGP?
No BGP, no dynamic routing protocols, about 25 routes in table most are connected routes of subinterfaces. I started logging to syslog server but logs look fine. What recent IOS version can I download for 2651xm?
Agreed Glen, but it was configured that way already when I arrived. One deny any any acl that was inserted via firewall wizard was logging, I removed it but problem still goes on. I also upgraded to latest IOS, inreased max. Upon seeing some VRF Fragment maximum messages in syslog, I modified the default value of 16 to virtual-reassembly max-fragments 64. Problem still persists. NAT is also CPU and memory intensive, but max dyn translation count is 3000.
The thing that drives me crazy is I CANT locate that CPU hogging process!! Even Windows is capable of that via task manager!!!
If its not one the cpu processes then it is probably being driven by the interupt process. This doc might help , if not you may have to get the TAC involved.
Glen, thanks for the helpful link! I got some output via profiling in order to locate what interrupt thread is loading so much (interface? NAT? etc), but dont know how to interpret the output. If I could locate it either the reason is 26 subinterfaces hogging f0/1 thus CPU, then I would add a L3 switch to resolve the issue. But cant approve purchase without knowing exact reason.