High CPU usage caused by interrupts

Unanswered Question
Mar 7th, 2007

Dear community.

After using my cisco 2621 as my internet gateway, and upgrading my DSL connection to 20mbit/1mbit, my cpu usage has increased to almost 90%. These 90% are reached when i am downloading @ 15mbit/sec.

After issueing a 'show proc cpu sorted' command, the 80% cpu usage was not caused by any process, but by interrupts:

CPU utilization for five seconds: 87%/81%; one minute: 1%; five minutes: 3%

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

41 68700 76661 896 0.95% 0.19% 0.16% 0 IP Input

I am already using 'ip cef' on both the FA interfaces. And issueing a 'sh int stat' command result in a very healthy routing proces if you ask me, if you look at the cache/proces ratio:

FastEthernet0/0

Switching path Pkts In Chars In Pkts Out Chars Out

Processor 75394 6417511 46889 5303988

Route cache 415701 36392514 645159 763558263

Total 491095 42810025 692048 768862251

FastEthernet0/1

Switching path Pkts In Chars In Pkts Out Chars Out

Processor 16157 3375901 27213 2645817

Route cache 639952 764440471 415434 37734183

Total 656109 767816372 442647 40380000

NIDWORKR01#sh ip cef sum

IP CEF with switching (Table Version 116), flags=0x0

94 routes, 0 reresolve, 0 unresolved (0 old, 0 new), peak 0

94 leaves, 127 nodes, 144864 bytes, 119 inserts, 25 invalidations

0 load sharing elements, 0 bytes, 0 references

universal per-destination load sharing algorithm, id 3EDCA274

3(0) CEF resets, 1 revisions of existing leaves

Resolution Timer: Exponential (currently 1s, peak 1s)

1 in-place/0 aborted modifications

refcounts: 32828 leaf, 32768 node

Table epoch: 0 (94 entries at this epoch)

Adjacency Table has 90 adjacencies

81 IPv4 adjacencies

I have already followed the cisco doc which mentions the 'high cpu usage caused by interrupts'. These steps did not lead to decrease my cpu usage. Can anybody point me in a direction to troubleshoot the traffic that is causing this high cpu usage? Could this problem be CEF related?

Btw, IOS version 12.3(21)

Thanks in advance!

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Danilo Dy Wed, 03/07/2007 - 15:51

Can you post the output of "show proc cpu" or "show tech-support"?

diondohmen Thu, 03/08/2007 - 10:37

Hi Medan,

Here is my 'show proc cpu' when downloading @ 15mbit:

NIDWORKR01#sh proc cpu | ex 0.0

CPU utilization for five seconds: 93%/81%; one minute: 24%; five minutes: 13%

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

3 20036 303 66125 10.95% 4.83% 2.44% 66 SSH Process

41 2480 1368 1812 0.55% 0.32% 0.22% 0 IP Input

I already tried upgrading to the latest 12.3(22) but no luck either. I think it's a bug somewhere but i am not sure at this. I hope someone can point me in a direction to let me show what interrupts are causing the cpu to go through the roof

Danilo Dy Thu, 03/08/2007 - 17:00

Hi,

Your high CPU utilization is cause by SSH not .

PID CPU Time Process

3 28.69 SSH Process

Try the following and use telnet unless you are connecting from unsecured network (internet)

!

line vty 0 4

transport input telnet ssh

Improve TCP performance

!

service nagle

service tcp keepalive-in

service tcp keepalive-out

diondohmen Thu, 03/08/2007 - 23:41

Hi medan,

SSH indeed caused some cpu spikes, but this was caused by my session to provide the output of "show tech-support". After that i started a downstream of 15mbit, and logged out of my SSH session. After a while i checked my cacti graph, and it was still peaking @ 75% cpu usage.

I will try to enable the services you mentioned above to improve tcp performance.

Thanks already!

Danilo Dy Fri, 03/09/2007 - 00:19

When it happens again, use telnet (not SSH) to access and capture "show tech-support" before you reload the router (if you are going to reload it) and post it here.

Your IOS is okay, its the latest (GD) :)

diondohmen Fri, 03/09/2007 - 02:11

Thanks medan,

I will use telnet the next time i will run "show tech-support". I will keep this topic updated about this issue. Thanks thus far

diondohmen Fri, 03/09/2007 - 09:53

Hi again,

I enabled telnet access to the box and run tech-support while cpu was hogging again. See attached file.

Danilo Dy Fri, 03/09/2007 - 10:48

Yeesh, its high cpu util

1. Add the following

!

service nagle

service tcp keepalive-in

service tcp keepalive-out

2. Check this link http://www.cisco.com/en/US/products/hw/routers/ps359/products_tech_note09186a00801c2af0.shtml#pxf_punts

Just do the following below as I already checked the others from your "show tech-support" output and they are okay;

# An inappropriate switching path is configured on the router

# The CPU is performing alignment corrections

# Too many packets are being punted from PXF to the Route Processor (RP)

Your IOS is okay, the router interface is not overloaded with traffic.

Do you have "%ALIGN-3-CORRECT" messages in the logs? If you have, capture the output of the "show align" command

diondohmen Fri, 03/09/2007 - 16:29

I have added those 3 services which gave me the following results:

NIDWORKR01#sh proc cpu | ex 0.0

CPU utilization for five seconds: 91%/76%; one minute: 43%; five minutes: 19%

PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process

123 10932 298 36684 13.91% 3.20% 0.96% 66 Virtual Exec

Alignment errors:

NIDWORKR01#sh align

No spurious memory references have been recorded.

I don't have any log lines about alignment or other possible misbehavior either :/

I will check statement 1 and 3 tomorrow. Btw, something that could possibly be a reference point in solving this mystery:

Buffers:

NIDWORKR01#sh buffers

Buffer elements:

1118 in free list (1000 max allowed)

343951 hits, 0 misses, 1119 created

Public buffer pools:

Small buffers, 104 bytes (total 50, permanent 50, peak 96 @ 1d03h):

48 in free list (20 min, 150 max allowed)

202772 hits, 42 misses, 46 trims, 46 created

1 failures (0 no memory)

Middle buffers, 600 bytes (total 25, permanent 25, peak 49 @ 1d03h):

23 in free list (10 min, 150 max allowed)

29261 hits, 8 misses, 24 trims, 24 created

0 failures (0 no memory)

Big buffers, 1536 bytes (total 50, permanent 50, peak 92 @ 1d03h):

49 in free list (5 min, 150 max allowed)

29746 hits, 220 misses, 42 trims, 42 created

135 failures (0 no memory)

VeryBig buffers, 4520 bytes (total 10, permanent 10, peak 13 @ 1d03h):

10 in free list (0 min, 20 max allowed)

76 hits, 59 misses, 3 trims, 3 created

59 failures (0 no memory)

Large buffers, 5024 bytes (total 1, permanent 0, peak 3 @ 1d03h):

1 in free list (0 min, 10 max allowed)

8 hits, 51 misses, 12 trims, 13 created

51 failures (0 no memory)

Huge buffers, 18024 bytes (total 1, permanent 0, peak 2 @ 1d03h):

1 in free list (0 min, 4 max allowed)

4 hits, 47 misses, 11 trims, 12 created

47 failures (0 no memory)

You can some failures going on in a few buffer pools, i still have to check that out.

Thanks again

Danilo Dy Fri, 03/09/2007 - 18:55

Hi,

Regarding buffer counter, this is strange because the input/output queue of the interfaces looks normal plus the throttle and drops are zero.

I also checked your free memory, it looks good. Do you have this error in your log? %SYS-2-MALLOCFAIL

The buffer counter can only be cleared by reloading the router. If you can have maintenance downtime, try to reload the router to clear the buffer counter. Then we observe the buffer counter, because you might be looking at historical buffer that has nothing to do with the current problem.

Incorrect adjustment of system buffer can severely affect hardware and network performance, therefore we want to make sure that you really have buffer problem. I'm still looking for the buffer calculator, I can't find it :)

Here's useful link about buffer counters

http://www.cisco.com/en/US/products/hw/iad/ps397/products_tech_note09186a00800a7b85.shtml

http://www.cisco.com/en/US/products/hw/routers/ps133/products_tech_note09186a00800a7b80.shtml

If you have back-to-back maintenance with Cisco, I suggest you open a TAC case with them

diondohmen Sat, 03/10/2007 - 02:06

I will schedule a reboot this night to clear buffer counters. Maybe one thing I forgot to mention is that the DSL connection is terminating in a thomson speedtouch adsl2+ modem which is set to bridged mode, so it kinda proxies all connection to this hop, the 2621. I already disabled all of the unneeded services in this modem, like ids, firewall etc. But maybe the communication between these two is not optimum. I'll keep you posted!

diondohmen Thu, 09/06/2007 - 11:10

I finally had time to do a reboot, but i am still experiencing some high CPU usage caused by interrupts. Is this just the max a cisco 2600 router can do with NAT enabled?

Thanks..

diondohmen Sat, 09/08/2007 - 06:43

Ah. Thanks for sharing this chart.

Still though, I am wondering what kind of hardware routers ISP's are using these days for the SOHO market. The standard router my ISP delivers on a 20mbit line, is a SMC 79xx router. I can't imagine these routers have better hardware built-in as Cisco does?

But, regarding this pdf, the cpu is indeed within expected ranges.

Actions

This Discussion