04-25-2014 11:54 AM
Hi everyone.
We are a little concerned about many high peaks regarding CPU utilization on a Cisco Nexus 7k (version 6.2(2)). Peaks often reach 100%. Here's the behavior:
# show processes cpu history
1 1 2 1 21 1 121 1 12 1121 2 14 121111 2 11 11 1 11
037357837199965304462514663660738508320802251446670483845437
100
90
80
70
60
50
40 #
30 # # #
20 # # ## # # # ## # ## ## # # # # #
10 # # ########### #################### ######## ##### # # # ##
0....5....1....1....2....2....3....3....4....4....5....5....
0 5 0 5 0 5 0 5 0 5
CPU% per second (last 60 seconds)
# = average CPU%
899465895558965598634693655934549356699455996449946569555794
350898451398942329713869146452267760759323381261483307572275
100 * * * * * ** * * *
90 ** * ** ** * * * ** ** ** * *
80 *** ** ** ** * * * ** ** ** * *
70 *** * ** ** *** ** * * *** ** ** * **
60 *** **** **** *** ** * ** * ***** *** ** * **** **
50 ******************* ** **** *** ***** ***** **************
40 ******************* ****************************************
30 ************************************************************
20 *#****##***##**###***##***##****#***###**##****##**#*#**###*
10 ############################################################
0....5....1....1....2....2....3....3....4....4....5....5....
0 5 0 5 0 5 0 5 0 5
CPU% per minute (last 60 minutes)
* = maximum CPU% # = average CPU%
11 1 1111 1 1 1 1 1 1 1 1 1
900999909999999000099990999999099999999999090990999909999099909990999999
900999909999898000099990889998099989999999090890899809988098809980899999
100 ************************************************************************
90 ************************************************************************
80 ************************************************************************
70 ************************************************************************
60 ************************************************************************
50 ************************************************************************
40 ************************************************************************
30 ************************************************************************
20 *##*####**####***##**#****#**********#***********************#**********
10 ########################################################################
0....5....1....1....2....2....3....3....4....4....5....5....6....6....7.
0 5 0 5 0 5 0 5 0 5 0 5 0
CPU% per hour (last 72 hours)
* = maximum CPU% # = average CPU%
I got this outputs from the Nexus:
# show processes cpu sort | ex 0.0
CPU utilization for five seconds: 23%/0%; one minute: 14%; five minutes: 15%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
----- ----------- -------- ----- ------ ------ ------ --- -----------
10294 54052210 14300533 3 4.50% 1.45% 1.96% - statsclient
5345 12271480 3637181 3 1.91% 0.59% 0.51% - oc_usd
5326 7924650 2329273 3 1.24% 0.44% 0.34% - sensor
4444 23015630 8269523 2 1.05% 0.90% 0.89% - platform
5216 22313520 8459875 2 0.95% 0.96% 0.97% - sysinfo
5261 7913470 2572469 3 0.47% 0.29% 0.31% - diagclient
5232 13726360 6699935 2 0.38% 0.52% 0.54% - diagmgr
5108 4240770 130432066 0 0.19% 0.16% 0.16% - pdsd
5322 3282920 3755142 0 0.19% 0.13% 0.13% - R2D2_usd
# show system resources
Load average: 1 minute: 0.88 5 minutes: 0.61 15 minutes: 0.56
Processes : 1439 total, 1 running
CPU states : 4.50% user, 10.00% kernel, 85.50% idle
CPU0 states : 6.00% user, 10.00% kernel, 84.00% idle
CPU1 states : 3.00% user, 10.00% kernel, 87.00% idle
Memory usage: 8260668K total, 4833504K used, 3427164K free
Current memory status: OK
# show system internal processes cpu
top - 11:20:10 up 14 days, 8:35, 3 users, load average: 0.62, 0.56, 0.56
Tasks: 650 total, 1 running, 649 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.7%us, 6.8%sy, 0.4%ni, 85.9%id, 0.0%wa, 0.2%hi, 1.0%si, 0.0%st
Mem: 8260668k total, 5333156k used, 2927512k free, 81408k buffers
Swap: 0k total, 0k used, 0k free, 1954140k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10945 ryramire 20 0 3868 1788 1148 R 5.6 0.0 0:00.06 top
4444 root 20 0 199m 13m 9236 S 3.7 0.2 383:33.06 pfm
5327 root -2 0 180m 14m 8552 S 1.9 0.2 763:20.54 xbar_driver
8109 root 20 0 278m 21m 12m S 1.9 0.3 99:24.51 pm
9109 root 20 0 278m 21m 12m S 1.9 0.3 230:52.80 pm
9836 svc-isan 20 0 255m 23m 14m S 1.9 0.3 48:06.21 stp
1 root 20 0 1988 608 532 S 0.0 0.0 0:12.30 init
2 root 15 -5 0 0 0 S 0.0 0.0 0:00.01 kthreadd
3 root RT -5 0 0 0 S 0.0 0.0 0:02.10 migration/0
4 root 15 -5 0 0 0 S 0.0 0.0 8:43.30 ksoftirqd/0
5 root -2 -5 0 0 0 S 0.0 0.0 0:48.33 watchdog/0
6 root RT -5 0 0 0 S 0.0 0.0 0:02.13 migration/1
7 root 15 -5 0 0 0 S 0.0 0.0 7:41.01 ksoftirqd/1
Is this a normal condition?. I've read this in Cisco:
The Cisco NX-OS operating system takes advantage of preemptive CPU multitasking, so processes can take advantage of an idle CPU in order to complete tasks faster. Therefore, the history option may report CPU spikes that do not necessarily indicate a problem
Any recommendations?. Troubleshooting I can do? other useful commands? Normal conditions?
Thanks a lot in advance!
Fabio.
Solved! Go to Solution.
05-07-2014 10:14 AM
Hi ,
It looks fine to me ...
The Cisco NX-OS operating system takes advantage of preemptive CPU multitasking, so processes can take advantage of an idle CPU in order to complete tasks faster. Therefore, the history option may report CPU spikes that do not necessarily indicate a problem. However, if average CPU usage remains high compared to normal, baseline CPU usage for a particular network, you might need to investigate high CPU usage.
Thanks-
Afroz
***Ratings Encourages Contributors ****
04-25-2014 08:03 PM
HI Fabio,
If it happening not that frequently and you are not seeing any drops then it is fine.
also check the show log and see if there is any message to be concerened. if you find any issue ,it would be best to troubleshoot the issue with EEM because spike happening Randomly .
check the below link for EEM config:
https://supportforums.cisco.com/document/64816/eem-script-nexus-7000-switches-monitor-cpu-utilization
Thanks-
Afroz
***Ratings Encourages Contributors ***
05-07-2014 07:06 AM
Hi!
I ran a EEM script to capture a "show proc cpu sort" as soon as a CPU spike was detected. The results were:
CPU utilization for five seconds: 72%/4%; one minute: 26%; five minutes: 16%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
----- ----------- -------- ----- ------ ------ ------ --- -----------
4444 40205050 12415521 3 34.59% 7.25% 2.27% - platform
10294 94405200 24897408 3 8.26% 2.57% 2.20% - statsclient
5327 81533840 3494387 23 1.33% 1.92% 1.86% - xbar_driver_usd
5216 39232700 14916113 2 0.76% 0.72% 0.72% - sysinfo
5108 7333350 226398351 0 0.19% 0.17% 0.16% - pdsd
5222 6022890 1824713 3 0.19% 0.11% 0.12% - plog_sup
4 903680 140208710 0 0.09% 0.03% 0.02% - ksoftirqd/0
7 794690 116303858 0 0.09% 0.03% 0.01% - ksoftirqd/1
3507 3066470 14086611 0 0.09% 0.03% 0.01% - sysmgr
CPU utilization for five seconds: 81%/4%; one minute: 27%; five minutes: 16%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
----- ----------- -------- ----- ------ ------ ------ --- -----------
4444 40906590 12564873 3 38.73% 7.26% 2.27% - platform
5948 2715690 31449621 0 6.69% 1.77% 0.43% - snmpd
5327 82972760 3553756 23 2.58% 2.10% 1.89% - xbar_driver_usd
5326 14357780 4218517 3 2.48% 0.42% 0.33% - sensor
5880 2190560 46777275 0 2.39% 0.70% 0.18% - netstack
5345 22232320 6582964 3 2.29% 0.54% 0.50% - oc_usd
5232 24871960 12054356 2 2.00% 0.59% 0.55% - diagmgr
5216 39797820 15159288 2 1.81% 0.81% 0.74% - sysinfo
5261 14339770 4692217 3 1.33% 0.35% 0.33% - diagclient
5865 897270 17393212 0 0.86% 0.26% 0.07% - pktmgr
3507 3077410 14181344 0 0.47% 0.06% 0.02% - sysmgr
5108 7461870 230410770 0 0.19% 0.16% 0.16% - pdsd
6269 234950 1185977 0 0.19% 0.04% 0.01% - ethpm
CPU utilization for five seconds: 74%/4%; one minute: 31%; five minutes: 17%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
----- ----------- -------- ----- ------ ------ ------ --- -----------
4444 40910890 12569459 3 41.17% 9.97% 2.92% - platform
5948 2716500 31454618 0 7.75% 2.25% 0.55% - snmpd
5880 2190750 46780136 0 1.81% 0.79% 0.21% - netstack
5216 39797980 15160241 2 1.53% 0.87% 0.75% - sysinfo
5327 82972820 3553760 23 0.57% 1.98% 1.87% - xbar_driver_usd
5865 897330 17394277 0 0.57% 0.28% 0.08% - pktmgr
3507 3077460 14182233 0 0.47% 0.10% 0.03% - sysmgr
27 322310 921709 0 0.38% 0.03% 0.00% - kide/1
5108 7461890 230411316 0 0.19% 0.16% 0.16% - pdsd
5222 6128730 1856342 3 0.19% 0.13% 0.14% - plog_sup
5322 5942590 6780564 0 0.19% 0.12% 0.13% - R2D2_usd
5102 8910 303848 0 0.09% 0.00% 0.00% - vshd
5217 153080 977776 0 0.09% 0.00% 0.00% - statsprofiler
5230 4410 164888 0 0.09% 0.01% 0.00% - evmc
Is this a normal condition?. It seems he spikes are always related to the process "platform".
Thanks!
05-07-2014 10:14 AM
Hi ,
It looks fine to me ...
The Cisco NX-OS operating system takes advantage of preemptive CPU multitasking, so processes can take advantage of an idle CPU in order to complete tasks faster. Therefore, the history option may report CPU spikes that do not necessarily indicate a problem. However, if average CPU usage remains high compared to normal, baseline CPU usage for a particular network, you might need to investigate high CPU usage.
Thanks-
Afroz
***Ratings Encourages Contributors ****
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: