cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
9938
Views
10
Helpful
20
Replies

high cpu due to Collection proce

dhalevi
Level 1
Level 1

hello,

I have a system that is experiencing high cpu utilization due to the "Collection proce".  Does anyone have any info about this process?

CPU utilization for five seconds: 100%/0%; one minute: 100%; five minutes: 100%
PID Runtime(ms)   Invoked      uSecs   5Sec   1Min   5Min TTY Process
343   289389920   1443139     200537 98.42% 97.69% 97.33%   0 Collection proce
137     2445976   1623863       1506  1.10%  0.31%  0.28%   0 BGP Router
345      401492    250680       1601  0.15%  0.08%  0.09%   0 HIDDEN VLAN Proc
   4           0         1          0  0.00%  0.00%  0.00%   0 EM Action CNS
   5           4       179         22  0.00%  0.00%  0.00%   0 Retransmission o
   6           0         5          0  0.00%  0.00%  0.00%   0 IPC ISSU Dispatc
   7           0         1          0  0.00%  0.00%  0.00%   0 PF Redun ICC Req
   8     3490512    194986      17901  0.00%  0.64%  0.71%   0 Check heaps
   3          84       896         93  0.00%  0.00%  0.00%   1 SSH Process
  10           0         2          0  0.00%  0.00%  0.00%   0 Timers
   2         684     99607          6  0.00%  0.00%  0.00%   0 Load Meter
  12         172    482233          0  0.00%  0.00%  0.00%   0 ARP Background
  13           0         2          0  0.00%  0.00%  0.00%   0 ATM Idle Timer
  14           0         1          0  0.00%  0.00%  0.00%   0 ATM ASYNC PROC
  15           0         1          0  0.00%  0.00%  0.00%   0 AAA_SERVER_DEADT
   9          28       151        185  0.00%  0.00%  0.00%   0 Pool Manager
  11        1040     33022         31  0.00%  0.00%  0.00%   0 ARP Input
  18           0       546          0  0.00%  0.00%  0.00%   0 EEM ED Syslog
  19           0         1          0  0.00%  0.00%  0.00%   0 IFS Agent Manage
  20           0      8348          0  0.00%  0.00%  0.00%   0 IPC Dynamic Cach

thanks!

20 Replies 20

Sergei Vasilenko
Cisco Employee
Cisco Employee

Hello,

Which platform/ios version is that?

Please provide the "show ver" at least.

Thanks,

Sergey

Hi Sergey,

I also got the same issue  with Catalyst 6503-E. Here is the show proc cpu and show ver:

CPU utilization for five seconds: 99%/5%; one minute: 70%; five minutes: 84%
PID Runtime(ms)   Invoked      uSecs   5Sec   1Min   5Min TTY Process
511     6863628     43386     158199 94.48% 61.99% 74.73%   0 Collection proce
331     3365244  16305707        206  0.07%  0.02%  0.01%   0 CEF: IPv4 proces
   2        8304   2229535          3  0.00%  0.00%  0.00%   0 Load Meter      
   1           4       115         34  0.00%  0.00%  0.00%   0 Chunk Manager   
   3       76044    278801        272  0.00%  0.00%  0.00%   0 IPv6 RIB Redistr
   4           0        28          0  0.00%  0.00%  0.00%   0 Retransmission o
   7           0         1          0  0.00%  0.00%  0.00%   0 PF Redun ICC Req
   5           0         2          0  0.00%  0.00%  0.00%   0 IPC ISSU Dispatc
   6           0         1          0  0.00%  0.00%  0.00%   0 PF Redun ICC Req
  10           0         2          0  0.00%  0.00%  0.00%   0 Timers          
   8   142388776   7116784      20007  0.00%  0.72%  1.01%   0 Check heaps     
  12           0         1          0  0.00%  0.00%  0.00%   0 AAA_SERVER_DEADT
  13           0         2          0  0.00%  0.00%  0.00%   0 AAA high-capacit
  14           0         1          0  0.00%  0.00%  0.00%   0 Policy Manager  
  15          12        26        461  0.00%  0.00%  0.00%   0 Entity MIB API  
   9         988      2334        423  0.00%  0.00%  0.00%   0 Pool Manager    
  11     9278880  81934656        113  0.00%  0.05%  0.02%   0 ARP Input       
  18        2364    185490         12  0.00%  0.00%  0.00%   0 IPC Dynamic Cach
  19           0         1          0  0.00%  0.00%  0.00%   0 NTI Example Proc
  20       23584  11125407          2  0.00%  0.00%  0.00%   0 IPC Periodic Tim
  16           8       151         52  0.00%  0.00%  0.00%   0 EEM ED Syslog   
  22           0         1          0  0.00%  0.00%  0.00%   0 IPC Process leve
  23    15866508 172741161         91  0.00%  0.02%  0.00%   0 IPC Seat Manager
  24           0         1          0  0.00%  0.00%  0.00%   0 IPC Session Serv
  25           0         1          0  0.00%  0.00%  0.00%   0 IPC Stdby Update
  26           0         2          0  0.00%  0.00%  0.00%   0 DDR Timers      
  27           0         2          0  0.00%  0.00%  0.00%   0 Dialer event    
  28           0         1          0  0.00%  0.00%  0.00%   0 ifIndex Receive 
  29           0         2          0  0.00%  0.00%  0.00%   0 Serial Backgroun
  30           0         1          0  0.00%  0.00%  0.00%   0 Crash writer    
  31      315788   3150514        100  0.00%  0.00%  0.00%   0 EnvMon          
  17           0         1          0  0.00%  0.00%  0.00%   0 IFS Agent Manage
  21        2376  11125419          0  0.00%  0.00%  0.00%   0 IPC Deferred Por
  34        9332  11125361          0  0.00%  0.00%  0.00%   0 GraphIt         
  35         120       904        132  0.00%  0.00%  0.00%   0 rf proxy rp agen
  36           0         3          0  0.00%  0.00%  0.00%   0 rf proxy message
  37           4         3       1333  0.00%  0.00%  0.00%   0 client_entity_se
  38           0         1          0  0.00%  0.00%  0.00%   0 SERIAL A'detect 
  39           0         1          0  0.00%  0.00%  0.00%   0 Connection Mgr  
  40           0         2          0  0.00%  0.00%  0.00%   0 Snmp ICC Process
  41           4        41         97  0.00%  0.00%  0.00%   0 Cat6k SNMP      
  32           0         1          0  0.00%  0.00%  0.00%   0 IPC ISSU Version
  43           0         1          0  0.00%  0.00%  0.00%   0 ARP Snoop       
  44        3744  11125410          0  0.00%  0.00%  0.00%   0 Dynamic ARP Insp
  45           0         1          0  0.00%  0.00%  0.00%   0 Critical Bkgnd  
  46       13940   1123940         12  0.00%  0.00%  0.00%   0 Net Background

Cisco IOS Software, s72033_rp Software (s72033_rp-IPSERVICESK9_WAN-M), Version 12.2(33)SXI2a, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2009 by Cisco Systems, Inc.
Compiled Wed 02-Sep-09 01:00 by prod_rel_team

ROM: System Bootstrap, Version 12.2(17r)SX5, RELEASE SOFTWARE (fc1)

br01.nsw uptime is 18 weeks, 3 days, 3 hours, 57 minutes
Uptime for this control processor is 18 weeks, 3 days, 3 hours, 35 minutes
Time since br01.nsw switched to active is 18 weeks, 3 days, 3 hours, 34 minutes
System returned to ROM by s/w reset at 13:26:54 EST Mon Jul 5 2010 (SP by power-on)
System restarted at 13:30:17 EST Mon Jul 5 2010
System image file is "disk0:/s72033-ipservicesk9_wan-mz.122-33.SXI2a.bin"


This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to
export@cisco.com.

cisco WS-C6503-E (R7000) processor (revision 1.3) with 983008K/65536K bytes of memory.
Processor board ID FOX11140BDF
SR71000 CPU at 600Mhz, Implementation 0x504, Rev 1.2, 512KB L2 Cache
Last reset from s/w reset
1 Virtual Ethernet interface
10 Gigabit Ethernet interfaces
1917K bytes of non-volatile configuration memory.
8192K bytes of packet buffer memory.

65536K bytes of Flash internal SIMM (Sector size 512K).
Configuration register is 0x2102

Can you please help? Thanks a lot.

Regards,

Yoesa

Hello,

Do you have Netflow Collector enabled on the switch? If you have it, try to disable it and check if situation changes. If I remember, collection process is related to NFC.

Cheers.

Yes, it's enabled for a long time ago and never cause high cpu. It's just occured when enabling IPv6.

Cheers.

Ok. Do you have Netflow Collector for IPv4 and IPv6 then, or only for IPv4 but you have IPv6 running?

If possible, you can try disabling NFC for IPv6 and check if the situation improves.

Cheers.

andtoth
Level 4
Level 4

Hi,

It would be great if you could attach the output of the 'sh pla ha ca' command.

Are you observing FIB TCAM exception messages in the logs? If yes, you should look for the %Used column in the 'sh pla ha ca fo' output at FIB TCAM usage for IPv6. If it's near to 100%, you can increase the TCAM allocation space with the 'mls cef maximum-routes ipv6 ' global configuration command.

For more details, refer to the Catalyst 6500/6000 Switch High CPU Utilization guide on the following link:

http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00804916e0.shtml#cb

Andras

Hi,

We're running Netflow only for IPv4.

Here is the "sh pla ha ca"

System Resources
  PFC operating mode: PFC3BXL
  Supervisor redundancy mode: administratively sso, operationally sso
  Switching resources: Module   Part number               Series      CEF mode
                       1        WS-SUP720-3BXL        supervisor           CEF
                       2        WS-X6408A-GBIC           classic           CEF

Power Resources
  Power supply redundancy mode: administratively redundant
                                operationally redundant
  System power: 1370W, 0W (0%) inline, 714W (52%) total allocated
  Powered devices: 0 total, 0 Class3, 0 Class2, 0 Class1, 0 Class0, 0 Cisco

Flash/NVRAM Resources
  Usage: Module Device               Bytes:      Total          Used     %Used
         1  RP  bootflash:                    65536000      47311860       72%
         1  SP  disk0:                       523935744      98476032       19%
         1  SP  disk1:                       512180224      98476032       19%
         1  SP  sup-bootflash:                65536000      47105404       72%
         1  SP  const_nvram:                    129004           556        1%
         1  SP  nvram:                         1964024         80241        4%

CPU Resources
  CPU utilization: Module             5 seconds       1 minute       5 minutes
                   1  RP               1% /  0%             4%              5%
                   1  SP               9% /  1%            40%             39%
  Processor memory: Module   Bytes:       Total           Used           %Used
                    1  RP             913627792      450402152             49%
                    1  SP             874635228      200710384             23%
  I/O memory: Module         Bytes:       Total           Used           %Used
              1  RP                    67108864       21605604             32%
              1  SP                    67108864       20823512             31%

EOBC Resources
  Module                     Packets/sec     Total packets     Dropped packets
  1  RP      Rx:                      12         444665437                   0
             Tx:                      11         442809313                   0
  1  SP      Rx:                       6         412214693                   0
             Tx:                       6         414070869                   0

VLAN Resources
  VLANs: 4094 total, 5 VTP, 0 extended, 21 internal, 4068 free

L2 Forwarding Resources
           MAC Table usage:   Module  Collisions  Total       Used       %Used
                              1                0  65536         12          1%

             VPN CAM usage:                       Total       Used       %Used
                                                    512          0          0%
L3 Forwarding Resources
             FIB TCAM usage:                     Total        Used       %Used
                  72 bits (IPv4, MPLS, EoM)     524288      333479         64%
                 144 bits (IP mcast, IPv6)      262144        3629          1%

                     detail:      Protocol                    Used       %Used
                                  IPv4                      333477         64%
                                  MPLS                           1          1%
                                  EoM                            1          1%

                                  IPv6                        3622          1%
                                  IPv4 mcast                     4          1%
                                  IPv6 mcast                     3          1%

            Adjacency usage:                     Total        Used       %Used
                                               1048576         329          1%

     Forwarding engine load:
                     Module       pps   peak-pps                     peak-time
                     1         153639    3931042  19:31:34 EST Thu Aug 12 2010

Netflow Resources
          TCAM utilization:       Module       Created      Failed       %Used
                                  1             249130           0         95%
          ICAM utilization:       Module       Created      Failed       %Used
                                  1                  0           0          0%

                 Flowmasks:   Mask#   Type        Features
                      IPv4:       0   reserved    none
                      IPv4:       1   Intf Ful    Intf NDE L3 Feature
                      IPv4:       2   unused      none
                      IPv4:       3   reserved    none

                      IPv6:       0   reserved    none
                      IPv6:       1   unused      none
                      IPv6:       2   unused      none
                      IPv6:       3   reserved    none

CPU Rate Limiters Resources
             Rate limiters:       Total         Used      Reserved       %Used
                    Layer 3           9            4             1         44%
                    Layer 2           5            3             3         60%

ACL/QoS TCAM Resources
  Key: ACLent - ACL TCAM entries, ACLmsk - ACL TCAM masks, AND - ANDOR,
       QoSent - QoS TCAM entries, QOSmsk - QoS TCAM masks, OR - ORAND,
       Lbl-in - ingress label, Lbl-eg - egress label, LOUsrc - LOU source,
       LOUdst - LOU destination, ADJ - ACL adjacency

  Module ACLent ACLmsk QoSent QoSmsk Lbl-in Lbl-eg LOUsrc LOUdst  AND  OR  ADJ
  1          1%     2%     1%     1%     1%     1%     0%     0%   0%  0%   1%

L3 Multicast Resources
  IPv4 replication mode: ingress
  IPv6 replication mode: ingress
  Bi-directional PIM Designated Forwarder Table usage: 4 total, 0 (0%) used
  Replication capability: Module                              IPv4        IPv6
                          1                                 egress      egress
                          2                                ingress     ingress
  MET table Entries: Module                             Total    Used    %Used
                     1                                  65516       6       1%

QoS Policer Resources
  Aggregate policers: Module                      Total         Used     %Used
                      1                            1024            1        1%
  Microflow policer configurations: Module        Total         Used     %Used
                                    1                64            1        1%

Switch Fabric Resources
  Bus utilization: not supported
  Fabric utilization:     Ingress                    Egress
    Module  Chanl  Speed  rate  peak                 rate  peak              
    1       0        20G    0%    1% @10:07 12Nov10    0%    0%              
  Switching mode: Module                                        Switching mode
                  1                                                        bus

Interface Resources
  Interface drops:
    Module    Total drops:    Tx            Rx      Highest drop port:  Tx  Rx
    2                  262191024           266                           2   7

  Interface buffer sizes:
    Module                            Bytes:     Tx buffer           Rx buffer
    2     (asic-1)                                  442368               81920
IBC Resources
  Module                     Packets/sec     Total packets     Dropped packets
  1  RP      Rx:                     115         378197650                   0
             Tx:                     104         255639906                   0
  1  SP      Rx:                       0           8981461                   0
             Tx:                     548        5780396178                   0

SPAN Resources
  Source sessions: 16 maximum, 0 used
    Type                             Max      Used
    Local                              2(*)      0
    Local-tx                          14         0
    RSPAN source                       2(*)      0
    ERSPAN source                      2(*)      0
    Capture                            1(*)      0
    Service module                     1(*)      0
    OAM loopback                       1(*)      0
      * - shared source sessions and the total can not exceed 2
  Destination sessions: 64 maximum, 0 used
    Type                             Max      Used
    RSPAN destination                 64(*)      0
    ERSPAN destination                23(*)      0
      * - shared destination sessions and the total can not exceed 64

Multicast LTL Resources
  Usage:   30656 Total, 565 Used

Cheers.

Hello,

I see IPv6 resources are almost no used, however IPv4 resources are. Both in FIB (64%) and Netflow TCAM resources (95%).

Could you please reply these questions?

- Do you have Netflow enabled in many L3 interfaces?

- Do you have many IP flows? (Check with "show mls netflowmask")

- Do you have Netflow Sampler enabled? (Check with "show mls sampling") Please note Neflow sampling in a SUP720 is done in software by the CPU.

- Paste the output of "show mls netflow table-contention detail" and"show mls netflow aging"

Regards.

Hello,

There are 8 interfaces.

show mls netflow flowmask
current ip   flowmask for unicast:   if-full
current ipv6 flowmask for unicast:    null 

show mls sampling        
Netflow Sampling is Disabled

show mls netflow table-contention detail
Earl in Module 1
Detailed Netflow CAM (TCAM and ICAM) Utilization
================================================
TCAM Utilization             :   71%
ICAM Utilization             :   2%
Netflow TCAM count           :   186500
Netflow ICAM count           :   3
Netflow Creation Failures    :   0
Netflow CAM aliases          :   0

show mls netflow aging
             enable timeout  packet threshold
             ------ -------  ----------------
normal aging true       300        N/A
fast aging   false      32         100     
long aging   true       1920       N/A

Cheers.

Hello,

The high CPU load seems to be related to the high amount of IP flows being checked by Netflow. The more active flows Netflow is handling, the more CPU is neede to maintain the cache.

You could try disabling Netflow where it is not really needed and compare the results.

Regards.

High CPU due to Collection Process is not related to Netflow. It is usually caused by an underlying routing issue and is related to some CEF internals. I will highly recommend that you verify your routing configuration to make sure you dont have any routes that cannot be properly recursed (recursion loop). I will also advise you to open a TAC case so someone can work with you to get to the root cause.

Atif

You might want to check the following documentations for more information about Netflow TCAM utilization.

Catalyst 6500/6000 Switches NetFlow Configuration and Troubleshooting

http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080721701.shtml

Common Error Messages on Catalyst 6500/6000 Series Switches Running Cisco IOS Software

%EARL_NETFLOW-4-TCAM_THRLD: Netflow TCAM threshold exceeded, TCAM Utilization [[dec]%]

http://www.cisco.com/en/US/products/hw/switches/ps700/products_tech_note09186a00801b42bf.shtml#prob1a

Hello,

So we have two uplink from gateway to our core and two uplink from core to our border. When we enable two IPv6 BGP session from gateway to two our core, the high cpu of collection process is occured, but if I enable only 1 leg of IPv6 BGP session the cpu is okay. I'm assuming this issue is not related with the Netflow. Does anyone have the same issue and can give an advice?

Cheers.

Hi,
You probably have routes pointing to each other and caused infinite recursive lookup within the routing table. The collection process is where the platform is finding dependent objects and CEF internal updates.
1. check the routing table together with the static route configured to make sure there is no infinite recursive lookup.
2. check link, neighbor flapping and verify arp
3. (if possible) you could do a debug arp and debug ip routing although it will be a good idea to suggest this only in a scheduled downtime
You might consider opening a Service Request for this so that TAC can help you investigate this issue further.
Andras
Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: