cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
7437
Views
0
Helpful
6
Replies

diagnostic error message Catalyst 6513 SWITCH

neo
Level 1
Level 1

·  Platform type: Cisco Catalyst 6513 Switch

·  Release Version/Firmware/IOS: “s72033-ipservicesk9_wan-mz.122-33.SXH.bin"

· Problem Statement: System diagnostic reports on module number 12 show that the online tests for the system variable number “36, TestErrorCounterMonitor” has failed the diagnostics tests five times.

Q1: When do logs get added into the diagnostic event log, if the test is running every 30min.

Q2: does it create a log entry every time it fails or only once it exceeds its defined threshold

Q3: what is the TestErrorCounterMonitor actually monitoring ?

# log extract

Jun 17 07:46:30.973 gmt: %CONST_DIAG-SP-4-ERROR_COUNTER_WARNING: Module 12 Error counter exceeds threshold, system operation continue.

Jun 17 07:46:30.973 gmt: %CONST_DIAG-SP-4-ERROR_COUNTER_DATA: ID:48 IN:0 PO:255 RE:575 RM:255 DV:23 EG:2 CF:10 TF:6108

sh diagnostic result module 12 detail

Current bootup diagnostic level: minimal

Module 12: CEF720 4 port 10-Gigabit Ethernet  SerialNo : SAL11445M5P

  Overall Diagnostic Result for Module 12 : PASS

  Diagnostic level at card bootup: minimal

  Test results: (. = Pass, F = Fail, U = Untested)

  ___________________________________________________________________________

    1) TestFabricCh0Health -------------> .

          Error code ------------------> 0 (DIAG_SUCCESS)

          Total run count -------------> 12594395

          Last test execution time ----> Jun 18 2010 10:05:10

          First test failure time -----> n/a

          Last test failure time ------> n/a

          Last test pass time ---------> Jun 18 2010 10:05:10

          Total failure count ---------> 0

          Consecutive failure count ---> 0

  ___________________________________________________________________________

OUTPUT OMITTED

   34) TestUnusedPortLoopback:

      Port  1  2  3  4

      ----------------

            U  .  .  .

          Error code ------------------> 0 (DIAG_SUCCESS)

          Total run count -------------> 1157755

          Last test execution time ----> Jun 18 2010 10:04:30

          First test failure time -----> n/a

          Last test failure time ------> n/a

          Last test pass time ---------> Jun 18 2010 10:04:30

          Total failure count ---------> 0

          Consecutive failure count ---> 0

  ___________________________________________________________________________

   35) TestOBFL ------------------------> .

          Error code ------------------> 0 (DIAG_SUCCESS)

          Total run count -------------> 1

          Last test execution time ----> Mar 21 2008 02:09:59

          First test failure time -----> n/a

          Last test failure time ------> n/a

          Last test pass time ---------> Mar 21 2008 02:09:59

          Total failure count ---------> 0

          Consecutive failure count ---> 0

  ___________________________________________________________________________

   36) TestErrorCounterMonitor ---------> .

          Error code ------------------> 0 (DIAG_SUCCESS)

          Total run count -------------> 2267647

          Last test execution time ----> Jun 18 2010 10:05:02

          First test failure time -----> Jun 17 2010 07:46:30

          Last test failure time ------> Jun 17 2010 07:48:32

          Last test pass time ---------> Jun 18 2010 10:05:02

          Total failure count ---------> 5

          Consecutive failure count ---> 0

          Error Records ---------------> n/a

  ___________________________________________________________________________

   37) TestPortTxMonitoring:

      Port  1  2  3  4

      ----------------

            .  U  U  U

          Error code ------------------> 3 (DIAG_SKIPPED)

          Total run count -------------> 141050

          Last test execution time ----> Dec 04 2009 08:59:20

          First test failure time -----> n/a

          Last test failure time ------> n/a

          Last test pass time ---------> Dec 04 2009 08:59:20

          Total failure count ---------> 0

          Consecutive failure count ---> 0

  ___________________________________________________________________________

# show diagnostic events

06/17 07:45:57.179 E  [12]    TestErrorCounterMonitor: ID:48 IN:0 PO:255 RE:575

                              RM:255 DV:11 EG:2 CF:9 TF:6107

06/17 07:46:30.976 E  [12]    TestErrorCounterMonitor: ID:48 IN:0 PO:255 RE:575

                              RM:255 DV:23 EG:2 CF:10 TF:6108

06/17 07:46:30.976 E  [12]    TestErrorCounterMonitor Failed

06/17 07:47:02.636 E  [12]    TestErrorCounterMonitor: ID:48 IN:0 PO:255 RE:575

                              RM:255 DV:13 EG:2 CF:11 TF:6109

06/17 07:47:02.636 E  [12]    TestErrorCounterMonitor Failed

06/17 07:47:32.888 E  [12]    TestErrorCounterMonitor: ID:48 IN:0 PO:255 RE:575

                              RM:255 DV:21 EG:2 CF:12 TF:6110

06/17 07:47:32.888 E  [12]    TestErrorCounterMonitor Failed

06/17 07:48:02.888 E  [12]    TestErrorCounterMonitor: ID:48 IN:0 PO:255 RE:575

                              RM:255 DV:14 EG:2 CF:13 TF:6111

06/17 07:48:02.888 E  [12]    TestErrorCounterMonitor Failed

06/17 07:48:32.888 E  [12]    TestErrorCounterMonitor: ID:48 IN:0 PO:255 RE:575

                              RM:255 DV:34 EG:2 CF:14 TF:6112

06/17 07:48:32.888 E  [12]    TestErrorCounterMonitor Failed

# Show module 12

EMBC-GLA-A1-6513-01#sh module 12

Mod Ports Card Type                              Model              Serial No.

--- ----- -------------------------------------- ------------------ -----------

12    4  CEF720 4 port 10-Gigabit Ethernet      WS-X6704-10GE      SAL11445M5P

Mod MAC addresses                       Hw    Fw           Sw           Status

--- ---------------------------------- ------ ------------ ------------ -------

12  001c.584b.5d08 to 001c.584b.5d0b   2.6   12.2(14r)S5  12.2(33)SXH  Ok

Mod  Sub-Module                  Model              Serial       Hw     Status

---- --------------------------- ------------------ ----------- ------- -------

12  Distributed Forwarding Card WS-F6700-DFC3B     SAL1137090N  4.6    Ok

Mod  Online Diag Status

---- -------------------

12  Pass

# show diagnostic result module 12 detail  - test 36

  ___________________________________________________________________________

   36) TestErrorCounterMonitor ---------> .

          Error code ------------------> 0 (DIAG_SUCCESS)

          Total run count -------------> 2264981

          Last test execution time ----> Jun 17 2010 10:58:18

          First test failure time -----> Jun 17 2010 07:46:30

          Last test failure time ------> Jun 17 2010 07:48:32

          Last test pass time ---------> Jun 17 2010 10:58:18

          Total failure count ---------> 5

          Consecutive failure count ---> 0

          Error Records ---------------> n/a

EMBC-GLA-A1-6513-01#sh diagnostic content module 12

Module 12: CEF720 4 port 10-Gigabit Ethernet

  Diagnostics test suite attributes:

    M/C/* - Minimal bootup level test / Complete bootup level test / NA

      B/* - Basic ondemand test / NA

    P/V/* - Per port test / Per device test / NA

    D/N/* - Disruptive test / Non-disruptive test / NA

      S/* - Only applicable to standby unit / NA

      X/* - Not a health monitoring test / NA

      F/* - Fixed monitoring interval test / NA

      E/* - Always enabled monitoring test / NA

      A/I - Monitoring is active / Monitoring is inactive

      R/* - Power-down line cards and need reload supervisor / NA

      K/* - Require resetting the line card after the test has completed / NA

      T/* - Shut down all ports and need reload supervisor / NA

                                                          Test Interval   Thre-

  ID   Test Name                          Attributes      day hh:mm:ss.ms shold

  ==== ================================== ============    =============== =====

    1) TestFabricCh0Health -------------> ***N****A***    000 00:00:05.00 10

    2) TestFabricCh1Health -------------> ***N****A***    000 00:00:05.00 10

    3) TestTransceiverIntegrity --------> **PD*X**I***    not configured  n/a

    4) TestLoopback --------------------> M*PD*X**I***    not configured  n/a

    5) TestScratchRegister -------------> ***N****A***    000 00:00:30.00 5

    6) TestSynchedFabChannel -----------> ***N****A***    000 00:00:02.00 6

    7) TestDontLearn -------------------> C**D*X**I***    not configured  n/a

    8) TestConditionalLearn ------------> M**D*X**I***    not configured  n/a

    9) TestNewLearn --------------------> C**D*X**I***    not configured  n/a

   10) TestStaticEntry -----------------> C**D*X**I***    not configured  n/a

   11) TestIndexLearn ------------------> C**D*X**I***    not configured  n/a

   12) TestCapture ---------------------> C**D*X**I***    not configured  n/a

   13) TestTrap ------------------------> C**D*X**I***    not configured  n/a

   14) TestMacNotification -------------> M**N****A***    000 00:00:15.00 10

   15) TestFibDevices ------------------> M**D*X**I***    not configured  n/a

   16) TestIPv4FibShortcut -------------> M**D*X**I***    not configured  n/a

   17) TestIPv6FibShortcut -------------> M**D*X**I***    not configured  n/a

   18) TestL3HealthMonitoring ----------> ***N**FEA***    000 00:00:05.00 10

   19) TestNATFibShortcut --------------> M**D*X**I***    not configured  n/a

   20) TestMPLSFibShortcut -------------> M**D*X**I***    not configured  n/a

   21) TestL3Capture -------------------> C**D*X**I***    not configured  n/a

   22) TestL3VlanMet -------------------> M**D*X**I***    not configured  n/a

   23) TestIngressSpan -----------------> M**D*X**I***    not configured  n/a

   24) TestEgressSpan ------------------> M**D*X**I***    not configured  n/a

   25) TestAclPermit -------------------> M**D*X**I***    not configured  n/a

   26) TestAclDeny ---------------------> M**D*X**I***    not configured  n/a

   27) TestQos -------------------------> M**D*X**I***    not configured  n/a

   28) TestNetflowShortcut -------------> M**D*X**I***    not configured  n/a

   29) TestFibTcamSSRAM ----------------> ***D*X**I*K*    not configured  n/a

   30) TestAsicMemory ------------------> ***D*X**I*K*    not configured  n/a

   31) TestEobcStressPing --------------> ***D*X**I***    not configured  n/a

   32) TestFirmwareDiagStatus ----------> M**N****I***    000 00:00:15.00 10

   33) TestAsicSync --------------------> ***N****A***    000 00:00:15.00 10

   34) TestUnusedPortLoopback ----------> **PN****A***    000 00:01:00.00 10

   35) TestOBFL ------------------------> M**N****I***    000 00:00:15.00 10

   36) TestErrorCounterMonitor ---------> ***N****A***    000 00:00:30.00 10

   37) TestPortTxMonitoring ------------> **PN****A***    000 00:01:15.00 5 

___________________________________________________________________________

I also checked interface Te12/1 and there are no errors incrementing , however there were some one there before i cleared the counters.

EMBC-GLA-A1-6513-01#sh int TenGigabitEthernet12/1

TenGigabitEthernet12/1 is up, line protocol is up (connected)

  Hardware is C6k 10000Mb 802.3, address is 001c.584b.5d08 (bia 001c.584b.5d08)

  Description: Affinity 10GE link to Node4 Ten12/1

  MTU 9216 bytes, BW 10000000 Kbit, DLY 10 usec,

     reliability 255/255, txload 41/255, rxload 38/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 10Gb/s

  input flow-control is off, output flow-control is off

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:48, output 25w6d, output hang never

  Last clearing of "show interface" counters never

  Input queue: 0/2000/275/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 1516462000 bits/sec, 218080 packets/sec

  5 minute output rate 1643902000 bits/sec, 279871 packets/sec

     2640941005675 packets input, 2078604386534022 bytes, 0 no buffer

     Received 673540248 broadcasts (555685351 multicasts)

     0 runts, 0 giants, 0 throttles

     275 input errors, 12 CRC, 137 frame, 0 overrun, 0 ignored

     0 watchdog, 0 multicast, 0 pause input

     0 input packets with dribble condition detected

     3602648249198 packets output, 2550625033536035 bytes, 0 underruns

     0 output errors, 0 collisions, 9 interface resets

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 PAUSE output

     0 output buffer failures, 0 output buffers swapped out

EMBC-GLA-A1-6513-01#clear counters te12/1

Clear "show interface" counters on this interface [confirm]

EMBC-GLA-A1-6513-01#sh int te12/1       

TenGigabitEthernet12/1 is up, line protocol is up (connected)

  Hardware is C6k 10000Mb 802.3, address is 001c.584b.5d08 (bia 001c.584b.5d08)

  Description: Affinity 10GE link to Node4 Ten12/1

  MTU 9216 bytes, BW 10000000 Kbit, DLY 10 usec,

     reliability 255/255, txload 41/255, rxload 38/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 10Gb/s

  input flow-control is off, output flow-control is off

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:19, output 25w6d, output hang never

  Last clearing of "show interface" counters 00:00:09

  Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 1494942000 bits/sec, 216709 packets/sec

  5 minute output rate 1627539000 bits/sec, 278728 packets/sec

     1147270 packets input, 950837147 bytes, 0 no buffer

     Received 51 broadcasts (43 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 0 multicast, 0 pause input

     0 input packets with dribble condition detected

     1498177 packets output, 1053983275 bytes, 0 underruns

     0 output errors, 0 collisions, 0 interface resets

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 PAUSE output

     0 output buffer failures, 0 output buffers swapped out

EMBC-GLA-A1-6513-01#

EMBC-GLA-A1-6513-01#

EMBC-GLA-A1-6513-01#

EMBC-GLA-A1-6513-01#sh int te12/1

TenGigabitEthernet12/1 is up, line protocol is up (connected)

  Hardware is C6k 10000Mb 802.3, address is 001c.584b.5d08 (bia 001c.584b.5d08)

  Description: Affinity 10GE link to Node4 Ten12/1

  MTU 9216 bytes, BW 10000000 Kbit, DLY 10 usec,

     reliability 255/255, txload 41/255, rxload 38/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 10Gb/s

  input flow-control is off, output flow-control is off

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:39, output 25w6d, output hang never

  Last clearing of "show interface" counters 00:00:28

  Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 1491257000 bits/sec, 216549 packets/sec

  5 minute output rate 1622851000 bits/sec, 278654 packets/sec

     6115546 packets input, 5133924051 bytes, 0 no buffer

     Received 312 broadcasts (248 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 0 multicast, 0 pause input

     0 input packets with dribble condition detected

     7931073 packets output, 5582418978 bytes, 0 underruns

     0 output errors, 0 collisions, 0 interface resets

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 PAUSE output

     0 output buffer failures, 0 output buffers swapped out

EMBC-GLA-A1-6513-01# 

EMBC-GLA-A1-6513-01#

EMBC-GLA-A1-6513-01#

EMBC-GLA-A1-6513-01#sh int te12/1

TenGigabitEthernet12/1 is up, line protocol is up (connected)

  Hardware is C6k 10000Mb 802.3, address is 001c.584b.5d08 (bia 001c.584b.5d08)

  Description: Affinity 10GE link to Node4 Ten12/1

  MTU 9216 bytes, BW 10000000 Kbit, DLY 10 usec,

     reliability 255/255, txload 43/255, rxload 40/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 10Gb/s

  input flow-control is off, output flow-control is off

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:32, output 25w6d, output hang never

  Last clearing of "show interface" counters 00:09:17

  Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 1595449000 bits/sec, 231021 packets/sec

  5 minute output rate 1714311000 bits/sec, 296204 packets/sec

     128232515 packets input, 110342792638 bytes, 0 no buffer

     Received 5423 broadcasts (4430 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 0 multicast, 0 pause input

     0 input packets with dribble condition detected

     164887292 packets output, 118865884510 bytes, 0 underruns

     0 output errors, 0 collisions, 0 interface resets

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 PAUSE output

     0 output buffer failures, 0 output buffers swapped out

6 Replies 6

Hitesh Vinzoda
Level 4
Level 4

Hi,

Please check the below link for the error message

http://www.cisco.mn/en/US/docs/switches/lan/catalyst6500/ios/12.2SX/configuration/guide/diagtest.pdf

HTH

Hitesh Vinzoda

Pls rate useful posts

Thanks for making out time to reply.

I am more concerned about the errors this diagnostic tests are meant to keep track of i.e. what is it actually monitoring, because the TestErrorCountermonitor message or counter just implies that a threshold has been reached or exceeded.

My concern stems from the fact that these diagnostics are used for proactive event management, and the aim is defeated if i can't get to the bottom of the exact erros it tracks and the impact these errors have or will eventually have if they cannot be traced on time.

cheers

Neo@cisco

hello All,

Do I assume that no one has ever come across this diagnostic error message before?

skoirala
Cisco Employee
Cisco Employee

I came to the same situation today with 6500-NEB A switch. If you have resolved it, please let me know. thank you.

I have tried to open the posted link but it is an invalid URL. Could you please re-check this link?

The thing is that an end customer of mine is having this same issue on one of his core switches, they want to know if there is anything tougher that can arise from these logs.

Also I found out in a Cisco document that this might be an issue of the card not being inserted properly. The link:

http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00801751d7.shtml#counter_exceeds

I hope someone can clarify what this issue is about since these are the errors we are receiving in the customer's switch:

Sep 18 16:16:53: %CONST_DIAG-SP-4-ERROR_COUNTER_WARNING: Module 1 Error counter exceeds threshold, system operation continue.

Sep 18 16:16:53: %CONST_DIAG-SP-4-ERROR_COUNTER_DATA: ID:69 IN:6 PO:0 RE:417 RM:0 DV:3 EG:2 CF:10 TF:201

Should I open a different thread with my own issue? I just posted this in here since I think these ones are very closely related, sorry if this wasn't meant to happen.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco