cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1490
Views
0
Helpful
8
Replies

Module down log

ray_stone
Level 1
Level 1

Hello,

Our monitoring team reporting that one of the cisco module is down but i checked which is working fine and then i found some logs stated below:

Jun 11 02:21:47.270 GMT: %OIR-SP-3-PWRCYCLE: Card in module 2, is being power-cy

cled off (Module not responding to Keep Alive polling)

Can someone suggest what is a cause of this error and the appropriate solution.

(Note) this device is in Production network.

Thanks.

1 Accepted Solution

Accepted Solutions

Hi Ray,

The message "(Module not responding to Keep Alive polling)" indicates that linecard reloaded due to loss of keepalives between module and supervisor. Since sup could not receive a reply on keepalives sent to the module, it reset the card to recover from problem condition.

Such failure could be caused by SW or HW issues, or loose seating of the card in the chassis slot. Could you collect and provide the following outputs, please?

  show module

  show diagn event

  check for crashinfo files in "dir /all" and get their content with "more ..." command(of copy them using FTP/TFTP)

Kind Regards,
Ivan

**Please grade this post if you find it useful.

Kind Regards,
Ivan

View solution in original post

8 Replies 8

ray_stone
Level 1
Level 1

#sh version

Cisco Internetwork Operating System Software

IOS (tm) s3223_rp Software (s3223_rp-ENTSERVICESK9_WAN-M), Version 12.2(18)SXF10

, RELEASE SOFTWARE (fc1)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2007 by cisco Systems, Inc.

Compiled Fri 13-Jul-07 03:27 by

Image text-base: 0x40101040, data-base: 0x42D54780

ROM: System Bootstrap, Version 12.2(17r)SX3, RELEASE SOFTWARE (fc1)

BOOTLDR: s3223_rp Software (s3223_rp-ENTSERVICESK9_WAN-M), Version 12.2(18)SXF10

, RELEASE SOFTWARE (fc1)

device_name_changed uptime is 8 weeks, 16 hours, 32 minutes

Time since device_name_changed switched to active is 8 weeks, 16 hours, 31 minutes

System returned to ROM by  power cycle (SP by power on)

System restarted at 10:35:28 GMT Sun Apr 15 2012

System image file is "sup-bootdisk:s3223-entservicesk9_wan-mz.122-18.SXF10.bin"

A summary of U.S. laws governing Cisco cryptographic products may be found at:

http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to

export@cisco.com.

cisco WS-C6506-E (R7000) processor (revision 1.1) with 458752K/65536K bytes of m

emory.

Processor board ID SAL1133XZY5

R7000 CPU at 300Mhz, Implementation 0x27, Rev 3.3, 256KB L2, 1024KB L3 Cache

Last reset from power-on

SuperLAT software (copyright 1990 by Meridian Technology Corp).

X.25 software, Version 3.0.0.

Bridging software.

TN3270 Emulation software.

3 Virtual Ethernet/IEEE 802.3 interfaces

240 FastEthernet/IEEE 802.3 interfaces

9 Gigabit Ethernet/IEEE 802.3 interfaces

1915K bytes of non-volatile configuration memory.

65536K bytes of Flash internal SIMM (Sector size 512K).

Configuration register is 0x2102

Hi Ray,

The message "(Module not responding to Keep Alive polling)" indicates that linecard reloaded due to loss of keepalives between module and supervisor. Since sup could not receive a reply on keepalives sent to the module, it reset the card to recover from problem condition.

Such failure could be caused by SW or HW issues, or loose seating of the card in the chassis slot. Could you collect and provide the following outputs, please?

  show module

  show diagn event

  check for crashinfo files in "dir /all" and get their content with "more ..." command(of copy them using FTP/TFTP)

Kind Regards,
Ivan

**Please grade this post if you find it useful.

Kind Regards,
Ivan

Thanks Ivan for your instant support.

Please confirm the commands that you are suggesting to be executed, won't make an impact on production services or it will not create any outage...

I'm very consious here -nothing wrong to be happened.

Thanks.

Welcome, Ray. As for the commands, those are short command of "show" type, so they don't cause any impact on services.

Looking at the outputs below, it looks like module is working fine and diagnostics did not detect any HW test failure for it.

Could you check for crashinfp files in the switch file systems. e.g. bootflash? Use "show file sys" to get the full list of file systems attached.

Kind Regards,
Ivan

See below:

device_name_changed#sh file systems

File Systems:

     Size(b)     Free(b)      Type  Flags  Prefixes

           -           -    opaque     rw   system:

           -           -    opaque     rw   tmpsys:

           -           -    opaque     ro   flexwan-fpd:

    65536000    65259164     flash     rw   bootflash:

*  255877120   255877120      disk     rw   disk0:

   255938560   206774272      disk     rw   sup-bootdisk:

    46267964           0    opaque     ro   sup-microcode:

           0   164072736    opaque     wo   sup-image:

      126956      126280     nvram     rw   const_nvram:

     1961976     1850885     nvram     rw   nvram:

           -           -    opaque     rw   null:

           -           -    opaque     ro   tar:

           -           -   network     rw   tftp:

           -           -    opaque     wo   syslog:

           -           -   network     rw   rcp:

           -           -   network     rw   ftp:

           -           -   network     rw   scp:

           -           -    opaque     ro   cns:

device_name_chnaged#dir /all bootflash:crashinfo_20071126-204837

Directory of bootflash:/crashinfo_20071126-204837

    1  -rw-      276706  Nov 26 2007 20:48:37 +00:00  crashinfo_20071126-204837

Is that the only crashinfo file in the "show bootflash:" output? If so, it is irrelevant  - the file is from 2007.

In that case we have only one symptom  - keepalives loss.

Such issues are caused by one of the following reasons:

- module HW failure (looses or corrupts part of the packets or goes offline)

- loose connection to the backplane of the chassis

- supervisor failure (can't keep the keepalive process running)

HW issues could be transient, e.g. data parity errors on the ASICs.

I suggest to monitor the module and physically reseat (pull out, firmly push back) should the issue reoccur. if that does not help, replace the module.

Kind Regards,
Ivan

Device_Name_Changed#sh diagnostic events module 2

Diagnostic events (storage for 500 events, 13 events recorded)

Event Type (ET): I - Info, W - Warning, E - Error

Time Stamp         ET [Card] Event Message

------------------ -- ------ --------------------------------------------------

04/15 10:37:04.536 I  [2]    Diagnostics Passed

05/10 06:23:39.861 I  [2]    Diagnostics Passed

05/11 02:43:17.606 I  [2]    Diagnostics Passed

05/13 10:21:35.382 I  [2]    Diagnostics Passed

06/06 11:13:20.442 I  [2]    Diagnostics Passed

06/09 04:20:26.523 I  [2]    Diagnostics Passed

06/10 01:58:47.753 I  [2]    Diagnostics Passed

06/11 02:22:54.210 I  [2]    Diagnostics Passed

Device_Name_Changed#sh module 2

Mod Ports Card Type                              Model              Serial No.

--- ----- -------------------------------------- ------------------ -----------

  2   48  48 port 10/100 mb RJ21 ethernet        WS-X6148-21AF      SAL1126T2VM

Mod MAC addresses                       Hw    Fw           Sw           Status

--- ---------------------------------- ------ ------------ ------------ -------

  2  001b.d4e4.2830 to 001b.d4e4.285f   1.2   5.4(2)       8.5(0.46)RFW Ok

Mod  Sub-Module                  Model              Serial       Hw     Status

---- --------------------------- ------------------ ----------- ------- -------

  2  IEEE Voice Daughter Card    WS-F6K-FE48-AF     SAL1126T096  1.8    Ok

Mod  Online Diag Status

---- -------------------

  2  Pass

 

Device_Name_Changed#dir /all

Directory of disk0:/

No files in directory

255877120 bytes total (255877120 bytes free)

Device_Name_Changed#

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card