cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5872
Views
5
Helpful
13
Replies

ASR9k:5.1.3: is there somewhere that provies us a list of reset reason decodes

Hello,

 

       I'm currently trying to find a list that list the different reload reasons in the "admin show logging onboard uptime" output.  I ask because we had a power cycle reload and I was unable to find any information on reset-reason 0x02.

RP/0/RSP0/CPU0:router#show clock
Thu Aug 27 19:28:53.307 UTC
19:28:53.312 UTC Thu Aug 27 2015

RP/0/RSP0/CPU0:router#show ver brief
Thu Aug 27 19:27:50.310 UTC

Cisco IOS XR Software, Version 5.1.3[Default]
Copyright (c) 2014 by Cisco Systems, Inc.

ROM: System Bootstrap, Version 0.71(c) 1994-2012 by Cisco Systems,  Inc.

router uptime is 2 hours, 26 minutes
System image file is "disk0:asr9k-os-mbi-5.1.3.CSCur18995-1.0.0/0x100305/mbiasr9k-rsp3.vm"

cisco ASR9K Series (Intel 686 F6M14S4) processor with 6291456K bytes of memory.
Intel 686 F6M14S4 processor at 2127MHz, Revision 2.174
ASR 9006 AC Chassis with PEM Version 2

2 Management Ethernet
24 TenGigE
24 DWDM controller(s)
24 WANPHY controller(s)
503k bytes of non-volatile configuration memory.
6271M bytes of hard disk.
12582896k bytes of disk0: (Sector size 512 bytes).
12582896k bytes of disk1: (Sector size 512 bytes).
RP/0/RSP0/CPU0:CUSREDMPDCA902#

 

RP/0/RSP0/CPU0:router#admin show logging onboard uptime
Thu Aug 27 19:17:32.701 UTC

-------------------------------------------------------------------------------
UPTIME CONTINUOUS INFORMATION (Node: node0_RSP0_CPU0)
-------------------------------------------------------------------------------
Current reset reason : 0x02
Current uptime       :    0 years  0 weeks 0 days  2 hours  0 minutes
-------------------------------------------------------------------------------
Time Stamp          |
MM/DD/YYYY HH:MM:SS | Users operation
-------------------------------------------------------------------------------
11/07/2014 15:05:52   File cleared by user request.
-------------------------------------------------------------------------------

-------------------------------------------------------------------------------
UPTIME CONTINUOUS INFORMATION (Node: node0_0_CPU0)
-------------------------------------------------------------------------------
Current reset reason : 0x05
Current uptime       :    0 years  0 weeks 0 days  2 hours  0 minutes
-------------------------------------------------------------------------------

 

 

1 Accepted Solution

Accepted Solutions

xthuijs
Cisco Employee
Cisco Employee

This may help heather:

CPU Reset Reason

There is also CPU reset reason from CBC printed at every reboot.

Selecting ROMMON Image...

DDR in Interleaved mode

POST 1 : PASSED : code 0 : DDR2 Memory Quick Test

CPU Reset Reason = 0x0002 <----

POST 2 : PASSED : code 0 : FPGA Flash Image CRC Checks

Loading Field Programmable Devices:

reasons in recorded in OBFL are:

    CPU_RESET_UNKNOWN = 1,         (CBC was reset after CPU was reset. So, CBC doesn't know)

    CPU_RESET_OIR_POR = 2,         (Board was plugged-in and CBC powered-on board by default)

    CPU_RESET_SRESET = 3,          (CBC received a CAN message to S-Reset CPU)

    CPU_RESET_HRESET = 4,          (CBC received a CAN message to H-Reset CPU)

    CPU_RESET_POR = 5,             (CBC received a CAN message to Power-Off or Power-Cycle CPU)

    CPU_RESET_WDOG_SRESET = 6,     (Watchdog expired and CBC S-Reset CPU so CPU can collect core-dump)

    CPU_RESET_WDOG_HRESET = 7,     (Watchdog expired and CBC H-Reset CPU)

    CPU_RESET_WDOG_POR = 8,        (Watchdog expired and CBC power-cycled board)

    CPU_RESET_PSEQFAIL_POR = 9,    (CBC power-cycled board following power-sequencer failure)

    CPU_RESET_PWR_OFF = 10,        (Board powered-off)

    CPU_RESET_PLDREQ_SRESET = 11,  (Lance / Mace S-Reset CPU)

    CPU_RESET_PLDREQ_HRESET = 12,  (Lance / Mace H-Reset CPU)

    CPU_RESET_AUTO_RESET = 13,     (CPU reset autonomously without informing CBC)

    CPU_RESET_MCLR_PROLONGED_HOLD = 14,  (CPU held in reset for several minutes, typically during PLD upgrade

xander

View solution in original post

13 Replies 13

xthuijs
Cisco Employee
Cisco Employee

This may help heather:

CPU Reset Reason

There is also CPU reset reason from CBC printed at every reboot.

Selecting ROMMON Image...

DDR in Interleaved mode

POST 1 : PASSED : code 0 : DDR2 Memory Quick Test

CPU Reset Reason = 0x0002 <----

POST 2 : PASSED : code 0 : FPGA Flash Image CRC Checks

Loading Field Programmable Devices:

reasons in recorded in OBFL are:

    CPU_RESET_UNKNOWN = 1,         (CBC was reset after CPU was reset. So, CBC doesn't know)

    CPU_RESET_OIR_POR = 2,         (Board was plugged-in and CBC powered-on board by default)

    CPU_RESET_SRESET = 3,          (CBC received a CAN message to S-Reset CPU)

    CPU_RESET_HRESET = 4,          (CBC received a CAN message to H-Reset CPU)

    CPU_RESET_POR = 5,             (CBC received a CAN message to Power-Off or Power-Cycle CPU)

    CPU_RESET_WDOG_SRESET = 6,     (Watchdog expired and CBC S-Reset CPU so CPU can collect core-dump)

    CPU_RESET_WDOG_HRESET = 7,     (Watchdog expired and CBC H-Reset CPU)

    CPU_RESET_WDOG_POR = 8,        (Watchdog expired and CBC power-cycled board)

    CPU_RESET_PSEQFAIL_POR = 9,    (CBC power-cycled board following power-sequencer failure)

    CPU_RESET_PWR_OFF = 10,        (Board powered-off)

    CPU_RESET_PLDREQ_SRESET = 11,  (Lance / Mace S-Reset CPU)

    CPU_RESET_PLDREQ_HRESET = 12,  (Lance / Mace H-Reset CPU)

    CPU_RESET_AUTO_RESET = 13,     (CPU reset autonomously without informing CBC)

    CPU_RESET_MCLR_PROLONGED_HOLD = 14,  (CPU held in reset for several minutes, typically during PLD upgrade

xander

Hello Xander,

 

Need your help with decode of current reset reason for the Xr,I have a 9912 box where I can see frequent lc reload with following message

 

RP/0/RP1/CPU0:COHERNGNGIA9K01#admin show logging onboard uptime location 0/6/C$

Fri May 18 22:35:56.023 IST

-------------------------------------------------------------------------------

UPTIME CONTINUOUS INFORMATION (Node: node0_6_CPU0)

-------------------------------------------------------------------------------

Current reset reason : 0x05

Current uptime       :    0 years  0 weeks 0 days 19 hours  0 minutes

-------------------------------------------------------------------------------

RP/0/RP1/CPU0:COHERNGNGIA9K01#admin show logging onboard uptime location 0/7/C$

Fri May 18 22:35:56.593 IST

-------------------------------------------------------------------------------

UPTIME CONTINUOUS INFORMATION (Node: node0_7_CPU0)

-------------------------------------------------------------------------------

Current reset reason : 0x05

Current uptime       :    0 years  0 weeks 0 days 19 hours  0 minutes.

I want to know what 0x05 and 0x08 reset rreason means.

 

Thanks in Advance

Jai

Hello,

5 is usually a power cycle request and 8 is watchdog. However more details info you can find in
- show reboot history loc <>
- show log
to start with.

Niko
HTH,
Niko

Hi Niko,

 

Thanks fo the quick reply,from what i understand the watchdog error/log messages come  when something gets triggered at kernel level.Unfortunately our logging buffer is small and I am not able to pull out data from show log.Also the card is a VSM card so show reboot history does not apply

 

Thanks

Jai

Hi Jai,

Do you have syslog configured and can pool the data from it? W/o logs we may not be able to find root cause.
You can open Service Request for TAC and share "show tech vsm" to verify it in more details and do health check - but it is also beneficial to configure the bigger buffer and collect the logs as soon as you notice the problem.

Do you have any system monitoring these alerts to react quickly? Do you see any crash/core files under:

harddisk:/dumper
or
harddisk:/np
corresponding to the time of LC crash?
HTH,
Niko

Hi Niko,

 

Thanks fo the quick reply,from what i understand the watchdog error/log messages come  when something gets triggered at kernel level.Unfortunately our logging buffer is small and I am not able to pull out data from show log.Also the card is a VSM card so show reboot history does not apply

 

Thanks

Jai

What release and SMUs/SP are you running on this node. We have fixed already some silent reloads specific to VSM.

/Aleksandar

Hello Nikolay/Aleksandar,

 

Find the  version details  of the router on which reload happened

 

co IOS XR Software, Version 6.1.4[Default]
Copyright (c) 2017 by Cisco Systems, Inc.

ROM: System Bootstrap, Version 14.28(c) 1994-2014 by Cisco Systems, Inc.

COHERNGNGIA9K01 uptime is 4 weeks, 5 days, 10 hours, 13 minutes
System image file is "disk0:asr9k-os-mbi-6.1.4/0x100305/mbiasr9k-rsp3.vm"

cisco ASR9K Series (Intel 686 F6M14S4) processor with 33554432K bytes of memory.
Intel 686 F6M14S4 processor at 1899MHz, Revision 2.174
ASR 9912 10 Line Card Slot Chassis with V3 DC PEM

 

RP/0/RP1/CPU0:COHERNGNGIA9K01#admin show install active summary
Mon May 21 12:49:19.759 IST
Default Profile:
SDRs:
Owner
Active Packages:
disk0:asr9k-services-infra-6.1.4
disk0:asr9k-doc-px-6.1.4
disk0:asr9k-fpd-px-6.1.4
disk0:asr9k-k9sec-px-6.1.4
disk0:asr9k-mcast-px-6.1.4
disk0:asr9k-mgbl-px-6.1.4
disk0:asr9k-mini-px-6.1.4
disk0:asr9k-mpls-px-6.1.4
disk0:asr9k-optic-px-6.1.4
disk0:asr9k-services-px-6.1.4
disk0:asr9k-video-px-6.1.4
disk0:asr9k-px-6.1.4.CSCvd96886-1.0.0
disk0:asr9k-px-6.1.4.CSCvf85579-1.0.0
disk0:asr9k-px-6.1.4.CSCvi09149-1.0.0

Can you open a TAC SR for this, please?

Hello 

 

We opened multiple TAC cases but have not got  any conclusive response for this issue.

Reference SR-684226422 

Hello 

 

Find the  version details  of the router on which reload happened

 

co IOS XR Software, Version 6.1.4[Default]
Copyright (c) 2017 by Cisco Systems, Inc.

ROM: System Bootstrap, Version 14.28(c) 1994-2014 by Cisco Systems, Inc.

COHERNGNGIA9K01 uptime is 4 weeks, 5 days, 10 hours, 13 minutes
System image file is "disk0:asr9k-os-mbi-6.1.4/0x100305/mbiasr9k-rsp3.vm"

cisco ASR9K Series (Intel 686 F6M14S4) processor with 33554432K bytes of memory.
Intel 686 F6M14S4 processor at 1899MHz, Revision 2.174
ASR 9912 10 Line Card Slot Chassis with V3 DC PEM

 

RP/0/RP1/CPU0:COHERNGNGIA9K01#admin show install active summary
Mon May 21 12:49:19.759 IST
Default Profile:
SDRs:
Owner
Active Packages:
disk0:asr9k-services-infra-6.1.4
disk0:asr9k-doc-px-6.1.4
disk0:asr9k-fpd-px-6.1.4
disk0:asr9k-k9sec-px-6.1.4
disk0:asr9k-mcast-px-6.1.4
disk0:asr9k-mgbl-px-6.1.4
disk0:asr9k-mini-px-6.1.4
disk0:asr9k-mpls-px-6.1.4
disk0:asr9k-optic-px-6.1.4
disk0:asr9k-services-px-6.1.4
disk0:asr9k-video-px-6.1.4
disk0:asr9k-px-6.1.4.CSCvd96886-1.0.0
disk0:asr9k-px-6.1.4.CSCvf85579-1.0.0
disk0:asr9k-px-6.1.4.CSCvi09149-1.0.0

Hi,  We have a customer who is experiencing the following module resets in a 9906/RSP880-LT running 7.1.3 with a A99-12X100GE LC:

 

0/RSP0/ADMIN0:Mar 28 06:03:37.611 UTC: canbus_driver[4049]: %PLATFORM-CANB_SERVER-7-CBC_PRE_RESET_NOTIFICATION : Node 0/3 CBC-0, reset reason CPU_RESET_WDOG_SRESET (0x06000000)
0/RSP0/ADMIN0:Mar 28 06:03:37.612 UTC: canbus_driver[4049]: %PLATFORM-CANB_SERVER-7-CBC_PRE_RESET_NOTIFICATION : Node 0/3 CBC-0, reset reason CPU_RESET_PLDREQ_SRESET (0x0b000000)
0/RSP0/ADMIN0:Mar 28 06:03:37.616 UTC: shelf_mgr[4083]: %INFRA-SHELF_MGR-6-HW_EVENT : Rcvd HW event HW_EVENT_RESET, event_reason_str 'HW Event RESET' for card 0/3
0/RSP0/ADMIN0:Mar 28 06:03:45.221 UTC: shelf_mgr[4083]: %INFRA-SHELF_MGR-6-HW_EVENT : Rcvd HW event HW_EVENT_POWERED_ON, event_reason_str 'HW Event Powered ON' for card 0/3

0/RSP0/ADMIN0:Mar 28 06:03:53.609 UTC: canbus_driver[4049]: %PLATFORM-CANB_SERVER-7-CBC_PRE_RESET_NOTIFICATION : Node 0/3 CBC-0, reset reason CPU_RESET_WDOG_HRESET (0x07000000)
0/RSP0/ADMIN0:Mar 28 06:03:53.614 UTC: shelf_mgr[4083]: %INFRA-SHELF_MGR-6-HW_EVENT : Rcvd HW event HW_EVENT_RESET, event_reason_str 'HW Event RESET' for card 0/3
0/RSP0/ADMIN0:Mar 28 06:03:54.668 UTC: canbus_driver[4049]: %PLATFORM-CANB_SERVER-7-CBC_PRE_RESET_NOTIFICATION : Node 0/3 CBC-0, reset reason CPU_RESET_WDOG_POR (0x08000000)

 

Here's the FPD versions:

 

0/3 A99-12X100GE 1.0 CBC CURRENT 46.06 46.06
0/3 A99-12X100GE 1.0 IPU-FPGA CURRENT 1.89 1.89
0/3 A99-12X100GE 1.0 IPU-FSBL CURRENT 1.113 1.113
0/3 A99-12X100GE 1.0 IPU-Linux CURRENT 1.113 1.113
0/3 A99-12X100GE 1.0 Morra-0 CURRENT 1.02 1.02
0/3 A99-12X100GE 1.0 Morra-1 CURRENT 1.02 1.02
0/3 A99-12X100GE 1.0 Primary-BIOS CURRENT 9.33 9.33
0/3 A99-12X100GE 1.0 Sideswipe-0 CURRENT 1.02 1.02
0/3 A99-12X100GE 1.0 Sideswipe-1 CURRENT 1.02 1.02
0/3 A99-12X100GE 1.0 SSDa-SMART N/A 7.05 7.05

 

Any idea what can be causing these issues?

 

Thanks

 

 

If you are getting constant reloads due to CBC CPU reset that means the voltage regulator has failed on the board and it is unable to properly manage power for the board, so it fails to boot. These messages are like a health monitor to make sure the cards will work properly when booted.

 

I would RMA this board.

 

Sam

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: