cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
675
Views
0
Helpful
1
Replies

ASR1004 failure

waspagv
Level 1
Level 1

Some days ago our Cisco ASR1004 stopped working and created crush dump. After the accident the device continued to work.

 

Log is:

 

017951: Oct 13 13:59:43 SAMST: %CPPOSLIB-3-ERROR_NOTIFY: F0: cpp_driver-0: cpp_driver-0 encountered an error -Traceback= 1#d1babdf44dc893e5cee82b5a267753da errmsg:F39B000+2250 cpp_common_os:FA07000+BBC0 cpp_common_os:FA07000+B9D0 cpp_drv_cmn:FB52000+1A22B0 cpp_drv_cmn:FB52000+1A9C00 cpp_drv_cmn:FB52000+C0168 cpp_drv_cmn:FB52000+C08AC cpp_drv_cmn:FB52000+87558 cpp_drv_cmn:FB52000+80214 cpp_drv_cmn:FB52000+1A98A4 cpp_drv_cmn:FB52000+1A1A60 time:EFC3000+5744 evlib:F6FC000+E328 evlib:F6FC000+104D4 :10000000+6B20 c:E89B
017952: Oct 13 13:59:43 SAMST: %CPPHA-3-FAULT: F0: cpp_ha: CPP:0.0 desc:DPE0_CPE_CPE_DPE_INT_SET_0_LEAF_INT_INT_PHY_ERROR det:DRVR(interrupt) class:OTHER sev:FATAL id:1071 cppstate:STOPPED res:UNKNOWN flags:0x7 cdmflags:0x1
017953: Oct 13 13:59:43 SAMST: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.
017954: Oct 13 13:59:43 SAMST: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.
017955: Oct 13 13:59:43 SAMST: %CPPHA-3-FAULT: F0: cpp_ha: CPP:0.0 desc:DPE0_CPE_CPE_DPE_DUI_LEAF_INT_INT_DUI_CHN0_DRAM_MBE det:DRVR(interrupt) class:MBE sev:FATAL id:1100 cppstate:STOPPED res:SUCCESS flags:0x7 cdmflags:0x1
017956: Oct 13 13:59:43 SAMST: %CPPHA-3-FAULTCRASH: F0: cpp_ha: CPP 0.0 unresolved fault detected, initiating crash dump.
017957: Oct 13 13:59:43 SAMST: %CPPDRV-6-INTR: F0: cpp_driver-0: CPP10(0) Interrupt : Last 16 Interrupts Since Boot
017958: Oct 13 13:59:43 SAMST: %CPPDRV-6-INTR: F0: cpp_driver-0: CPP10(0) Interrupt : 17-Oct-13 13:59:43.825984 UTC+0400:HALT:DPE0_CPE_CPE_DPE_INT_SET_0_LEAF_INT_INT_PHY_ERROR
017959: Oct 13 13:59:43 SAMST: %CPPDRV-6-INTR: F0: cpp_driver-0: CPP10(0) Interrupt : 17-Oct-13 13:59:43.825984 UTC+0400:HALT:DPE0_CPE_CPE_DPE_DUI_LEAF_INT_INT_DUI_CHN0_DRAM_MBE
017960: Oct 13 13:59:43 SAMST: %CPPDRV-6-INTR: F0: cpp_driver-0: CPP10(0) Interrupt : 17-Oct-13 13:59:43.825984 UTC+0400:DPE0_CPE_CPE_DPE_DUI_LEAF_INT_INT_DUI_CHN0_DRAM_SBE
017961: Oct 13 13:59:43 SAMST: %CPPDRV-6-INTR: F0: cpp_driver-0: CPP10(0) Interrupt : 17-Oct-13 13:59:43.825984 UTC+0400:HEDP_HED_HALTED_IN_63_0_LEAF_INT_INT_HALTED5
017962: Oct 13 13:59:43 SAMST: %CPPDRV-6-INTR: F0: cpp_driver-0: CPP10(0) Interrupt : 17-Oct-13 13:59:43.825984 UTC+0400:HOT:INFP_INF_IDMU_LEAF_INT_INT_TRANSACTION_TIMEOUT
017963: Oct 13 13:59:44 SAMST: %IOSXE-3-PLATFORM: F0: cpp_cdm: CPP crashed, core file /tmp/corelink/Izhevsk28_ESP_0_cpp-mcplo-ucode_101317135943.core.gz
017964: Oct 13 13:59:44 SAMST: %IOSXE_OIR-6-OFFLINECARD: Card (fp) offline in slot F0
017965: Oct 13 13:59:44 SAMST: %ASR1000_RP_ALARM-6-INFO: ASSERT MAJOR module F0 Unknown state
017966: Oct 13 13:59:44 SAMST: %ASR1000_RP_ALARM-6-INFO: ASSERT CRITICAL module R0 No Working ESP
017967: Oct 13 13:59:44 SAMST: %CPPDRV-3-LOCKDOWN: F0: fman_fp_image: CPP10(0) CPP Driver LOCKDOWN due to fatal error.
017968: Oct 13 13:59:48 SAMST: %CPPDRV-3-LOCKDOWN: F0: cpp_cp: CPP10(0) CPP Driver LOCKDOWN due to fatal error.
017969: Oct 13 14:00:14 SAMST: %OSPF-5-ADJCHG: Process 10, Nbr 217.14.199.10 on TenGigabitEthernet0/0/0.4080 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017970: Oct 13 14:00:14 SAMST: %OSPF-5-ADJCHG: Process 3226, Nbr 217.14.207.26 on TenGigabitEthernet0/0/0.2 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017971: Oct 13 14:00:15 SAMST: %OSPF-5-ADJCHG: Process 11, Nbr 217.14.199.43 on TenGigabitEthernet0/0/0.4081 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017972: Oct 13 14:00:15 SAMST: %OSPF-5-ADJCHG: Process 3226, Nbr 217.14.207.27 on TenGigabitEthernet0/0/0.2 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017973: Oct 13 14:00:16 SAMST: %OSPF-5-ADJCHG: Process 11, Nbr 217.14.199.34 on TenGigabitEthernet0/0/0.4081 from FULL to DOWN, Neighbor Down: Dead timer expired
017974: Oct 13 14:00:16 SAMST: %OSPF-5-ADJCHG: Process 10, Nbr 217.14.199.9 on TenGigabitEthernet0/0/0.4080 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017975: Oct 13 14:00:17 SAMST: %OSPF-5-ADJCHG: Process 3226, Nbr 217.14.207.25 on TenGigabitEthernet0/0/0.2 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017976: Oct 13 14:00:17 SAMST: %OSPF-5-ADJCHG: Process 10, Nbr 217.14.199.11 on TenGigabitEthernet0/0/0.4080 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017977: Oct 13 14:00:18 SAMST: %OSPF-5-ADJCHG: Process 10, Nbr 217.14.199.3 on TenGigabitEthernet0/0/0.4080 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017978: Oct 13 14:00:18 SAMST: %OSPF-5-ADJCHG: Process 11, Nbr 217.14.199.41 on TenGigabitEthernet0/0/0.4081 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017979: Oct 13 14:00:18 SAMST: %OSPF-5-ADJCHG: Process 3226, Nbr 217.14.207.46 on TenGigabitEthernet0/0/0.2 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017980: Oct 13 14:00:19 SAMST: %OSPF-5-ADJCHG: Process 10, Nbr 217.14.199.2 on TenGigabitEthernet0/0/0.4080 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017981: Oct 13 14:00:21 SAMST: %OSPF-5-ADJCHG: Process 11, Nbr 217.14.199.35 on TenGigabitEthernet0/0/0.4081 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017982: Oct 13 14:00:22 SAMST: %OSPF-5-ADJCHG: Process 11, Nbr 217.14.199.40 on TenGigabitEthernet0/0/0.4081 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017983: Oct 13 14:00:22 SAMST: %OSPF-5-ADJCHG: Process 3226, Nbr 217.14.207.24 on TenGigabitEthernet0/0/0.2 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017984: Oct 13 14:00:22 SAMST: %OSPF-5-ADJCHG: Process 3226, Nbr 217.14.207.51 on TenGigabitEthernet0/0/0.2 from 2WAY to DOWN, Neighbor Down: Dead timer expired
017985: Oct 13 14:00:23 SAMST: %OSPF-5-ADJCHG: Process 3226, Nbr 217.14.207.21 on TenGigabitEthernet0/0/0.2 from FULL to DOWN, Neighbor Down: Dead timer expired
017986: Oct 13 14:00:23 SAMST: %RADIUS-4-RADIUS_DEAD: RADIUS server 192.168.2.222:1812,1813 is not responding.
017987: Oct 13 14:00:23 SAMST: %OSPF-5-ADJCHG: Process 10, Nbr 217.14.199.1 on TenGigabitEthernet0/0/0.4080 from FULL to DOWN, Neighbor Down: Dead timer expired
017988: Oct 13 14:00:24 SAMST: %RADIUS-4-RADIUS_DEAD: RADIUS server 192.168.2.129:1612,1613 is not responding.
017989: Oct 13 14:00:30 SAMST: %CPPHA-3-CDMDONE: F0: cpp_ha: CPP 0 microcode crashdump creation completed.
017990: Oct 13 14:00:30 SAMST: %IOSXE-6-PLATFORM: F0: cpp_cdm: Shutting down CPP MDM while client(s) still connected
017991: Oct 13 14:00:30 SAMST: %IOSXE-6-PLATFORM: F0: cpp_ha: Shutting down CPP MDM while client(s) still connected
017992: Oct 13 14:00:30 SAMST: %IOSXE-6-PLATFORM: F0: cpp_ha: Shutting down CPP CDM while client(s) still connected
017993: Oct 13 14:00:30 SAMST: %PMAN-3-PROCHOLDDOWN: F0: pman.sh: The process cpp_ha_top_level_server has been helddown (rc 69)
017994: Oct 13 14:00:30 SAMST: %PMAN-3-PROCHOLDDOWN: F0: pman.sh: The process cpp_cdm_svr has been helddown (rc 69)
017995: Oct 13 14:00:46 SAMST: %HA_EM-3-FMPD_SMTP: Error occurred when sending mail to SMTP server: 217.14.192.18 : timeout error
017996: Oct 13 14:00:53 SAMST: %HA_EM-3-FMPD_SMTP: Error occurred when sending mail to SMTP server: 217.14.192.18 : timeout error
017997: Oct 13 14:00:53 SAMST: %HA_EM-3-FMPD_SMTP: Error occurred when sending mail to SMTP server: 217.14.192.18 : timeout error
017998: Oct 13 14:01:03 SAMST: %RADIUS-3-ALLDEADSERVER: Group HydRadius: No active radius servers found. Id 197.
017999: Oct 13 14:01:05 SAMST: %RADIUS-3-ALLDEADSERVER: Group DialACCT: No active radius servers found. Id 9.
018000: Oct 13 14:01:23 SAMST: %RADIUS-6-SERVERALIVE: Group HydRadius: Radius server 192.168.2.222:1812,1813 is responding again (previously dead).
018001: Oct 13 14:01:23 SAMST: %RADIUS-4-RADIUS_ALIVE: RADIUS server 192.168.2.222:1812,1813 is being marked alive.
018002: Oct 13 14:01:23 SAMST: %RADIUS-4-RADIUS_DEAD: RADIUS server 192.168.2.222:1812,1813 is not responding.
018003: Oct 13 14:01:23 SAMST: %RADIUS-3-ALLDEADSERVER: Group HydRadius: No active radius servers found. Id 170.
018004: Oct 13 14:01:24 SAMST: %RADIUS-6-SERVERALIVE: Group DialACCT: Radius server 192.168.2.129:1612,1613 is responding again (previously dead).
018005: Oct 13 14:01:24 SAMST: %RADIUS-4-RADIUS_ALIVE: RADIUS server 192.168.2.129:1612,1613 is being marked alive.
018006: Oct 13 14:01:24 SAMST: %RADIUS-4-RADIUS_DEAD: RADIUS server 192.168.2.129:1612,1613 is not responding.
018007: Oct 13 14:01:24 SAMST: %RADIUS-3-ALLDEADSERVER: Group DialACCT: No active radius servers found. Id 0.
018008: Oct 13 14:02:14 SAMST: %ASR1000_RP_ALARM-6-INFO: CLEAR MAJOR module F0 Unknown state
018009: Oct 13 14:02:14 SAMST: %ASR1000_RP_ALARM-6-INFO: ASSERT MAJOR module F0 Disabled
018010: Oct 13 14:02:18 SAMST: %ASR1000_RP_ALARM-6-INFO: CLEAR MAJOR module F0 Disabled
018011: Oct 13 14:02:18 SAMST: %ASR1000_RP_ALARM-6-INFO: ASSERT MAJOR module F0 Boot state
018012: Oct 13 14:02:23 SAMST: %RADIUS-6-SERVERALIVE: Group HydRadius: Radius server 192.168.2.222:1812,1813 is responding again (previously dead).
018013: Oct 13 14:02:23 SAMST: %RADIUS-4-RADIUS_ALIVE: RADIUS server 192.168.2.222:1812,1813 is being marked alive.
018014: Oct 13 14:02:24 SAMST: %RADIUS-6-SERVERALIVE: Group DialACCT: Radius server 192.168.2.129:1612,1613 is responding again (previously dead).
018015: Oct 13 14:02:24 SAMST: %RADIUS-4-RADIUS_ALIVE: RADIUS server 192.168.2.129:1612,1613 is being marked alive.
018016: Oct 13 14:02:24 SAMST: %RADIUS-4-RADIUS_DEAD: RADIUS server 192.168.2.222:1812,1813 is not responding.
018017: Oct 13 14:02:24 SAMST: %RADIUS-3-ALLDEADSERVER: Group HydRadius: No active radius servers found. Id 210.
018018: Oct 13 14:02:24 SAMST: %RADIUS-4-RADIUS_DEAD: RADIUS server 192.168.2.129:1612,1613 is not responding.
018019: Oct 13 14:02:24 SAMST: %RADIUS-3-ALLDEADSERVER: Group DialACCT: No active radius servers found. Id 217.
018020: Oct 13 14:03:24 SAMST: %RADIUS-6-SERVERALIVE: Group HydRadius: Radius server 192.168.2.222:1812,1813 is responding again (previously dead).
018021: Oct 13 14:03:24 SAMST: %RADIUS-4-RADIUS_ALIVE: RADIUS server 192.168.2.222:1812,1813 is being marked alive.
018022: Oct 13 14:03:24 SAMST: %RADIUS-6-SERVERALIVE: Group DialACCT: Radius server 192.168.2.129:1612,1613 is responding again (previously dead).
018023: Oct 13 14:03:24 SAMST: %RADIUS-4-RADIUS_ALIVE: RADIUS server 192.168.2.129:1612,1613 is being marked alive.
018024: Oct 13 14:03:24 SAMST: %RADIUS-4-RADIUS_DEAD: RADIUS server 192.168.2.129:1612,1613 is not responding.
018025: Oct 13 14:03:24 SAMST: %RADIUS-3-ALLDEADSERVER: Group DialACCT: No active radius servers found. Id 227.
018026: Oct 13 14:03:24 SAMST: %RADIUS-4-RADIUS_DEAD: RADIUS server 192.168.2.222:1812,1813 is not responding.
018027: Oct 13 14:03:24 SAMST: %RADIUS-3-ALLDEADSERVER: Group HydRadius: No active radius servers found. Id 244.
018028: Oct 13 14:03:36 SAMST: %IOSXE_OIR-6-ONLINECARD: Card (fp) online in slot F0
018029: Oct 13 14:03:36 SAMST: %ASR1000_RP_ALARM-6-INFO: CLEAR CRITICAL module R0 No Working ESP
018030: Oct 13 14:03:38 SAMST: %ASR1000_RP_ALARM-6-INFO: CLEAR MAJOR module F0 Boot state
018031: Oct 13 14:03:58 SAMST: %CPPHA-7-START: F0: cpp_ha: CPP 0 preparing image /tmp/sw/fp/0/0/fp/mount/usr/cpp/bin/qfp-ucode-esp10
018032: Oct 13 14:03:59 SAMST: %CPPHA-7-START: F0: cpp_ha: CPP 0 startup init image /tmp/sw/fp/0/0/fp/mount/usr/cpp/bin/qfp-ucode-esp10
018033: Oct 13 14:04:07 SAMST: %CPPHA-7-START: F0: cpp_ha: CPP 0 running init image /tmp/sw/fp/0/0/fp/mount/usr/cpp/bin/qfp-ucode-esp10
018034: Oct 13 14:04:07 SAMST: %CPPHA-7-READY: F0: cpp_ha: CPP 0 loading and initialization complete
018035: Oct 13 14:04:08 SAMST: %IOSXE-6-PLATFORM: F0: cpp_cp: Process CPP_PFILTER_EA_EVENT__API_CALL__REGISTER

 

Show ver is

 

Cisco IOS Software, IOS-XE Software (PPC_LINUX_IOSD-ADVENTERPRISE-M), Version 15.2(4)S4, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2013 by Cisco Systems, Inc.
Compiled Sun 01-Sep-13 09:48 by mcpre

IOS XE Version: 03.07.04.S

Cisco IOS-XE software, Copyright (c) 2005-2013 by cisco Systems, Inc.
All rights reserved. Certain components of Cisco IOS-XE software are
licensed under the GNU General Public License ("GPL") Version 2.0. The
software code licensed under GPL Version 2.0 is free software that comes
with ABSOLUTELY NO WARRANTY. You can redistribute and/or modify such
GPL code under the terms of GPL Version 2.0. For more details, see the
documentation or "License Notice" file accompanying the IOS-XE software,
or the applicable URL provided on the flyer accompanying the IOS-XE
software.


ROM: IOS-XE ROMMON

Izhevsk28 uptime is 2 weeks, 4 days, 2 hours, 10 minutes
Uptime for this control processor is 2 weeks, 4 days, 2 hours, 12 minutes
System returned to ROM by reload at 15:25:51 SAMST Thu Sep 28 2017
System restarted at 15:29:50 SAMST Thu Sep 28 2017
System image file is "bootflash:/asr1000rp1-adventerprise.03.07.04.S.152-4.S4.bin"
Last reload reason: Reload Command


cisco ASR1004 (RP1) processor with 1694416K/6147K bytes of memory.
Processor board ID FOX1335GEYW
2 Ten Gigabit Ethernet interfaces
32768K bytes of non-volatile configuration memory.
4194304K bytes of physical memory.
937983K bytes of eUSB flash at bootflash:.
39004543K bytes of SATA hard disk at harddisk:.

Configuration register is 0x2102

 

Crush dump: https://drive.google.com/open?id=0BxBorEHS62byZWU5bGJaNk5oMmM.

 

Please help me to know what happend and what the problem is - hadrware, software or misconfiguration.

1 Reply 1

Mark Malone
VIP Alumni
VIP Alumni
Hi
there seems to be a few known bugs very similar to that online , you should log this out with TAC
other option is upgrade software see if it happens again if not chances are it was software bug as faulty hardware usually repeats itself pretty quickly , or run the show tech through the Cisco cli analyzer too on the Cisco website
Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card