Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 

Double fault PSC/PAC migration

PSC/PAC double failure can occur if the lever of the PSC/PAC is unlocked and then the PSC/PAC is pulled out while the migration is occuring.

A PSC/PAC double failure can occur if the lever of the PSC/PAC is unlocked and then the PSC/PAC is pulled out before the PSC/PAC migration is completed. Two seconds after a PSC lever unlock, the PSC starts to migrate. If the PSC is pulled anytime during the migration, this is counted as a double failure and both PSCs involved in the migration are rebooted. The result is insufficient active PSCs to deal with the necessary resources and so the system restarts tasks anywhere in an attempt to get back to a stable state. The best (cleanest) approach is to do a CLI-based PSC migration first, and then unlatch and remove the card. Here are logs that indicate a double failure scenario:

******** show rct stats *******

RCT stats Details (Last 7 Actions)
Action         Type      From To   Start Time                Duration
-------------- --------- ---- ---- ------------------------  ---------- 
PSC migration  Planned     4    2  2010-Jan-27+16:05:57.141  Failed/Aborted

Note: this first log showing the beginning of migration is the actual beginning of the migration 2 seconds after the lever is lifted, there is no realtime log for the actual lever being lifted.

2010-Jan-27+16:05:57.093 [csp 7009 info] [8/0/1795 <cspctrl:0> pctrl_helpers.c:2535] [hardware internal system critical-info] Migrating all tasks on the Packet Services Card 2 in slot 4 to the one in slot 2.

2010-Jan-27+16:05:58.200 [csp 7017 critical] [8/0/1795 <cspctrl:0> spctrl_events.c:1333] [hardware external system] Unexpected card pull!  The Packet Services Card 2 with serial number PLB44099509 in slot 4 was not ready for removal!

2010-Jan-27+16:05:58.200 [csp 7031 critical] [8/0/1795 <cspctrl:0> spctrl_events.c:1375] [hardware internal system critical-info] No additional Packet Services Card 2 cards available!  Tasks may be lost!

2010-Jan-27+16:05:58.395 [npuctrl 16020 info] [8/0/1792 <npuctrl:0> rl_sf_handler.c:3784] [software internal system critical-info] Unexpected card pull on slot 4, SF recovery will be disabled for 10 seconds

2010-Jan-27+16:05:58.474 [rct 13041 warning] [8/0/1779 <rct:0> rct_pac.c:1261] [software internal system critical-info] Recovery action aborted card 4: PAC migration, card reset
2010-Jan-27+16:05:58.504 [rct 13041 warning] [8/0/1779 <rct:0> rct_pac.c:1297] [software internal system critical-info] Recovery action aborted card 2: PAC migration, card reset
2010-Jan-27+16:05:58.504 [rct 13013 info] [8/0/1779 <rct:0> rct_pac.c:978] [software internal system critical-info] Card 4 shutdown started
2010-Jan-27+16:05:58.514 [rct 13014 info] [8/0/1779 <rct:0> rct_pac.c:1105] [software internal system critical-info] Card 4 shutdown completed, took 0.010 sec
2010-Jan-27+16:05:58.514 [rct 13013 info] [8/0/1779 <rct:0> rct_pac.c:978] [software internal system critical-info] Card 2 shutdown started
2010-Jan-27+16:05:58.524 [rct 13014 info] [8/0/1779 <rct:0> rct_pac.c:1105] [software internal system critical-info] Card 2 shutdown completed, took 0.010 sec

2010-Jan-27+16:05:58.525 [rct 13003 info] [8/0/1779 <rct:0> rct_pac.c:596] [software internal system critical-info] Card task migration failed from card 4 to card 2

Note the following SNMP traps will also get logged:

(PACMigrateStart) from card 4 to card 2

(CardDown) card 4

(CardRemoved) card 4

(ServiceLossPTACs) Active PSC (3 configured active, 2 operationally active)

(PACMigrateFailed) from card 4 to card 2

(CardDown) card 2

Above was imported from Starent Networks Knowledgebase Article # 10940

Another example:

Taken from SR 621530819, the following is output from two PSCs 16 and 10 reseting within a few seconds of eachother due to missed heartbeats. PSC 4 becomes active first and recovers most of (but not all) the sessions from PSC 16. PSC 10 resets without any standby to take over and all calls are dropped, and when PSC 16 comes back up it starts taking new calls from scratch. The effect of this can be seen in the lagging call count of PSC 16:

14/0 sessmgr        198  10% 100% 501.2M  1900M   69  500  2887 28160 I   good
14/0 sessmgr        202 7.8% 100% 501.6M  1900M   69  500  2887 28160 I   good
14/0 sessmgr        206 9.4% 100% 501.9M  1900M   69  500  2887 28160 I   good
14/0 sessmgr        215 7.3% 100% 501.8M  1900M   71  500  2888 28160 I   good
14/0 sessmgr        217 8.2% 100% 501.7M  1900M   70  500  2883 28160 I   good
14/0 sessmgr        230 7.4% 100% 501.7M  1900M   71  500  2883 28160 I   good

15/0 sessmgr        172 8.3% 100% 426.6M  1900M   72  500  2839 28160 I   good
15/0 sessmgr        177 8.4% 100% 426.0M  1900M   72  500  2826 28160 I   good
15/0 sessmgr        183 8.3% 100% 426.9M  1900M   70  500  2835 28160 I   good
15/0 sessmgr        189 8.9% 100% 426.0M  1900M   73  500  2837 28160 I   good
15/0 sessmgr        234 7.7% 100% 426.9M  1900M   72  500  2834 28160 I   good
15/0 sessmgr        266 7.9% 100% 425.5M  1900M   71  500  2824 28160 I   good
15/0 sessmgr        279 9.2% 100% 425.2M  1900M   73  500  2838 28160 I   good

16/0 sessmgr         88 7.1% 100% 375.7M  1900M   61  500  1582 28160 I   good
16/0 sessmgr         91 4.9% 100% 375.4M  1900M   60  500  1580 28160 I   good
16/0 sessmgr         99 6.3% 100% 376.7M  1900M   62  500  1570 28160 I   good
16/0 sessmgr        122 5.2% 100% 375.1M  1900M   61  500  1576 28160 I   good
16/0 sessmgr        127 6.9% 100% 375.1M  1900M   60  500  1566 28160 I   good
16/0 sessmgr        138 6.1% 100% 376.3M  1900M   62  500  1581 28160 I   good
16/0 sessmgr        142 6.3% 100% 375.4M  1900M   66  500  1569 28160 I   good
16/0 sessmgr        143 5.1% 100% 375.9M  1900M   64  500  1567 28160 I   good


SNMP traps (chronological order)


Wed Apr 25 15:58:20 2012 Internal trap notification 73 (ManagerFailure) facility hatcpu instance 161 card 8 cpu 0

Wed Apr 25 15:58:21 2012 Internal trap notification 55 (CardActive) card 4 type Packet Services Card 3

Wed Apr 25 15:58:21 2012 Internal trap notification 73 (ManagerFailure) facility hatcpu instance 101 card 8 cpu 0

Wed Apr 25 15:58:24 2012 Internal trap notification 35 (PortLinkDown) card 20 port 1 ifindex 335609856
Wed Apr 25 15:58:24 2012 Internal trap notification 1024 (PortDown) card 20 port 1 ifindex 335609856port type 10G Ethernet
Wed Apr 25 15:58:24 2012 Internal trap notification 35 (PortLinkDown) card 20 port 1 ifindex 335609858
Wed Apr 25 15:58:24 2012 Internal trap notification 35 (PortLinkDown) card 20 port 1 ifindex 335609859
Wed Apr 25 15:58:24 2012 Internal trap notification 35 (PortLinkDown) card 20 port 1 ifindex 335609860
Wed Apr 25 15:58:24 2012 Internal trap notification 35 (PortLinkDown) card 20 port 1 ifindex 335609861
Wed Apr 25 15:58:24 2012 Internal trap notification 93 (CardStandby) card 40 type Redundancy Crossbar Card
Wed Apr 25 15:58:24 2012 Internal trap notification 93 (CardStandby) card 41 type Redundancy Crossbar Card
Wed Apr 25 15:58:24 2012 Internal trap notification 35 (PortLinkDown) card 19 port 1 ifindex 318832640
Wed Apr 25 15:58:24 2012 Internal trap notification 1024 (PortDown) card 19 port 1 ifindex 318832640port type 10G Ethernet
Wed Apr 25 15:58:25 2012 Internal trap notification 55 (CardActive) card 20 type 10 Gig Ethernet Line Card
Wed Apr 25 15:58:25 2012 Internal trap notification 93 (CardStandby) card 19 type 10 Gig Ethernet Line Card

Wed Apr 25 15:58:27 2012 Internal trap notification 60 (CardDown) card 16 type Packet Services Card 3

Wed Apr 25 15:58:28 2012 Internal trap notification 35 (PortLinkDown) card 26 port 1 ifindex 436273152
Wed Apr 25 15:58:28 2012 Internal trap notification 1024 (PortDown) card 26 port 1 ifindex 436273152port type 10G Ethernet
Wed Apr 25 15:58:28 2012 Internal trap notification 60 (CardDown) card 26 type 10 Gig Ethernet Line Card
Wed Apr 25 15:58:30 2012 Internal trap notification 5 (CardUp) card 26 type 10 Gig Ethernet Line Card

Wed Apr 25 15:58:30 2012 Internal trap notification 60 (CardDown) card 10 type Packet Services Card 3

Wed Apr 25 15:58:31 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2824 calls recovered 2824 passed audit 2820 prior to audit 2820 all call lines 5411 time elapsed ms 8792.

Wed Apr 25 15:58:31 2012 Internal trap notification 55 (CardActive) card 40 type Redundancy Crossbar Card
Wed Apr 25 15:58:31 2012 Internal trap notification 55 (CardActive) card 41 type Redundancy Crossbar Card
Wed Apr 25 15:58:32 2012 Internal trap notification 36 (PortLinkUp) card 19 port 1 ifindex 318832640
Wed Apr 25 15:58:32 2012 Internal trap notification 1025 (PortUp) card 19 port 1 ifindex 318832640port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 55 (CardActive) card 19 type 10 Gig Ethernet Line Card

Wed Apr 25 15:58:32 2012 Internal trap notification 36 (PortLinkUp) card 20 port 1 ifindex 335609856
Wed Apr 25 15:58:32 2012 Internal trap notification 1025 (PortUp) card 20 port 1 ifindex 335609856port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 36 (PortLinkUp) card 20 port 1 ifindex 335609859
Wed Apr 25 15:58:32 2012 Internal trap notification 1025 (PortUp) card 20 port 1 ifindex 335609859port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 36 (PortLinkUp) card 20 port 1 ifindex 335609861
Wed Apr 25 15:58:32 2012 Internal trap notification 1025 (PortUp) card 20 port 1 ifindex 335609861port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 36 (PortLinkUp) card 20 port 1 ifindex 335609858
Wed Apr 25 15:58:32 2012 Internal trap notification 1025 (PortUp) card 20 port 1 ifindex 335609858port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 1040 (DiameterIpv6PeerUp)  context HSGWin ipaddr 2001:4888:203:fff2:c0:116:0:e end point name 0012-diamproxy.JPTRFLGNPN2.Rf.vzims.com
Wed Apr 25 15:58:32 2012 Internal trap notification 35 (PortLinkDown) card 20 port 1 ifindex 335609858
Wed Apr 25 15:58:32 2012 Internal trap notification 1024 (PortDown) card 20 port 1 ifindex 335609858port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 35 (PortLinkDown) card 20 port 1 ifindex 335609859
Wed Apr 25 15:58:32 2012 Internal trap notification 1024 (PortDown) card 20 port 1 ifindex 335609859port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 35 (PortLinkDown) card 20 port 1 ifindex 335609861
Wed Apr 25 15:58:32 2012 Internal trap notification 1024 (PortDown) card 20 port 1 ifindex 335609861port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 36 (PortLinkUp) card 20 port 1 ifindex 335609858
Wed Apr 25 15:58:32 2012 Internal trap notification 1025 (PortUp) card 20 port 1 ifindex 335609858port type 10G Ethernet

Wed Apr 25 15:58:32 2012 Internal trap notification 36 (PortLinkUp) card 20 port 1 ifindex 335609859
Wed Apr 25 15:58:32 2012 Internal trap notification 1025 (PortUp) card 20 port 1 ifindex 335609859port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 36 (PortLinkUp) card 20 port 1 ifindex 335609860
Wed Apr 25 15:58:32 2012 Internal trap notification 1025 (PortUp) card 20 port 1 ifindex 335609860port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 36 (PortLinkUp) card 20 port 1 ifindex 335609861
Wed Apr 25 15:58:32 2012 Internal trap notification 1025 (PortUp) card 20 port 1 ifindex 335609861port type 10G Ethernet
Wed Apr 25 15:58:32 2012 Internal trap notification 93 (CardStandby) card 19 type 10 Gig Ethernet Line Card
Wed Apr 25 15:58:32 2012 Internal trap notification 83 (ServiceLossPTACs) Active PSC (13 configured active, 12 operationally active)


Wed Apr 25 15:58:33 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2829 calls recovered 2829 passed audit 2826 prior to audit 2818 all call lines 5390 time elapsed ms 10215.
Wed Apr 25 15:58:33 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2821 calls recovered 2821 passed audit 2817 prior to audit 2818 all call lines 5411 time elapsed ms 10399.
Wed Apr 25 15:58:33 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 4 card 10 cpu 0
Wed Apr 25 15:58:34 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2828 calls recovered 2828 passed audit 2823 prior to audit 2823 all call lines 5428 time elapsed ms 11437.
Wed Apr 25 15:58:35 2012 Internal trap notification 36 (PortLinkUp) card 26 port 1 ifindex 436273152
Wed Apr 25 15:58:35 2012 Internal trap notification 1025 (PortUp) card 26 port 1 ifindex 436273152port type 10G Ethernet
Wed Apr 25 15:58:35 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2829 calls recovered 2829 passed audit 2818 prior to audit 2811 all call lines 5380 time elapsed ms 12273.
Wed Apr 25 15:58:35 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2823 calls recovered 2823 passed audit 2820 prior to audit 2816 all call lines 5413 time elapsed ms 11667.
Wed Apr 25 15:58:35 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2831 calls recovered 2830 passed audit 2821 prior to audit 2823 all call lines 5455 time elapsed ms 12887.
Wed Apr 25 15:58:36 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2825 calls recovered 2825 passed audit 2815 prior to audit 2817 all call lines 5410 time elapsed ms 12228.
Wed Apr 25 15:58:36 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 25 card 10 cpu 0
Wed Apr 25 15:58:36 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2826 calls recovered 2826 passed audit 2811 prior to audit 2815 all call lines 5415 time elapsed ms 13356.
Wed Apr 25 15:58:36 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2822 calls recovered 2821 passed audit 2812 prior to audit 2817 all call lines 5422 time elapsed ms 13872.
Wed Apr 25 15:58:36 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2821 calls recovered 2820 passed audit 2808 prior to audit 2810 all call lines 5443 time elapsed ms 12776.
Wed Apr 25 15:58:36 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2829 calls recovered 2829 passed audit 2819 prior to audit 2819 all call lines 5418 time elapsed ms 13743.
Wed Apr 25 15:58:37 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2828 calls recovered 2827 passed audit 2817 prior to audit 2819 all call lines 5401 time elapsed ms 14224.
Wed Apr 25 15:58:37 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2827 calls recovered 2827 passed audit 2813 prior to audit 2821 all call lines 5410 time elapsed ms 14263.
Wed Apr 25 15:58:37 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2824 calls recovered 2824 passed audit 2812 prior to audit 2817 all call lines 5382 time elapsed ms 13433.
Wed Apr 25 15:58:37 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2821 calls recovered 2821 passed audit 2808 prior to audit 2808 all call lines 5382 time elapsed ms 13555.
Wed Apr 25 15:58:37 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2816 calls recovered 2816 passed audit 2807 prior to audit 2810 all call lines 5400 time elapsed ms 13630.
Wed Apr 25 15:58:37 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2830 calls recovered 2830 passed audit 2823 prior to audit 2821 all call lines 5414 time elapsed ms 13816.
Wed Apr 25 15:58:37 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 30 card 10 cpu 0
Wed Apr 25 15:58:37 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2821 calls recovered 2821 passed audit 2811 prior to audit 2812 all call lines 5411 time elapsed ms 14969.
Wed Apr 25 15:58:37 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2826 calls recovered 2826 passed audit 2818 prior to audit 2818 all call lines 5383 time elapsed ms 14038.
Wed Apr 25 15:58:38 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2815 calls recovered 2815 passed audit 2806 prior to audit 2812 all call lines 5400 time elapsed ms 14141.
Wed Apr 25 15:58:38 2012 Internal trap notification 183 (SessMgrRecoveryComplete) Slot Number 4 Cpu Number 0 fetched from aaa mgr 2833 calls recovered 2833 passed audit 2826 prior to audit 2827 all call lines 5409 time elapsed ms 15004.
Wed Apr 25 15:58:39 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 50 card 10 cpu 0
Wed Apr 25 15:58:40 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 56 card 10 cpu 0
Wed Apr 25 15:58:41 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 75 card 10 cpu 0
Wed Apr 25 15:58:42 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 81 card 10 cpu 0
Wed Apr 25 15:58:43 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 89 card 10 cpu 0
Wed Apr 25 15:58:43 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 90 card 10 cpu 0
Wed Apr 25 15:58:44 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 103 card 10 cpu 0
Wed Apr 25 15:58:46 2012 Internal trap notification 1099 (ManagerRestart) facility aamgr instance 126 card 10 cpu 0

Wed Apr 25 16:01:33 2012 Internal trap notification 5 (CardUp) card 16 type Packet Services Card 3
Wed Apr 25 16:01:33 2012 Internal trap notification 55 (CardActive) card 16 type Packet Services Card 3

Wed Apr 25 16:01:34 2012 Internal trap notification 1110 (ServiceLossPTACsClear) Slots 0 and 0 has configured for card type Unknown Card (0x00000000), one of them is active now

Wed Apr 25 16:02:30 2012 Internal trap notification 5 (CardUp) card 10 type Packet Services Card 3
Wed Apr 25 16:02:30 2012 Internal trap notification 93 (CardStandby) card 10 type Packet Services Card 3


Logs (Reverse chronological order)


012-Apr-25+15:58:28.480 [rct 13014 info] [8/0/4366 <rct:0> rct_pac.c:1062] [software internal system critical-info syslog] Card 10 shutdown completed, took 0.032 sec
2012-Apr-25+15:58:28.448 [rct 13013 info] [8/0/4366 <rct:0> rct_pac.c:941] [software internal system critical-info syslog] Card 10 shutdown started
2012-Apr-25+15:58:28.372 [csp 7031 critical] [8/0/4405 <cspctrl:0> spctrl_events.c:3807] [hardware internal system critical-info syslog] No additional Packet Services Card 3 cards available!  Tasks may be lost!

2012-Apr-25+15:58:28.346 [csp 7019 critical] [8/0/4405 <cspctrl:0> spctrl_events.c:3632] [hardware internal system diagnostic] The Packet Services Card 3 with serial number PLB38108766 in slot 10 has failed and will be reset and brought back online. (Device=CPU_1, Reason=CPU_CRITICAL_TASK_FAILURE, Status=[CPU0 MB: Boot Done HB_cpu: 0F:05] [CPU1 HB_cpu: 00:00] [CPU2 HB_cpu: 00:00] [CPU3 HB_cpu: 00:00] [GPIO_IN: 00,ff,ff,ff] [GPIO_OUT: 01,ff,00,ff])

2012-Apr-25+15:58:23.606 [rct 13014 info] [8/0/4366 <rct:0> rct_pac.c:1062] [software internal system critical-info syslog] Card 16 shutdown completed, took 1.868 sec
2012-Apr-25+15:58:21.740 [rct 13013 info] [8/0/4366 <rct:0> rct_pac.c:941] [software internal system critical-info syslog] Card 16 shutdown started

2012-Apr-25+15:58:21.727 [sitmain 4019 warning] [8/0/4402 <npuctrl:0> sit_api.c:3757] [software internal system critical-info syslog] Message bounced to facility sitmain instance 160
2012-Apr-25+15:58:21.708 [hat 3033 error] [8/0/4364 <hatsystem:0> atsystem_fail.c:1277] [hardware internal system critical-info diagnostic] Card error detected on card 10 device CPU_1 reason CPU_CRITICAL_TASK_FAILURE

2012-Apr-25+15:58:21.026 [csp 7019 critical] [8/0/4405 <cspctrl:0> spctrl_events.c:3632] [hardware internal system diagnostic] The Packet Services Card 3 with serial number PLB38108781 in slot 16 has failed and will be reset and brought back online. (Device=CPU_1, Reason=CPU_CRITICAL_TASK_FAILURE, Status=[CPU0 MB: Boot Done HB_cpu: 0E:04] [CPU1 HB_cpu: 00:00] [CPU2 HB_cpu: 00:00] [CPU3 HB_cpu: 00:00] [GPIO_IN: 00,ff,ff,ff] [GPIO_OUT: 01,ff,00,ff])
2012-Apr-25+15:58:20.966 [hat 3033 error] [8/0/4364 <hatsystem:0> atsystem_fail.c:1277] [hardware internal system critical-info diagnostic] Card error detected on card 16 device CPU_1 reason CPU_CRITICAL_TASK_FAILURE
2012-Apr-25+15:58:20.912 [hat 3051 warning] [8/0/4364 <hatsystem:0> hat_hb_lib.c:1262] [software internal system critical-info syslog] ICMP ping results from card 16: rtt(cpu:0,if:0,seq:0)=0ms rtt(cpu:1,if:0,seq:0)=0ms
2012-Apr-25+15:58:20.912 [hat 3014 critical] [8/0/4364 <hatsystem:0> hat_hb_lib.c:721] [software internal system syslog] HAT instance 0 found Critical task hatcpu/161 failed on cpu 16/1. Missed heartbeat sequence 1937379 ns_age 4 bounce_code N/A uptime 1938309.
hb 1937369 :09.645, RV, - 00000000 00000000, tx 0ms, rx 0ms, rtt 0ms
hb 1937370 :10.645, RV, - 00000000 00000000, tx 0ms, rx 0ms, rtt 0ms
hb 1937371 :11.646, RV, - 00000000 00000000, tx 2ms, rx 0ms, rtt 2ms
hb 1937372 :12.647, RV, - 00000000 00000000, tx 0ms, rx 2ms, rtt 2ms
hb 1937373 :13.647, RV, - 00000000 00000000, tx 0ms, rx 0ms, rtt 0ms
hb 1937374 :14.648, RV, - 00000000 00000000, tx 1ms, rx 5ms, rtt 6ms
hb 1937375 :15.649, RV, - 00000000 00000000, tx 0ms, rx 0ms, rtt 0ms
hb 1937376 :16.649, RV, - 00000000 00000000, tx 0ms, rx 0ms, rtt 0ms
hb 1937377 :17.650, RV, - 00000000 00000000, tx 0ms, rx 0ms, rtt 0ms
hb 1937378 :18.650, BV, - 00000000 00000000, tout 1258ms, age 5
afd 19273658 :19.352, rtt 2ms
afd 19273659 :19.452, rtt 1ms
afd 19273660 :19.553, rtt 0ms
afd 19273661 :19.653, tout
afd 19273662 :19.754, rtt 0ms
afd 19273663 :19.855, rtt 1ms
afd 19273664 :19.955, rtt 0ms
afd 19273665 :20.056, rtt 0ms
afd 19273666 :20.156, rtt 0ms
afd 19273667 :20.256, rtt 1ms
afd 19273668 :20.356, rtt 0ms
afd 19273669 :20.457, rtt 2ms
afd 19273670 :20.558, rtt 2ms
afd 19273671 :20.659, rtt 0ms
afd 19273672 :20.759, rtt 0ms
afd 19273673 :20.859, rtt 0ms

2012-Apr-25+15:58:20.911 [hat 3051 warning] [8/0/4364 <hatsystem:0> hat_hb_lib.c:1262] [software internal system critical-info syslog] ICMP ping results from card 10: rtt(cpu:0,if:0,seq:0)=0ms rtt(cpu:1,if:0,seq:0)=0ms
2012-Apr-25+15:58:20.910 [hat 3014 critical] [8/0/4364 <hatsystem:0> hat_hb_lib.c:721] [software internal system syslog] HAT instance 0 found Critical task hatcpu/101 failed on cpu 10/1. Missed heartbeat sequence 1937379 ns_age 4 bounce_code N/A uptime 1938317.
hb 1937369 :09.645, RV, - 00000000 00000000, tx 3ms, rx 3ms, rtt 6ms
hb 1937370 :10.645, RV, - 00000000 00000000, tx 2ms, rx 0ms, rtt 2ms
hb 1937371 :11.646, RV, - 00000000 00000000, tx 3ms, rx 0ms, rtt 3ms
hb 1937372 :12.646, RV, - 00000000 00000000, tx 0ms, rx 2ms, rtt 2ms
hb 1937373 :13.647, RV, - 00000000 00000000, tx 0ms, rx 0ms, rtt 0ms
hb 1937374 :14.648, RV, - 00000000 00000000, tx 1ms, rx 0ms, rtt 1ms
hb 1937375 :15.649, RV, - 00000000 00000000, tx 0ms, rx 0ms, rtt 0ms
hb 1937376 :16.649, RV, - 00000000 00000000, tx 0ms, rx 0ms, rtt 0ms
hb 1937377 :17.650, RV, - 00000000 00000000, tx 2ms, rx 3ms, rtt 5ms
hb 1937378 :18.650, BV, - 00000000 00000000, tout 1258ms, age 5
afd 19273658 :19.352, rtt 2ms
afd 19273659 :19.452, rtt 1ms
afd 19273660 :19.553, rtt 0ms
afd 19273661 :19.653, tout
afd 19273662 :19.754, rtt 0ms
afd 19273663 :19.855, rtt 1ms
afd 19273664 :19.955, rtt 0ms
afd 19273665 :20.056, rtt 0ms
afd 19273666 :20.156, rtt 0ms
afd 19273667 :20.256, rtt 1ms
afd 19273668 :20.356, rtt 0ms
afd 19273669 :20.457, rtt 2ms
afd 19273670 :20.558, rtt 2ms
afd 19273671 :20.659, rtt 0ms
afd 19273672 :20.759, rtt 0ms
afd 19273673 :20.859, rtt 0ms

Version history
Revision #:
1 of 1
Last update:
‎01-25-2012 06:14 AM
Updated by:
 
Everyone's tags (1)