Cat4507R Sup IV wrong standby state

Unanswered Question
Aug 25th, 2009
User Badges:

I have a 4507R with redundant Sup IV's running IOS 12.2(25)EWA6. They are in SSO mode, however one is in Standby Cold, not Standby Hot. So if I try "redundancy force-switchover" it tells me the standby is not ready. I have searched and cannot find how to force the standby Sup to go from cold to hot. An identical 4507R next to this one, same Sups, same IOS, same redundancy mode, is fine with the standby in "hot" mode.


Any ideas? This is a production system, so the maintenance window is closed for the day, but I can try things again early tomorrow.


  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
gkuzmowycz Tue, 08/25/2009 - 08:02
User Badges:

Thanks, but we've been through that and similar documents.


SW_NAME#sh modu

Chassis Type : WS-C4507R


[snip]


Mod Redundancy role Redundancy mode Redundancy status

----+-------------------+-------------------+-------------------

1 Standby Supervisor SSO Standby cold

2 Active Supervisor SSO Active


SW_NAME#sh redun state

my state = 13 -ACTIVE

peer state = 4 -STANDBY COLD

Mode = Duplex

Unit = Secondary

Unit ID = 2


Redundancy Mode (Operational) = Stateful Switchover

Redundancy Mode (Configured) = Stateful Switchover

Split Mode = Disabled

Manual Swact = Disabled Reason: Progression in progress

Communications = Up


client count = 23

client_notification_TMR = 240000 milliseconds

keep_alive TMR = 9000 milliseconds

keep_alive count = 0

keep_alive threshold = 18

RF debug mask = 0x0


As I said, an identical switch next to this one, same Sup engines, same IOS, shows Standby Hot.


Giuseppe Larosa Tue, 08/25/2009 - 12:07
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello George,

during a maintanance window try to reset the standby supervisor.


something has gone wrong and it was stopped by active supervisor.


try to use (if supported)

sh redundancy history


and check the logs for communication errors between the two supervisors.


Hope to help

Giuseppe


jbrenesj Tue, 08/25/2009 - 12:48
User Badges:
  • Silver, 250 points or more

Start planing ahead because most likely the sup in slot 1 will fail the diagnostics on the next reload. I have seen it a couple of times.

Do you remember if slot 1 was the former active sup because it might have done a switchover, do a "more slavecrashinfo:data" to verify if there is any crash details for slot 1

gkuzmowycz Tue, 08/25/2009 - 17:52
User Badges:

Thanks for the advice. Slot 1 was at one time the active sup. I don't recall when it switched over.


I will look at this in the morning. But we are on maintenance, should we just open a ticket and RMA the Sup?


gkuzmowycz Tue, 08/25/2009 - 17:58
User Badges:

I meant to add somewhere in this thread, that even the Standby Cold Sup is semi-responsive. I can tftp to slavebootflash, etc. Don't know if this changes anybody's diagnosis.

gkuzmowycz Wed, 08/26/2009 - 04:45
User Badges:

OK, the "standby cold" Sup failed when I did redundancy reload peer. So RMA time.


However, another update to a comment made up-thread. Reading the RPR/SSO doc very carefully, and reviewing configs for like the hundredth time, it seems that the switch in question is not quite as identical to its brother as I thought. The switch with the bad Sup has a ROM of


ROM: 12.1(12r)EW

Dagobah Revision 95, Swamp Revision 24


whereas its brother is


ROM: 12.2(20r)EW1

Dagobah Revision 226, Swamp Revision 31


According to the doc, "The minimum ROMMON requirement for running SSO is Cisco IOS Release 12.1(20r)EW1 or Cisco IOS Release 12.2(20r)EW1."


So it seems we could not run SSO. Can we upgrade the ROM? If so, how?



Actions

This Discussion