vPC query : possible to force "up" with a single switch?

Answered Question
Sep 18th, 2010
User Badges:

I believe it may not be possible, but I thought I'd ask around to see if anyone else had run into this. I had an interesting (understatement!) failure today. Our data center experienced a classic cascading power failure (circuit failure shunts load to "backup" or "secondary" circuit, overloading that circuit). The initial end result was a fully down N5K that had been a member of a functioning vPC pair. The other N5K remained up. However, apparently the data center operators, in their zeal to troubleshoot, repeatedly cycled power on the down N5K AND N5K that had initially stayed up.


The combo joy was a dead N5K (green lights, no boot, TAC case already opened for RMA) and the other N5K that booted without ever having a functioning vPC peer.


The end result was a production network completely down, even though 1 of the N5Ks was up and available. Since it couldn't talk to its peer, it wouldn't bring any of the vPCs up (uplinks, server links, nor the peer-link). I poked around, didn't see any obvious way to "force" it to bring them up. I'll be opening a TAC case for that, just to be complete. I've also sent a query to my SE team.


Nevertheless, I wondered if anyone else had experienced this scenario and had come up with a solution. I was fortunate in that I had a spare N5K that I could configure identically to the failed unit and migrate the connections; the vPCs all came up fine after that. When I first installed this system, I tested single switch failures and that worked as expected (once the vPC is up, it survives a partner failure just fine). I just didn't consider the possibility of both switches rebooting, with 1 failing to power back up.


Thoughts?


Hagen

Correct Answer by Darren Ramsey about 6 years 8 months ago

Yes I have seen this in our lab environment.


http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&from=myNotification&bugId=CSCte95521


Looks like it's fixed in a pending release.



The only way I could make it work was to remove the "vpc xx" statement from all the port-channels until the dead 5K recovered.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
Correct Answer
Darren Ramsey Mon, 09/20/2010 - 08:16
User Badges:
  • Silver, 250 points or more

Yes I have seen this in our lab environment.


http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&from=myNotification&bugId=CSCte95521


Looks like it's fixed in a pending release.



The only way I could make it work was to remove the "vpc xx" statement from all the port-channels until the dead 5K recovered.

amenh Mon, 09/20/2010 - 08:24
User Badges:

Darren, that's it exactly. I guess I'm pending the release of software! I'm hopeful it won't happen again, but nice to know I'm not crazy (at least this time). Dumping the vPC config popped into my head, but since I had the shelf spare N5K, I went for that "fix".


Thanks for the bug ID.


Hagen

Darren Ramsey Mon, 09/20/2010 - 08:41
User Badges:
  • Silver, 250 points or more

This could be a scary situation for sure in a production data center environment with the exact conditions you described.


I'm hoping this will be fixed in 5.0(2)N1(1), and you'll also get some new cool features like FEX pre-provisioning, config-sync, and config rollback

Actions

This Discussion