Many of you might be upgrading to Version 7.0(1)N1(1) as it is becoming a standard. Though there are various features that might help, you need to take extra care during the overall IOS upgrade procedure for 7.0. If you upgrade from 7.0 from any version, there is a chance that your Zone might be isolated, and IP connectivity is lost to everything behind the pair of N5Ks. We just experienced this.
We upgraded from 5.0(3)N2(1) to 7.0(1)N1(1).
Firstly, after upgrading, we lost all FEX configs and FC configs on N5K-1. We figured out that this is due to the bug #CSCul22703. Workaround was to upgrade 5.0 to 5.2(1)N1(6) or 7, and then upgrade it to 7.0(1)N1(1). This would drastically increase the overall time to upgrade the N5K boxes.
Our 2nd issue was bizarre, which led us losing all IP connectivity to everything behind N5K-2 after N5K-1 came up with 7.0. We saw N7K AGGR blocking STP instances for all VLANs behind the N5Ks. We thought with VPC - we will no more have STP issues, but its not the case :)
We initially thought it was due to bridge assurance, since BA is enabled by default on 5.0 (port type network), and is disabled on 7.0, but that wasnt the case.
Losing connectivity to the 2nd switch:
This was due to the Bug # CSCuo74024. Because of this bug – the STP BPDU’s weren’t forwarded from 2nd switch to 1st switch since there was an issue with BPDU’s over LACP Hashed interfaces. The STP instance blocked layer 2 connection to all VLANs on the 2nd switch since N5K was trying to become the root for that VLAN (which is how it should work). During this time only the IP connectivity was disrupted. SAN connection was still working (we checked various server logs too).
Workaround for this problem – was to shut down one of the links in LACP bundle from AGGR to N5K to make sure the BPDU’s don’t go over the Hashed LACP link.
We had a few other bugs which we encountered during this upgrade:
VTP version goes from V2 to V1 when upgrading from 5.0 to 5.2 – minor – doesn’t affect anything.
Fcoe qos not config not seen in 5.2 in running config – minor – probably it is the default config and isn’t seen in config
We strangely had a kernel panic on the first switch when downgrading from 7.0 to 5.0 . It just happened once. We ran downgrades many times ,but couldn’t recreate this issue. With this kernel panic SFP microcode was lost (and defaulted to V0.0.0.0). Due to this all the SFP’s in module 1 went down (vpc peer link, uplinks, FEX links etc). Cisco AS said it was due to the bug #CSCuo46284. Workaround was to upgrade the switch again to 7.0 since it had the SFP uC package built with the IOS.
Please make sure you have everything checked before you upgrade to 7.0.. If you have a test lab, pls test this before putting it on production.