I have a situation with a UCS implementation to a Brocade SAN fabric:
B200 M2 blades w/Palo (M81KR)
Brocade 5300 switches with NPIV enabled
ESXi 5 update 1
2 vHBAs per host; one to each fabric; using default VSAN1
EMC CX-480 array
After initial UCS configuration and enabling NPIV on Brocade ports, we provisioned service profiles (using WWNN/WWPN pools) to the blades, installed ESXi 5 and configured ESXi networking/storage. Once the hosts logged into the Brocade fabric, the SAN admin zoned the LUNs to each hosts' WWPN and they appeared in the ESXi hosts.
Everything was working great until I made some UCS LAN configuration changes (related to QoS and MTU). The MTU changes spawned a reset of the service profile and ultimately a reboot of the ESXi hosts. When they came back, the LUNs had disappeared. The ESXi HBAs show no paths or LUNs after a rescan. UCS storage uplink ports show active and have no errors. We reviewed the Brocade side and the fabric switches see the WWPNs, however, when we drill down there is no indication of driver/HBA type associated to the WWPN (as seen on other physical hosts, i.e. QLogic). I'm not privy to SAN switch configuration but this was just an observation made by the SAN admin.
I am having the SAN admin confirm NPIV on the ports using portCfgShow. Also, I'm using one global VSAN/FCoE VLAN for both fabrics since Brocade doesn't understand the VSAN concept, but wondering if I should create a VSAN for each fabric using different FCoE VLANs as we would in an MDS configuration. It was working with just the one global VSAN/FCoE VLAN but it's the only thing I can think of from a UCS configuration point of view. This seems to be a fabric issue in my opinion.
Would love to hear any thoughts and thanks in advance!!
Thanks for the responses and sorry for not replying. The loss of LUNs was resolved by upgrading the Cisco VIC driver for ESXi to the latest 188.8.131.52 version. According to TAC, VMware published an unvalidated driver with ESXi 5 update 1. All LUNs are now visible even across reboots and service profile resets.
However, we upgraded UCS to 2.0(3b) in the testing process and a known 'cosmetic' flogi error has resurrected (CSCtz88841) which was supposedly fixed in 2.0(2m). In our testing, we validated that it is truly cosmetic, and random (would appear and disappear across reboots), but all storage paths remain active and LUNs presented to the host without incident:
Moquery is the command line cousin of Vizore, it's very helpful and efficient sometimes during the troubleshooting. This article aims to provide moquery cheat sheet to the users for some most common seen scenarios.
Here is the checklist before customers/partners contact Cisco TAC:
Firmware Version of APIC and Switch
Download Switch and APIC techsupport logs
Problem description (Symptoms with details)
Business impact (eg, what kind of services...
moquery usageAPIC moquerySwitchmoquery
This document discuss a common issue observed during the VMM integration & VM workload migration to ACI fabric.
VMware Virtual machines are hosted in Cisco UCS-B seri...