- Silver, 250 points or more
Has anyone out there implemented FCoE multihop on the new 2.1 firmware? I moved my lab UCS over to it and storage performance tanked. Here is what I have in my lab.
2 Nexus 5500s - used for both LAN and FC switching
EMC CX4-120 - connected to the 2 Nexus 5500s
UCS 6120/FEX 2104 - Pre-FCoE setup was a 2 port SAN port-channel in each fabric to the Nexus 5ks.
6 B200 M1 servers
I cabled up an addition twinax on Fabric Interconnect A to Nexus 5548 in FC Fabric A
I cabled up an addition twinax on Fabric Interconnect B to Nexus 5548 in FC Fabric B
Configured the interfaces as FCoE uplink in UCS
Placed the FCoE interfaces in the appropriate VSAN
One fabric at a time I disabled the FC SAN port-channel to force the vHBAs to login over the FCoE uplink. Didn't have any issues with this and vHBAs showed up in the flogi table of the Nexus 5ks on the correct vfc interface.
Rebooted my ESXi hosts (boot from SAN) and the hosts rebooted fine using the FCoE uplinks to the Nexus 5ks.
When I powered on a VM it took over 20 minutes for it to boot and it never got to the point where it was usable.
No drops or errors on the FCoE or vfc interfaces of the Fabric Interconnects and Nexus 5ks.
As soon as I re-enabled the FC SAN port-channels using 4GB SFPs and disable the FCoE uplink performance was fine again.
I am hoping this is something to do with the older gen 1 hardware of 6100s and IOM 2100s and not a bug.
I have 2 other HP rack mount servers with Qlogic CNAs doing FCoE to the same VMFS LUNs with no issues.
Case ID: 624922297
Both Nexus switches and UCSM tech support are attached.
I figured out my performance issue with using multi-hop FCoE from UCS to Nexus 5k.
We have our UCS QoS System Classes changed from the default of best effort 50 and FC 50. We are using the other classes to place some barriers around traffic like vMotion and to give VM network traffic and NFS traffic guaranteed percentages.
On the Nexus 5k side we had the default QoS of 50/50 configured.
To test I reset UCS back to the defaults and then my performance issue went away. The miss-match in QoS between UCS and Nexus 5k was causing the issue.
I then went back and enabled one QoS system class at a time to see which one was causing the issue. When I enabled the Platinum system class I immediatly started noticing a performance issue access storage. The platinum class was being used for IP storage and has no-drop configured.
I can't find anywhere that this is documented but it looks like having no-drop configured on 2 different qos groups at the same time it causes issues.
Here is a screen shot of the QoS system class configuration that works with FCoE
vMotion is mapped to Best Effort
ESXi Mgmt is mapped to Silver
VM traffic is mapped to Gold
I don't really need a QoS policy or vNICs for IP Storage but if I did I would enable packet drop on Platinum and map the NFS/iSCSI vNICs to Platinum.