We went through a firmware upgrade on our UCS environment from 2.1.1e to 2.1.2c on Thursday. The updates appeared to go well, but almost immediately, we've been seeing storage performance issues on 5 of our 9 blades (ESXi 5.1U1 servers). The issue seems to be isolated to the HBA on fabric A for the 5 hosts in question. The HBA for fabric B seems OK.
I'm hoping for some tips on how to troubleshoot this from the UCS side. I've been able to verify that everything is unchanged and appears fine everywhere in the chain from the blades through to the storage array, but I just don't have solid tools for troubleshooting an issue like this.
We've got 2x 6248UP talking to a 5548UP which talks to a VNX5500 via FC. Our 9 blades (B200M3) are spread across 3 chassis (3 / chassis). Blades with problems are in all three chassis, so it's not isolated to specific ones. I have a VSAN for each fabric, connected to an 8GB FC port on each SP (i.e. VSAN 200 talks to port 0 on SPA and SPB, and VSAN 201 talks to port 1 on SPA and SPB). Each FI uses a 2 port portchannel for FCoE traffic to the 5548.
I appear to have only two symptoms that are visible to me:
1) ScsiDeviceIO failures in ESXi logs. On affected systems, these are happening several times/second. It appears that IO eventually goes through, but performance is degraded.
2) PowerPath reports path failures in the ESXi logs and errors counters via rpowermt.
I'm able to put the HBA for fabric A into standby mode using powerpath to force all IO to fabric B and issues appear to clear (no ScsiDeviceIO errors, no path failures), so we're still functional.
There are no errors in UCS manager, nothing visible on the storage array or 5548 switch.
I will likely be opening a support case for this issue, but ahead of or alongside that, can any provide some feedback on how to clearly troubleshoot conditions where FCoE HBAs or connections appear to be having issues, but aren't completely down? I'd like to strengthen my knowledge in this area as it's my weakest in managing our environment and I don't like being in a position where I'm unable to help myself.
The unmanaged mode is also known as Network only switching, which is introduced in Brazos release. It adds the flexibility for customer to use only network automation for service appliance.
If a device is configured a...
Usually, we can access ESXi Shell by pressing Alt+F1 from ESXi DCUI (Direct Console User Interface).
But on HyperFlex system, it just shows black window.
This is expected behavior because HyperFlex redirects ESXi Shell output to SoL...
Configuring an Export Policy Using the GUI
This procedure explains how to configure an Export policy using the APIC GUI. Follow these steps to trigger a backup of your data:
On the menu bar, choose Admi...