We have a perplexing problem with our EMC VNX storage, 9148, Emulex, and Sparc T5 Solaris 11 configuration and don't know where to look.
When we provision a storage group from the VNX to a two node cluster -- without any clustering software -- we observe that "format" would take over 5 minutes for 1200 paths/140 LUNs. This problem bounces back and forth between the two server. The same command would take 4 seconds on the other node.
When we put Veritas SFRAC on, the problem gets worse. It causes all sort of isssues from slow startup, booting, shutting down server, and formatting disks.
It must be something simple and we've gone through testing numerous scenarios trying to isolate the problem. We've engaged all vendors we have contact with.
I've tried forcing all interfaces to 8Gb but that didn't help either. Connect speed is 8Gb when interfaces are set to auto as is the default.
Appreciate any suggestion you may have.
You seem to have a serious performance problem ? correct ?
- VNX is active / standby per lun ?
- how is vnx connected to MDS
- how many paths does the server have to the lun
- what kind of FC multipathing software are you using
This is a new installation and the issue seems widespread. The slow "format" problem is only an indication that something is wrong, before we install any multipathing or clustering software. If we install those, they simply work marginally.
The VNX is ALUA mode 4. Each LUN has 8 paths, 2 switches, 2 Emulex ports from two HBA's.
Can you please post a diagram of the setup !
Have you tested the performance with just one path ? (could it be path trashing ?).
Certain hosts such as Windows, Solaris and AIX will require the system to rediscover their disks in order for ALUA to be enabled. It is recommended that the system be rebooted once the change is made.
We rebooted many times trying to fix this. I reduced the paths to 4 and the problem remain. We uninstalled Powerpath and it didn't help. We did without SFRAC 6.1.1 and didn't help.
When we had problem with the 'format' command on either node, the trick is to take the node out of the storage group and add it back in. Then at some point later the problem comes back.
- is it correct, that you have a dual fabric, with one MDS connecting the host as well as the storage ?
- and the storage is dual homed to both MDS ? and if yes, why ?
- do you really believe, that MP with 8 links is necessary ? MP sometimes creates more problems than it solves.
- I repeat again: do baselining with one link, eliminating MP !
In summary: I am 100% convinced, that this problem has nothing to do with MDS; it is related to storage controller and/or host and/or MP (EMC Powerpath).
The diagram goes something like this
SPA/SPB --- FE 0a,2a <----> 9148 (A) <----> HBA0
--- FE 0b,2b
SPA/SPB --- FE 3a,5a <-----> 9148 (B) <-----> HBA1
Even FE ports go to switchA, odd ports go to switchB. This server has a lot of storage so we want FE ports for IOPs.
When I turn off all SAN ports, the servers do "format" and return quickly.
When format runs slow, the DTrace shows in the user stack trade gives very low count for various function calls
CPU ID FUNCTION:NAME
326 83159 :tick-5sec
In this case, it takes about 5 minutes to show 140 LUNs with 1200 paths.
The other node took 4 seconds to return and the number is not 66 but somewhere around 6000.
Thanks for the suggestion. Will try to strip down to just one path.
Update: Symantec gave us the following to try in /etc/system and the situation improves somewhat.
If I understand you correctly, it means, that controller A is dual homed to MDS fabric A resp. B; and the same applies for Controller B.
My understanding of ALUA is, that you mix optimzed and non optimized path to a lun.
I would therefore try yet another setup:
SPA --- FE 0a,2a,3a,5a <----> 9148 (A) <----> HBA0
SPB --- FE 0b,2b,3b,5b <-----> 9148 (B) <-----> HBA1
No, SPA's FEs shouldn't be all connected to switchA because it wouldn't provide redundancy. There are 4 paths from each SP to two switches.
Even numbered ports go to switchA and odd ports, switch B.
The LUNs are half and half spreaded among the two SP's.
I'm too convinced that something is amiss in the setup before we load any clustering or MP software. We got rid of PowerPath because ALUA/ASL/DMP can spread I/O among optimized and non optimized paths.
Due to the setting set ssd:ssd_retry_on_reservation_conflict=0x0 that Symantec suggested, the situation has improved somewhat and my colleague is happy. This saps the winds off my sail a bit but I'll continue to look because I want to get to the root of the matter.
I was aware of the redundant design; btw. have a look at the Cisco Validated Design for UCS and Netapp, which in Fig 2 shows exactly your setup (its N5k instead of MDS)