We currently have two pairs of Brocade 48000 Directors at each site that are filling up very rapidly. We also have two pairs of Brocade 3800's that have ISL's over dark fibre. We have purchased some 8-Port Enhanced Data Muxponder Cards for the Cisco ONS 15454 Multiservice Transport Platform (MSTP) for SAN extenion as we are going to need lots more bandwidth as we are looking at use VTL's at each site to backup the other site. And we also are doing geographically dispersed clusters (geo clusters) which requires synchronous true copy between the HDS USP's. The plan is to move all the ISL's to the 48000's and retire the 3800's.
The current Brocade switches have endless zero buffer credits messages (about 300,000 per hour) but there is no apparent way to fix this as they only have 29 buffer credits available per ISL and the distance between the sites is small enough that 29 is enough. In theory, we only need 20. The switches use 29 all the time. I don't even know if this is a problem.
I am looking at asking for a Cisco solution to handle the geoclusters and the upgraded number of ISL's for the backups. I seriously doubt my work will look at the director class as they think they have spent enough on the Brocade switches.
I am interested in knowing if anyone thinks something like the 9216i would be a better solution with the ONS 15454 MSTP as both are Cisco products. In particular the buffer to buffer credits seem to be much higher in the MDS than the Brocade but if there were issues with Brocade, then surely they would have made changes for this.
The reason I see the zero buffer credits all the time is because I am using HDS Tuning Manager and it reports on them for some reason.
Our HDS systems have spare ports which means we could use them and the the connections to the geo cluster hosts and connect them through the 9216 to each datacenter. Thats enough redundancy with four switches. The ONS seems to understand the Cisco B2B credits.
I think it is strange there is no Brocade forum site or at least one that I can find.
This HDS tuning mgr application...Is it able to see/query "no BB credit available" in the Brocade switches? I would think not and rather it is seeing them from where it sits as a F port and not seeing the E port link since class 3 FibreChannel is point-point flow control. I don't know for sure, but I am going to estimate it is that way. I would make sure the zero BB credit available are referencing the F(L) port or the E port. So, yes, the 16 port blade can do 255 BB credits on a port and the new 12,24,48 ports can do more than that. But to design this, I would look to see how much you are pushing. Do I estimate that this is 10-20k distance? So, over optical, the SSM doing local-ack/write acceleration might get you some performance for HDS trucopy but a cost benefit analysis would need to be done. I can't exactly speak for the Brocade since it is not my equipment, but I think that is still a passthru device. Since the MDS is a store-n-forward device, we have the capability to buffer frames which would reduce your chances of receiving Zero BB credit issues. I would also consider how much you are going to need 2G versus 4G now and future as you look at how to position the hardware for this design.
All the tuning manager presents is Buffer Credit Zero State and Input Buffer Full. The later never has anything other than zero so thats got to be a good thing. It knows the particular ports are E ports and they are the only ones complaining about buffer credit zero. I would like to know the difference if any of the BB credits on the Brocade versus the Cisco equipment. It sort of surprises me that the max I can get is 29 on a port on the Brocade. You can get more but it is another licence and as the link is less than 10km, you don't need to do that.
Basically what tuning manager is showing is:
tim_txcrd_z Amount of time that frame transmission is blocked by a transmit credit of zero. This is sampled once every 348 106.25MHz clocks, and the counter is incremented by 1 if the condition is true.
I am seeing hundreds of thousands of these tim_txcrd_z every hour.
The clustering solution(s) needs the True Copy sync writes to not have any issues and we have to have a specific "response" time. We are running some clusters now with sync writes but they are small compared to the new ones we are looking at doing.
I am interested in knowing if the ONS and MDS are a better option. I am much more familiar with the MDS than the Brocade.
On a Cisco board, you are always going to be told that the ONS and MDS are a better option....:)I would think the MDS is a better option not only because it is stronger with regards to BB credits that you suggest is the core of the issue, but gives you more tools to do your work. The MDS has the simple show command and you would catch a zero BB credit situation if it occured. You have more config options to make the hardware do what you want, IE move BB credits around. You have 4 interop modes. You get VSAN's to do what you want. You can turn on performance monitor for free for 119 days to test your E ports. I don't want to do a marketing list here, but the point is that there more things you can do with the product so the business can get work done.
I still am guessing that your tool is looking at the F port link and not the E port link. Not to say that I know exactly what it is reporting but flow control is point-to-point so unless your device is one end of the E port, then it really doesn't know what's going on other than it does not have a credit to send from it's perspective in the world. If the E port has 29 and the F port on the BRCD has say 7, then once the array sends 7 frames and no R_RDY back, then the array thinks it has zero BB Tx credits. In this example, the credits still sits at 22 on the E port. This is assuming the HDS is using class3 FCP.
Also, if your tuning device is able to see out into the Fabric and knows a zero BB credit event is occuring, it may not be on the E link. It might be on the other side of the E port as it goes out to the other array doing replication. 29 BB credits should definitely be enough on a link that is less than 10KM. So, if you are writing alot of data to the other side, maybe it's the other array not able to keep up.
I tend to agree with you about something being amiss in the middle or at the other datacenter. Most of the zero bb credits are reported from one datacenter which is the initiator of most of the traffic. All the F ports are good and they are not reporting any issues.
I raised my idea of using something like a 9216A with the ONS so I could be assured I could see and perhaps diagnose the problems. I only wanted four of them and after my boss spent 4 million on new storage that is going to use the SAN extension, he is still not convinced that Cisco is better. He just said they were too expensive.. When it all goes pear shaped in the future, I will have my design ready to go.
I have Truecopy in the lab and seen B2B credit transitions on the MDS switch port connected to the HDS in the 100s of thousands. Even while these transitions to zero are climbing rapidly the HDS is still able to achieve 123MB/sec throughput across a pair of MDS (no distance) which is exactly same throughput it was getting when the two HDS were cabled back to back. So if your seeing the txcredit hit zero on F port I would not be worried about it.
Thanks for the response. What I am seeing is on an E port not an F port. Once we get the ONS configured, we will be able to setup trunking ISL's and hopefully, that will sort something out. Still trying to convince management into leaving the Brocades for internal datacenter use and using MDS's for all inter datacenter networking. Especially with a large increase in backups to be done across the links.