UCS Switching Questions

Answered Question
Jul 22nd, 2010

I'm a little concerned as to why 2:1 Oversubscription is the best possible between the servers and the Fabric Extenders (e.g., 8 x 10G interfaces from the blades, with 4 x 10G Uplink Ports).  If there is no QoS there (b/c there is no switching), then what happens if those uplinks get congested?  It seems there is no way to prioritize the traffic until the Fabric Interconnect.

Correct Answer by Robert Burns about 6 years 7 months ago

Over subscription between the Chassis & Interconnects is not 1:1...


The three options are 8:1, 4:1 & 2:1


Each blade has a single backplane link to each fabric.  Two links per server x 8 servers = 16 lossless FCoE links total.


To manage congestion 802.1Qbb Priority Flow Control is used.  Above & beyond that if you think you're pushing anything close to line rate you can implement 1000v & QoS.


Robert

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (3 ratings)
Loading.
Jeremy Waldrop Fri, 07/23/2010 - 04:18

There are QoS and rate limiting policies in UCS that can be applied to vNICs and vHBAs that are applied to Service Profiles. If this UCS is for VMware you can also implement the Nexus 1000v and do QoS at the VM veth level.

lamav Fri, 07/23/2010 - 06:40

If you cant run line rate in this age of 10G ethernet and LL requirements, then this solution is no solution.

lamav Fri, 07/23/2010 - 06:36

Its not a 2:1 oversubscription. Its 1;1.


The configuration youre giving is not supported and is considered incomplete by Cisco's UCS reference architecture.


Both FEXs and both 6100s need to be installed, no matter how many blades you have in place.


That would give you 8 downlinks and 8 uplinks, 1:1.

Correct Answer
Robert Burns Fri, 07/23/2010 - 22:36

Over subscription between the Chassis & Interconnects is not 1:1...


The three options are 8:1, 4:1 & 2:1


Each blade has a single backplane link to each fabric.  Two links per server x 8 servers = 16 lossless FCoE links total.


To manage congestion 802.1Qbb Priority Flow Control is used.  Above & beyond that if you think you're pushing anything close to line rate you can implement 1000v & QoS.


Robert

Aaron Dhiman Sat, 07/24/2010 - 09:30

Brilliant.  So, I imagine with 802.1Qbb, you could choose not to use Flow Control with any RTP traffic (running through a UCM MTP for instance) to eliminate jitter?


Also, what are the trade-offs with 1000v implementation?  I imagine maybe more delay and jitter...  Also, is there extra licensing or anything?

Robert Burns Sat, 07/24/2010 - 17:40

Aaron,


There are not really any "noticeable" tradeoffs with adding on 1000v.  Of course there will be a few CPU cycles allocated to run the VEM agent, but its not going to slow down your overall system performance or inpact network throughput at all.


To handle RTP traffic you can apply a "Gold/Platnium" level no-drop QoS policy to the traffic.  PFC gives you 8 queues to assign to various traffic levels, so you can ensure your real time traffic is unaffected.  Cisco has already began to deploy Unified Communications (Call Manager) on UCS which has very strict latency & bandwidth requirements. The UCS level QoS polcies will assist with Blade -> Interconnect congestion, whereas the 1000v QoS will prioritize your VM's traffic exiting the host.


As far as licensing, I "believe" there's still a promotion which gives customers 1000v when they purchase a UCS system - you'd have to disuss this with your account team to confirm.   Normally there is a per-socket license involved with all 1000v vSphere participating hosts.  If you have 4 dual socket servers you would need 8 1000v licenses.   With these licenses you can access to every & all feature (except for the NAM - available on the Nexus 1010 only).  Keep in mind to use the 1000 distributed switch it requires the VMware Enterprise Plus license for vCenter.  (Same requirements to activate the VMware vNetwork Distributed Switch)


Robert

lamav Sat, 07/24/2010 - 10:54

Good correction. Rated.


You are right. I forgot that the outbound NIC traffic is active/active, not active/standby. In this age of 10G and LL networking, Cisco couldnt provide 1:1? In my opinion, QoS is a non-solution in a high performace data center because the prioritization of one type of traffic is to the detriment of another. Its managed unfairness.


Why do you recommend the 1000v? How will it help push more traffic? By keeping local VM-VM traffic in the box instead of traversing the uplinks?


Thanks

Robert Burns Sat, 07/24/2010 - 17:56

I guess it comes down to requirements.  In any high performance Datacenter there's going to be oversubscription - you show me one that offers end to end line rate and I'll stand corrected


Most of the server administrators feel that if their NIC supports 1G or 10G, then "by golly" they should have 1G/10G right to the core.  Almost all of Cisco datacenter design practices incorporate some level of oversubscription into their design.  Typically this ranges from 20:1 (Acces-> Distribution) up to 4:1 (Distribution -> Core).  It just doesn't make practical sense to exceed these ratios in most circumstances to accomodate for traffic patterns that occur less than 1% of the time.


That being said, if you REALLY REALLY want your 10G pipe from your UCS blade all the way to the Interconnect, then you can use PIN Groups, and dedicate 1-2 links to your blade. In our experience the 2:1 max ratios up to a non-blocking cut-through switch like the UCS 6100 has been more than adequate.


So if you're trying to compare a 10G pass through module on something like an HP c7000, you'd better ensure your upstream phsyical switch can handle the traffic at line rate.  Store & forward switches can't keep up with hardware based cut-through switches to begin with.


Robert

lamav Sat, 07/24/2010 - 18:07

Robert, how about 1:1 to the....access layer?? Rack servers with 10G NICs plug into a 10G switchport that operates at line rate and....voila! So why not do this in a blade enclosure environment, especially when the UCS requires inter-blade traffic to traverse the uplinks to the ToR and come back down.


In the days of yesteryear, reaching line rate bandwidth may have occurred 1% of the time. In the age of virtualization sprawl, unified fabrics, SR-IOV and I/O consolidation, bandwidth requirements will exceed anything we have seen in the past. Trust me, Cisco's requirement for FCoE is going to be 40G in the next few months.


I do agree that not everyone has such stringent low latency requirements, so the 2:1, 4:1 and 8:1 oversubscription rates may suit them just fine. But for those who do have stringent LL requirements, like financials, the 1:1 should be an option.


And by the way, the Cisco Nexus 5K performed miserably in latency tests with unicast and multicast traffic compared to Arista, Blade or Extreme Networks. Even Dell's Power Connect store and forward switches performed remarkably better than the 5K.


http://www.networkworld.com/reviews/2010/011810-ethernet-switch-test-latency.html

Robert Burns Sat, 07/24/2010 - 18:31

We're still in the first generation of UCS -   Give us time.  For the current 5108 Chassis there is only the 2104 FEX option, but if the market demands it I'm sure we'll release an 8-port version.   Yes, you're right, 40G is just around the corner, as is 100G.  These are in development now, and not far off.  As these standards continue to mature they will be incorporated into our products.  We have done a great deal of financials & banking implementation with UCS, and through the course of our proof of concepts, the current over-subscription rate have met for their requirements.  I've seen few datacenter infrastructures that are capable of handling 10G to the access layer efficiently.  Wait until 40/100G creep onto the scene - It's going to be very interesting.


As for the N5K performance - there's always going to be 100-reviews from Sunday on which products excel in various tests & situations.  All I can offer is that its important to do thorough testing  prior to selecting any one technology for your environment.


Robert

lamav Sat, 07/24/2010 - 18:42

OK, Robert....I only expect the best from Cisco, which is why my criticism is harsh. Nice convo...


By the way, can you elaborate on ypur recommendationm to deploy the 1000v?


Thank ya.

lamav Tue, 07/27/2010 - 19:12

Robert, I found this document from Brad Hedlund, a Cisco blogster, and I was right. The oversubscription rate is 1:1 in the UCS under normal circumstances.


http://bradhedlund.com/2010/03/02/the-folly-in-hp-vs-ucs-tolly/


Each of 8 blades has a dual port NIC, but only ONE port is active and the other is STANDBY. That means that the server is offering 10G of outbound trraffic to the fabric extenders, and they provide 10G uplinks to the fabric interconnect. So, its 1:1.


From a pure interface availabilty perspective, then, yes, there are 16 downlinks and 8 uplinks - 2:1, but that is not how you measure oversubscription. For example, I can have ten 1-Gbps downlinks and 1 10Gbps uplink, is my oversubscription 10:1? Of course not, its 1:1.


Victor

Robert Burns Tue, 07/27/2010 - 19:47

Victor,


Your statement is incorrect.


"Each of 8 blades has a dual port NIC, but only ONE port is active and the other is STANDBY."'


Each adapter is ACTIVE.  This gives each blade with 2 x 10Gb/s connections.  In addition, to both adapters being active, you can also enable "Fabric Failover" on first generation CNAs (M71KR-E/Q & M81KR-VIC) which creates a "standby" vif on the opposite fabric interconnect.


Have a read through our official documentation:


http://www.cisco.com/en/US/docs/unified_computing/ucs/sw/gui/config/guide/GUI_Config_Guide_chapter1.html#concept_6FB0460E86CD44B1AE24892BBE365996


"The number of active links from the server to the fabric interconnect

Oversubscription is affected by how many servers are in a particular chassis and how bandwidth-intensive those servers are. The oversubscription ratio will be reduced if the servers which generate a large amount of traffic are not in the same chassis, but are shared between the chassis in the system. The number of cables between chassis and fabric interconnect determines the oversubscription ratio. For example, one cable results in 8:1 oversubscription, two cables result in 4:1 oversubscription, and four cables result in 2:1 oversubscription. The lower oversubscription ratio will give you higher performance, but is also more costly."


Regards,


Robert

lamav Tue, 07/27/2010 - 19:59

Robert....


Roberto....


lol


Did you read Brad's link? He is saying the opposite.....now, maybe hes wrong, but I am shocked if he is because he is supposed to be the Cisco UCS-guru expert freak. Cisco even points to his blog as a reference....


Can you check out his link and comment on it?

Aaron Dhiman Tue, 07/27/2010 - 20:11

I would say that either way, I am still very impressed with UCS am looking to deploy soon.  It is quite revolutionary.  In reality, say you have four servers running on a single blade.  Are you ever going to average 5 Gbps consistently on all four servers to cause any sort of problem with 20 Gbps vNIC?  I don't; so, this level of over-subscription does not concern me.  I do intend to activate all 8 connects to the Interconnects though.

Robert Burns Tue, 07/27/2010 - 20:20

Victor..


Victory..


Yes I read Brad's Blog.  He's not posting inaccurate info, but his topology also includes a single vNIC per blade.  That being said he's limiting the amount of bandwidth per blade by only using a single vNIC with failover, which is a valid topology.  2nd gerenation Menlo cards will NOT support failover.  If you want to have any form of redundancy you would need two vNICs, one for each fabric.


So if you have an adapter that supports failover, and you only configure a single NIC, then you will only pump 10GB/s to the blade, whereas you'll have 20GB/s of bandwidth with a dual adapter configured profile.


This is just a difference of requirements.  Do you want 2 x 10Gb/s adapters with oversubscription, or a single adapter without.  90% of UCS customers are deploying dual adapters with their service profiles


Robert

simon.geary Wed, 07/28/2010 - 06:55

An interesting thread this, I just wanted to chip in my own question about oversubscription that I have been wondering about. So a blade server has two 10 gig CNA ports (depending on card type). The operating system then thinks it sees two 10 GbE ports and also two 4 Gbps FC ports as well, correct? Does this not mean there is a theoretical maximum of 28 gig utilising the available 20 gig? I maybe don't understand correctly but it seems that there is already oversubscription to begin with right on the CNA before traffic even hits the FEX. Is that right?

Robert Burns Wed, 07/28/2010 - 13:51

Simon,


That is correct.  The host OS will see two 10G Ethernet Adapters, and two Fiber Channel SCSI adapters.  As with any current CNA (not just Mezz cards for blades) they have a maximum transmission of 10Gb/s which is "wide open" for ethernet, but limited to 4Gb/s for FC traffic.  I would hazard a guess that 3rd gen CNAs might incorporate the 8Gb/s Fiber Channel chips into the cards, but we'll have to wait and see.  When you start open storage up to those rates QoS & CoS would need to play a major roll in managing your drop vs. no drop traffic queues.   As you can tell with CNA vendors other than Cisco (Emulex & Qlogic) - there is a solid understanding that a host under normal operation will be not hindered by a certain amount of oversubscription.  Understand that FCoE traffic is much more efficient than TCP (due to QCN & PFC for congestion) so you can rest assured the paths from your blades to your interconnects will experience less oversubscription that you might find in the regular ethernet world.


Robert

Actions

This Discussion

Related Content