Welcome to the Cisco Networking Professionals Ask the Expert conversation. This is an opportunity to discuss Cisco NX-OS with Mark Berly and Omar Sultan. Learn how this feature rich OS delivers unprecedented levels of availability. Mark is a senior manager of product management in Cisco System's Data Center Business Unit. He is the product line manager for software and management products on Cisco's next-generation switching products. Mark is a recognized leader in Cisco and helped develop and manage critical programs such as the Financial Test Lab. As manager of the Financial Test Lab, he was responsible for leading an engineering team while assisting in designing and certifying some of the world's largest enterprise networks based on the Cisco Catalyst 6500 Series platform. In addition, he led an escalation team resolving customer critical issues. He also helped launch the widely recognized SafeHarbor testing program. Omar Sultan is the solution manager for data center switching in the Data Center solutions team. Omar has spent over 20 years in the IT industry and has developed a broad portfolio of experience including data center management, network operations, system engineering, field sales, and business systems analysis.
Remember to use the rating system to let Mark and Omar know if you have received an adequate response.
Mark and Omar might not be able to answer each question due to the volume expected during this event. Our moderators will post many of the unanswered questions in other discussion forums shortly after the event. This event lasts through March 21, 2008. Visit this forum often to view responses to your questions and the questions of other community members.
I'm finding it difficult to visualise the phrase "self-healing" - what does this mean in practical terms?
Self-healing refers to the ability of the operating system to rapidly detect and correct problems in such a way that they are transparent to the data flowing over the network. As an example in NX-OS we can rapidly detect failed processes and statefully restart them, generally in milliseconds. For you this means no need to reconverge the network since the traffic is forwarded in hardware there is no impact to the services running over the network infrastructure.
Interesting - for an event such as this I would expect a trap of some sort, are there new MIB's for these new events?
There are syslog messages that would alert you, as for a MIB this is a long process to get something defined in NX-OS we have a well defined XML API which will allow you to do anything you can do via show cmds with XML. I think this would be your better option to get more data from the box.
What type of info would oyu be interesting in seeing?
Is there a general trend towards using an XML API instead of MIB's or is this a temporary solution until new MIB's are developed?
Our whole DC landscape is really geared towards a MIB-driven management approach and I'm not sure how much extra effort would be required to start building XML-API based tools.
In regards to this specific capability, stateful process restartability, I would like to understand what information you would like to see exposed via a MIB.
Well, I'm trying not to focus too much on one specific point of instrumentation, but that sounds like something I'd like to know about.
It may be indicative of something more insidious, or it may not, but I'd like a trap so that I can make the decision. Often, there are warning signs before a serious event and I wouldn't like to discover after the event that my BGP process had been restarted every 5 minutes for a month before a major outage.
Maybe that's a valid concern or maybe you can convince me that it isn't... I'm open-minded on the issue and although I know a lot about device instrumentation via snmp, I freely admit I don't have much experience with XML API's and their ability to tell you what's happening.
I am just trying to understand what you want to see via SNMP, it sounds like a trap is what you would like to see, if this is the case then no worries you will see syslogs and the ability to send traps. It seemed like you had some other ideas for information you would like to see provided via a MIB, if it is just a trap then no worries.
I just have a comment to add here. I was in a Cisco lab this week with a Nexus 7000 running NX-OS connected via OSPF to a Catalyst 6500. When the OSPF process was killed on the Nexus, NX-OS recognized the failure and restarted OSPF so fast (with state) that the Catalyst 6500 never knew what happened, no lost hellos, no neighbor resets, no route changes, nothing. As if nothing ever happened. This was quite impressive to see!
It is kinda cool. In demos, we will do something like turn on debugging for Spanning Tree, then kill the STP process, and you can see there are no topology changes. As you say, neighbors don't even notice. One of the keys to all this is the fact that the restart is stateful, so the new process does not have to re-learn the topology.
Do you get the same transparency with IOS upgrades? (I guess that should be NX-OS upgrades...)
--sorry about the late reply...travelling yesterday--
Do you mean ISSU type upgrades without impact to the data forwarding plane?
I'm interested in the effect of both patching the OS (e.g. security updates) and also the effect of upgrading to an entirely new version of OS.
I seem to remember reading somewhere that some kind of upgrade caused a blade reset, but I may be mistaken.
NX-OS is a fully modular OS, so individual process run on top of a common Linux kernel. This means that we have highly granular patching capabilities--we could patch STP code without it touching other parts of the OS, for example. On the other hand, for a full OS upgrade can be done without impacting data forwarding, as long as you have dual supervisors. You can upgrade the standby supervisor, roll-over to it, then upgrade the other supervisor. Because the data and control planes are loosely coupled, this happens without impact to data forwarding.
Thanks for the replies guys, I look forward to getting my hands on a couple of boxes to test.
In relation to VDC's, we currently maintain an entire DC just for development, testing and pre-prod environments, would you consider it feasible to use VDC's for this purpose? By that I mean hosting one set of hardware and virtualising test environments alongside a production environment? Can VDC's run different versions of the OS?
It is certainly reasonable to mix test and production environments on a single physical chassis without them impacting each other. In fact, we expect this to be a common scenario. One other feature of NX-OS that really helps with testing is the ability to save checkpoints and rollback to them if things don't go as planned--not that that ever happens. :)
Currently, there is a restriction that all VDCs must be running the same version of NX-OS.
I didn't get a chance to test an ISSU of NX-OS, that would have been fun. But as Omar pointed out, you can do this with 2 sup engines in a stateful hitless fashion. Main reason is because the Sup engines do not actually pass any traffic (unlike Sup720). The Nexus sup engines process control plane stuff only. Basically Andrew, the Nexus 7000 was design to *never* go down, even during scheduled maintenance. These guys thought of everything, and most of the HA design has been common place for several years in the SAN with the MDS and SAN OS.
This is how we get to a capacity of 230 Gbps per slot:
From the I/O module to the Fabric Module there is a series of thick copper traces and high density connectors. Each pair of copper traces is today clocked at 3.125Gbps. Each fabric channel is comprised of 16 pairs of copper for 16x3.125Gbps or 50Gbps half-duplex and 25Gbps of full-duplex bandwidth per fabric channel. (each pair of Cu at 3.125Gbps is unidirectional) We have to encode on the wire and we use a 24b/26b encoding scheme that yields just north of 23Gbps of real-world bandwidth per fabric channel.
Each switch fabric chip has 26 fabric channels. 2 fabric channels connect from each switch fabric ASIC to each line card in the 10-slot chassis. 2 Fabric channels per slot @ 23Gb each is 46Gb per slot per Fabric Module.
The Nexus 7000 can sustain up to five fabric modules. So with five switch fabric modules @ 46Gbps each we have 230Gb per slot. Now I do want to clarify that this is 230Gb IN and 230Gb OUT concurrently from every slot in the system when five fabric modules are present. Bandwidth scales up and down linearly with the addition or removal of fabric modules. You need a minimum of one, and max of five.
You can read a bit more about this at http://blogs.cisco.com/datacenter/2008/01/lets_talk_bandwidth.html. Doug said it so well, I figured I'd just quote him on it. :)
As we build out data centers more and more with 10GE uplinks it becomes more important for us not to have blocking links. This problem seems to be solved with the 6500-VSS but is there an equivalent solution with NX-OS?
Yes we do have a similar type functionality, called virtual Port Channel (vPC), in our next release which is scheduled for end of this CY. Also remember that N7K has 64 10GE ports line rate on the 10 slot chassis and will have 128 10GE ports line rate on the 18 slot chassis - which will come out end of the CY 2008.
A final question - I've been reading the literature on DCNM which seems to be the network management tool of choice for the Nexus platform. However, it seems to just manage the nexus platform so I'd like to know if there are any plans to extend the support to other platforms?
I don't visualise us doing a forklift upgrade to nexus, but more of a gradual evolution, which means that we will have both nexus and 6500 together. Having yet another tool which only looks after one device seems limiting - can it cope if a bunch of the trunks connect to 6500's?
Ask away! So DCNM is a platform in transition. It is currently set up to manage L2/L3 and CTS on the Nexus 7K. Once the N7K has FCoE capabilities, you will have a single pane of glass to manage storage, L2, L3, and CTS. At this time, there is no plan to have DCNM span the Nexus and Catalyst families. Within the DC 3.0 vision, the goal is to have VFrame be the tool that handles the complexities of cross-platform operations and management. In the interim, we have given you a consistent CLI to simplify things.
BTW, the gradual evolution is the way to go. We expect the prime insertion point for the N7K is in the core. From there we see an eventual transition of the agg layer from c6500 to N7K as the access layer transitions from GE to 10GbE. The rate of migration is going to be customer dependent.