Vishal Mehta is a customer support engineer for Cisco’s Data Center Server Virtualization TAC team based in San Jose, California. He has been working in the TAC for the past three years with a primary focus on data center technologies such as Cisco Nexus 5000, Cisco UCS, Cisco Nexus 1000v, and virtualization. He has presented at Cisco Live in Orlando 2013 and will present at Cisco Live Milan 2014 (BRKCOM-3003, BRKDCT-3444, and LABDCT-2333). He holds a master’s degree from Rutgers University in electrical and computer engineering and has CCIE certification (number 37139) in routing and switching and service provider.
The following experts were helping Vishal to answer few of the questions asked during the session: Ali Haider and Gunjan Patel. Ali and Gunjan are support engineers in TAC and have vast knowledge in Datacenter Virtulization related topics.
You can download the slides of the presentation in PDF format here. The related Ask The Expert sessions is available here. The Complete Recording of this live Webcast can be accessed here.
L2 and L3 mode Related
Q. Why is L3 preferred over L2 for VSM to VEM connectivity? Is there known issues or limitations with L2?
A. L3 mode is easy to troubleshoot, and it gives freedom to have VEM and VSM in different subnets. There are no limitations in L2. L3 mode minimizes broadcast traffic in case of multiple VEMs and VSMs and also makes it easy to troubleshoot issues.
Q. Moving from L2 to L3 VSM-to-VEM is disruptive?
A. It should not cause any disruptions in the data plane. L2 and L3 are just control protocols. Even if there is any disruption the data traffic will not be affected. The movement is temporary and it will only affect the control plane. Any existing VM will be able to forward traffic, any new VM will face issue but that will be temporary.
Q. Is there L3 control option between primary and secondary VSM for HA?
A. No, currently we have L2 control plane between VSMs for HA. We are not using L3 for such purpose.
HA Clustering Related
Q. Do you recommend deploying a HA Cisco 1000V Virtuals over a IPSec VPN tunnel?
A: That can put a lot of encapsulation overhead. As I said, it should be fine if it meets 10ms latency requirement. You will need VSG if you want to use IPSec or any advanced security feature with Nexus 1000v.
Q. What are the latency required for HA clustering over WAN?
A: You need 10ms latency for HA .
Q. How is Split brain avoided in Nexus 1000v if you have 2 VSMs?
A: Nexus 1000v VSMs maintain a heartbeat and only one VSM is active at a time. This mechanism helps avoid split brain scenarios. However, in certain scenarios where both VSMs lose network connectivity, split brain can still occur.
VSMs communicate on Layer 2 and if they are able to talk to each other split brain scenario will never arise. In case you are having a split brain in your network, try to bring both VSMs to same host, send some traffic through vSwitch so that they have a strong layer 2 connectivity and then try to figure out the issue that caused the split brain.
Q: Can we have 2 VSM across Data Centers? Any requirement or restriction?
A: Yes you can have VSMs across the DC. Make sure that you have L2 connectivity across the datacenter's and also the latency requirement is 10ms. 10ms for HA and 100ms for VEM to VSM if they are across multiple DC
Q. Can we have VMFEX as backup to VSM or more that two VSMs? Can you elaborate on this?
A: No, VMFEX and VSM are complimentary to each other. They can co-exist but they can not manage the same infrastructure. So you cant not have VMFEX as backup of VSM. Now with VSM you can have a pair (active and standby), however you can not have more than two in HA redundancy. So you can have only a pair of VSMs managing the same datacenter but not 3 or 4 VSMs managing the same datacenter.
Q. How come does VMware VUM not verifiy if a VEM can still be connect to the VSM after updating the ESX Host with patches?
A: Do you mean VUM doesn't verify if the host is connected to the VSM before applying the patches?
Q.: Example: VMware 5.1 with current patches will lead to the error on the VSM "The VEM is not compatible with the VSM Version"?
A: Yes, that's a known issue with VUM . If the VUM is trying to install a higher version of VEM module and Nexus 1000v is at a lower version, then this is not backward compatible. So VSM to VEM communication wont happen.
Q. No, the N1kv was working before applying the current patches to the esx Server.?
A: VUM just looks at VEM on a host as a vib file (similar to any other vib drivers). VUM does not check the connection to the VSM before applying the patch hence sometimes it upgrades the VEM to a non-compatible version
Q. Any idea when this will be fixed?
A: This issue is more of a VMware design issue, they can't treat VEM differently from any other vib. We are working with VMware to get this resolved. Hopefully soon .
Q. I had to do a Roll-back?
A: Yes, that's the workaround for now.
Q. I had to do a ESX Server the 1st time since I working with it: starting with Version 1.x in 2004?
A: Yes. VUM doesn't play with 1000v very well from the beginning.
Q. What version do you recommend we run on the 1000V for ESX 5.0?
Q. How can I enforce (with N1k) that all my vmk vmotion goes over the same fabric?
A: On UCS you can pin a certain vnic to a specific fabric interconnect by configuring the vnic properties. This ensure that the vmnic is pinned to certain fabric Interconnect. Now from the Nexus 1000v perspective you have to create port-profile. Two port-profile cannot have overlapping vlans. One you have port-profiles defined for your vmotion vlan, you can assign that port-profile to a vmnic that has been pinned to Fabric Interconnect.
Through 1000v you cant enforce that. The vmnic that carry the VLAN traffic have corresponding vnic in UCS and these vnic need same primary path. So if vnic 0 and 1 on both hosts have same primary path via FI A and they allow Vmotion traffic in this case Vmotion will stay local to the FI.
Q. MS Unicast NLB (without N1k) is not supported in UCS EHM ?
A: It is supported in UCS, as UCS supports all three modes, but with Nexus 1000V only the unicast mode will work.
Q. Can i do a local port span from different vmguest across different ESX servers or do they need to live on the same ESX Server?
A: You cannot acomplish this by doing local port span. The VM will have to be on the same ESXi host. However, you do ERSPAN and span the VMs traffic to an IP destination on Nexus 7000 for example. If you are doing local span then they have to be on same host, otherwise you will need to use erspan to span traffic from multiple VM's.
Q. Is there a quick way to find out if the vmguest are operating in headless mode besides the CLI on the ESX server?
A: You can find that out from ESX CLI. Command "vemcmd show card", you can't get that info from vm guest as VM guest should still be passing traffic when VEM is in headless mode.
For vmguest you will have to find out at the VEM level. Use command "vemcmd show card" and if you see your VM's in headless mode then all VM's are in headless mode.
Q. How close is this design to the Enterprise Campus design model?
Q. Does the Nexus 1000v installation and configuration is it the same for Nexus 1010?
A: VSM configuration is pretty much same, just installation is a bit different for Nexus 1010 . Nexus 1010 is a physical box which can host upto 6 pairs of VSM's that can manage different datacenters. Nexus 1010 and 1000v have different configuration guides and they are not replacement of each other. It is just physical Nexus 1010 and 1000v sitting on top of it.
Q. With the enterprise campus design model, routing keeps getting pushed closer to the access layer. Can this be done on the fabric interconnects or does routing need to take place upstream?
A: You can use 1000v in L2 mode so VEM and VSM can talk to each other in the same VLAN and traffic can be switched locally throught the Fabric (provided the vmnics are pinned to the same fabric)
FI are pure L2 and we have to stay with the conventions of servers, where server is a end host device connected to the switch, so routing is not going to happen at the FI. One workaround is to deploy CSR 1000v for routing. So you can have the routing done within UCS domain but it can not happen at the FI.
Q. This makes sense. My concern is traffic for VMs in different VLANs. Can routing between these 2 VMs take place on a fabric interconnect?
A: No- not as of now. Just L2 switching at the FI level right now .
Q. Does the 1000V support storm control for broadcast/multicast or is that handled upstream on the switch?
A: It is handled upstream on the switch.
Q. Server1 pinned to fabric 1 and server 2 pinned to fabric 2 so is there local switching or what ?
A: No, its not local switching because Fabric Interconnect to Fabric Interconnect traffic has to go through the upstream switch. There is no L2 data path between the two FIs, only control traffic goes through that link between them.
Q. Can you run multiple VEMs on the same hypervisor?
A: No, its not supported right now. We can not have multiple VEMs on same hypervisor (VMware or HyperV).
Q. Is it possible to use all 4 virtual switching at same time: VMFEX, vSwitch, VMware DVS and Nexus 1000v?
A: VMFEX can not manage the same infrastructure, its a different design. For the three virtual switches: vSwitch, DVS and 1000v; it is possible to segregate the traffic and send it via different switches, but the recommendation is to put all traffic via DVS or Nexus 1000v.
Q. Do we need to have advanced license for Nexus 1000v?
A: It depends, if you are planning to use advanced features you will need it. 99% of the customers use the basic license which is free (2.1 version onwards). For advanced features like TrustPoint, Security or DHCP snooping you will need to purchase the advanced license.
Q.: Can the same VSM be used for both HyperV and ESXi?
A: No, because of the requirements for VMware and Windows you can not have the same VSM instance for both the environments.
Q. The software for VSM/VEM for HyperV and ESXi are different?
A: Yes, they are different. They are on different code train and this is because we need to support the features that both VMware and HyperV have and they dont have same features or functionality.
Q. Will there be one GUI to manage both UCS and Nexus 1000v?
A: No, not right now. We have same GUI for VMFEX and UCS but not one GUI for 1000v and UCS.
Q. Does VEM to VEM traffic has to pass through an external switch?
A: If they are pinned to the same FI then it will be through the FI, and if not then it will be through the upstream switch.
Q. In End-host-mode, unknown unicast flooding is not done and that could be the reason why unicast NLB does not work while multicast works?
A: Correct. With unicast NLB it would be unknown unicast because the way it was designed is that it will have its own mac address different for MS NLB than the multicast. So MS NLB unicast will not work with UCS end host mode.