I have built a Windows 2012R2 Hyper-V cluster on B200M3's 2.21d firmware VIC 1240. I was unable to get the OS to enable VMQ on the NIC's that I presented to Hyper-V for VM networking. I finally found that I needed to create a VMQ Connection policy and assign that policy to the NIC's that I wanted to use VMQ on. In this policy there are two settings required 1) Number of VMQ's and 2) Number of Interrupts. This is where my question comes in, how do I figure out what values to use here?
So I opened a Cisco support ticket and was given the following:
The number of interrupts value is calculated by - the total number of CPU threads that the server has
The number is VMQ’s is calculated by The Number of VMQs = Number of Synthetic NICs (VM NICs) + 1) - this was clarified again later as the number of NIC's on the VM's that are present on the host. So if I have 25 vm's 1 NIC each my value should be 26.
The Interrupts makes sense however the VMQ's does not. I have a cluster and VM's are migrated back and forth between nodes for obvious reasons and we also add and remove VM's on a regular basis. I can obviously change the number of VMQ's to match when changes are made but seems very unreasonable to me.
So that all said, does anyone else have feedback on these settings?
That seems about right.
I would say prepare for "worst" case scenario. Depending on your environment, it could be multiple values.
Saw you also hit the MS forums, good idea, that.
Coincidentally VMQ go-to doc for MS:
This stated that after you use up all your VMQ's the system will use the default Q for each additional VM NIC. So if I understand this correctly we will get the advantage of VMQ for as many q's as we have available and after that the default Q is used which is basically the same as not running VMQ in the first place??
If the above is correct, I am thinking that I set the interrupts based on the CPU information in my case UCS reports 20 threads per CPU so I will use 40 and set the number of VMQ's to the max which I have been told is 128 for the VIC 1240. This way I will get the most advantage of VMQ's that is possible per node in the cluster. If I happen to go over the default will be used (there was nothing I could do about that anyway as I am maxed out) and I don't see any negative information about having too many VMQ's.
Any other suggestions?!?!?!!?
I'm not __aware__ of negative repercussions of having amount of VMQs maxed out. But I do not know the architectural choices of VIC behind this feature. I'll see if I can find something.
At the same time checking MSDN, I could not provide specific interrupt-to-queue (or reverse) mapping, only this
I do believe we have the following limitations:
Number of VMQs per vNIC = 128
Number of VMQs per blade = 256
Depending on the number of vNICs and/or VMs you can theoretically run out of VMqs.
I configure the Number of Interrupts value to the number of 'threads' or 'logical processors' as shown by UCS Manager. The only question remains, should you device this value tot he number of vNICs (that host VMs) you configured on your Service Profile?
Now, about the Number of VMQs. Cisco has different VICs. For example a VIC 1240 or 1280. But you can also have a VIC 1240 + Port Expander. A VIC 1240 + Port Expander seems to be capable of 256 VMQ's. Now that we know that, there is another thing to keep in mind. How many vNIC (vEth) do you provision in a Cisco UCS Service Profile. To give an example, I use the following:
I am not sure you should divide the available VMQs to these vNIC (Service Profile). I am almost certain you should because I can't get above 256. But I don't think setting the number of VMQs to the maximum is a good thing. Because a VMQ is linked to an Interrupt which then seem to stack up. If you don't use all VMQs the load may not be equally divided. It sounds logical to look at the number of vNICs (Hyper-V) from all your VMs. The downside is this can vary a lot. Especially when you are doing maintenance and shutdown the cluster resource node (e.g. 3/4 up instead of 4/4).
Any thoughts about this?
The last few days I have done a lot of research. Some things are different than we might expect. Just to point out one, the following quote comes straight from TechNet:
Some Intel multi-core processors may use Intel Hyper-Some Intel multi-core processors may use Intel Hyper-Threading technology. When Hyper-Threading is enabled, the actual number of cores that are used by dVMQ should be half the total number of logical processors that are available in the system. This is because dVMQ spreads the processing across individual physical cores only, and it will not use hyper-threaded sibling cores. As an example, if the machine has an Intel processor with four physical cores, and Hyper-Threading is enabled, it will show a total of eight logical processors. Only four logical processors are available to VMQ. VMQ will use cores 0, 2, 4, and 6. technology. When Hyper-Threading is enabled, the actual number of cores that are used by dVMQ should be half the total number of logical processors that are available in the system. This is because dVMQ spreads the processing across individual physical cores only, and it will not use hyper-threaded sibling cores. As an example, if the machine has an Intel processor with four physical cores, and Hyper-Threading is enabled, it will show a total of eight logical processors. Only four logical processors are available to VMQ. VMQ will use cores 0, 2, 4, and 6.
According the Cisco help information you should configure the Number of Interrupts (in the VMQ Connection Policy) to the number of logical processors in your server. But VMQ only applies to physical processors. Hmmm... ok. I also found the following quotes from TechNet and other resources:
Changing number of available CPU may affect the number of RSS queues. When number of available CPUs is lower than currently configured number of RSS queues, Windows OS will silently lower number of RSS queue to match number of CPUs.
This sounds to me that if you configure a larger Number of VMQs (in the VMQ Connection Policy) it is automatically downsized to the maximum number of CPUs available in your system or when manually limited with Set-NetAdapterVmq. With other words, number of VMQs never exceed the maximum available CPUs. This sounds quit logical to me, why should you have more queues than physical CPU's available.
There may be situations where starting with logical processor 0 (which corresponds to core 0) as the RssBaseProcNumber is acceptable. However, general guidance is to avoid using core 0 as the base CPU because it is generally used for default network queues and workload processing in the root partition.
Based on the root virtual processor utilization, the RSS base and maximum logical processors for a physical network adapter can be configured to define the set of root virtual processors that are available for VMQ processing. Selecting the set of VMQ processors can better isolate latency-centric or cache-sensitive virtual machines or workloads from root network processing. Use the Windows PowerShell cmdlet, Set-NetAdapterVmq, to set the correct base processor number and the maximum processor number.
When a host has multiple network adapters, each for a different virtual switch instance, and it is using dVMQ, do not configure dVMQ to use overlapping processor sets. Ensure that each network adapter that is configured for dVMQ has its own set of cores. Otherwise performance results may be impaired and unpredictable.
Ok, apparently VMQ tuning is a best-practice. All processing occurs on the base (e.g. 0:0) processor, until LP (Logical Processor) exceeds 90%. Then one VM traffic will be re-directed to another LP. I have tested this and indeed this is what happens. But by default, ALL your VMs will initially use the BaseProcessor until the LP/core/BaseProcessor exceeds 90%. If you have 16 LPs (with Hyper-Threating, but only counting the physical processor) and have 2 NIC's with a vSwitch it might be best-practice to assign 8 LPs to each NIC each. Each NIC with it's own BaseProcessor.