When we want create a new virtualization environment, should be a must know all of the single element deeply to avoid problems; one of this aspect is absolutely the hardware usage. This article is the first of a lot where I want explain the mechanism behind the scenes of Hyper-V in order to understand how to configure our servers. Today we will walk about processor and in particular about NUMA.
The Non-Uniform Memory Access is a computer system architecture that is used with multiprocessor designs to connect CPU and RAM is the same bus, the result is low latency and great performance. Local memory is the memory that is on the same node as the CPU currently running the thread. Every couple CPU/RAM is called NUMA Nodes.
The system attempts to improve performance by scheduling threads on processors that are in the same node as the memory being used. It attempts to satisfy memory-allocation requests from within the node, but will allocate memory from other nodes if necessary. It also provides an API to make the topology of the system available to applications. You can improve the performance of your applications by using the NUMA functions to optimize scheduling and memory usage.
NUMA in Hyper-V
Hyper-V provides NUMA support, called Virtual NUMA. Virtual processors and guest memory are grouped into virtual NUMA nodes, and the virtual machine presents a topology to the guest operating system based on the underlying physical topology. It’s clear that also the application must be NUMA Aware and this could be true for the big player (Microsoft, Oracle, SAP, etc) but not always for small software vendors.
When a virtual machine is started, Hyper-V attempts to allocate all the memory for that virtual machine from a single physical NUMA node, if sufficient memory is available. This allows the VM to have the maximum performance but means also a potential risk when the demand is over the limit of single NUMA node.
If the memory requirements for the virtual machine cannot be satisfied from a single node, Hyper-V allocates memory from another physical NUMA node. This is called NUMA Spanning, enabled by default in Hyper-V – figure 2, that allows the hypervisor to do cross-assignment between two or more busses. This could be an advantage or disadvantage (later more details).
Pro and Cons
This list explains the benefits and disadvantages of enabling NUMA Spanning:
- PRO – Resource usage across physical NUMA nodes
- PRO – VM power-on even when the NUMA configuration is over-dimensioned
- CON – VM startup and reboot performance may vary
- CON – VM performance very poor when the NUMA configuration is over-dimensioned
This list explains the benefits and disadvantages of disabling NUMA Spanning:
- PRO – NUMA aware workloads perform optimally
- PRO – VM startup and reboot perform optimally
- CON – VM will fail to start if resources are not available
- CON – VM may fail to migrate if resources are not available
- CON – VM with Dynamic Memory enabled cannot use more processors or memory than available in a single physical NUMA node
As you can see, there’s no a perfect solution but is clear that the NUMA Spanning Enabled is probably the flexible way for the most solutions.
Configure Virtual Machine
Each virtual machine can be configured ad-hoc or we can use the default settings, as showed in figure 3.
Processor and Memory numbers are provided by the total amount of each single NUMA Nodes values. In my case, each socket has 8 core with 46GB RAM. In case you want reset the modified value, available only when the VM is powered-off, click on Use Hardware Topology.
To see if your hypervisor supports NUMA you can run this PowerShell cmdlet: Get-VMHostNumaNode – the result will be similar figure 4.
As you can see, both nodes have 10GB of RAM available and this can be checked also from Task Manager, figure 5, where the total amount available of memory is 23GB. This means that my server is balanced.
Another great command, to understand the balance, is: Get-Counter “\Hyper-V VM VID Partition(*)\*” – figure 6.
The first three values provides performance counters that enable you to judge how well a virtual machine’s virtual NUMA nodes align with the host’s physical NUMA nodes. If Remote Physical Pages result is zero means that the virtual machine is perfectly aligned. The Physical Pages Allocated gives us the association between virtual machine and physical NUMA Node.
The last tool is this script from TechNet (https://technet.microsoft.com/en-us/library/dn282282(v=ws.11).aspx), to understand the configuration of each VM and the NUMA association. Add only this row at the end of script: $vNodes | Format-Table @(“VmName”, “vNUMANode”, “VMMem”, “PhyNUMANode”, “PhyNUMANodeMemRemaining”)
Virtual NUMA vs Dynamic Memory
The figure 7 show the detail: when the Dynamic Memory is enabled the Virtual NUMA is not enabled, this means that a single virtual NUMA node will be presented to the virtual machine regardless of the virtual NUMA settings.
The logic behind the scene is easy: if your VM use more resource of single node, the NUMA Spanning request more resources to the other node but the hypervisor expose only one single Virtual NUMA.
When a virtual machine use an application NUMA Aware and the resource requirements are very high, could be better use static memory in order to enable also Virtual NUMA to expose the architecture into the virtual machine. Another point is configuring the settings of vProcessor to split performance.
The right virtual machine configuration is fundamental and also the little details are important to avoid problems and achieve the best performance of your infrastructure. Leave the parameters by default in case you are not familiar with tuning.