Memory Ballooning: Managing Memory in a Virtual Environment
VMware's vSphere hypervisor, commonly referred to as simply vSphere or ESXi, allows us to create many virtual machines on the same physical server. Because virtual machines rarely use 100% of their allocated memory, ESXi allows us to overcommit memory on the physical server also known as the host.
But what happens when you allocate more memory to virtual machines than is physically installed on the host server — and your virtual machines try to use all of this memory? This is exactly where ESXi's intelligent memory management system comes in. Let's first look at four memory management techniques and what they address.
1. Transparent Page Sharing is Enabled by Default
One of the least impacting memory management options available in ESXi is TPS or Transparent Page Sharing. By default, TPS works by looking inside a virtual machine's memory and looks for duplicate memory pages. If blocks of memory are the same, they can be reclaimed back on the physical server and used for other virtual machines. This option is enabled by default on recent versions of ESXi.
VMware also offers Inter-VM TPS which will deduplicate memory pages amongst other virtual machines which are running on the same ESXi host. This is particularly useful when you are running many of the same operating systems on the same ESXi host. This is because operating systems could have many gigabytes of the same memory pattern being used. Deduplicating this memory to only one copy on the host can therefore save considerable amounts of memory.
Keep in mind that Inter-VM TPS (TPS between virtual machines) is disabled by default in recent ESXi versions. This is due to VMware's strict security policies around the sharing of virtual hardware between virtual machines. If one VM was to be successfully attacked, strict isolation between virtual machines is beneficial to prevent the attack from spreading to other machines.
2. Memory Ballooning is Mostly Non-disruptive
Memory ballooning is probably the most well-known memory reclaim technique. Ballooning is mostly non-disruptive to virtual machines. Ballooning is an interesting method to claim back unused memory from virtual machines to the host for use in other demanding machines.
VMware tools must be installed for memory ballooning to work. Once a host becomes low on memory, it will search for virtual machines with memory that is "free" but has not been released back to the host. A balloon driver inside each virtual machine inflates, attempting to squeeze out memory like a balloon within the virtual machine to force out "free" memory back to the host. The host will detect how much memory has been reclaimed and allow the host to use this for other virtual machines, with the hope that this is enough memory to resolve the host's low available memory issue.
Ballooning is considered non-disruptive but if you have constant ballooning occurring on your host, you'll want to resolve this more permanently by adding more memory to the host or moving virtual machines to other hosts with enough memory.
3. Memory Compression Can Save Small Amounts of Memory
If TPS and ballooning do not free up enough memory for the host, then compression commences on the virtual machine's memory. While TPS and ballooning are generally non-disruptive, compression does start to have a performance impact. This is why compression is one of the last options the host uses to claim back memory.
If memory pages can be compressed, then the host will attempt to do so and store them in a special cache on the host. Once the cache is full, compressed pages need to be switched with other memory pages before more memory can be compressed.
Compression will likely reduce performance on the virtual machines and their workloads, so It's important to ensure you are balancing virtual machines between hosts or use DRS to do this automatically for you, subject to the correct vSphere licensing.
4. Virtual Machine Memory Swapping Impacts Performance
The final option that ESXi will attempt is to use memory swapping. This will swap memory from virtual machines to disk. When you power on a new virtual machine, you'll notice that a swap file, the same size as the virtual machine's configured memory is created. This is the file that ESXi will use should swapping be required on the virtual machine. Note that there is also in-guest or operating system level swapping which is different and handled by the operating system.
Because memory swapping is the swapping of memory to a physical disk and with disks being significantly slower to read and write from than memory, any virtual machines with swapped memory will have a significant performance impact. It is for this reason why memory swapping is the last option that ESXi will use to reclaim memory when the host is running low.
Why You Should Use Memory Ballooning
Memory ballooning is an excellent technical solution to claiming back memory from virtual machines. The ballooning driver inflation is a smart way to claim memory from the virtual machine to the host. It's beneficial because it will only generally claim back memory which is free inside the virtual machine, thus is usually a non-disruptive memory reclaim technique.
Memory ballooning is also smart in that the hypervisor, ESXi will try to balloon memory before trying disruptive methods. You can also monitor ballooning activity via performance charts in the vSphere client. Ballooning is an early indicator of memory issues in your environment, so it can serve as a warning to rightsize virtual machines and re-balance them over other hosts.
Memory ballooning is overall a good technology to claim back unused memory from virtual machines so that it can be used for other important hypervisor tasks or for other virtual machines that need the memory more urgently.
Cons of Memory Ballooning
Memory ballooning while an impressive solution can cause issues to virtual machines that have reoccurring spikes in memory demand. Also, ballooning will not work if your virtual machines are using all of their memory by applications within the virtual machine, this is commonly noticed in applications such as databases.
Ballooning can also become a problem if it's relied upon too much by the hypervisor. In an ideal situation, there would be no ballooning taking place. This would indicate a healthy environment. Ballooning would only commence if there is too much demand for memory on the host, in other words when the ESXi host does not have enough free physical memory to allocate to virtual machines. If ballooning is happening all the time then a performance issue can start on the host.
Many ballooning operations could cause additional CPU cycles to be used to perform the ballooning operations. This, in turn, could reduce the amount of physical CPU available to virtual machines on the host.
How Do I Check Memory Ballooning in VMware?
There are different ways to check if memory ballooning is in operation on your VMware ESXi hosts.
First, you can SSH into your ESXi host and run esxtop. After esxtop loads, hit the m key then the f key. The MCTL? heading indicates if a virtual machine has the ballooning driver installed via VMware tools. The MCTLSZ heading shows the current amount of memory being ballooned. The final column, MCTLMAX shows the maximum amount of ballooning supported on the virtual machine. If this value is zero then the ballooning driver is unlikely to be installed, or perhaps the virtual machine's memory is fully reserved.
You can also check for memory ballooning within the vSphere client. If you select an ESXi host, then open the Monitor page you will be able to open Performance > Advanced charts. Once they have loaded, change the View to memory. Within this view the chart legend will show you which line is allocated to Ballooned memory, hopefully, this is showing a zero value for your host.
How to Disable Memory Ballooning on Virtual Machines
To disable ballooning, you can set the virtual machine to reserve 100% of its memory. This is a better option than manually reinstalling VMware tools with the ballooning driver unselected because this is harder to manage. In general, you wouldn't want to disable memory ballooning but if you have some applications or virtual machines that can feel the effects of ballooning then reserving all their memory is a good option. Also, remember that reserving all memory on a virtual machine will ensure that hypervisor swapping will not occur since the memory has been reserved up-front.
VMware's ESXi hypervisor does a great job in allowing overcommitment of memory for virtual machines. This has a huge cost saving since many virtual machines can be deployed to a single ESXi host. When all of the hosts' memory runs out, ESXi will commence its memory reclaim technologies, first starting with the least disruptive, moving through to the most disruptive options which will cause performance issues for your workloads.
You can always disable some of the memory reclaim techniques but remember that you want to understand the impact of doing this if the host runs out of memory and also the management overhead of doing so.