- Topics Mentioned
- VMware vSphere
When we discussed VMware Distributed Resource Scheduler (DRS) and Storage DRS, we saw in great detail how to configure them to optimize and balance resources by moving running VMs in our host and storage clusters. While this can be very useful, it can also cause some issues and administration complications.
You may recall that DRS uses performance and other factors in determining the best placement for your VMs. However, there are always other considerations that are hard to include in the DRS algorithm; those considerations lead some admins to avoid configuring DRS in fully automated mode to prevent it from misplacing their VMs. But those constraints and limitations can be configured using affinity and anti-affinity rules.
This may sound like a tough concept from some law or psychology textbook and many of the articles I read explain it in a very theatrical way, so our approach in this article is to use those rules in real life scenarios to achieve some practical goals.
Goal 1: VM-VM Affinity
We need to keep two VMs together on one host.
A multitier application is usually composed of multiple VMs including a web frontend, an application server and a database backend. Those three tiers of VMs communicate heavily with each other, and if those VMs reside on different hosts, the network traffic must exit a host to the physical network to reach the other VMs in another physical host. This is far from efficient, and if you have such a case you better do something about it.
I do not have a multitier application in this DRS cluster; but I have a back-to-back firewall setup where the Windows based TMG is set behind the Linux based Vyatta. Most of the traffic passing through one of those VMs passes through the other, so I prefer to keep them together on same host.
From the web client, go to your vCenter, then Hosts and Clusters and manage your DRS Cluster. Under the Settings tab and below configuration you can notice “DRS Rules.”
I am sure that you cannot wait to add our first rule “Keep Virtual Machines Together” which adds TMG and Vyatta VMs to the list of VMs that must be run on the same host.
If your DRS Cluster is set to manual mode, as I did in order to catch the effect of the new rules in action, you will notice a recommendation. However, when set to fully automated mode DRS migrates the VMs to bring them together on the next run of DRS.
In this case DRS chose to move the Linux based Vyatta VM since it is much smaller than the Windows based TMG, which makes this the least costly move to satisfy the rule.
Goal 2: VM-VM Anti-Affinity rule
We need to keep two VMs apart on different hosts.
Hosted on your DRS-HA cluster in the form of VMs, you may have an SQL cluster for high availability or a load balanced web farm that is composed of multiple VMs. If two SQL servers are hosted on one physical host, you will lose both instances temporarily until VMware HA can restart them on one of the serving physical hosts. This will cause an unjustified interruption of service. Hence, it makes much sense to prevent DRS from placing VMs that perform the same job on the same physical host.
For this scenario, we will create a rule that separates our domain controllers to any two different hosts.
As before, DRS generated a recommendation in reaction to our new rule: it wants to migrate DC0 to the second host to keep it away from DC1.
In this case, it chose to move DC0 to ESX-B since DC0 is the most active while ESX-B is the least utilized host of the cluster. Therefore, vSphere DRS considers performance optimization when providing its recommendations, but it does so while considering affinity and anti-affinity rules.
Goal 3: VM-Host Affinity rule
We prefer to know where our vCenter.
It is estimated that around 60% of vCenter servers are installed as physical servers. I once asked an administrator why he chose to dedicate a full physical host to his vCenter instead of benefiting from the flexibility and better protection that installing it as VM can provide. His answer was, “If the vCenter service fails, I don’t want to waste my time connecting to each physical host searching for my vCenter to fix it.” This is a valid concern.
However a stronger reason to use VM-Host affinity rules is to limit running some VMs to a particular set of hosts. For example, Oracle requires its customers to license each host that may run an Oracle instance. This means that if your DRS cluster contains one Oracle VM, you may have to buy Oracle licenses for each host in your cluster. A way around this is to impose a hard limit on your VM to stick to some hosts and license those.
Since it is a hard rule, these types of VM-Host affinity rules are called “Must” (mandatory) Rules, while the other type is called “Should” (preferential) Rules.
Let us take a break from the theory and try to create our VM-Host affinity rule.
Opps, it seems that we cannot do that yet, as we need to create our “VM Groups” and “Host Groups” before that. In order to do this we need to go back to the DRS Configuration and add some “DRS Groups.”
With the new web client Work in Progress feature, we can minimize the dialog and create those.
To my pleasant surprise, the web client auto filled the groups in the correct dropdown boxes as soon as I returned to the Create DRS Rules dialog.
For this case, I have changed the “Must run on hosts in group” to “Should run on hosts in group.”
What is the difference between Must and Should rules?
The difference between “Must” and “Should” VM-Host affinity rules is so important since it affects the way those rules interact with DRS, DPM and most importantly HA.
One of the most frequently asked question about affinity rules:
Will affinity rules prevent my VM from restarting on a surviving host in case of an HA event like a host failure?
The answer is: It depends.
If a VM-Host affinity rule was configured as a “Must” rule, your critical Oracle database will not run if you do not have a “licensed host” for it, even if you have other free hosts in the cluster (which is exactly what Oracle wants). However, if the rule was defined as a “should” rule, HA will have no problem running the VM.
It’s also worth mentioning that DRS and DPM will never recommend any action that violates a “Must” rule, but does its best to comply with “Should” rules unless it really needs to violate it. In addition, and as we can see in the screenshots, a “Must” rule generates “priority 1″ recommendations (the highest possible), while “Should” rules generates “priority 2.”
Here are very important tips I learned at VMworld 2011 you should keep in mind:
- “Must” rules are always tracked by the host even after you disable DRS
- This can cause unexplained odd behavior if you are unaware of this fact
- Remove “Must” rules before disabling DRS
What about VM-VM affinity rules and HA?
HA is not even aware of those rules, so it will ignore them completely in case of HA events. If a host fails, your second SQL cluster node or your second Domain controller will restart on an available host even if it contains the first instance.
However, if you have more hosts, DRS will correct this state as soon as it detects it. DRS runs by default every 5 minutes, and considers affinity rules when providing recommendations or acting on them.
What about Storage DRS, are there affinity rules related to it?
As with hosts in clusters, a datastore in a storage cluster may be temporarily lost or corrupted. Losing both your domain controllers at the same time can be catastrophic, but VMware helps you avoid this by configuring “VM Anti-Affinity” rules.
Go to manage your datastore cluster, and click the “Settings” tab. From there, click on “Rules” to enable you to add your first storage Affinity rule.
This “VM anti-affinity” rule keeps “Keep DC0 and DC1 files on different datastores.”
Like with host related affinity rules, Storage DRS will soon recommend an action to comply with the new Affinity rule by migrating DC1 to the second datastore in the cluster.
What about the Virtual disks of a single VM?
It’s usually preferred to store all virtual disks of a single VM in the same location, but for some applications special needs dictate that you do not. For example, according to best practices of database servers, you should have your logs stored on a different disk drive than the one containing your data files.
This enables you to use your logs with your last backup to restore the data to a specific point of time. To apply this best practice in the virtual world, one must ensure that the VM virtual disks are not stored together on a single datastore. This can be achieved by configuring a “VMDK anti-affinity” rule.
The Default VMDK Affinity Rule
However, if you try to create a VMDK anti-affinity rule, you may be greeted with an error message informing you of a conflict with another rule that keeps the virtual disk of this VM together on the same datastore.
This default rule can be disabled by editing the “Advanced Options” under “Storage DRS” settings of your datastore cluster.
However, disabling the default VM affinity rule for all VMs is not the recommended option. The default rule makes it much easier to troubleshoot and recover virtual disks when you know they are all stored on one datastore. Yet, if you need to configure an exception for a VM, you can do this by adding a VM Override.
There is another exception to the default rule that does not need an override; if you specify a swap file location on the host level or the VM level, the swap file will be exempt from the default rule.
Affinity rules are a powerful tool
With greater power comes greater responsibility, and Affinity rules give virtualization admins the power to impose restraints on the DRS algorithms. If misused they can severely limit your clusters abilities to select the most optimized resources to balance your load.
Moreover, careless virualization admins can create conflicting rules that makes life much harder. An alarm may be generated in such cases, but troubleshooting can become tedious if you have too many configured rules.
Hence, Affinity rules must only be configured as an implementation to real business needs. Otherwise, leave DRS to do what it can do best: automatically optimize loads for you.
Learn more advanced virtualization techniques by watching Advanced vSphere Networking Training videos.