A long time ago in a vSphere version far, far away VMware introduced support for automatic space reclamation which allowed vSphere to send UNMAP commands to a storage array so space from deleted or moved VMs could be un-allocated (reclaimed) on the storage array. This was a welcome feature as block storage arrays have no visibility inside a VMFS volume so when any data is deleted by vSphere the array is unaware of it and it remains allocated on the array. UNMAP was supposed to fix that so when data was deleted vSphere would send a string of UNMAP commands to the array telling it exactly which disk blocks it could have back. Doing this allowed thin provisioned storage arrays to maintain a much smaller capacity footprint.
Shortly after this feature was introduced in vSphere 5.0 which was released back in 2012 problems started surfacing. As the UNMAP operation was real-time (synchronous) vSphere would have to wait back for a response from the array that the operation was complete. In many scenario’s this wasn’t a problem, but some arrays apparently had problems completing the operation in a timely manner which would cause timeouts and disk errors in vSphere. As a result VMware quickly issued a KB article recommending to disable UNMAP support and in vSphere 5.0 Update 1 they completely disabled it.
What VMware did next was introduce a manual reclamation method by modifying the vmkfstools CLI command and adding a parameter to it that allowed UNMAP to run on an array as a manual operation. While this worked it took quite a while to execute and was very resource intensive on the array. The reason for that is all the manual operation was doing was creating a balloon file using un-used space on a VMFS volume and then sending UNMAP commands to the array to reclaim it all. The end result was that instead of reclaiming just the blocks from deleted VMs it tried to reclaim all the remaining free space on the VMFS volume which was terribly inefficient. You can read all about how this all worked in this post I did back then.
So since that time VMware has never figured out a way to make it work again until now. In vSphere 6.5 they have again made in an automatic operation but not in the same way as before. What they did was kind of a compromise, instead of trying to do it all as a synchronous operation they are now scheduling it and sending UNMAP commands in bursts to the storage array in an asynchronous manner. So it is truly an automatic process now but it operates in the background and how fast it works is based on priority levels that can be set on individual VMFS datastores.
Now this only works in vSphere 6.5 and only on VMFS6 datastores, VMFS5 datastores must still use the manual method using the esxcli command to reclaim space with the balloon file method. When you create a VMFS6 datastore the default priority will be set to Low which sends UNMAP commands at a less frequent rate to the storage array. In the vSphere Web Client you will only see the option to change this to either None or Low with None disabling UNMAP completely. However using the esxcli command (esxcli storage vmfs reclaim config) you can also change this setting to Medium or High which increases the frequency in which UNMAP commands are sent by 2x (Medium) and 3x (High) over the Low setting.
Now why did VMware not allow you to choose Medium or High from the Web Client? There is a good reason for that, they hid those options for your own good. UNMAP is still a resource intensive operation, when you do an UNMAP operation you are literally telling the array to un-allocate millions or billions of disk blocks. When you get more aggressive with UNMAP commands it will start putting a heavier load on the storage array which can seriously impact your VM workloads as the array tries to handle everything at once. Having this set to Low is a good compromise as you get your disk space back automatically but with minimal impact to your VM workloads. If you do happen to set it to Medium or High via esxcli it will still show those settings in the Web Client, you just can’t select them there.
So welcome back UNMAP, we missed you and are glad to have you back. Of course if you are using VVols you don’t have to worry about UNMAP at all as the array has VM-level visibility and knows when VMs are deleted and can reclaim space on it’s own without vSphere telling it to.
Great article and explanation!
One question: Is it still only working with Thin Provisioned VMDK’s? I prefer to have Thin Provisioning only in one place, the storage, so I have to monitor disk usage only there.
So it doesn’t matter if a VM is provisioned thick or thin in vSphere, when you delete it then it will be unmapped on the array. You don’t really need UNMAP if you are using vSphere thin provisioning only as vSphere knows the space is un-allocated inside VMFS. The best place to use thin provisioning is on the array side and that’s when you get the benefits of UNMAP.
Thanks for your reply!
I did not mean the case if a VM is deleted. My approach is, what requirements have to be fullfilled to get disk space back on the storage system when data is deleted inside the VM (eg. Windows 2012 R2, which itself can do UNMAP and gives this disk space back to the VMFS volume). What I have read for vSphere 6.0 so far is, that therefore the VMDK must be Thin Provisioned.
Could you please comment on the default “space reclamation granularity” in VMware? Is it the same at the reclaim unit size that you mentioned in som earlier articel? What would be the best setting for the granularity and the reclaim unit size?