As I’ve been learning more and more about VMware Virtual Volumes and the impact that it will have on everyday storage operations in vSphere one thing that I’ve been trying to find out is the impact that VVols will have on backups. In this post I’ll focus on 2 areas related to backups in a VVol environment: backup transport mechanisms and backup snapshots.
How VVols impacts backup transport methods
There are several methods that you can use to backup your virtual machine using software from vendors like Veeam, Unitrends and Symantec. The first is the traditional method, backing up using an agent inside the guest OS, this one is generally not recommended as it is not very efficient in a virtual environment. The next is the Hot Add method, this essentially allows a VM running backup software to hot add another VM’s virtual disk to it so it can be backed up without impacting the VM. This allows the backup VM to have direct access to the virtual disk to back it up. Then there is the LAN (NBD) method where a ESXi host reads VM data from storage and sends it across a network to the backup server, this method uses the network stack instead of the storage stack so it not as efficient.
Finally there is the Direct to SAN method, this method requires a backup application or proxy running on a physical server that has direct access to your SAN where your VMFS datastores reside as shown below.
The reason for this is that with VVols, VM’s do not reside on LUNs with a file system over-layed (VMFS), instead VM’s are packaged into VVols and stored directly on a storage array inside Storage Containers (logical entity). VM’s are then accessed by ESXi hosts via a Protocol Endpoint that resides within the array, the PE is essentially a special LUN that has conglomerate status (admin LUN). The PE then binds VVols to a host using secondary LUN IDs (sub-LUNs) that are assigned to each VVol and reported back to a ESXi host via the VASA Provider as shown below.
How VVols impacts backup snapshots
Before a virtual machine can be backed up using the methods described a snapshot must be taken of a VM in vSphere. Doing this allows the VM to be frozen at a point of time so it can be backed up without any changes (writes) occurring while the backup process completes. Once a backup is finished the snapshot is discarded, this process (creation/deletion of snapshot) is controlled by the backup application at the beginning and at the end of the VM backup.
What happens when you take a VM snapshot in vSphere is the VM is briefly stunned and a separate delta virtual disk is created that contain any disk writes that might occur within the VM while the snapshot is active. The original virtual disk remains Read Only and all new writes that occur while the snapshot is running are deflected to the delta virtual which is Read-Write. If an additional snapshot is taken then the previous snapshot becomes Read Only and a new delta virtual disk is created that becomes Read-Write. Once you no longer need a snapshot and it is deleted all of the changes that occurred while the snapshot was active need to be merged back (committed) into the original virtual disk from the delta virtual disk. Once all that operation completes the delta virtual disk files that were created are then deleted and the original disk becomes Read-Write again.
This commit process can be time consuming based on how long a snapshot is active and the amount of writes that occur while it is active. If you have a very write intensive application running inside the VM and the snapshot is active for a long time (days/weeks) the commit process can take hours to complete.
With VVols the whole VM snapshot process changes dramatically, a snapshot taken in vSphere is not performed by vSphere but instead created and managed on the storage array. The process is similar in the fact that separate delta files are still created but the files are VVol snapshots that are array-based and more importantly what happens while they are active is reversed. When a snapshot of a VM on VVol-based storage is initiated in vSphere a delta VVol is created for each virtual disk that a VM has but the original disk remains Read-Write and instead the delta VVols contain any disk blocks that were changed while the snapshot is running. The delta VVols are all Read-Only as they are simply storing changed disk blocks while the original disk remains Read-Write as illustrated in the short video below.
Now the big change occurs when we delete a snapshot, with VVols because the original disk is Read-Write, we can simply discard the delta VVols and there is no data to commit back into the original disk. This process can take milliseconds compared to minutes or hours that is needed to commit a snapshot on VMFS datastores. How does this impact your backups? Because we have to take a VM snapshot of a VM, the backup application no longer has to sit around waiting at the end of the backup for any changes to commit while the snapshot is deleted. Depending on the size of your VM and how much change that occurs within the VM while the backup is running with VVols this can reduce your backup times from seconds to minutes or more per VM, multiply this times dozens or hundreds of VMs and you can really reduce your backup window by a good amount of time.
To validate this Symantec has done some testing by doing some benchmarking that compares a group of VMs being backed up on a VMFS datastore versus the same VMs being backed up on a VVol Storage Container. The following information is a summary of their results from the VMworld session that they presented on this topic (STO5844 – Benchmark Testing: Making Backups Better Than Ever Using Virtual Volumes).
Their environment consisted of NetBackup 7.7 with a 3PAR 7200c storage array that had 24 1.2TB 10K disks. They did a comparison using 60 VMs with 100GB virtual disks and 40 of them powered on. They did simulate a 10% data change rate inside the VM while the backup is running and they first tested with VMFS and then wiped out the array and configured it for VVols. Their testing focused on the amount of time it took to create VM snapshots when the backups are started and the time it took to delete and consolidate them when the backup is finished.
From their testing they found that overall backup times were reduced by around 30% as seen below on the slides from their VMworld session. They found out that snapshot creation time took a few seconds longer with a VVol-based snapshot, but the snapshot deletion time was dramatically reduced. They tested this using different numbers of simultaneously running backups with consistent results. They also found that snapshot errors that sometimes occur during the delete process were virtually eliminated. The net effect of this much improved snapshot mechanism with VVols can amount to a much more efficient and shorter backup operation.
Source: Symantec, VMworld 2015 session ID STO5844
Looking beyond backups the new VVol-based snapshot mechanism will be a great time and resource saver in any vSphere environment. Of all the VM snapshots that we take for whatever reason, how many do we actually use to revert back to a point in time, very few I would say. Typically we are creating VM snapshots for insurance purposes and never really end up using them. Not having that long resource intensive commit process running on your ESXi host as well as the extra resource consumption required to maintain snapshots on VMFS means less burden on the host and more resources for your VM workloads. The storage array is much better equipped to do snapshots and the shift to move them off the host and to the storage array with VVols is a great benefit not just for backups but also for any use case that you might use snapshots for in your vSphere environment.