Does virtualization really reduce the risk of data loss?


I recently read an article on ZDNet about data protection in virtual environments that made the following statement:

“According to a survey conducted by data recovery vendor Kroll Ontrack, 80 percent of respondents did not believe they were at risk or believed they would reduce the risk of data loss when they stored data in a virtual environment.”

Did I read that right?

80 percent of 724 people that were polled at VMworld think virtualization reduces the risk of data loss?



I’d like to know who these people are and why they feel that way. Virtualization brings a lot of benefits to the data center but magically protecting data isn’t one of them that I know of. If anything you could argue that virtualization increases the risk of data loss as storage becomes a single point of failure and when failures occur they have big impacts. If you look at a traditional data center, your servers and storage are widely distributed, you have a lot of individual physical servers that run a single application which typically store data on the local hard disk of the server. Sure you might have a SAN that you use for storing user data and databases, but a big part of your applications and data is scattered across many servers. If you have a failure on a single server it only impacts that server and not the rest of your environment.

With virtualization you move to a centralized storage model for everything, you still have physical servers that run your VMs but most of them are all stored on a SAN that serves the whole virtual environment. With this model, a failure of a physical server in a virtual environment is typically no big deal as none of your VM data is stored on the physical server it all resides on the SAN. When a host fails you may lose a tiny bit of data from any application that are running and haven’t written data to disk yet, but the VM starts right up on another host and continues where it left off. Now if your storage fails you’re going to be in a world of hurt, those VMware features like HA & FT only protect against host failures, when your storage fails, all those VMs that reside on them go down.

You hear the term all your eggs in one basket when it comes to hosts as they run multiple VMs on them, but at least you have a lot of baskets in your virtual environment. When it comes to storage you truly have all your eggs in one basket as a single storage array services many hosts. So when storage fails, it has a huge impact and greatly amplifies the risk of data loss both in the short term and longer term. Think about it, if I have 200 VMs running on a storage array and it goes down, that’s 200 applications that suddenly had the lights turned out on them and whatever they were doing at the time you lost. Now think about if you had a catastrophic storage failure and you had to recover from the previous nights backup for your whole environment, multiple that times 200 VMs and that’s a lot of data loss.

I can understand how some people might gain a false sense of security after virtualizing as they enjoy the cool new things that they can do now because of virtualization. Their previously rigid physical servers gain superpowers by becoming encapsulated into VMs which provides them with mobility to zip across hosts and datastores while running. They also now have the cool HA & FT features that means they can sleep better at night and not have to head to the office at 2am to get the Exchange server back up and running after a hardware failure. But to think that their VMs are now much safer after virtualizing is just nuts. And its not just hardware failures that can cause big data loss in a virtual environment, there are a lot of other things that can do it as well. With a few simple clicks of a mouse someone can delete a datastore and a lot of VMs or change a setting and shutdown your whole environment. These are things that you don’t have that to worry about in a traditional data center.

Understand this, when you virtualize you’re not in Kansas anymore. When things happen in virtual environments it can have huge impacts. Sure you can help mitigate the risk but the fact remains, shit happens and when it does you can end up covered in it.

So the moral of this story is:

Virtualization does not make your data any safer

So if you’re one of the people that took that survey and think your data is much safer because you virtualized, you better think again, and next year be sure and stop by one of the backup vendors like Veeam or Unitrends and hopefully you’ll learn about the realities of data protection in a virtual environments. And its not just about protecting data through backup methods, you can implement features like stretched storage clusters (vMSC) to protect against storage failures and SRM to provide more traditional off-site recovery options.

So stay safe out there and enjoy all the great benefits that virtualization provides but be smart and make sure you understand the impacts that virtualization has on data protection and what you need to do to keep your data safe.

Share This:

1 comment

    • Seb on October 14, 2013 at 4:41 am

    Despite I agree with you, i think that that’s not entirely true : SAN environments are still more reliable than local storage. And Since backup plans have evolved fot virtualizations, expenses have been done by the same people to get their backup environment straight (most of them were not so straight before).
    And don’t forget that “before virtualization” means also “before you had to buy a new Storage environment, more safer today than some years ago”….
    So i think that maybe yes, for a whole lot of IT people, virtualization brought indirectly something to the data loss problems.

Comments have been disabled.