It is essential to understand how the VMware Virtual Machines Storage view is different from VMware Virtual Datastore in order to model the sizing more efficiently, understand what each number means and how it could be assessed thereafter according to the certain needs (migrating to the cloud, moving to a different storage array, etc).
In VMware infrastructure there are 2 different key objects in terms of storage:
- virtual disks (connected to VMs)
This view provides an overview of VMware Virtual Machines Storage and is basically an aggregation over application metrics of Virtual Disks that are associated with VMs:
VMware Virtual Datastore view provides an overview of repository for virtual machine files including log files, scripts, configuration files, virtual disks, and so on (either VMFS or NFS).
VMware Virtual Datastore view
Total Capacity - the formatted size of a volume. The size is calculated after a volume has been formatted; provides a metric of how much data could be potentially stored on the datastore.
Used Capacity - the total capacity that is consumed across all datastores. Impacted by the storage array (might not be local) technologies. Below are 2 examples:
- it could be a NFS datastore where you store 30 files of 1 TB in size. If remote storage (e.g NetApp) has enabled deduplication or compression, the actual size is going to be less than 30 TB (e.g. 27.9 TB). Thus, the lesser usage in this section might be reported compared to the actual file size.
- on the other hand, if the data is stored on the local disk - where there are no storage techniques used, thus the Used Capacity will be exactly of he same size as the actual data files.
Summing it up:
- if deduplication or compression is enabled - Used capacity can be (potentially) less than actual data that is stored
- if there are no saving mechanisms - Used capacity essentially is the sum of all files located on the datastore
Used Capacity files do not necessarily need to be files that are associated with VMs. Used Capacity size can be less than VM Storage if storage has deduplication enabled or all files are compressed.
For instance, if it is planned to:
- migrate to the same storage array (but of a bigger size) - Used Capacity (Datastore) will adequately show how much storage is needed (if it is planned to store the same data with the similar impact on the storage system - similar type of storage saving)
- migrate to the different sort of back end system (from old Dell to the new NetApp Filers) - it is needed to decide whether these numbers are appropriate. Initially, it is required to understand whether the new device has the same storage saving capabilities as the old one. If they differ, raw numbers can be used, but probably usage may go down - if new space saving technology emerges
- migrate to the Cloud - Used Capacity will show how much data will need to be transfered
- understand how much data is saved on the storage array, Used Capacity of VM storage and VM datastore can be compared. For example, if:
- VM Storage = 30 TB, Datastore = 15 TB. Space saving = 50%
- VM Storage = 30 TB, Datastore = 25 TB. Space saving = 18%
Provisioned Capacity - is not used for sizing. It is used for calculating how much space VMware estimates for the growth of the infrastructure in the current configuration. For instance:
- VM has 1 thin-provisioned disk of 40 GB
- it is filled up with 20 GB of data currently
- since the disk can potentially grow up to 20 GB more, Provisioned Capacity will report 20 GB.
Difference between Used and Provisioned (Datastore) Capacities provides information of how large infrastructure can grow to if it will be pushed up to 100%.
It is an estimated upper bound of how far infrastructure will grow if nothing changes in current infrastructure. Should be used for the worst-case scenario. It is a total capacity with all files included. If VM has a number of snapshots, only current snapshot and delta files sizes will contribute to this section.
VMware Virtual Machines Storage view
Snapshot/Delta - these files are related to VM drive files. Usually, they are not large, but still contribute to the both Used and Provisioned Capacities. Both Snapshot and Delta files are architecture in the same way but use 2 different technologies.
- When the snapshot of a disk is taken, the current state of the disk is retained. The delta file is created then with the size of 0 KB - it is reserved for future writes. If another snapshot is taken, current delta file will be retained and the new delta file will be created. So the snapshot tree will contain of base file and a sequence of delta files.
- There is also a snapshot file is created - it contains the currents state of the memory and reaches the size of VM memory. For instance, if VM has 16 GB of RAM the snapshot file will reach the size of 16 GB.
So, combined size of both snapshots and delta files is reported in this section.
Delta files usually used as base drives. But they could also be used as templates (base drives) in VDI infrastructures. This is a “golden image’ that is always persistent. All the changes are written to the delta file only, which are always thin-provisioned.
For instance, there might be a base disk (“golden image”) of 50 GB (could not be re-written) and also be a delta files reserved for it’s own need of 200 GB. This delta disk could reuse 200 GB (the situation when the whole disk is re-written). If this happens, the Provisioned Capacity would be of 250 GB. Used capacity for each VM would be 50 GB (“golden disk”) + difference file (what has been changed, e.g. 10 GB) = 60 GB. If we have 100 identical VMs of this kind we will end up with:
- VM disks view: the total usage will be 6000 GB (60x100), although VMware should take into the account the shared file only.
- Datastore view: it will only be accounted once. Huge discrepancy will be found in both views then.
Total Capacity - this is the sum of all Virtual Disks. If there are 30 disks of 1.1 TB they will contribute to 30.3 TB of Total Capacity, not matter how they are provisioned.
Used Capacity - includes all of VM files (disk files, delta files, snapshots, etc). The used capacity can be easily more than Total Capacity.
- If all disks are thick-provisioned, Used > Total (guaranteed)
- If all disks are thin-provisioned, Used < Total (possibly).
If VM has a number of snapshots, all snapshots and delta files sizes will contribute to this section.
- VM with a disk of 1TB
- There’s only 100GB used
- Lanamark Portal will report 100GB.
- VM has 16 GB of storage allocated
- on the datastore, there is a vmdk disk of 4.5 GB
- and other VM files (including swap and log files) of (roughly) 2.0 GB
- Lanamark Portal will report it as "6.5 GiB (36%) used of 18.2 GiB total allocated capacity"
When it is planned to migrate VMs to cloud this Used Capacity should be referred, because swap files will not be counted as well as the log files. Usually, the bulk of space are VM disks and associated snapshots. If there is a requirement to calculate the size of the VMs that will be allocated in the cloud - the following formula can be used: Used - Snapshots/Delta.