The distance between the physical resources and the effects of VM migration must be addressed to provide business continuity and DA when managing storage extension. The maximum distance is driven by the latency supported by the framework without impacting the performance.
VMs can be migrated manually for DA, or dynamically (e.g. VMware Dynamic Resource Scheduler) in a cloud environment. VM migration should occur transparently, without disrupting existing sessions.
VMs use dedicated storage volumes (LUN ID) provisioned from a SAN or a NAS disk array. These storage volumes cannot be replaced dynamically without affecting the active application and stateful sessions.
In order for the VMs to move from one physical host to another in a stateful mode, they need to keep the same storage. This behavior is not a concern when the VM moves from one host to another within the same physical PoD or between PoDs inside the same physical data center, as the distances between hosts and storage disks are very short. The storage is therefore provisioned in shared mode with the same physical volumes accessible by any host.
Shared storage means that during and after the movement of a VM, the operating system remains attached to the same physical LUN ID when the migration occurs between two hosts.
However, this shared mode of operating storage may have an adverse effect in the environment of DCI due to a long distance between hardware components. According to the rate of transactions per second (TPS) and depending on how much I/O the application itself consumes (e.g. database), beyond 50 km (1ms latency in synchronous mode), there is a risk of impacting the performance of the application. Assuming a VM has moved to a remote site, by default it continues to write and read data stored on its original physical volume.
Several storage services can be enabled to compensate for this behavior:
Cisco IOA: Cisco provides an I/O Acceleration (IOA) function on Cisco MDS 9000 Series Fabric Switches. IOA can halve the latency for synchronous write replication, thus doubling the distance between two sites for the same latency. In partnership with ECO partners NetApp and EMC, Cisco and VMware have tested and qualified two types of storage services to improve the sensitive remote I/O effect due to VM mobility between sites.
FlexCache from NetApp: This feature supports local storage cache of (i.e. secondary DC) data that has been previously read on the original disk. Any read command associated with the data already stored locally doesn’t have to cross the long distance between the two sites, and thus this function has a negligible latency on read commands, although the original ID is still physically on the primary data center (shared storage). Therefore the current stateful sessions can retain their active state during and after VM migration without being disturbed. FlexCache operates in a NAS environment. The actual data is still written on the single location at the original site. Therefore this mode of storage remains shared.
VPLEX Metro from EMC: This feature allows users to create a virtual volume that is distributed between two remote sites. Both volumes can synchronously present the same information on two different sites. The volume is created and shared between two VPLEX clusters, connected via an extended Fiber Channel running in synchronous mode. The data is replicated and synchronized between the VPLEX devices using dedicated FC link.
The initiator (host) writes data on the same but virtual LUN ID available on both sites at the same time. This technology replicates the same settings of the SCSI parameters on both storage targets (VPLEX), making the change of physical volume transparent to the hypervisor.
The maximum distance between the two cluster members of the VPLEX metro should not exceed 100 km due to the replication running in synchronous mode. Synchronous mode is required to maintain the transparency of this service.
This function works in a SAN environment and the storage mode is therefore known as Active/Active.