1 – Disaster Recovery

Traditional data center interconnection architectures have supported Disaster Recovery (DR) backup solutions for many years.

Disaster Recovery can be implemented in Cold Standby, Warm Standby, and Hot Standby modes. Each option offers different benefits.

  • Cold Standby: The initial DR solutions worked in Cold Standby mode, in which appropriately configured backup resources were located in a safe, remote location (Figure 1). Hardware and software components, network access, and data restoration were implemented manually as needed. This DR mode required restarting applications on the backup site, as well as enabling network redirection to the new data center. The Cold Standby model is easy to maintain and remains valid. However, it requires a substantial delay to evolve from a standby mode to full operational capability. The time to recover, also known as Recovery Time Objective (RTO), for this scenario can require up to several weeks. In addition, the Recovery Point Objective (RPO), which is the maximum data lost during the recovery process, is quite high. It is accepted that several hours of data might be lost in a Cold Standby scenario.

Cold Standby Mode

  • Warm Standby: In Warm Standby mode, the applications at the secondary data center are usually ready to start. Resources and services can then be manually activated when the primary data center goes out of service and after traffic is being fully processed to the new location. This solution provides a better RTO and RPO than Cold-Standby mode, but does not offer the transparent operation and zero disruption required for business continuity.
  • Hot Standby: In hot standby mode, the backup data center has some applications running actively and some traffic processing the service tasks. Data replication from the primary data center and the remote data center are done in a real time. Usually the RTO is measured in minutes and the RPO approaches zero, which means that the data mirrored in the backup site is exactly the same as that in the original site. Zero RPO allows applications and services to restart safely. Immediate and automatic resource availability in the secondary data center improves overall application scalability and equipment use.
  • Data Replication: The different disaster recovery modes are deployed using Layer 3 interconnections between data centers through a highly-available routed WAN. The WAN offers direct access to the applications running in the remote site with synchronous or asynchronous data mirroring, depending on the service level agreement and enterprise business requirements.

The pressure to support business continuity has motivated many storage vendors to accelerate the RTO by offering more efficient data replication solutions that achieve the smallest possible RPO. Examples of highly efficient data replication solutions are host-based and disk-based mirroring.

For example, with Veritas Volume Replicator® host-based mirroring solution, the host is responsible for duplicating data to the remote site. In the EMC Symmetrix Remote Data Facility® (SRDF) and Hitachi (HDS) TrueCopy® disk-based mirroring solutions, the storage controller is responsible for duplicating the data1.

A disaster recovery solution should be selected based on how long the organization can wait for services to be restarted, and above all, how much data it can afford to lose after the failover happens. Should the business restart in a degraded mode, or must all services be fully-available immediately after the switch-over? Financial institutions usually require an RTO of less than one hour with an RPO equal to zero without a degraded mode. This is a fairly widespread practice so that no transactions are lost.

This entry was posted in DCI, DR&DA. Bookmark the permalink.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.