hamburger icon close icon

Disaster Recovery in AWS GovCloud with Cloud Volumes ONTAP

If the federal databases that support local law enforcement became unavailable or if a natural disaster were to also compromise governmental IT services, people’s lives and safety would be directly at risk.



It is vital for government agencies at the federal, state, and local levels to have uninterrupted access to their data and applications. The IT infrastructure used to support government systems, like all computing platforms, is vulnerable to hardware failure, human error, cyber-attack, and even region-wide outages. That is where Disaster Recovery (DR) environments on the AWS GovCloud come in.



In this article, we describe the advantages of using a cloud-based DR environment, and the added benefits of deploying DR to the AWS GovCloud using NetApp’s Cloud Volumes ONTAP.



What is AWS GovCloud?



AWS GovCloud is an isolated region of the Amazon cloud designed for US government agencies at all levels. This AWS government cloud follows a high-level cloud security policy that is compliant with the most stringent regulations, including FedRAMP, ITAR, DOD Security Requirements Guide and HIPAA. Access to AWS GovCloud services is limited to vetted account holders that must be held by US citizens. Amazon guarantees that the engineers they assign to manage AWS GovCloud are also US citizens and resident on US soil. AWS GovCloud supports secure, encrypted remote connections that comply with the requirements of FIPS 140-2.



Now, in the scenarios where you would be running on GovCloud, your disaster recovery policy will be of the highest importance. It takes a lot more than the standard AWS disaster recovery plan to ensure the proper protections are in place to make sure violations of the regulations mentioned above don’t take place. Account holders need to ask themselves:



    • What does it take to store long-term data copies and comply with regulations?
      • What does it take to store long-term data copies and comply with regulations?
      • Will this system be able to handle a seamless failover and failback between the primary and secondary, DR environments?
      • What if a disaster hits and the DR environment fails? How can I make sure that doesn’t happen?
      • And finally, how much is all of this going to cost? What can be done to keep that spending under control?

Disaster Recovery in the GovCloud



In order to effectively respond to a severe failure of IT services, a disaster recovery plan must be put in place ahead of time. There are two main factors to consider when doing this: the RTO (Recovery Time Objective), which is the amount of time required to bring services back online, and the RPO (Recovery Point Objective), which relates to the level of data loss that would be incurred after doing so. Disaster recovery in the GovCloud works the same way.



For example, consider a disaster recovery plan that involves taking data backups and sending them offsite. The RTO in such a scenario would be excessive, as the backups would need to be retrieved from their storage location and then restored, which would take an amount of time proportional to the size of the original data. The RPO would depend on the frequency of the backups, so if a backup were taken daily, up to a day’s worth of data could be lost irretrievably. For a governmental body, the loss of such information would not only be harmful to citizens but in some cases might violate laws in place that require the highest levels of data protection and availability.



For these reasons, most organizations opt for a warm, standby DR environment. The environment is warm because it captures as much of the live, production data as possible, which reduces RPO. Being on standby means that DR services are ready to be brought online at any time, which reduces RTO to a minimum. DR environments such as these have traditionally been implemented at a secondary facility with independent, physical hardware.



Managing physical infrastructure at secondary, and maybe even tertiary, sites makes the rollout of DR complicated, time consuming, and expensive. The databases and applications used by most government agencies have grown organically over time, which makes it more challenging to successfully replicate them and keep them up-to-date. Regular DR testing is essential for ensuring a smooth failover in the event of an actual disaster. After fully deploying a DR environment, nearly all of the infrastructure will actually remain idle for much of the time.



Building a DR with AWS GovCloud is simpleAlternatively, building out DR services with AWS GovCloud is simple, fast, and cost-effective. In the cloud there is no need to buy and manage physical servers, which must be patched and upgraded on a regular basis. Using a “pilot light” architecture, it is possible to build-out DR capability for only the most crucial services, and provision cloud resources for the remaining services only after a failover has occurred, which makes cloud-based DR deployments very cost-effective. The flexibility of AWS means that you can quickly scale up capacity at any time, allowing you to burst during periods of heavy user activity or provision new servers and storage at a moment’s notice.

AWS Disaster Recovery: Cloud-based DR with Cloud Volumes ONTAP



All applications and services rely on access to data. To deploy a DR environment to the cloud, government departments require secure and efficient transport of their data, which often contains sensitive information. Cloud Volumes ONTAP enables NetApp customers to control the flow of their data into AWS GovCloud while always ensuring the highest levels of data protection and security to align with disaster recovery policy.



Cloud Volumes ONTAP is a cloud-based variant of NetApp’s enterprise-class ONTAP storage services that uses the native compute and storage resources of AWS to create a virtual storage appliance in the cloud. Cloud Volumes ONTAP is able to transparently interact with physical, on-premises NetApp systems, allowing data to be replicated between them using SnapMirror®.



SnapMirror is a proven solution for highly efficient, block-level data replication. After synchronizing the destination with an initial full baseline copy, subsequent transfers will only send over the data that has changed at the source. This makes SnapMirror replication very effective, regardless of the source data size, and means that you can synchronize source and destination as frequently as necessary to meet your RPO requirements. Another benefit of only copying data deltas is that it lowers the costs of data traffic.



In AWS, Cloud Volumes ONTAP uses Amazon EBS as the storage back-end for its data. Users can setup storage volumes using any of the Amazon EBS disk types, depending on whether high performance or capacity storage is required. There are many Cloud Volumes ONTAP storage efficiencies that can dramatically reduce cloud storage footprint, and therefore operational costs, such as data compression, data deduplication, and thin provisioning. Your storage footprint (and its associated costs) can be lowered in some cases by 70% with deduplication, and by an average of 50% with compression. Data compression and deduplication will also shrink the size of the deltas transferred by SnapMirror, further reducing costs. In addition, using the cloud pay-as-you-go model, your DR instances can be idle most of the time and only used for the data updates, which lowers your AWS costs even more.



DR is a must for all governmentsCloud Volumes ONTAP is also able to dynamically tier data to object storage, such as Amazon S3, which is an extremely cost effective solution for data that is infrequently accessed, such as data in a DR environment. In this case, data arriving into Cloud Volumes ONTAP via SnapMirror is sent directly to an Amazon S3 bucket with Standard storage, and on future data access is automatically brought back into the Amazon EBS performance tier. The separate Amazon S3 tiers Infrequent-Access and One Zone-Infrequent Access can also be taken advantage of in order to get even better storage rates (though with higher access rates). Data tiering can lower costs down to $0.02 per GB/month, depending on the Amazon S3 storage tier selected.

Testing DR capability is very often neglected, which can have serious implications when it comes to actually performing a failover. Using NetApp FlexClone®, Cloud Volumes ONTAP is able is instantly create zero-capacity cost, writable clones of existing storage volumes, which greatly simplifies the process of performing DR tests.



For security, Amazon KMS is used by Cloud Volumes ONTAP to manage the encryption keys that ensure data security at rest. Data is automatically encrypted before it is stored and unencrypted for authorized access.



The combined benefits of these features can not only protect a system in a disaster, but cut down the TCO of operations by slashing storage usage. For an example of this, take a look at the story of Concerto Cloud Services, which used Cloud Volumes ONTAP to set up a four-hour standard SLA for DR that includes zero downtime, an industry benchmark.



Conclusion



Implementing a plan for disaster recovery is a must for all government agencies that rely on IT services. Building a DR platform in the cloud significantly reduces the cost and work effort required when compared to using physical infrastructure, and the strong compliance guarantees of AWS GovCloud ensure that sensitive information is handled meticulously.



Cloud Volumes ONTAP is one of the most versatile solutions available for cloud-based data storage management, providing SnapMirror replication to connect the data in on-premises systems to the cloud, and thereby unleashing the power of the cloud. There is a wealth of features within Cloud Volumes ONTAP that help with setting up cloud disaster recovery environments, reduce costs and provide flexibility when working with storage.



New call-to-action
Yifat Perry, Technical Content Manager

Technical Content Manager

-