Technologies that can support zero downtime and no data loss are crucial pieces of the application architecture while hosting workloads in cloud. For workloads in Azure, high availability is now offered using Cloud Volumes ONTAP HA (High Availability), a configuration that ensures business continuity and helps organizations meet their compliance standards.
In this post, we’ll explore how NetApp Cloud Volumes ONTAP HA enables Azure high availability by providing RPO of zero, RTO of under 60 seconds, and automatic failover and failback for shared storage.
What Is High Availability?
High availability is one of the critical design pillars for cloud deployments. Designing for HA helps protect against possible downtime due to data center outages, hardware failures, and maintenance operations. It also helps meet business continuity and disaster recovery targets of mission critical applications. Automatic failover and failback mechanisms are an integral part of designing a highly available architecture.
High availability keeps businesses from losing money, loss of reputation, unmet SLAs, loss of critical business data, fines or lawsuits for violating regulatory compliance, and more. Well-designed and implemented high availability mechanisms ensure no data loss and zero downtime while hosting applications in the cloud.
What Is Native Azure Storage HA
Native Azure high availability is implemented using different tools and configuration options available in the platform. On Azure, availability SLAs are often linked to these configurations. Azure stores three copies of the data in a data center, which is known as locally-redundant storage. Customers can go for additional levels of resiliency by choosing globally-redundant storage which will create another three copies of the data in a paired region. Azure availability zones help in achieving HA by distributing resources in multiple data centers within a region. Azure disaster recovery tools and services such as Azure Backup and Azure Site Recovery can also be leveraged by customers to achieve their application RPOs and RTOs. Depending on the scenario and use case, customers can choose the right set of native services to be implemented.
While all of this is excellent protection for a deployment on Azure, Cloud Volumes ONTAP High Availability is entirely different than anything offered by Azure natively.
Cloud Volumes ONTAP HA in Azure
Cloud Volumes ONTAP offers the capabilities of NetApp’s trusted ONTAP data management platform in the Azure cloud. It leverages the native storage capabilities of Azure storage while delivering advanced capabilities through NetApp technologies such as efficient Snapshots, DR, storage efficiencies, encryption, and more. With the HA configuration, those benefits extend to ensuring high availability for storage with RPO of zero and under-60-second RTO.
How it Works
With the Cloud Volumes ONTAP HA configuration, organizations are able to build resiliency for their deployments in Azure by using two Cloud Volumes ONTAP nodes. This ensures non-disruptive management and operations, and fault tolerance for your data, which augments the native availability capabilities of Azure storage.
The Cloud Volumes ONTAP Azure HA architecture.
The ONTAP HA pair nodes are deployed in the same Resource Group in an Availability Set and connected to the backend pool of an Azure internal load balancer (ILB). Since the nodes reside in same Availability Set, they will be placed in separate Fault Domains and Update Domains.
- Resource Groups offer logical grouping of Azure resources for uses such as lifecycle management and role-based access control. Both Cloud Volumes ONTAP nodes will be located in the same Resource Group.
- Availability Sets help to group virtual machines so that they are isolated from each other in a deployment to ensure high availability. The high availability is achieved by distributing these VMs in Fault Domains and Update Domains.
- Fault Domains (FD) essentially provide a rack with same power, networking, cooling, etc. Placing each Cloud Volumes ONTAP node in a different FD helps to avoid creating a single point of failure that might be affected by events such as power or network outages.
- Update Domains (UD) are logical deployment units used by Azure to determine the order in which instances are rebooted during planned maintenances. Placing nodes in multiple UDs ensures that at least one of the Cloud Volumes ONTAP nodes will remain available during scheduled maintenance or failure.
- An internal load balancer (ILB) provides a more secure endpoint by making it possible to run your application via a non-public IP address. Azure ILB redirects traffic to the backend pool of Cloud Volumes ONTAP nodes based on health probes that check if the nodes are online. If either of the nodes are down, an ILB is required to ensure high availability by redirecting network traffic to the available Cloud Volumes ONTAP instance.
Failover and Failback
Cloud Volumes ONTAP HA for Azure allows for seamless failover and failback. The data in the backend Azure page blobs is shared so that clients can access the data even if one of the Cloud Volume ONTAP nodes is unavailable. In the event of a takeover, connections to the same storage will trigger the surviving node to handle the data operations. Network redundancy paths allow clients and hosts to communicate with the available node in order to avoid any disruptions, and does not require any change from the clients or hosts.
Cloud Volumes ONTAP HA helps organizations to manage their mission critical workloads in Azure in a non-disruptive manner, adding another level of resiliency above the native availability provided by Azure at the storage level. The HA option of Cloud Volumes ONTAP helps you to recover from any disruptions in a very short recovery time (RTO<60 sec.) and with no data loss (RPO=0). The failover and failback mechanisms are automatic and seamless.
In addition to HA, Cloud Volumes ONTAP delivers enterprise-class storage management for SMB/CIFS, NFS, and iSCSI-based data solutions in Azure storage. It offers premium storage efficiencies in data deduplication and compression that help lower your TCO, and proprietary technologies such as SnapMirror® data replication and FlexClone® data cloning technology. Along with this you also get improved networking performance, encryption, and the Cloud Manager front-end console, which offers single point of data management and orchestration for ONTAP instances.