Azure NetApp Files – Performance So Good You’ll Think You’re On Premises

Enterprises are moving to the cloud (or rather, they’re continuing to move), and Azure is hungry to have them. Some companies are being selective in their migration, moving only applications that meet specific criteria—such as those without legal restrictions—while others have an all-in mandate. The task of transforming from custom-built data centers to general-purpose cloud is not easy, especially for applications not born in the cloud. From the retail firm with its enterprise-class databases demanding gigabytes of bandwidth, to the financials firm running I/O-hungry Monte Carlo simulations requiring a single namespace, to the genomics firm running highly demanding scale-out HPC workloads, demands are high—but is the cloud ready?  Microsoft Azure certainly is, thanks in big part to Azure NetApp Files!

Industry-leading NetApp® ONTAP® data management software, is the foundation of Microsoft’s new native, first-party file protocol service, Azure NetApp Files. Placed in the Azure data center for a consistent low-latency experience regardless of region, Azure NetApp Files (a NoOp service) is built to provide an on-premises NFS and very soon SMB protocol experience. Now you can give your born-in-the-cloud applications access to large amounts of I/O with sub-millisecond latency or a large amount of bandwidth for scale-up or scale-out environments.

Azure NetApp Files is a fully managed service built for simplicity, performance, and scalability that will take your business, its applications, and its workflows into the cloud faster than ever before.

Exploring Performance

It’s often easiest to understand the capabilities of a system by way of an example. The rest of this paper explores the capabilities of Azure NetApp Files by using a theoretical application, Acme AppX.

Acme AppX is a home-grown Linux-based application built for the cloud. This app is designed to scale linearly by adding virtual machines as the need for compute power increases. Data access is the name of the game for Acme AppX; rapid accessibility of the data lake is critical, and shared storage is the best option. The I/O patterns of this application are at times random and at times sequential. When random, low latency is needed for its large amounts of I/O, and when sequential, large amounts of bandwidth are desirable. The random component of the application leads the application admin to rule out object storage from consideration. The team has tried to build their own NFS server farm, but they’re frustrated by the complexity of having to manage the environment. Most shared file service engines are either self-managed (and undesirable); don’t scale far enough, offering a few tens of thousands of operations per second at best; or both, and are ruled out as well.

Azure claims that the newly launched Azure NetApp Files service is different—fully managed and scalable enough to meet the demands of most applications. But what does that mean? Let’s find out.

The Workload Generator

The results documented below come from Vdbench summary files. Vdbench is a command line utility that was created to help engineers and customers generate disk I/O workloads to be used for validating storage performance. We used the tool in a client-server configuration by using a single mixed master/client and 14 dedicated client virtual machines—thus scale out.  

The Work

The tests were designed to identify the limits that the hypothetical Acme AppX may experience, as well as expose the response time curves up to those limits.  We ran the following test scenarios to identify the limits:

  • 100% 8KiB random read
  • 100% 8KiB random write
  • 100% 64KiB sequential read
  • 100% 64KiB sequential write
  • 50% 64KiB sequential read, 50% 64KiB sequential write
  • 50% 8KiB random read, 50% 8KiB random write

Volume-Level Performance Expectations

Service Levels and Quotas

Individual volume bandwidth is provisioned based on a combination of service level and quota.  There are three service levels: Standard, Premium, and Ultra. Each service level allocates a different amount of bandwidth per TiB of provisioned capacity. This amount is known in the Azure NetApp Files portal as the storage quota.

Note: Service levels are designed to answer business needs. The Standard service level is intended for situations where capacity is the principle need, while the Ultra service level is intended for use where bandwidth is the principle need. The Premium service level strikes a balance between the two.

Service Levels & Quotas
The volume acmeAppX-one, shown below, has 4500MiB/s of provisioned bandwidth—the highest throughput attained by any of the workloads driven against a single volume. To see that 4,500MiB/s of bandwidth has been made available, use the following formula:
formula Quota + Service Level

Throughput-Intensive Workloads

Using Vdbench and a combination of 12 D32s V3 storage virtual machines, the following throughput numbers were achieved against the acmeAppX-one volume.

Throughput Test

I/O-Intensive Workloads

Again, using Vdbench and a combination of 12 D32s V3 storage virtual machines, the following I/O numbers were achieved against the acmeAppX-one volume.

I/O Test

I/O-Intensive Workloads – A Latency Study

Azure virtual machines (VMs) do not support the concept of availability zones, so VM placement is nondeterministic and can shift between instance shutdown and startup. The response time difference between a VM placed adjacent (same Azure campus as the storage) and nonadjacent (same region but separate Azure campus) is about 1ms. The increased latency encountered when a virtual machine (or a collection of VMs) shifts from or to the Azure campus hosting the Azure NetApp Files service has an effect on I/O count at a given degree of parallelism; this is something to keep an eye on. In either case, reads against an Azure NetApp Volume can be driven in excess of 300,000 IOPS.

The arrows on the following graph connect similar thread counts.

Single Volume Random Ready Study

As you can see from the test results, with Azure NetApp Files, you get a sizeable boost in performance for your file-based workloads in Microsoft Azure. Your latency-sensitive workloads —think databases—can get sub-millisecond response times (adjacent VMs), driving your transactional performance over 300k IOPS (for a single volume)—a level previously only capable in the data center and on dedicated equipment. And for your throughput-sensitive applications—think streaming and imaging applications—you now get 4.5GiB/s throughput, a level never seen before in the cloud. Reflecting on the results of this Acme AppX example, I can see a whole new possibility for high-performance production applications poised to run on Azure.

Do you want to try this out for yourself? Sign up for your own Azure NetApp Files access and see the performance you can get for your applications. Register today!