Azure File Storage

Azure Storage: Behind the Scenes

Microsoft Azure continues to grow its presence in the public cloud market. But familiar as many administrators have become with the service, not everyone is aware of the way that its storage solution is structured. As the software-defined cloud-based storage solution for Azure, Microsoft Azure Storage supports Azure deployments, but what exactly is going on underneath its hood?  


It turns out that there is a lot more to storage than meets the eye. In this article, we will cover the various components that make up the Microsoft Azure Storage layer, detail how those components interact with each other, and show how Microsoft achieves the redundancy and high availability of its cloud service.

Azure Storage Services

Before we can begin the discussion, it is important to be familiar with the storage services that Microsoft offers:

  • Blob
  • Table
  • Queue
  • File
  • Disk Storage

Object storage — or Blob storage — contains blobs, which are basically files such as images, Virtual Hard Disks (VHDs), logs and database backups. These are stored in containers, which are similar to folders in a Windows file system. These blobs can be accessed from anywhere using URIs, REST APIs and Azure SDK Client libraries.

Table storage is an NoSQL service from Microsoft, where the data is stored in tabular format. Each row represents an identity and the columns represent the entity’s different properties.

Queue storage is used for message storage and retrieval. This service features RESTful APIs that allow you to build flexible applications and separate functions for better durability across large workloads.  


For file storage, Azure offers Azure File Storage. This is a fully managed file share service where the files can be accessed by Server Message Block (SMB) protocol and can be mounted by cloud or on-premises systems running on Windows, Linux and MacOS.


Disk storage on Azure is offered by Azure Disk Storage. Azure Disk Storage provides two options for storage: Unmanaged and Managed. Unmanaged disks are created by tenants under storage accounts and are maintained by the tenant. In Azure Managed Disks, the storage accounts are created and maintained by Azure. They also offer two performance tiers: Standard (backed by HDDs for low IOPS workload) and Premium (backed by SSD for high IOPS requirement).

  • Azure Storage also provides several redundancy options based on geography and accessibility. Please find more details here.

The Azure Storage Architecture  

The two main components are storage stamps and location services.

Each storage stamp is a cluster of storage node racks. The number of racks can vary, and each rack exists on a fault domain of its own. These storage stamps can reach up to 70% utilization, after which the location services move the data to a different stamp using inter-stamp replication.


Location services (LS) themselves are geographically distributed services responsible for the management of the storage stamps. This includes mapping a storage account to a storage stamp, managing the namespace of these storage stamps, and managing the addition of new storage stamps (even for new regions).

When a new Azure Storage account is created, LS maps the storage account to a primary storage stamp (based on the region chosen by the end user), stores the account metadata on the storage stamp (after which the storage stamp starts servicing traffic to that storage account), and updates the DNS to point to the various VIPs (Virtual IPs) exposed by the storage stamp. The endpoint format (URIs) for this is https://accountname.service.core.windows.net/.

For example, a storage account by the name “netapptest7” would have endpoints such as https://netapptest7.blob.core.windows.net/ (which is a blob endpoint) and https://netapptest7.file.core.windows.net/ (which is a file endpoint). These would, in turn, point to different VIPs as updated by the location service.


Now let’s delve a bit deeper into the storage stamps as they are responsible for the actual storage transactions. A storage stamp has three layers, namely:

  • The front-end layer
  • The partition layer
  • The stream or distributed file system (DFS) layer

Front-End (FE) Layer

The front-end layer comprises a set of stateless servers whose main function is to serve incoming requests by looking up the AccountName, authorizing and authenticating the requests, and then forwarding them to the relevant partition server. This partition server is chosen based on the storage abstraction which, in turn, translates to a partition layer.

This mapping of partitions is cached by the FE layer in a partition map (map of partition names and servers) object. Some other tasks that the FE layer assists in are caching data that is frequently accessed and streaming large objects from the DFS layer directly.

Partition Layer

The partition layer manages the abstractions mentioned at the beginning of this article (i.e, Blob, Table or Queue). The partition layer is responsible for tracking storage objects in the object table (OT).

The OT is broken down into contiguous rows named RangePartitions, which are then spread across several partition servers. Some of these OTs are the account table (which keeps track of the storage account metadata and configuration assigned to that stamp), the schema table (which holds the schema for all OTs), and the partition map table (which keeps track of all the RangeParitions and their mapping to the partition servers). Individual objects in the partition are determined by the ObjectName.


There are three components of the partition layer: The partition manager, the partition servers, and the lock service.

  • The partition manager (PM) is responsible for splitting the OT into RangePartitions and keeping track of these in the partition map table.
  • The partition servers (PS) are responsible for serving requests to the RangePartitions as dictated by the PM.
  • The lock service chooses the PM and assigns a lease to the PS for serving a specific RangePartition.

Stream Layer

The stream layer is responsible for storing the data bits on the disk and provides an internal interface for the partition layer. The data is distributed for load balancing and replicated for redundancy across servers inside the storage stamp.

The stream layer consists of two main components: Extent nodes (EN) and the stream manager (SM). 

The extent nodes (EN) are a set of nodes that contains extent replicas and their blocks. ENs do not understand the abstraction of streams, but keep track of extents (each extent is treated as a file), data blocks and an index which maps data offset to their blocks and file locations. ENs keep track of their extents and the peer replicas for those extents.

The stream manager (SM) is a standard Paxos cluster that keeps track of stream namespaces, extents to stream mapping, and distribution of extents across extent nodes. Based on the replication policy (by default three), the SM polls the EN and checks for compliance.

If the SM detects that the extents are stored on fewer ENs, then it starts a lazy replication to ENs spread across different fault domains.  This is how the stream layer works: Extents are the smallest units of replication in the stream layer and are made up of a sequence of blocks. Blocks are the smallest unit of data for reading and writing that are appended to extents.

The partition layer is responsible for appending blocks, tracking extents and controlling the size of each block; it’s the job of the stream layer to store and replicate all of those files as streams, which are lists that point to extents. The streams are maintained by the stream manager and are presented to the partition layer in the form of files.

Replication in Azure Storage

There are two types of replication: Intra-stamp replication and inter-stamp replication. Both of these are used for specific purposes.


  • Intra-stamp replication: This type of synchronous replication is handled by the stamp layer to replicate the data within the storage stamp. As the data is spread across separate fault and update domains, data redundancy is ensured in case of disk, server or rack failures. The unit of replication is blocks which make up a storage object.
  • Inter-stamp replication:  This type of asynchronous replication is handled by the partition layer and helps to provide geo-redundancy in case of a disaster. The whole object or delta changes are replicated to different stamps and aids in disaster recovery and migration of data between storage stamps. This allows replica scenarios like GRS and ZRS.

azure storage

Azure Storage components and replication (Source).

Final Note

In addition to the different components that make up the Azure Storage layer, there are several other factors that help Microsoft guarantee 99.9% SLA. Some of these factors are:

  • Spindle anti-starvation (an IO scheduling mechanism where new IO is not scheduled on spindle disks that have more than 100ms of pending IO).
  • Journaling (which reserves the SSD disk for journaling the data written to the EN).
  • Read load-balancing (EN determines if the read can be completed within a deadline and this helps the SM to choose a different extent if a deadline cannot be met).

With all these moving parts, it’s clear that there is a lot going on behind the scenes. Companies migrating to Azure have a lot to figure out, and hopefully the outline given above does a little bit to make this part of Microsoft’s public cloud service more understandable.

For users who still want to know more about Azure and how it works, NetApp can help you leverage Azure storage as your cloud provider, both with its new NFS services for Azure and with it’s premiere data management solution, Cloud Volumes ONTAP(formerly ONTAP Cloud) for Azure.

Want to learn more about NetApp’s fully managed file service for Azure? Read this quick breakdown on the benefits and learn more at Azure NetApp Files

New call-to-action.

See Our Additional Guides on Key Cloud Storage Topics

We have authored in-depth guides on several other topics that can also be useful as you explore the world of cloud storage.

Cloud File Sharing

File shares support some of the most important workloads that enterprise businesses rely on, and the resources of the public cloud have created interesting new possibilities. Every major public cloud provider now offers its own cloud file sharing service, each with its own target workloads and considerations. But not every enterprise will find what they’re looking for in a fully managed, all-cloud service.

See top articles in our cloud file sharing guide:

Multicloud Storage

Multicloud strategies are becoming more popular as organizations seek to optimize their cloud services and deployments. These strategies can help you prevent vendor lock-in, increase your flexibility, and help you optimize costs.

This guide explains what multicloud storage is, how it works, what it’s used for, the core requirements for this storage, and how Cloud Volumes ONTAP supports it. 

See top articles in our multicloud storage guide:

AWS Database Services

AWS offers a range of database services and support to try and meet all its clients needs. Many of these services are fully managed to help reduce your IT workload and enable you to store and use data as simply as possible.

This guide explains what AWS database support is available, what database services are available, and how you can migrate your databases to AWS. 

See top articles in our AWS database services guide:

AWS Snapshots for Amazon EBS

Snapshots are a common method for natively backing up cloud data and services. This method enables you to save point in time backups which can be restored when needed.

This guide explains what types of storage snapshots are available, what AWS snapshots are, and how to use AWS snapshots.

See top articles in our AWS snapshots guide:

Azure Backup

Azure provides a wide variety of services to its users to help you manage your cloud data and services reliably. Azure Backup is one such service that can help provide data loss protection and peace of mind.

This guide explains what Azure Backup is and how to use it to backup your Azure data. 

See top articles in our Azure Backup guide:

Azure File Storage

Storing file data in Azure is simple through Azure File Storage service. This service enables you to store files across cloud and on-premises resources, enabling you to flexibly and securely share data and workflows. 

This guide explains what Azure File Storage is, common use cases for Files, management concepts and components of the service, how data is accessed and the architecture of the service, and some best practices for securing your data.

See top articles in our Azure file storage guide:

Google Cloud Storage

Google Cloud offers a variety of storage options for you to choose from. These services form the base of many other services in the cloud and understanding what your options are can help you manage your cloud more efficiently.

This guide explains what Google Cloud Storage options exist and their common uses.

See top articles in our Google Cloud storage guide:

Google Cloud Database Services

Google Cloud’s specialty is flexibility and integration of services and this extends to its database services. In Google Cloud you have a wide variety of database deployments, models, and support to choose from. 

This guide explains your options for deploying databases in the cloud, what Google Cloud database services are available, and how to choose the right service for you.

See top articles in our Google Cloud database guide:

Kubernetes Storage

Software developers and DevOps engineers are packaging applications into lightweight units called containers. Kubernetes helps manage and scale containers across clusters of physical machines. 

In this environment, Kubernetes storage becomes a significant challenge. By default, containers are ephemeral, meaning that any transient data on the container is lost when it shuts down. However, Kubernetes provides several options for persistent storage.

See top articles in our Kubernetes guide:

S3 Storage

Learn the basics of storing data in Amazon Simple Storage Service (S3), Amazon’s first cloud service and still one of its most popular.

-