Storage Deduplication, Compression and More for Azure

January 17, 2018

Topics: Cloud Volumes ONTAP Azure Cloud Storage Storage Efficiencies Data Migration Data Cloning Advanced 6 minute read

Azure’s native ability to optimize storage usage through its different object storage classes can be helpful in reducing storage costs, but does that mean the data is being stored efficiently?

With NetApp’s Cloud Volumes ONTAP, Azure users gain a number of space-reducing and cost-saving efficiencies, including storage deduplication, data compression, data compaction, snapshot copies, data clones, and more.

In this article we take a look at all of these technologies to help you reduce your storage spend on Azure.

Unmatched Data Storage Efficiencies

Cloud Volumes ONTAP can be run as an Azure compute instance using either standard or premium Azure storage, as shown below:

Data Center - ONTAP cloud

As public cloud storage costs have a tendency to increase rapidly, Cloud Volumes ONTAP for Azure carries out a series of data operations to help reduce the storage space and costs required for maintaining data in Azure.

Data Compression

Cloud Volumes ONTAP for Azure’s adaptive compression process efficiently determines whether the data block entering the storage controller can be compressed by 50% or more. If that is determined to be the case, then Cloud Volumes ONTAP compresses the data to minimize storage overhead.

Storage Deduplication

Most datasets contain a certain amount of duplication at the file and/or block levels. With Cloud Volumes ONTAP for Azure storage, deduplication automatically detects if the block currently being written is identical to an existing block in the storage media. If so, it saves a pointer rather than writing the block again. Storage deduplication can lead to substantial storage savings in and of itself. It is also possible to run scheduled deduplication in the background across stored data at rest. Deduplication is extremely effective in specific types of data, where some parts of it may be repetitive, such as with images.

Data Compaction

Both data that has been compressed and small uncompressed files are eligible for compaction: a process that fits two or more smaller chunks into a single 4KB physical block before sending the block to storage. Substantial storage space is saved when each physical block is used as close to capacity as possible.

These three processes—compression, deduplication, and compaction—are complementary and designed to work together. For example, the higher the compression rate, the higher the compaction rate. Where deduplication may be less beneficial, compression can step up to the plate and yield significant storage space savings.

Thin Provisioning: True On-Demand Capacity

Cloud Volumes ONTAP for Azure implements thin provisioning so that the storage capacity for an app or a DB instance is allocated dynamically from a single shared storage pool only when data is actually being written to a volume or LUN. In this way you save the cost of provisioning storage space that will be unoccupied most of the time. Furthermore, when data is deleted, free space is released back to the pool. Overall, Cloud Volumes ONTAP for Azure’s robust thin provisioning not only saves on storage costs, it also simplifies capacity planning.

Here’s a classic example of howCloud Volumes ONTAP thin provisioning saves storage and reduces costs. One of NetApp’s Cloud Volumes ONTAP customers is a well-known university, which estimated that it would need close to 30TB of physical storage in order to meet its storage allocation obligations to its students and faculty members. In reality however, most of the students and faculty were using little or no storage. With ONTAP Cloud thin provisioning, the college was able to seamlessly meet its storage obligations with only 3.5TB of physical storage — and saved $90,000 a year in storage costs in the process.

Superior Snapshots

Snapshots are read-only, point-in-time, virtual copies of file systems or volumes, used primarily for backup and recovery purposes. Most snapshot implementations use what’s called a “copy-on-write” (CoW) approach. The initial snapshot points to the original data, and data blocks are only copied when the original data is to be overwritten (or deleted) by a change. In this way, a true point-in-time version of the original is preserved. Although more space-efficient than snapshots that copy the entire volume every time a change is made, CoW snapshots still require reserved storage capacity to store the snapshots—typically 10-20% of the volume size.

NetApp’s proprietary Snapshot technology is at the heart of Cloud Volumes ONTAP for Azure’s suite of disaster-tolerant data protection solutions such as SnapMirror®, SnapRestore® and SnapVault®. NetApp was the first to introduce “redirect-on-write” (RoW) snapshots, leveraging its unique storage virtualization technology that maintains a set of pointers to individual blocks of data. When data changes, the updated data is written to a new block and changes the pointer. Thus, a NetApp snapshot is a point-in-time, read-only copy of the new pointers; it takes less than a second to create, incurs no performance overhead, and is highly space-efficient (only taking up several KBs). Each point-in-time Snapshot copy remains stable as changes are made to the original data, and up to 255 Snapshot copies can be stored per volume. However, each newly-created Snapshot copy points only to the blocks added or changed since the previous Snapshot. Cloud Volumes ONTAP Snapshots make it possible to keep an extremely high, up-to-the-minute SLA, while keeping costs low.

Near-Zero Capacity Volume Cloning

There are many cases in which it is necessary or desirable to clone a volume, such as creating forks for the application dev/test cycle, provisioning new virtual machines, or creating DR volumes. Cloud Volumes ONTAP for Azure leverages the Snapshot technology described above in its industry-leading FlexClone® solution:

Volume cloning - ONTAP cloud

Cloned volumes are created instantly and at near-zero capacity regardless of the source data size. Thus, FlexClone technology dramatically cuts the storage you need for dev/test or virtual environments by 50% or more. FlexClone also makes it easy to fully test your DR processes, which is essential to any robust DR plan. Last, but not least, you can deploy tens or even hundreds of virtual machines in minutes — with only a small incremental increase in your storage needs (and costs).

Final Note

We invite you to use our Azure Calculator to see how Azure and NetApp services work together to provide a cost-efficient data storage solution for your organization.

Gali Kovacs

Storage Deduplication, Compression and More NetApp: Storage Efficiencies in Azure

Read Next:

Subscribe to our blog

January 17, 2018

Topics: Cloud Volumes ONTAP Azure Cloud Storage Storage Efficiencies Data Migration Data Cloning Advanced 6 minute read

Unmatched Data Storage Efficiencies

Data Compression

Storage Deduplication

Data Compaction

Thin Provisioning: True On-Demand Capacity

Superior Snapshots

Near-Zero Capacity Volume Cloning

Final Note

Storage Deduplication, Compression and More NetApp: Storage Efficiencies in Azure

Read Next:

Share

More about Azure Cost Management

Subscribe to our blog

January 17, 2018

Topics: Cloud Volumes ONTAP AzureCloud StorageStorage EfficienciesData MigrationData CloningAdvanced6 minute read

Unmatched Data Storage Efficiencies

Data Compression

Storage Deduplication

Data Compaction

Thin Provisioning: True On-Demand Capacity

Superior Snapshots

Near-Zero Capacity Volume Cloning

Final Note

Topics: Cloud Volumes ONTAP Azure Cloud Storage Storage Efficiencies Data Migration Data Cloning Advanced 6 minute read