Snapshot Copies: When to Use Them and When to Tier Them

As one of the pieces making up a complete backup strategy, snapshot copies play an important role in the majority of recovery scenarios. Virtual machine environments use them to have safe points where they can rollback to before patches or important upgrades. Dev/test teams can use snapshot copies to clone whole new volumes that can be used to perform tests without affecting production data. Snapshot copies can also be used for single file recovery or quickly recovering against data corruption and ransomware attacks.

NetApp has an innovative snapshot technology, making these copies quick to create and highly space efficient. But snapshot copies still take up storage space, and in on-prem environments that space can be precious. What can you do about the space that snapshot copies consume on high performance storage appliances, like NetApp AFF and SSD-backed FAS systems? That’s where Cloud Tiering comes in.

In this article we will give you an insight on how the NetApp Snapshot technology works and how Cloud Tiering service makes it easy to store these copies cost effectively in the cloud.

NetApp Snapshot Technology: A Quick Overview

If you’ve used an ONTAP system before, you are probably already familiar with the basic concept of a snapshot copy: a read-only, point-in-time copy of your data. Knowing what is going on behind the scenes requires diving a little bit into it.

Whenever a snapshot is triggered to capture the state of the data in a volume, the corresponding data blocks in the volume are secured by the snapshot, making them immutable. At that exact moment and until the moment any of the blocks included in the snapshot are modified, the snapshot shares the exact same storage as the active file system of the volume. This is why they are highly efficient. As soon as any of the data blocks are changed after the snapshot is taken, that’s when the snapshot starts diverging from the active file system and additional storage space is consumed. The storage system now maintains a record of both the new modified data block and the original one which exists only in the snapshot.

To understand why the NetApp Snapshot technology has excelled we need to examine the two main types of snapshot technologies used to handle this data modification process and to keep track of everything: COW (Copy on Write) and ROW (Redirect on Write).

Copy on Write

Under the COW approach, data blocks in a snapshot which need to be modified have to go through a process that would copy them elsewhere. Let’s say we just saved a file we were working on and after saving it this file was included as part of the most recent snapshot taken by the storage system. When we open the file and update some of its contents, the corresponding data blocks from that file should now be physically modified.

Prior to overwriting any of the existing data blocks, the COW process copies them and transfers that copied information somewhere else on the disk to new storage blocks. The new storage blocks are now going to contain the original information from our file kept secure by the snapshot. Only after the copy process is completed, the system overwrites the modified file data into the original blocks which now belong only to the active file system containing that updated data. This whole process takes three Input/Output operations:

  1. A read operation to copy the original blocks.
  2. A write operation to put them in the snapshot area on the disk.
  3. A write operation of the modified data into the original blocks.

This approach comes with a certain price in computational resources and with inherent performance penalties:

  • When data is modified, a total of three I/O operations are carried out: one read and two writes.
  • Similarly, when a restore operation from a snapshot is performed, the storage system goes through a decision process where it examines the blocks being kept securely. For each block it needs to know whether or not that block was modified. If it wasn’t modified, the block is read from the original, secured entity. If it was modified, then the storage system needs to get the data block from where it was copied to. 

These drawbacks can have ramifications for the system’s performance. That’s why this approach is recommended only for short-term or temporary backups.

Now let’s take a look at how the NetApp ONTAP Snapshot technology works by using the Redirect on Write approach.

Redirect on Write

With the Redirect on Write approach, whenever a data block protected by a snapshot needs to be modified, the system only redirects the writes of data that was changed to new storage blocks and updates the pointers of the active file system to those blocks. This is the method used by ONTAP Snapshot technology.

This ONTAP Snapshot technology only consumes one write I/O operation, contrary to the COW approach which consumes three I/O operations per modified data block. It also consumes less computational resources when restoring from a snapshot, since the snapshot pointers always point to the original location of the data blocks which themselves are “locked” and can never be overwritten as long as the snapshot exists. Snapshots are independent sets of pointers that are irrespective of where the active file system pointers are pointing at. When a restore is performed, the active file system pointers are all updated to match the pointers of the snapshot. In this way, completing a restore operation of terabytes of data can be done within seconds. These two key points, together with the ability to save storage efficiencies achieved using deduplication and compression, are what makes ONTAP Snapshot technology so efficient.

How Cloud Tiering Can Optimize Snapshot Storage

So, on one side snapshot copies are an important part of a backup strategy and they are used as the means for recovery in more than 80% of cases. On the other side, inactive or cold data can comprise more than 50% of the total capacity in mid- and large-scale business storage environments, according to research done by Osterman in Data Center Transformation Requires Software-Based Cloud Storage. Snapshot copies are a part of this bulk of inactive data, and although they are highly space-efficient they usually take 10% or more of the storage capacity, which leaves room for a more efficient storage management solution. This is where Cloud Tiering kicks in.

Cloud Tiering offers an easy-to-use solution for your on-prem AFF or SSD-backed FAS systems to offload cold data to low-cost object storage. Through NetApp Cloud Manager you can easily subscribe to the service, discover your ONTAP on-prem cluster, set the desired configurations and start tiering cold data to Azure Blob, Amazon S3, Google Cloud Storage, or NetApp StorageGRID.

Overall, Cloud Tiering has three tiering policies that you can set at the volume level, two of them with configurable cooling periods— “Cold user data & snapshots” (Auto) and “Cold snapshots” (Snapshot-only). Since we’re talking about Snapshot copies in this article, let’s take a closer look at the “Cold-snapshots” policy.

This policy will tier any data blocks in a snapshot which are not currently shared with the active file system, as happens when the data blocks are modified. Data blocks which haven’t been modified are shared by both the snapshot and the active file system. This condition opens room for additional storage savings in cases where data changes, moderately to frequently, causing snapshot storage consumption to rise.

Since the chance of a snapshot being used for recovery decreases significantly after just a few days, the “Cold-snapshots” policy offered by the Cloud Tiering service provides a great opportunity for tiering cold snapshot blocks and freeing space from your highly performant SSDs. If you were to perform a recovery from one of those snapshots already tiered to the cloud, the corresponding blocks are seamlessly read from the cold cloud tier and placed back in the hot performance tier. The data is always there when you need it.

While there are many cases that could benefit from this approach, a great example is when used in a database environment. A typical database observed by NetApp, has a turnover rate of 5% weekly respective to the total size of the database. Examining further, that 5% becomes a 15% because some of those data blocks are modified more then once and those changes are saved by the snapshots. OLTP workloads with data change rates which are higher could also benefit from this type of Snapshot-only policy.

Festo, the German automation and control company applied this snapshot-only policy on their AFF ONTAP environment to snapshots blocks older than 48 hours on 200 of their AFF ONTAP volumes and have been gradually extending their usage of the service to free hundreds of TBs and scale up their on-prem capacity.

Conclusion

NetApp uses its Snapshot technology to provide you with instantaneous, highly efficient storage backups that can be used in many recovery scenarios, including recovering from possible ransomware threats, without impacting your AFF or FAS performance. Add the Cloud Tiering solution to this mix and you end up with a more efficient snapshot-based backup strategy with savings in storage TCO provided by Cloud Tiering.

New call-to-action

-