Alongside compute and network, storage is one of the fundamental resources required in today’s technological systems and software development. Regardless if your systems are on-premises, in the cloud, or both, you always rely on storage components to persist your data. However, if you are used to the world of data centers—with SAN, NAS, and local hard drives—and are starting to venture into public cloud, the concept of cloud-based storage and the different services available for it might prove tricky.
In this article, we are going to provide some clarity about the different storage options in the Google Cloud Platform.
Google Cloud provides three main services for different types of storage: Persistent Disks for block storage, Filestore for network file storage, and Cloud Storage for object storage. These services are at the core of the platform and act as building blocks for the majority of the Google Cloud services and, by extension, to the systems you build on top of it.
Since NetApp Cloud Volumes ONTAP is now available for use in the Google Cloud, NetApp users can easily take advantage of this growing cloud platform. Let’s take a closer look at each of these Google Cloud storage services, what they were designed for, and what use cases they are each best suited to handle.
Google Cloud Persistent Disks (Block Storage)
Block storage is the traditional storage type, both in the cloud and in on-premises systems. A Google Cloud Persistent Disk provides block storage and it is used by all virtual machines in Google Cloud (Google Cloud Compute Engine). The easiest way to understand it is by imagining those Persistent Disks as mere USB drives. They can be attached or detached from virtual machines and enable you to build, as the name suggests, data persistence for your services whenever virtual machines are started, stopped, or terminated.
In addition to Google Cloud Compute Engine virtual machines, these Persistent Disks are also used to power the Google Kubernetes Engine service.
Very much like a virtual disk in your local machine, a Google Cloud Persistent Disk can either be HDD or SSD, the latter for high I/O performance. In addition, there is also the ability to choose where they are located and what type of availability is needed: they can be Regional, Zonal, or Local. While local disks are not, in theory, part of the Google Cloud Persistent Disk service, it is important to mention them. These local disks are only available in the hardware where the virtual machine is running and, while providing the best I/O performance, they are not often recommended due to the low availability and redundancy. On the other hand, if you require high availability, Regional disks will offer you that out-of-the-box, with your disks being replicated behind the scenes in different zones within a region. A more moderate (and less-expensive) approach are Zonal disks, which are storage disks that are also highly available, but only within a single zone.
Other lesser known, yet great features of Google Cloud Persistent Disks are automatic encryption, flexibility to resize while-in-use, and a snapshot capability which can be used for both backup and virtual machine image creation.
Google Cloud Filestore (Network File Storage)
Filestore is the fully managed Google Cloud service that provides network file storage. Network file storage is not a new cloud concept and very much like block storage it also exists in your typical on-premises data center. If you’re used to working with NAS (Network Attached Storage) the concept should be familiar to you.
While you could of course argue that network file storage is technically block storage (it is!), there is a very clear distinction here. A network file storage, as the name suggests, provides a disk storage over the network. This enables the development of systems with multiple parallel services that can read and write files from the same disk storage mounted over the network.
However, the advantages this gives require some caution in its use. Compared with the usual block or object storage, the performance of file storage is, as you might expect, substantially inferior. This might lead to issues with concurrency and file permissions. Therefore, while designing cloud-native systems, you should only use this solution after careful evaluation to solve these challenges.
Google Cloud Storage (Object Storage)
What is Google Cloud Storage?
Google Cloud Storage is the object storage service offered by Google Cloud. It provides some very interesting out-of-the-box features such as object versioning or fine-grain permissions (per object or bucket), that can make development easy and help reduce operational overheads. Google Cloud Storage serves as the foundation of several different services.
What kind of benefits does this storage type have? The concept of object storage is not that easy to grasp. In typical on-premises systems where capacity is more limited and connectivity fast and exclusive, having this type of storage is not at all common. The way that object storage works, however, is beautifully simple to the end user. In simple terms, its value-proposition is that you can get and put any file you want via a REST API—and this can expand indefinitely with each object growing up to the terabyte scale. Interesting, right? In Cloud Storage, different objects are grouped in unique “namespaces” called buckets. A bucket can hold multiple objects yet, a single object will belong to only one bucket.
This model for storage is widely popular in cloud-native systems due to its low cost (cents per GB) combined with the serverless approach and simplicity. The heavy work of data replication, availability, integrity, capacity planning, etc. is then left to the cloud provider. The drawback of object storage is that there is no other way to access the data besides the REST API; therefore, the typical approach for designing systems, managing data, and structuring a filesystem-type of access doesn’t work.
Google Cloud Storage Classes, Archival Storage, and Lifecycle Management Rules
Perhaps some of the most underrated functionalities of Google Cloud Storage are the different storage classes and putting Lifecycle Management Rules into use for the data buckets. Using these features can make a huge difference in terms of cost and the running operational expenses.
In Google Cloud Storage, you are required to select one storage class for your bucket: Standard (which can be either Regional or Multi-Regional), Nearline, or Coldline. The usual approach is to select Standard, where you can opt to have your bucket in a specific single Google Cloud Region or stored across multiple Regions. This works really well in different scenarios since you get a highly performant and highly available storage.
However, there are several cases where the data is not meant to be accessed frequently and having reduced availability is perfectly ok. In these cases, the Nearline and Coldline storage classes are options that can and should be explored. They can easily reduce the cost in >50% compared with the standard storage class.
The Nearline storage class is designed for data that is accessed less than once per month. One example use case is data that will only be used to produce an aggregate monthly report. Coldline storage, on the other hand, is designed for data that is accessed even less frequently—think, once per year or less. This storage class is therefore particularly useful for archival storage. One use case is to use the Coldline storage class to keep a copy of data that the business requires to be retained for a long period of time (e.g. 10 years) due to compliance with different regulatory requirements.
Lifecycle Management Rules
One of the challenges of properly leveraging storage classes (and other Cloud Storage features) is that the same type of data (i.e., data in the same bucket) might require different treatment over the course of its lifetime. For example, if you use Google Cloud Storage buckets to store your application logs, you might require high availability for the data during the first month (including versioning each object as a safeguard for data tampering), and perhaps less availability (without versioning) for the next six months, and eventually retain a copy of those logs for the following five plus years due to compliance obligations.
For these types of scenarios, you can enable Google Cloud Storage Lifecycle Management rules. This is a really great built-in feature of Google Cloud Storage that enables you to define business logic rules per bucket without much effort. With these rules you can define actions such as automatically transitioning objects between different storage classes, disable versioning, or even delete objects after certain defined periods of time. Leveraging these features and having lifecycle rules in place can translate into real and tangible cost savings at the end of the month.
In this article, we explored the different storage building blocks in the cloud and provided a good overview of the different storage services available in Google Cloud. They are quite straightforward to grasp and provide a huge amount of flexibility to develop different types of systems in the Cloud.
While designing and architecting your system is then important to understand the pros and cons of each storage type and the costs associated with them. What will likely appeal to current NetApp users trying to find out how to use Google Cloud storage is that Cloud Volumes ONTAP is now available for use with Google Cloud. NetApp’s groundbreaking cloud data management platform that has been a huge success for enterprises in AWS and Azure, can finally start to leverage the benefits of Google Cloud.
To try out NetApp Cloud Volumes for Google Cloud, sign up for a 30-day free trial here.