Blog

Kubernetes Persistent Storage: Why, Where and How

Kubernetes Persistent Storage offers Kubernetes applications a convenient way to request, and consume, storage resources. A Volume is a basic building block of the Kubernetes storage architecture. Kubernetes offers Persistent Volume, a type of Volume that lives within the Kubernetes cluster, and can outlive other Kubernetes pods to retain data for long periods of time.

In this post, we’ll review persistent storage concepts, Kubernetes storage integrations and features, and show how NetApp Cloud Volumes ONTAP can help provision highly available, high performance storage for Kubernetes applications.

In this article you will learn:

What is Kubernetes Persistent Storage?

Containers are immutable, meaning that when a container shuts down, all data created during its lifetime is lost. This is suitable for some applications, but in many cases, applications need to preserve state or share information with other applications. A common example is applications that rely on databases (see our post on MySQL Kubernetes). For these and other use cases, there is a need for containers to have a place to store information persistently—so it can survive the shutdown of one or more containers. 

Kubernetes provides a convenient persistent storage mechanism for containers. It is based on the concept of a Persistent Volume (PV). Kubernetes Volumes are constructs that allow you to mount a storage unit, such as a file system folder or a cloud storage bucket, to a Kubernetes node and also share information between nodes. Regular Volumes are deleted when the Pod hosting them shuts down. But a Persistent Volume is hosted in its own Pod and can remain alive for as long as necessary for ongoing operations.

The Importance of Kubernetes Persistent Storage

There has always been a need to hide the details of storage implementation from the applications and users that need to use it. Many years ago, protocols like NFS, iSCSI and SMB helped applications and operating systems access a storage resource seamlessly, irrespective of whether it was an SSD drive, a magnetic drive, who was the vendor of the device, etc.

In the cloud native world, this interoperability has broken down. There are a myriad of storage services and systems—some offered by public cloud providers, some by commercial vendors or open source projects. There are also many third-party storage solutions provided by the major cloud providers.

This creates an environment in which applications or users who access the data need to integrate with a specific storage system. For example, many tools and cloud applications integrate with Amazon S3. S3 is a great solution, but the fact that these components have to integrate explicitly with a specific storage service creates an unhealthy dependency.

Kubernetes is trying to change that. Its PersistentVolume abstraction allows cloud native applications to connect to a variety of cloud storage systems, virtualized storage, and proprietary or open source storage platforms, without having to explicitly integrate with those systems. An application can simply request storage, and have it provisioned, without being aware of the details of the implementation.

This can make consumption of cloud storage much more seamless and eliminate integration costs. It can also reduce vendor lock-in, making it much easier to migrate between clouds and adopt multi-cloud strategies.

Kubernetes Persistent Storage Concepts

There are three primary concepts you should understand as you start working with Kubernetes persistent storage:

PersistentVolume (PV)
An API volume object that represents a storage location that lives in your Kubernetes cluster. A PV is implemented as a Volume plugin—it abstracts the details of the storage implementation (such as NFS or iSCSI communication), from the storage consumer. The main feature of a PV is that it has an independent life cycle, and it continues to live when the pods accessing it have shut down.

PersistentVolumeClaim (PVC)
This is a request sent by a Kubernetes node for storage. The claim can include specific storage parameters required by the application—for example an amount of storage, or a specific type of access (read/write, read-only, etc.). 

Kubernetes looks for a PV that meets the criteria defined in the user’s PVC, and if there is one, it matches claim to PV. This is called binding. You can also configure the cluster to dynamically provision a PV for a claim.

StorageClass
The StorageClass object allows cluster administrators to define PVs with different properties, like performance, size or access parameters. It lets you expose persistent storage to users while abstracting the details of storage implementation. There are many predefined StorageClasses in Kubernetes (see the following section), or you can create your own.

Administrators can define several StorageClasses that give users multiple options for performance. For example, one can be on a fast SSD drive but with limited capacity, and one on a slower storage service which provides high capacity.

Types of Persistent Volumes

Kubernetes comes with numerous plugins that let you make different types of storage resources available to nodes in the Kubernetes cluster. These are implemented using the StorageClass object. Here are some of the main plugins currently supported:

Cloud Storage and Virtualization

Proprietary Storage Platforms

Physical Drives / Storage Protocols

GCEPersistentDisk

Flocker

NFS

AWSElasticBlockStore

RBD (Ceph Block Device)

iSCSI

AzureFile

Cinder (OpenStack block storage)

FC (Fibre Channel)

AzureDisk

Glusterfs

 

VsphereVolume

Flexvolume

 

 

Quobyte Volumes

 

 

Portworx Volumes

 

 

ScaleIO Volumes

 

 

StorageOS

 

For more details on these plugins, see the StorageClass documentation

Read our blog post on the Kubernetes NFS integration. 

Persistent Volumes Features

Kubernetes Persistent Volumes offer powerful capabilities. The most important are detailed below.

Capacity

The capacity attribute lets you set the maximum storage capacity of the PV. Storage is specified in bytes, to ensure quantities are standard across all storage services and devices.

Volume Mode

By default, Kubernetes creates a file system on the PV, but if desired, you can use a raw block device directly without an additional layer.

Access Modes

 

A PV can have the following access modes:

●      ReadWriteOnce—enables read and write and can be mounted by only one node

●      ReadOnlyMany—enables read only and can be mounted by multiple nodes (but not at the same time)

●      ReadWriteMany—both read and write, can be mounted by several nodes (not at the same time)

 

Note: Different storage plugins may only support some of these access modes.

Reclaim Policy

The reclaim policy specifies what happens when the node no longer needs the persistent storage. It can be set to Retain, meaning the PV is kept alive until it is explicitly deleted; Recycle, meaning the data is scrubbed but can be restored later; and Delete, meaning it is irreversibly deleted.

 

Note: Different storage plugins may only support some of these reclamation policies.

Phase

A PV goes through the following lifecycle phases, which are visible to other entities in the cluster:

●      Available—free for use, binding has not occured yet

●      Bound—the PV was matched to a PersistentVolumeClaim and binding has occurred

●      Released—the user deleted their PVC, but the PV is not yet reclaimed by the cluster

●      Failed—the PV could not be reclaimed by the cluster automatically

Kubernetes Persistent Storage with Cloud Volumes ONTAP

NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, cloud automation, Kubernetes integration, and more.

In particular, Cloud Volumes ONTAP provides Kubernetes integration for persistent storage requirements of containerized workloads.

Want to learn more about Kubernetes Storage?

Have a look at these articles:

Want to get started? Try out Cloud Volumes ONTAP today with a 30-day free trial.

-