What is Kubernetes Storage?
Storage has always been a challenge for IT practitioners, with issues like integrity, retention, replication and migration of large data sets. These challenges are not new, and with modern, decentralized systems based on containers, they haven't gone away.
Kubernetes is the most popular orchestrator for containerized workloads. Because containers are ephemeral, they erase all stored data by default, causing major challenges for many types of workloads. However, Kubernetes provides several capabilities that help mitigate this problem, and support stateful workloads in a containerized environment.
Kubernetes handles all aspects of the container lifecycle, including creation, management, automation, load balancing, and hardware interfaces, as well as interfaces to storage devices. Kubernetes introduces the concept of Persistent Volumes, which exist independently of containers, survive even after containers shut down, and can be requested and consumed by containerized workloads.
In this article, you will learn:
- How Does Kubernetes Storage Work?
- Volume Management in Kubernetes
- Kubernetes Storage Best Practices
- Kubernetes Storage with NetApp Cloud Volumes ONTAP
How Does Kubernetes Storage Work?
The Kubernetes storage architecture is based on Volumes as a central abstraction. Volumes can be persistent or non-persistent, and Kubernetes allows containers to request storage resources dynamically, using a mechanism called volume claims.
Volumes are the basic entity containers use to access storage in Kubernetes. They can support any type of storage infrastructure, including local storage devices, NFS and cloud storage services. Developers can create their own storage plugins to support specific storage systems. Volumes can be accessed directly from pods or Persistent Volumes (defined below).
By default, Kubernetes storage is temporary (non-persistent). Any storage defined as part of a container in a Kubernetes Pod, is held in the host's temporary storage space, which exists as long as the pod exists, and is then removed. Container storage is portable, but not durable.
Kubernetes also supports a variety of persistent storage models, including files, block storage, object storage, and cloud services belonging to these and additional categories. Storage can also be defined as a data service, commonly a database.
Storage can be referenced directly from within a pod, but this violates the pod’s portability principles and is not recommended. Instead, pods should use Persistent Volumes and Persistent Volume Claims (PV/PVC) to define the storage requirements of their applications.
Persistent Volumes (PV) and Persistent Volume Claims (PVC)
PV and PVC separate storage implementations from functionality and allow pods to use storage in a portable way. It also separates users and applications from storage configuration requirements.
Administrators can define storage resources, together with their performance, capacity and cost parameters, in a PV. A PV also defines details like routes, IP addresses, credentials, and a lifecycle policy for the data. PVs are not portable between Kubernetes clusters.
A PVC, on the other hand, is used by users or developers to describe the storage required by the application. They are portable and can be moved together with an application. Kubernetes identifies the storage available in the defined PV, and if it matches the requirements in the PVC, binds the PVC to that storage.
The PVC can specify some or all of the storage parameters defined in the PV. If, for example, the PVC defines only capacity and storage tier, it can be bounded to a larger variety of PVs (any that meet those criteria).
Deployments and Stateless Sets
Kubernetes provides a construct called a deployment, which comprises several cloned pods, which share the same PVC. This can lead to stability issues. A better option is to run pods as stateless sets, which allows you to clone PVCs between containers.
Kubernetes administrators can define StorageClasses and assign PVs to them. Each StorageClass represents a type of storage—for example, fast SSD storage vs regular magnetic drives or remote cloud storage. This allows a Kubernetes cluster to provision different types of storage depending on the changing requirements of its workload.
A StorageClass is a Kubernetes application programming interface (API) for setting storage parameters. It is a dynamic configuration method that creates new volumes on demand. The StorageClass specifies the name of the volume plugin used, an external provider if any, and a Container Storage Interface (CSI) driver, which allows containers to interact with storage devices.
Dynamic Provisioning of StorageClasses
Kubernetes supports dynamic volume provisioning, which allows for creation of storage volumes on demand. This eliminates the need for administrators to manually create new storage volumes in their cloud or storage provider, and then create PersistentVolume objects to make them available in the cluster. This whole process happens automatically when a specific storage type is requested by users.
The cluster administrator defines StorageClass objects as needed. Each StorageClass references a volume plugin, also known as a provisioner. The volume plugin specifies a set of parameters and passes them to a provisioner when it automatically provisions a storage volume.
Each StorageClass defined by the administrator can represent a different type of storage or the same storage with different parameters (for example, S3 using the normal storage tier vs an archive tier). This allows users to select from several storage options, without worrying about the underlying implementation of each one.
Volume Management in Kubernetes
Unlike regular non-persistent Volumes, a PV is a Kubernetes resource object and has its own lifecycle, independent of pods. Kubernetes uses PV controllers to implement and manage the lifecycle of PV and PVC. Creating a PV is similar to creating a storage resource object in Kubernetes.
The life cycle of PV and PVC is divided into 5 stages.
- Provisioning—a PV is created in advance by the administrator in static mode, or via a StorageClass provided by the administrator in dynamic mode.
- Binding—the PV is bound and assigned to PVC.
- Using—the container consumes a PV, via the PVC.
- Releasing—the container releases the PV, removing the PVC.
- Reclaiming—Kubernetes reclaims the storage resources previously used by the PV.
After the user has finished using the volume, two strategies can be used to reclaim the storage resources used by the PV (a third strategy, “reclaim”, is now deprecated):
- Retain—the retention reuse strategy allows users to request a PV again in the future.
- Delete—the delete strategy causes the PV to be deleted from the Kubernetes cluster, and the space taken up on external storage devices is also deleted.
Kubernetes Storage Best Practices
Managing Kubernetes storage can be complex. The following best practices will help you manage storage more effectively.
Persistent Volume Settings
When defining a PV, the Kubernetes documentation recommends the following best practices:
- Always include a PVC in pod configuration
- Never include a PV in the configuration (because this violates portability)
- Always create a default storage class
- Give users the option to select a StorageClass in their PVC
Resource Quotas for Namespaces
Resource quotas are also available at the namespace level, giving you another layer of control over cluster resource usage.
Resource quotas limit the total amount of CPU, memory, and storage resources that can be used by all containers running in the namespace. It can also limit consumption of storage resources according to service levels or backup.
The following command checks if resource quotas are enabled at the namespace level:
kubectl describe namespace <namespace_name>
Support High Performance with Quality of Service Definitions
There are many types of persistent storage hardware. SSDs, for example, offer better read/write performance than HDDs, while NVMe SSDs are particularly suitable for heavy workloads.
Some Kubernetes providers extend the definition of a PVC with quality of service (QoS) parameters. This means it prioritizes read/write volumes for specific deployments, enabling higher throughput if needed by the application.
Kubernetes Storage with NetApp Cloud Volumes ONTAP
NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.
Read more about the added values of Cloud Volumes ONTAP in Kubernetes:
- Data Protection for Persistent Storage in Kubernetes Workloads
- Storage Efficiency For Improving Persistent Volume Storage Costs
- Cloning Kubernetes Persistent Volumes
- SQL Kubernetes Deployments with Cloud Volumes ONTAP
- Kubernetes Persistent Volumes for NFS File Services
Learn More About Kubernetes Storage
Read more in our series of guides about Kubernetes storage.
An Introduction to Kubernetes
In recent years, software developers and DevOps engineers have benefited from encapsulating applications into lightweight, independent units called containers. Kubernetes takes container deployment to a whole new level by providing a robust solution for managing and scaling containers and containerized applications and workloads across a cluster of machines.
Kubernetes is changing the way that software is being deployed, though it has a lot of moving parts to be aware of. If you’re just getting started with Kubernetes or want to know what it can do with NetApp’s Trident and Cloud Volumes ONTAP, our Kubernetes Introduction blog post will introduce you to the history, background, important use cases, and basic terminology that relates to Kubernetes.
Read more: An Introduction to Kubernetes
AWS ECS vs Kubernetes: An Unfair Comparison?
Amazon Elastic Compute Service (ECS) is a container orchestration service that runs and manages containers. It manages cloud machine instances, scales and schedules groups of containers across multiple Availability Zones (AZ). By contrast, Kubernetes is the world's most popular container orchestration platform, which can run in the Amazon cloud but also on other cloud platforms and providers.
Comparing Kubernetes to ECS is not an apples-to-apples comparison, because ECS provides both container orchestration and a managed service that operates it for Amazon users. Kubernetes offers only the first aspect, not the second. Learn how ECS compares to Kubernetes and also to a managed Kubernetes service that offers both aspects - Amazon Elastic Kubernetes service.
Read more: AWS ECS vs Kubernetes: An Unfair Comparison?
AWS Kubernetes Cluster: Quick Setup with EC2 and EKS
A Kubernetes cluster includes one or more pods, which are groups of containers. Kubernetes uses clusters to help organizations manage containers at scale. A Kubernetes cluster uses several components to manage container workloads, including an API server, a scheduler, kubelet (an agent that runs on each container), and etcd (a lightweight database that holds cluster configuration).
Learn about the different options for deploying Kubernetes on Amazon Web Services, and see practical instructions for running a Kubernetes cluster directly on EC2 machines, as opposed to deploying a cluster automatically using Amazon Elastic Kubernetes Service (EKS).
Cloud File Sharing: How to Provision Kubernetes Persistent Volumes for NFS with Cloud Volumes ONTAP and Trident
One of the most popular file protocols in use today is NFS (Network File System). With NFS, users can share files in enterprise-scale deployments with thousands of users around the world concurrently for use cases as diverse as big data analytics, data lake creation, archiving, database, and more. With Kubernetes deployments, NFS can be used with pods to provide Kubernetes persistent volumes that can share data across containers.
There are a number of advantages to using NFS with Kubernetes. One of these advantages is that it offers more flexibility than block-level persistent volume allocations. NFS makes it possible for a single file system to be mounted by multiple hosts who all have concurrent file access. As such, the storage volume can be mounted and used right away, without being formatted using an OS. This makes it easier to attach pods and storage and lowers administrative overhead. Of course, you’ll still need a Kubernetes persistent volumes provisioner, and NetApp Trident can do that, as it fully supports NFS.
The benefits of using Trident as a Kubernetes NFS provisioner, include the abilities to dynamically resize NFS persistent volumes, mount persistent volumes as Read/Write Many, and create separate storage classes for different mount parameters and other requirements. Plus, it comes along with all the data management benefits of Cloud Volumes ONTAP.
This post gives you an in-depth look at NFS file services with Kubernetes and how to use Trident as your Kubernetes NFS provisioner for Kubernetes persistent volumes. You’ll be able to see specific code examples for provisioning, creating separate storage classes, and more.
Data Protection for Persistent Data Storage in Kubernetes Workloads
Enterprise workloads typically have a strong requirement for reliable data storage. Kubernetes persistent volumes can be provisioned using a variety of solutions. However, ensuring that the data is easy to backup and restore, always available, consistent, and durable in a Kubernetes workload DR (Disaster Recovery) situation or any other failure is the responsibility of end users and administrators.
In this article, we’ll look at how containerized applications in Kubernetes can take advantage of the enterprise data protection features of Cloud Volumes ONTAP by provisioning persistent volumes through NetApp Trident. This solution can help meet all the data protection requirements of production Kubernetes workloads transparently and with ease.
Dynamic Kubernetes Persistent Volume Provisioning with NetApp Trident and Cloud Volumes ONTAP
There are two ways Kubernetes persistent volumes are provisioned so users can take advantage of the extensible framework for clustered data storage management: static and dynamic. Administrators who want to have all the storage they require upfront can do so with static provisioning, which pre-allocates any Kubernetes persistent volumes. Such a decision depends on an exact understanding of the storage needs of the cluster. Should the cluster’s storage demands exceed the number of volumes previously provisioned, there would be an issue.
Those issues are avoided when using the second option for Kubernetes persistent volume provisioning, dynamic provisioning. In a deployment using dynamic provisioning, users don’t have to know of the cluster’s eventual storage needs since persistent volumes are provisioned as they are needed. This more organic process gives the Kubernetes cluster more flexibility, and room to scale.
By using dynamic storage provisioning, Kubernetes users can greatly simplify how persistent volumes are deployed in clusters. Without the requirements to provision volumes manually and have foreknowledge of the storage amount needed, users can let their clusters scale without worry.
How to Set Up MySQL Kubernetes Deployments with Cloud Volumes ONTAP
With persistent volumes, Kubernetes can support stateful apps, such as databases like MySQL. Kubernetes uses this permanent form of data storage to control and use MySQL databases and other services that populate their deployments via a unified platform that scales.
Stateful sets are a powerful mechanism that Kubernetes uses to scale stateful applications. Stateful sets are good for horizontal-scaling systems that would use a new node while deploying persistent storage from a template. In this way the network identity of all the pods can stay stable with the guarantee that the persistent volumes the set is connected to, will not be deleted. However, stateful applications such as MySQL databases need the highest levels of data protection, features that Kubernetes provisioning relies on the storage service to provide. Since a variety of storage can be used in Kubernetes, meeting the data protection requirements is up to the user to manage. DevOps engineers also require an easy way to clone quickly in order to speed up testing and TTM, another feature that the storage provider backing Kubernetes may lack.
By deploying Kubernetes with NetApp Trident, the storage used for persistent volumes is allocated by NetApp systems such as Cloud Volumes ONTAP, which come with multiple data protection benefits: backup and restore for databases of any size through ONTAP Snapshots, high availability for Kubernetes persistent volumes across AZs with Cloud Volumes ONTAP HA, efficient, block-level data replication with SnapMirror, and FlexClone data cloning for fast, space-efficient clones to benefit DevOps testing.
How to Use NetApp Cloud Manager with Trident for Provisioning Persistent Volumes in Kubernetes Deployments
NetApp Trident is a fully supported, open-source storage provisioner for Kubernetes, which enables Kubernetes persistent volumes to be dynamically provisioned with Cloud Volumes ONTAP. This allows you to retain the use of native Kubernetes manifests and constructs to interact with your persistent storage, while at the same time gaining the benefits of using NetApp’s enterprise-grade data management platform.
Cloud Volumes ONTAP provides a whole host of features that are crucial for the reliable storage of persistent data. Protecting data with instant snapshots and high availability, as well as supporting data mobility between your on-premises systems and your Kubernetes deployments with highly efficient block-level data replication, are just some of the advantages users gain from using NetApp storage technologies.
Cloud Manager simplifies the process of deploying NetApp Trident into your Kubernetes deployment, irrespective of its underlying implementation, and then configuring the cluster to use a specific deployment of Cloud Volumes ONTAP. All of this can be achieved within minutes from the Cloud Manager web-based UI, and with just a few clicks you’ll be ready to start provisioning persistent storage for your cluster using Cloud Volumes ONTAP.
Kubernetes for Developers: Overview, Insights, and Tips
Kubernetes has transformed the way that companies design, deploy, and orchestrate microservices. Whether based on-prem or in the cloud, there are a number of basic things to know about Kubernetes for developers that will help make using the service much easier and more effective.
In this post, we’ll walk you through the basics of Kubernetes. For developers looking to design their own Kubernetes workflows, this is a useful place to start. You’ll learn some of the ground floor rules of the cluster-building orchestration platform, including its security features, fundamental architecture for load balancing and failure prevention, and more.
This post covers some of the monitoring tools that come in handy with Kubernetes for developers. These include applications such as DaemonSets, which can make it possible for the Kubernetes app to have metrics run from a centralized environment, including Kubelet, for bridging nodes, PodSpecs for statistics and status monitoring, and Container Advisor for keeping tabs on your containers and their usage.
Besides these Kubernetes-native features, this post also takes a look at how NetApp Trident and Cloud Volumes ONTAP can be used to make Kubernetes deployment even easier, allowing for persistent volumes to be provisioned dynamically on AWS and Azure storage resources, and to deploy Kubernetes clusters across clouds from a single central management console.
Kubernetes NFS: Quick Tutorials
Kubernetes Volumes are storage units that allow containers in a Kubernetes cluster to write, read and share data. One of the many storage plugins offered by Kubernetes is the NFS plugin, which lets containers mount a Kubernetes volume as a local drive. This is useful for migrating legacy applications to Kubernetes, because they can continue accessing data the same way as they did in a traditional deployment model.
Learn about the advantages of using NFS with Kubernetes, and see step-by-step instructions on mounting an NFS share on a container, and creating an NFS persistent volume which containers can mount as a local drive.
Read more: Kubernetes NFS: Quick Tutorials
Kubernetes Persistent Storage: Why, Where and How
Containers are immutable, but there is often a need to save data in persistent storage and access it from one of more containers. Kubernetes provides a convenient persistent storage mechanism called Persistent Volumes. Kubernetes Volumes allow you to mount a storage unit, such as a file system folder or a cloud storage bucket, to one or more Kubernetes node, and also use it to share data between the nodes. Persistent Volume is hosted in its own Pod and can remain alive for as long as necessary for ongoing operations.
Persistent Volumes are the Kubernetes way to hide the details of storage implementation from applications and users, and provide a cloud native way to seamlessly connect to a variety of cloud storage systems, virtualized storage, and proprietary or open source storage platforms. An application can simply request storage with specific criteria, and Kubernetes provisions it automatically.
Cloning Kubernetes Persistent Volumes with NetApp Trident and Cloud Volumes ONTAP
Developers get a huge advantage from Kubernetes’ abilities to easily scale and manage containerized workloads. But the CI/CD pipeline also requires an easy way to test new builds and changes to environments. Normally, this would require provisioning an entirely new persistent volume with all of the same data. That could be costly both in terms of the time and the costs involved for storage. Trident and Cloud Volumes ONTAP offer a better solution: FlexClone® data clone volumes.
Using Trident, Kubernetes persistent volume claims can be answered by creating highly space-efficient clones of persistent volumes instantaneously. Trident does this using a set of basic annotations on the persistent volume claim, and works in tandem with the ONTAP back-end systems to locate the original volume claim and recreating it.
The cost benefits of using FlexClone are considerable, as no storage needs to be consumed to create the clone: only the changed data needs to be stored. And once the clone is no longer needed by a pod, a simple reclaim policy of delete will delete the clone, making sure no unnecessary storage is taken up.
Kubernetes Shared Storage: The Basics and a Quick Tutorial
Kubernetes storage is based on the concept of Volumes - abstracted storage units that containers can use to store and share data. Kubernetes provides a range of storage plugins that integrate with storage offered by public cloud providers, virtualization systems like VMware, and on-premise hardware using standard protocols like NFS.
Learn how Kubernetes storage works, including volumes, persistent volumes, static and dynamic provisioning, and see how to set up a storage volume in a Kubernetes YAML file.
Managing Stateful Applications in Kubernetes
Stateful applications that run in Kubernetes need storage that is persistent and with a lifecycle that is independent of pods. Using persistent volumes can go some way towards achieving this, but another solution is to use stateful sets and dynamic provisioning, which are easier both to scale and to manage.
Kubernetes has been offering support for stateless applications since the platform’s inception, however, the storage that stateful applications rely on needs to have strong data protection guarantees, something that Kubernetes on its own does not provide. Stateful applications include some business-critical components; a database is a good example of a stateful application that is key to an enterprise and must ensure is protected. That protection takes place at the back-end storage service in use for persistent volumes. As different storage solutions can be used for this, data protection levels can vary. The service being employed needs to provide for backup and restore and availability solutions.
Using Trident and Cloud Volumes ONTAP, Kubernetes users can dynamically provision storage for their stateful sets and gain the benefits of high levels of data protection, zero-data-loss and under-60-second high availability, and flexible data management operability, all of which enterprise-level deployments require.
Read more: Managing Stateful Applications in Kubernetes
NetApp Trident and Docker Volume Tutorial
Docker volumes behave as a layer that abstracts storage provisioning and container usage. Using commands within Docker, volumes can be created, managed, and used to keep Docker admin operations consistently interfaced.
The storage that Docker volumes are based on can be provisioned from file services such as NFS, or local, block-level storage types. Whichever storage type is preferred, the Docker host machine will need to have access to it ahead of when provisioning takes place (an Amazon EC2 instance hosting a Docker container needs an Amazon EBS volume assigned to it). All of the additional data management tasks associated with the volume from scaling and capacity to monitoring and backup creation are manual operations for the user to carry out.
A solution for handling those operations are NetApp’s Trident and Cloud Volumes ONTAP. With the help of Trident, all of the data management features of NetApp storage are available for Docker volumes. These features are available by using native Docker commands, making provisioning storage for containers a vastly improved experience. This also makes moving to Kubernetes possible, as all the benefits can be carried over through Trident for Kubernetes.
Read more: NetApp Trident and Docker Volume Tutorial
Storage Efficiency for Improving Persistent Volume Storage Costs
When Kubernetes’ users provision large amounts of storage for containerized applications, it may be the case that large allocation of storage is never used. There may also be scenarios where a persistent volume stores data that is not compressed, a storage inefficiency that leads to consuming more storage and raising the associated storage costs unnecessarily.
The issue of storage efficiency is affected by how Kubernetes users decide to provision persistent volumes: manually through static provisioning, or automatically through dynamic provisioning. In either case, the challenge is to make sure there is storage efficiency. Developers will many times estimate for too large a storage need. In the cloud, that is a big reason for storage sprawl and unnecessary costs.
These storage challenges can be addressed through the use of the built-in storage efficiency features of Cloud Volumes ONTAP, which are available to Kubernetes users through the NetApp Trident provisioner. Storage space can be conserved through data deduplication, compression, compaction, thin provisioning, and automatically tiering cold data to less-expensive object storage on Amazon S3 or Azure Blob until it needs to be used. All of these combine to give Kubernetes advantages in reducing the storage space persistent volumes required.
Understanding Kubernetes Persistent Volume Provisioning
While Kubernetes allows for innovative ways to scale and use containerized workloads, there is still the need for storage solutions. Kubernetes facilitates this through persistent volumes, which provide the flexibility to control how storage is provisioned without affecting the pods that make use of that storage. Using persistent volumes, the same persistent volume claim could be set up to use a different type of backend storage based on the Kubernetes cluster it is deployed into.
Kubernetes persistent volumes are created through the use of a provisioner that interfaces with backend storage through the use of a plugin. The storage type can be a range of different formats, with support extending to Google Persistent Disk, Amazon EBS, Azure Disk Storage, and others. A reclaim policy is set for the persistent volume which determines its lifetime. This same policy controls what the cluster will do whenever a persistent volume claims is released by a pod.
Provisioning volumes takes place in two different ways: static or dynamic provisioning. With static provisioning, admins provision persistent volumes for the cluster ahead of time. This requires prior knowledge of storage requirements as a whole. In dynamic provisioning, persistent volumes are deployed automatically based on the claims the cluster receives.
Kubernetes persistent volumes enable a great amount of flexibility when it comes to storage provisioning due to the separation persistent volumes create between the containerized applications and the storage the apps make use of. NetApp’s Trident provisioner works alongside the Cloud Volumes ONTAP data management platform from inside Kubernetes, extending the benefits of storage optimization and ease of use to persistent volumes in Kubernetes.
Azure Kubernetes Service How-To: Configure Persistent Volumes for Containers in AKS
Azure Kubernetes service is being widely used by enterprises to deploy Microservices workloads, both for greenfield as well as brownfield deployments. Persistent volumes are mandatory elements of the architecture for stateful data sets used by containers. In Azure there are multiple options to achieve this, i.e. Azure files, Azure Disks, Cloud Volumes ONTAP, etc. This blog covers the steps required to provision persistent volumes using Azure disks and attach them to containers in AKS. Cloud Volume ONTAP offers advanced storage management capabilities and integrating it with AKS helps to extend these benefits to Microservices in AKS.
Monolith vs. Microservices: How Are You Running Your Applications?
There has been a revolution going on in the way that software and applications are being developed and deployed. The older model, known as the monolith model, looked at an application like a black box, where all of its systems were planned and bundled together to max out the server usage. But today, a new model is taking precedence: microservices deployment with containerized workloads.
Which is right for you? In this blog we compare both the monolith vs microservices models, and see how the advancements including containerized cloud deployment, virtualization, and more can be enhanced with Cloud Volumes ONTAP.