More about Kubernetes Storage
- Kubernetes vs. Nomad: Understanding the Tradeoffs
- How to Set Up MySQL Kubernetes Deployments with Cloud Volumes ONTAP
- Kubernetes Volume Cloning with Cloud Volumes ONTAP
- Container Storage Interface: The Foundation of K8s Storage
- Kubernetes Deployment vs StatefulSet: Which is Right for You?
- Kubernetes for Developers: Overview, Insights, and Tips
- Kubernetes StatefulSet: A Practical Guide
- Kubernetes CSI: Basics of CSI Volumes and How to Build a CSI Driver
- Kubernetes Management and Orchestration Services: An Interview with Michael Shaul
- Kubernetes Database: How to Deploy and Manage Databases on Kubernetes
- Kubernetes and Persistent Apps: An Interview with Michael Shaul
- Kubernetes: Dynamic Provisioning with Cloud Volumes ONTAP and Astra Trident
- Kubernetes Cloud Storage Efficiency with Cloud Volumes ONTAP
- Data Protection for Persistent Data Storage in Kubernetes Workloads
- Managing Stateful Applications in Kubernetes
- Kubernetes: Provisioning Persistent Volumes
- An Introduction to Kubernetes
- Google Kubernetes Engine: Ultimate Quick Start Guide
- Kubernetes Persistent Volumes, Claims, Storage Classes, and More
- Azure Kubernetes Service Tutorial: How to Integrate AKS with Azure Container Instances
- Kubernetes Workloads with Cloud Volumes ONTAP: Success Stories
- Container Management in the Cloud Age: New Insights from 451 Research
- How to Use NetApp Cloud Manager with Trident for Provisioning Persistent Volumes in Kubernetes Deployments
- Kubernetes Storage: An In-Depth Look
- Monolith vs. Microservices: How Are You Running Your Applications?
- Kubernetes Shared Storage: The Basics and a Quick Tutorial
- Kubernetes NFS Provisioning with Cloud Volumes ONTAP and Trident
- Azure Kubernetes Service How-To: Configure Persistent Volumes for Containers in AKS
- Kubernetes NFS: Quick Tutorials
- NetApp Trident and Docker Volume Tutorial
Kubernetes has transformed the way DevOps teams manage CI/CD pipelines. In its few years since becoming public, Kubernetes is the most used container orchestrator, mainly due to its simplicity, declarative syntax, and a ubiquitous presence on almost any cloud provider. Thanks to Kubernetes abstractions, software teams can provision persistent storage for stateful applications running in containers. Click here to take a deep dive into Kubernetes storage.
In this post we’ll take a closer look at what is so appealing about Kubernetes for developers, including an overview of its basic features for deployment, monitoring, security, and some of the NetApp solutions that can make it even more effective.
Use the links below to jump down to the section on:
- What Is Kubernetes?
- How Developers Deploy Kubernetes
- Getting Started with Kubernetes
- Declarative Deployment with YAML
- Kubernetes Flavors
- Building a Kubernetes Cluster with Kops
- Kubernetes Resource Types
- Kubernetes Deployment Strategies
- Kubernetes Services, Networking and Load Balancing
- Kubernetes Security Secrets
- Monitoring and Security for Kubernetes
- Kubernetes Development for Stateful Applications
What Is Kubernetes?
Kubernetes is an open-source container orchestration tool designed to manage distributed applications with automated scaling of nodes and containers, fault tolerance, and ease of use. It was originally created by Google but is currently managed by the Cloud Native Computing Foundation. Initially designed to run using Docker containers, Kubernetes can currently run many other types of containers, such as rkt.
Kubernetes works by orchestrating collections of machines in groups known as clusters. A cluster consists of two types of machines:
- Master Nodes are machines that manage the cluster and enable administrative processes. Master nodes hold the Control Plane components.
- Worker Nodes, on the other hand, are machines that host the containers running application workloads.
A Kubernetes cluster could contain any combination of one or more personal computers, virtual machines at an on-premises data center and instances on cloud computing platforms. As a result, the framework rightly supports large-scale applications that require complex networks comprising hundreds or thousands of nodes. In the following section, we delve deeper into how developers utilize Kubernetes in modern CI/CD pipelines.
How Developers Deploy Kubernetes
Working with distributed applications is challenging, and that impacts the approach to Kubernetes for developers. The many moving parts involved increase the chances of failure. Besides all the infrastructure needs (such as machines and networks), there are other factors to juggle from the operational and development side. For example, how do users deploy each application while keeping them reliable? What to do in case of a failure? How will services discover each other? And what about security?
Below we’ll take a deep dive into the nuances of how this platform works to answer some of these questions about Kubernetes for developers.
Getting Started With Kubernetes
With Kubernetes, developers can automate software releases by implementing Infrastructure as Code. This follows a descriptive model in which teams configure and manage deployment environments using code binaries that govern provisioning and management of infrastructure components.
Kubernetes components are deployed using Yet-Another-Markup-Language (YAML) scripts, which allow for effortless provisioning of infrastructure, reusable underlying components, and iterative processes. YAML scripts can be used to provision similar configurations on multiple environments, including local machines, on-prem data centers, or hybrid cloud environments. This section explores how developers can provision Infrastructure as Code using YAML, and how to build a practical cluster using different Kubernetes distributions.
Declarative Deployment with YAML
YAML templates are human-readable text-based file formats used to define configuration information, logs, interprocess messaging, and data sharing in various software platforms, including Kubernetes. The Kubernetes API speaks JSON, but most projects use YAML files as they are easier to interpret and can be shared across various teams. A YAML file uses two types of data structures to define fields and objects: maps and lists.
YAML maps associate key-value pairs. A typical Kubernetes configuration file includes various fields with one-on-one association, requiring the use of simple maps. The typical header for a Kubernetes API object would look similar to:
This is a simple YAML notation that maps two values certificates.k8s.io/v1beta1 and CertificateSigningRequest to two keys apiVersion and kind.
YAML maps can also be used to specify complicated data structures by creating keys that map to other key-value pairs, such as:
In this case, the key metadata has a value which is a map with two more keys: name and namespace. Developers can create configuration files with as much nesting as needed depending on system complexity.
YAML Lists define object sequences. A list can contain any number of items, as shown below:
- digital signature
- key encipherment
- server auth
Members of a list can also be maps. In this case, each member of the list is a container object that consists of key-value pairs, allowing for the configuration of detailed and complex objects. The dictionary-map configuration would look similar to this:
- name: front-end
- containerPort: 80
- name: rss-reader
- containerPort: 88
Developers use YAML files to create such Kubernetes resources as Pods, Services, Deployments, and Persistent Volume Claims. The configuration file for a simple POD running an Nginx app would look similar to this:
- name: darwin-webserver
- containerPort: 80
There are various distributions that allow to configure and administer a Kubernetes environment. These flavors could be Vanilla (basic, with no extra components) or managed distributions that eliminate the burden of installing, configuring and managing the resources used to run workloads. This section explores different available flavors of Kubernetes and their most appropriate use-cases.
- Vanilla Kubernetes consists of the most minimal components to keep a cluster running. At its core, a running Kubernetes application requires six components: the Kube-apiserver, kube-scheduler, kube-controller-manager, the cloud-controller manager, kubelet, and the kube-proxy server. With these in place, the operations team remains focused on enabler domains such as setting up observability, networking, security administration, provisioning storage and load-balancing to keep the application running.
- Minikube is a popular, developer-friendly Kubernetes distribution that enables the setup of a minimal, single-node Kubernetes cluster. This version runs on a single host workstation and deploys the core Kubernetes features with a single command. Other communities such as GKE, AKS, and OpenShift also offer Vanilla installs but these typically come with some optimization for specific workloads/use-cases.
- Managed Kubernetes
On account of the consistent popularity of Kubernetes, there is an emerging trend of third-party service providers that offer Kubernetes-based platforms. Such platforms offer optimum support and a fine-tuned Kubernetes ecosystem that enable organizations to operate Kubernetes workloads locally or on the cloud. Some popular managed Kubernetes platforms include:
- Google Kubernetes Engine (GKE): A managed Kubernetes service developed by Google as an upstream to its container-optimized OS for in-house workload orchestration. GKE offers faster node management, integrated logging & monitoring, auto-scaling, automatic updates and other enterprise features that make it one of the most advanced Kubernetes platforms.
- Azure Kubernetes Service (AKS): A managed Kubernetes service that lets organizations quickly deploy and manage complex clusters on the MS Azure public cloud. By simplifying much of the overhead involved, AKS reduces the complexity of deploying scalable applications by offering inherent automation and easy configuration options.
- AWS Elastic Kubernetes Service: An upstream service for AWS EC2 instances that includes multiple enterprise features for the deployment of Kubernetes applications across various AWS availability zones.
- OpenShift Container Platform: An open-source, vendor-agnostic platform that allows for multiple deployments, including on premises, private, public cloud, or hybrid cloud.
- Rancher: A managed Kubernetes distribution that enables the management of Kubernetes nodes, Docker containers and Linux hosts to facilitate customization of Kubernetes environments without impacting continuous and seamless upgrades.
Building a Kubernetes Cluster With Kops
Kubernetes Operations (Kops) simplifies the configuration and deployment of production grade clusters on various environments (AWS, GKE, and VMWare vSphere). Kops clusters are made of High Availability (HA) masters and enable the automation of various Kubernetes operations including:
- Command Line Completion
- API configuration
- Terraform generation
- Manifest templating and dry runs
To create a production-grade cluster using Kops, follow the steps as listed below:
- Download and install Kops from the release page using the following command:
$ curl -LO https://github.com/kubernetes/kops/releases/download/$(curl -s https://api.github.com/repos/kubernetes/kops/releases/latest | grep tag_name | cut -d '"' -f 4)/kops-linux-amd64
- Make the Kops binary executable:
$ chmod +x kops-linux-amd64
- Move the binary into its appropriate path:
$ sudo mv kops-linux-amd64 /usr/local/bin/kops
- Create a cluster domain (Route53) to enable DNS discovery
- Create an AWS S3 bucket to persist cluster state
- Build the cluster in AWS
- Create the cluster in AWS.
The above steps can also be deployed on clusters other than AWS.
Managing objects and resources in Kubernetes
Kubernetes offers three fundamental techniques of managing objects and cluster resources:
- Management Through Imperative Commands
Developers operate directly on live objects using the kubectl command combined with various arguments and flags. This technique is appropriate for one-off cluster tasks, as it does not provide records of previous actions or configurations.
- Imperative Object Configuration
With imperative configuration, developers specify kubectl operations and flags combined with a target JSON or YAML file. The object configuration can be stored in a source control platform where context and a history of configurations can be accessed.
- Declarative Object Configuration
In this case, developers operate on YAML or JSON configuration files without defining the operations to be taken on the files. The kubectl command then automatically detects changes in these files and applies them when a user points the apply, create, or update commands to a target file.
Kubernetes Resource Types
Some of the objects that can be configured using YAML files in Kubernetes include:
- Workload deployment resources such as Pods, Deployments, StatefulSets and ReplicaSets
- Services such as LoadBalancer, ClusterIP, Ingress
- Configuration resources such as ConfigMaps and Secrets
Kubernetes Deployment Strategies
Modern applications require rapid scaling, integration and frequent deployments. Kubernetes applications are microservices-based, so multiple teams working on different modules of the application perform different deployments. Kubernetes allows various deployment strategies that software teams can use depending on the objective. These include Recreate and RollingUpdates as well as Canary and Blue/Green Deployments which are not provided out of the box and require 3rd party tools.
Out of the above, Recreate and RollingUpdates are the most commonly used deployment strategies. The Recreate approach stops the old version to start the new one. As a result, the application appears offline while changing the versions. The RollingUpdate policy, on the other hand, keeps the application available by spinning up new versions while shutting down old versions of the application.
Kubernetes Services, Networking and Load Balancing
Kubernetes relies on services to expose applications running in PODs to networks. Every POD running an application gets assigned an IP address. Pods are, however, ephemeral and this makes it hard for client browsers and other applications to keep track of the IP address they should connect to. A service defines a set of logically connected PODs, and includes policies on how to access them. The PODs that are exposed by a service are typically denoted by a selector in the service’s YAML definition file. For instance, to expose pods hosting containers with running instances of an application MyApp using a service my-service:
- protocol: TCP
Kubernetes includes various service types which enable developers to publish services to external IP addresses outside the Kubernetes cluster. Kubernetes ServiceTypes include:
- ClusterIP: Exposes a set of PODs as a cluster’s internal IP Address
- NodePort: Exposes the service on a static port at the host node’s IP address
- LoadBalancer: Exposes the service to public networks through a cloud provider’s load balancer
- ExternalName: This service eliminates the need to set up any proxies since it exposes the services to a URL field by returning a DNS domain name
Kubernetes Security Secrets
A Kubernetes secret is an API object that allows developers to store and share sensitive data. Secure communication with the Kubernetes API is achieved through TLS/SSL, which involves the use of keys to encrypt messages. These keys need to be shared between various teams/machines for collaborative administration.
While doing so, a Secret in Kubernetes acts as a vault used to store information that cannot be viewed directly in an object’s configuration file, including:
- TLS/SSL Keys
- Database Passwords
- Service Account Tokens
Using Secrets, an application deployment file can be used in many different environments (dev, stage, production) without any sensitive information stored on it.
Monitoring and Security for Kubernetes
DaemonSets enable monitoring of Kubernetes applications by ensuring that there’s always an instance of a POD running on cluster nodes. While doing so, applications allow the collection of monitoring statistics by connecting to the following two components:
- Kubelet, which acts as a bridge between the master and the nodes and watches for PodSpecs to collect statistics and current status.
- Container Advisor (cAdvisor) works on containers and delivers resource usage and performance analysis metrics.
Kubernetes interfaces with various monitoring and logging solutions to provide full-scale observability and visibility into applications running in clusters. Some popular platforms include:
- Kubernetes Dashboard
- Weave Scope
- EFK Stack
Kubernetes Development for Stateful Applications
Containers are ephemeral: a container runs a single process and terminates, deleting all the data it creates or processes. Containers have, therefore, long been used to orchestrate stateless applications.
Stateful distributed applications cannot work in the same manner of stateless apps: you can’t just increase the number of instances to accommodate the actual demand because the data stored can’t be shared without risks of data corruption or data loss. Storage for containerized stateful applications, as a result, requires a novel approach that differs from monoliths.
Here are a few critical elements of managing stateful application in Kubernetes:
Kubernetes delivers Persistent Volumes to handle persistence storage in applications. A PersistentVolume (PV) is a Kubernetes object that connects containerized workloads to physical block or file storage. PVs can be static or dynamic. Static volumes are created for the application ahead of time and are harder to maintain as the operator must know all the application’s future storage needs from the start. Dynamic volumes are created on-demand; thus, the cluster allocates the resources as they are needed. While dynamic volumes are used in most cases, a static volume can come in handy when an application has specific IO needs, such as a relational database.
StatefulSet is a Kubernetes API object that helps in the deployment and management of pods running stateful applications. These objects assign identifiable, consistent ID to each pod for easier attachment of storage and workloads, irrespective of the node they are scheduled to.
Alongside enabling workloads to maintain connectivity with respective PVs and pods, StatefulSets enable persistent storage for stateful applications while supporting automated rolling updates and ordered scaling.
Kubernetes was initially developed with stateless applications in mind. As the modern technology landscape moved to a stateful ecosystem, the Container Storage Interface (CSI) plugin was introduced to offer a standardized approach to storage orchestration. The interface lets vendors create storage solutions and offers organizations to efficiently extend the Kubernetes volume layer by orchestrating container storage.
Persistent storage is tied to the infrastructure the cluster is running; thus, it depends on the storage options the providers make available. Each cloud provider offers at least one solution for persistence; at NetApp, a useful open-source tool has been created called NetApp Astra Trident.
Trident carries out dynamic provisioning of persistent storage for Kubernetes and works with NetApp Cloud Volumes ONTAP, which provides the storage volumes needed on AWS, Azure, or Google Cloud. Persistent volumes allocated by Cloud Volumes ONTAP get additional benefits such as high availability, data protection, file services, and more.
This post covered some key concepts from the viewpoint of Kubernetes for developers. Kubernetes offers extensive features that facilitate container orchestration of a distributed, complex ecosystem. While Kubernetes eases the complexity of managing containers, a diligent approach to managing the Kubernetes ecosystem is equally important.
The article also delved into the concept of managing stateful applications - a topic that is increasingly getting mainstream in the modern technology landscape. While stateless applications are easy to provision, stateful workloads require a more refined deployment in defining where the data will be stored, when they will be removed (if needed), and what kind of fallback it will provide.
Cloud Volumes ONTAP supports Kubernetes Persistent Volume provisioning and management requirements of containerized workloads.
Learn more about how Cloud Volumes ONTAP helps to address the challenges of containerized applications in these Kubernetes Workloads with Cloud Volumes ONTAP Case Studies.