Kubernetes Storage

Kubernetes for Developers: Overview, Insights, and Tips

Kubernetes has transformed the way DevOps teams manage CI/CD pipelines. In its few years since becoming public, Kubernetes is the most used container orchestrator, mainly due to its simplicity, declarative syntax, and a ubiquitous presence on almost any cloud provider. Thanks to Kubernetes abstractions, software teams can provision persistent storage for stateful applications running in containers. Click here to take a deep dive into Kubernetes storage.

In this post we’ll take a closer look at what is so appealing about Kubernetes for developers, including an overview of its basic features for deployment, monitoring, security, and some of the NetApp solutions that can make it even more effective.

Use the links below to jump down to the section on:

What Is Kubernetes?

Kubernetes is an open-source container orchestration tool designed to manage distributed applications with automated scaling of nodes and containers, fault tolerance, and ease of use. It was originally created by Google but is currently managed by the Cloud Native Computing Foundation. Initially designed to run using Docker containers, Kubernetes can currently run many other types of containers, such as rkt.

Kubernetes works by orchestrating collections of machines in groups known as clusters. A cluster consists of two types of machines:

  1. Master Nodes are machines that manage the cluster and enable administrative processes. Master nodes hold the Control Plane components.
  2. Worker Nodes, on the other hand, are machines that host the containers running application workloads.

A Kubernetes cluster could contain any combination of one or more personal computers, virtual machines at an on-premises data center and instances on cloud computing platforms. As a result, the framework rightly supports large-scale applications that require complex networks comprising hundreds or thousands of nodes. In the following section, we delve deeper into how developers utilize Kubernetes in modern CI/CD pipelines.

How Developers Deploy Kubernetes

Working with distributed applications is challenging, and that impacts the approach to Kubernetes for developers. The many moving parts involved increase the chances of failure. Besides all the infrastructure needs (such as machines and networks), there are other factors to juggle from the operational and development side. For example, how do users deploy each application while keeping them reliable? What to do in case of a failure? How will services discover each other? And what about security?

Below we’ll take a deep dive into the nuances of how this platform works to answer some of these questions about Kubernetes for developers.

Getting Started With Kubernetes

With Kubernetes, developers can automate software releases by implementing Infrastructure as Code. This follows a descriptive model in which teams configure and manage deployment environments using code binaries that govern provisioning and management of infrastructure components.

Kubernetes components are deployed using Yet-Another-Markup-Language (YAML) scripts, which allow for effortless provisioning of infrastructure, reusable underlying components, and iterative processes. YAML scripts can be used to provision similar configurations on multiple environments, including local machines, on-prem data centers, or hybrid cloud environments. This section explores how developers can provision Infrastructure as Code using YAML, and how to build a practical cluster using different Kubernetes distributions.

Declarative Deployment with YAML

YAML templates are human-readable text-based file formats used to define configuration information, logs, interprocess messaging, and data sharing in various software platforms, including Kubernetes. The Kubernetes API speaks JSON, but most projects use YAML files as they are easier to interpret and can be shared across various teams. A YAML file uses two types of data structures to define fields and objects: maps and lists.

YAML maps associate key-value pairs. A typical Kubernetes configuration file includes various fields with one-on-one association, requiring the use of simple maps. The typical header for a Kubernetes API object would look similar to: 

apiVersion: certificates.k8s.io/v1beta1
kind: CertificateSigningRequest

This is a simple YAML notation that maps two values certificates.k8s.io/v1beta1 and CertificateSigningRequest to two keys apiVersion and kind.

YAML maps can also be used to specify complicated data structures by creating keys that map to other key-value pairs, such as:

metadata:
name: coredns
namespace: kube-system

In this case, the key metadata has a value which is a map with two more keys: name and namespace. Developers can create configuration files with as much nesting as needed depending on system complexity.

YAML Lists define object sequences. A list can contain any number of items, as shown below:

usages:
-   digital signature
-   key encipherment
-   server auth

Members of a list can also be maps. In this case, each member of the list is a container object that consists of key-value pairs, allowing for the configuration of detailed and complex objects. The dictionary-map configuration would look similar to this:

containers:
       - name: front-end
         image: nginx
         ports:
           - containerPort: 80
       - name: rss-reader
         image: nickchase/rss-php-nginx:v1
         ports:
           - containerPort: 88

Developers use YAML files to create such Kubernetes resources as Pods, Services, Deployments, and Persistent Volume Claims. The configuration file for a simple POD running an Nginx app would look similar to this:

apiVersion: v1
kind: Pod
metadata:
name: darwin
spec:
containers:
- name: darwin-webserver
   image: nginx:latest
   ports:
  - containerPort: 80


Kubernetes Flavors

There are various distributions that allow to configure and administer a Kubernetes environment. These flavors could be Vanilla (basic, with no extra components) or managed distributions that eliminate the burden of installing, configuring and managing the resources used to run workloads. This section explores different available flavors of Kubernetes and their most appropriate use-cases.

  • Vanilla Kubernetes consists of the most minimal components to keep a cluster running. At its core, a running Kubernetes application requires six components: the Kube-apiserver, kube-scheduler, kube-controller-manager, the cloud-controller manager, kubelet, and the kube-proxy server. With these in place, the operations team remains focused on enabler domains such as setting up observability, networking, security administration, provisioning storage and load-balancing to keep the application running.
  • Minikube is a popular, developer-friendly Kubernetes distribution that enables the setup of a minimal, single-node Kubernetes cluster. This version runs on a single host workstation and deploys the core Kubernetes features with a single command. Other communities such as GKE, AKS, and OpenShift also offer Vanilla installs but these typically come with some optimization for specific workloads/use-cases.
  • Managed Kubernetes
    On account of the consistent popularity of Kubernetes, there is an emerging trend of third-party service providers that offer Kubernetes-based platforms. Such platforms offer optimum support and a fine-tuned Kubernetes ecosystem that enable organizations to operate Kubernetes workloads locally or on the cloud. Some popular managed Kubernetes platforms include:
    • Google Kubernetes Engine (GKE): A managed Kubernetes service developed by Google as an upstream to its container-optimized OS for in-house workload orchestration. GKE offers faster node management, integrated logging & monitoring, auto-scaling, automatic updates and other enterprise features that make it one of the most advanced Kubernetes platforms.
    • Azure Kubernetes Service (AKS): A managed Kubernetes service that lets organizations quickly deploy and manage complex clusters on the MS Azure public cloud. By simplifying much of the overhead involved, AKS reduces the complexity of deploying scalable applications by offering inherent automation and easy configuration options.
    • AWS Elastic Kubernetes Service: An upstream service for AWS EC2 instances that includes multiple enterprise features for the deployment of Kubernetes applications across various AWS availability zones.
    • OpenShift Container Platform: An open-source, vendor-agnostic platform that allows for multiple deployments, including on premises, private, public cloud, or hybrid cloud.
    • Rancher: A managed Kubernetes distribution that enables the management of Kubernetes nodes, Docker containers and Linux hosts to facilitate customization of Kubernetes environments without impacting continuous and seamless upgrades.

Building a Kubernetes Cluster With Kops

Kubernetes Operations (Kops) simplifies the configuration and deployment of production grade clusters on various environments (AWS, GKE, and VMWare vSphere). Kops clusters are made of High Availability (HA) masters and enable the automation of various Kubernetes operations including:

  • Command Line Completion
  • API configuration
  • Terraform generation
  • Manifest templating and dry runs

To create a production-grade cluster using Kops, follow the steps as listed below:

  1. Download and install Kops from the release page using the following command: $ curl -LO https://github.com/kubernetes/kops/releases/download/$(curl -s https://api.github.com/repos/kubernetes/kops/releases/latest | grep tag_name | cut -d '"' -f 4)/kops-linux-amd64
  2. Make the Kops binary executable: $ chmod +x kops-linux-amd64
  3. Move the binary into its appropriate path: $ sudo mv kops-linux-amd64 /usr/local/bin/kops
  4. Create a cluster domain (Route53) to enable DNS discovery
  5. Create an AWS S3 bucket to persist cluster state
  6. Build the cluster in AWS
  7. Create the cluster in AWS.

The above steps can also be deployed on clusters other than AWS.

Managing objects and resources in Kubernetes

Kubernetes offers three fundamental techniques of managing objects and cluster resources:

  • Management Through Imperative Commands
    Developers operate directly on live objects using the kubectl command combined with various arguments and flags. This technique is appropriate for one-off cluster tasks, as it does not provide records of previous actions or configurations.
  • Imperative Object Configuration
    With imperative configuration, developers specify kubectl operations and flags combined with a target JSON or YAML file. The object configuration can be stored in a source control platform where context and a history of configurations can be accessed.
  • Declarative Object Configuration
    In this case, developers operate on YAML or JSON configuration files without defining the operations to be taken on the files. The kubectl command then automatically detects changes in these files and applies them when a user points the apply, create, or update commands to a target file.

Kubernetes Resource Types

Some of the objects that can be configured using YAML files in Kubernetes include:

  • Namespaces
  • Workload deployment resources such as Pods, Deployments, StatefulSets and ReplicaSets
  • Services such as LoadBalancer, ClusterIP, Ingress
  • Configuration resources such as ConfigMaps and Secrets

Kubernetes Deployment Strategies

Modern applications require rapid scaling, integration and frequent deployments. Kubernetes applications are microservices-based, so multiple teams working on different modules of the application perform different deployments. Kubernetes allows various deployment strategies that software teams can use depending on the objective. These include Recreate and RollingUpdates as well as Canary and Blue/Green Deployments which are not provided out of the box and require 3rd party tools.

Out of the above, Recreate and RollingUpdates are the most commonly used deployment strategies. The Recreate approach stops the old version to start the new one. As a result, the application appears offline while changing the versions. The RollingUpdate policy, on the other hand, keeps the application available by spinning up new versions while shutting down old versions of the application.

Kubernetes Services, Networking and Load Balancing

Kubernetes relies on services to expose applications running in PODs to networks. Every POD running an application gets assigned an IP address. Pods are, however, ephemeral and this makes it hard for client browsers and other applications to keep track of the IP address they should connect to. A service defines a set of logically connected PODs, and includes policies on how to access them. The PODs that are exposed by a service are typically denoted by a selector in the service’s YAML definition file. For instance, to expose pods hosting containers with running instances of an application MyApp using a service my-service:

apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
   app: MyApp
ports:
- protocol: TCP
  port: 80
  targetPort: 9376

Kubernetes includes various service types which enable developers to publish services to external IP addresses outside the Kubernetes cluster. Kubernetes ServiceTypes include:

  • ClusterIP: Exposes a set of PODs as a cluster’s internal IP Address
  • NodePort: Exposes the service on a static port at the host node’s IP address
  • LoadBalancer: Exposes the service to public networks through a cloud provider’s load balancer
  • ExternalName: This service eliminates the need to set up any proxies since it exposes the services to a URL field by returning a DNS domain name

Kubernetes Security Secrets

A Kubernetes secret is an API object that allows developers to store and share sensitive data. Secure communication with the Kubernetes API is achieved through TLS/SSL, which involves the use of keys to encrypt messages. These keys need to be shared between various teams/machines for collaborative administration.

While doing so, a Secret in Kubernetes acts as a vault used to store information that cannot be viewed directly in an object’s configuration file, including:

  • TLS/SSL Keys
  • Database Passwords
  • Service Account Tokens

Using Secrets, an application deployment file can be used in many different environments (dev, stage, production) without any sensitive information stored on it.

Monitoring and Security for Kubernetes

DaemonSets enable monitoring of Kubernetes applications by ensuring that there’s always an instance of a POD running on cluster nodes. While doing so, applications allow the collection of monitoring statistics by connecting to the following two components:

  1. Kubelet, which acts as a bridge between the master and the nodes and watches for PodSpecs to collect statistics and current status.
  2. Container Advisor (cAdvisor) works on containers and delivers resource usage and performance analysis metrics.

Kubernetes interfaces with various monitoring and logging solutions to provide full-scale observability and visibility into applications running in clusters. Some popular platforms include:

  • Prometheus
  • Grafana
  • Jaeger
  • Kubernetes Dashboard
  • Kubewatch
  • Weave Scope
  • EFK Stack
  • InfluxDB

Kubernetes Development for Stateful Applications

Containers are ephemeral: a container runs a single process and terminates, deleting all the data it creates or processes. Containers have, therefore, long been used to orchestrate stateless applications.

Stateful distributed applications cannot work in the same manner of stateless apps: you can’t just increase the number of instances to accommodate the actual demand because the data stored can’t be shared without risks of data corruption or data loss. Storage for containerized stateful applications, as a result, requires a novel approach that differs from monoliths.

Here are a few critical elements of managing stateful application in Kubernetes:

Kubernetes delivers Persistent Volumes to handle persistence storage in applications. A PersistentVolume (PV) is a Kubernetes object that connects containerized workloads to physical block or file storage. PVs can be static or dynamic. Static volumes are created for the application ahead of time and are harder to maintain as the operator must know all the application’s future storage needs from the start. Dynamic volumes are created on-demand; thus, the cluster allocates the resources as they are needed. While dynamic volumes are used in most cases, a static volume can come in handy when an application has specific IO needs, such as a relational database.

StatefulSet is a Kubernetes API object that helps in the deployment and management of pods running stateful applications. These objects assign identifiable, consistent ID to each pod for easier attachment of storage and workloads, irrespective of the node they are scheduled to.

Alongside enabling workloads to maintain connectivity with respective PVs and pods, StatefulSets enable persistent storage for stateful applications while supporting automated rolling updates and ordered scaling.

Kubernetes was initially developed with stateless applications in mind. As the modern technology landscape moved to a stateful ecosystem, the Container Storage Interface (CSI) plugin was introduced to offer a standardized approach to storage orchestration. The interface lets vendors create storage solutions and offers organizations to efficiently extend the Kubernetes volume layer by orchestrating container storage.

Persistent storage is tied to the infrastructure the cluster is running; thus, it depends on the storage options the providers make available. Each cloud provider offers at least one solution for persistence; at NetApp, a useful open-source tool has been created called NetApp Astra Trident.

Trident carries out dynamic provisioning of persistent storage for Kubernetes and works with NetApp Cloud Volumes ONTAP, which provides the storage volumes needed on AWS, Azure, or Google Cloud. Persistent volumes allocated by Cloud Volumes ONTAP get additional benefits such as high availability, data protection, file services, and more.

Conclusion

This post covered some key concepts from the viewpoint of Kubernetes for developers. Kubernetes offers extensive features that facilitate container orchestration of a distributed, complex ecosystem. While Kubernetes eases the complexity of managing containers, a diligent approach to managing the Kubernetes ecosystem is equally important.

The article also delved into the concept of managing stateful applications - a topic that is increasingly getting mainstream in the modern technology landscape. While stateless applications are easy to provision, stateful workloads require a more refined deployment in defining where the data will be stored, when they will be removed (if needed), and what kind of fallback it will provide.

Cloud Volumes ONTAP supports Kubernetes Persistent Volume provisioning and management requirements of containerized workloads.

Learn more about how Cloud Volumes ONTAP helps to address the challenges of containerized applications in these Kubernetes Workloads with Cloud Volumes ONTAP Case Studies.

Michael Shaul, Principal Technologist

Principal Technologist

-
X