Infrastructure as Code AWS

Terraform for EBS and EFS: Automating EBS Volumes and EFS File Shares with IaC

[Cloud Volumes ONTAP, AWS, Advanced, 10 minute read, Infrastructure as Code AWS, Automation, D]

Terraform is a solution that provides infrastructure as code (IaC) capabilities, and is commonly used for Infrastructure as Code automation on AWS. Terraform integrates with a wide range of data sources, including AWS storage services.

In this post, we’ll show how to use Terraform to automate EBS and EFS, and add EBS volumes or EFS file shares to your automated deployments. We will also show how NetApp Cloud Volumes ONTAP can help simplify the management of multi-vendor and hybrid cloud environments.

In this article, you will learn:

Terraform and AWS

Terraform is an IaC solution that you can use to build, modify, and version your infrastructure. It is created by Hashicorp and is an Advanced Technology Partner in the AWS Partner Network (APN). Terraform is also part of the AWS DevOps Competency. You can use it in place of AWS CloudFormation to manage your AWS infrastructure.

Several features of Terraform can make it an appealing solution, including:

  • A user friendly, custom syntax with support for JSON
  • Increased visibility into future changes
  • Built-in graphing features for infrastructure visualization
  • Isolation of resource relationships for failure protection
  • The ability to segment configurations for easier management and re-use

Learn more in our article about Terraform AWS, which explains how to deploy a Terraform enterprise cluster on AWS.

What Is Amazon EBS?

Amazon Elastic Block Store (EBS) is a service that provides block storage. Block storage enables you to store large amounts of data in blocks that serve as virtualized hard drives. This type of storage can provide high performance and is ideal for volatile or transactional data.

When using EBS you can store a wide variety of data types, including file systems, databases, big data engines, containers, and applications. The service is scalable, flexible, and includes built-in data replication across Availability Zones (AZ). It also includes a snapshot feature for easy backup and duplication of volumes.

What Is an AWS EBS Volume?

AWS EBS volumes are the individual machines you create in EBS. The most common use cases for EBS volumes include:

  • Frequent updates—serves as storage for data that receives frequent updates or modifications. For example, log repositories or application databases.
  • Throughput-intensive applications—can serve as storage for applications or workloads that need frequent or continuous disk scans.
  • EC2 instances—serves as persistent storage for EC2 instances, which otherwise only have ephemeral storage. This enables the creation of stateful workloads.

EBS was originally designed to complement EC2 and is the default storage choice for EC2 instances. However, you can use EBS freely regardless of whether you are using the EC2 service. For example, it is possible to use EBS as your primary storage or with other AWS services.

What Is AWS EFS?

AWS EFS is a scalable storage service that enables you to create file systems in AWS. It is based on the NFSv4 file system protocol, which mirrors standard, on-premises systems. This makes transfer and access of files easier and faster.

EFS supports Linux-based workloads and applications, and can be combined with on-premises resources and other AWS services. It is available in two storage classes — Standard and Infrequent Access. Standard access is more expensive but provides lower latency for active files. Infrequent access is cheaper in exchange for higher latency and is meant for archived or rarely used files.

How Do Terraform Data Sources Work?

Terraform data sources let you access data stored outside of Terraform or defined in independent Terraform configurations. When a data source, such as EBS, is connected, you can fetch and compute data without relying on Terraform’s resources. This helps stabilize performance on the Terraform server, because it does not need to transfer or compute data directly, and enable you to share configuration data across your system or teams more easily.

To access data sources, you need to declare your data resource in a data block. Data resources are different from managed resources in that the resource is read-only and you cannot create, update, or delete objects. You can see an example block below:

data "aws_ami" "output" {
 most_recent = true

 owners = ["self"]
 tags = {
   Name = "app-server"
   Tested = "true"
  }
}

In this block, Terraform is requesting to read from the "aws_ami" data source and to export the results to the local name "output". You can then use this local name throughout your Terraform module to access the data. The data source and local name serve as an ID for the data and are unique to the module.

After the source and export location are defined, and within the block body, you can define any query constraints associated with the source. In the above case, the constraints are most_recent, owners, and tags.

Terraform EBS Data Sources

Specific to using EBS, you can use the following data sources to make configuration and management of your volumes and data easier:

  • aws_ebs_volume—provides volume information that you can use with other sources.
  • aws_ebs_volumes—provides volume information for volumes that match filters or criteria that you provide. For example, you could use it to get a list of volume IDs for volumes tagged with a specific business unit.
  • aws_ebs_snapshot—provides information about EBS snapshots that can be used to create new volumes or ensure that volumes are backed up.
  • aws_ebs_snapshot_ids—provides a list of EBS snapshot IDs that match filters or criteria you define.
  • aws_ebs_default_kms_key—provides information needed to manage your default customer master key (CMK) which is used to encrypt your EBS data.

Terraform EFS Data Sources

Like EBS, EFS also has dedicated data sources you can use. Below are the most commonly used sources:

  • aws_efs_file_system — provides file system information that you can use with other sources.
  • aws_efs_access_point — provides file system access points from which to perform operations. These are used to connect specific applications to your file system.
  • aws_efs_mount_target — provides information about mount targets for your file system. These are used to connect virtual machines, such as EC2 instances to your file system.

Quick Tutorial: Attach an EBS Volume to an EC2 Instance Using Terraform

Below is a brief tutorial showing a common deployment task, attaching an EBS volume to an EC2 instance. For this tutorial you should already have an AWS account set up with both EBS and EC2 services. It also assumes that you have Terraform installed and configured to work with AWS.

  1. In your configuration file, define the AWS region you want to configure your EC2 instance in.
provider "aws" {
 region = "us-east-1" // Replace with your desired region
}
  1. Next, configure your security groups to allow HTTP and SSH access. In the example below we use “0.0.0.0/0” as the IP range, but this is not recommended in a real-world setup as it creates security issues. Prefer to use a set of IPs or a single IP (such as that of a VPN).
resource "aws_security_group" "morning-ssh-http" {
  name        = "morning-ssh-http"
  description = "allow ssh and http traffic"
 
  ingress {
     from_port   = 22
     to_port     = 22
     protocol    = "tcp"
     cidr_blocks = ["0.0.0.0/0"]
  }
 
  ingress {
     from_port   = 80
     to_port     = 80
     protocol   = "tcp"
     cidr_blocks = ["0.0.0.0/0"]
   }
 
  egress {
     from_port       = 0
     to_port         = 0
     protocol        = "-1"
     cidr_blocks     = ["0.0.0.0/0"]
    }
}
  1. Now, you can define your EC2 instance. In this configuration, you need to define your AZ. This is the same AZ you will use in your EBS configuration. Although you can use different zones, it is best practice to use the same one to avoid latency issues.
resource "aws_instance" "good-morning" {
  ami               = "ami-5b673c34" #Red Hat RHEL V7
  instance_type     = “t3.micro”
  availability_zone = "us-east-1"
  security_groups   = ["${aws_security_group.morning-ssh-http.name}"]
  key_name = "zoomkey"
  user_data = <<-EOF
               #! /bin/bash
               sudo yum install httpd -y
               sudo systemctl start httpd
               sudo systemctl enable httpd
               echo "<h1>Sample Webserver" | sudo tee /var/www/html/index.html
 EOF

  tags = {
       Name = "webserver"
  }
}
  1. Finally, you can configure your EBS volume and connect it to your EC2 instance.
resource "aws_ebs_volume" "data-vol" {
 availability_zone = "us-east-1"
 size = 1
 tags = {
        Name = "data-volume"
 }

}
#
resource "aws_volume_attachment" "good-morning-vol" {
 device_name = "/dev/sdc"
 volume_id = "${aws_ebs_volume.data-vol.id}"
 instance_id = "${aws_instance.good-morning.id}"
}

Quick Tutorial: Creating AWS Elastic Filesystems with Terraform

Below is a brief tutorial showing how to create an EFS file share. Like the EBS tutorial, this tutorial assumes that you already have an AWS account set up with EFS services. It also assumes that you have Terraform installed and configured to work with AWS.

This is adapted from a more in-depth tutorial by Earl Ruby which you can find here.

To begin, you need to define your resource in your EFS Terraform file. Your code should resemble the following:

resource "aws_efs_file_system" "example-efs" {
   creation_token = "example-efs"
   performance_mode = "generalPurpose"
   throughput_mode = "bursting"
   encrypted = "true"
tags = {
     Name = "TestEFS"
   }
}

After your resource is defined, you also need to set up a mount target so you can attach volumes to the file share. The following block gives an example of how your target is defined.

resource "aws_efs_mount_target" "example-efs-mt" {
   file_system_id = "${aws_efs_file_system.example-efs.id}"
   subnet_id = "${aws_subnet.subnet-efs.id}"
   security_groups = ["${aws_security_group.ingress-efs.id}"]
}

In the above block, your file_system_id ties your mount target to your file share. Subnet_id is a separately defined connection that you can create. The security-groups designation is also separately defined. You can see how to create these below.

EFS subnet
The following definition should be added to your network Terraform file. You can choose any available subnets when defining this resource. The below example uses /16 for a virtual private cloud and /24 for EFS shares and machine clusters. The subnet is tied to the US East 1a availability zone.

resource "aws_vpc" "test-env" {
   cidr_block = "10.0.0.0/16"
   enable_dns_hostnames = true
   enable_dns_support = true
   tags {
     Name = "test-env"
   }
}

resource "aws_subnet" "subnet-efs" {
   cidr_block = "${cidrsubnet(aws_vpc.test-env.cidr_block, 8, 8)}"
   vpc_id = "${aws_vpc.test-env.id}"
   availability_zone = "us-east-1a"
}

EFS security group
Creating a security group enables you to control traffic between your EFS file share and your test environment. A block like the following example should be added to your security Terraform file.

This example assumes an existing security group (ingress-test-env) and creates a new group to enable restricted (to port 2049) inbound traffic and unrestricted outbound traffic. This restriction improves security and eliminates the need for the cidr_blocks attribute.

resource "aws_security_group" "ingress-efs-test" {
   name = "ingress-efs-test-sg"
   vpc_id = "${aws_vpc.test-env.id}"
 
   ingress {
     security_groups = ["${aws_security_group.ingress-test-env.id}"]
     from_port = 2049
     to_port = 2049
     protocol = "tcp"
   }
 
   egress {
     security_groups = ["${aws_security_group.ingress-test-env.id}"]
     from_port = 0
     to_port = 0
     protocol = "-1"
   }
}

Once all your files are defined, you can apply the definitions by running Terraform. Provided the definitions are applied successfully, you should be returned an output that lists your mounted folders and the details of each.

Terraform EBS and EFS with Cloud Volumes ONTAP

NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.

In particular, Cloud Volumes ONTAP provides Cloud Manager, a UI and APIs for management, automation and orchestration, supporting hybrid & multi-cloud architectures, and letting you treat pools of storage as one more element in your Infrastructure as Code setup.

Cloud Manager is completely API driven and is highly geared towards automating cloud operations. Cloud Volumes ONTAP and Cloud Manager deployment through infrastructure- as- code automation helps to address the DevOps challenges faced by organizations when it comes to configuring enterprise cloud storage solutions. When implementing infrastructure as code, Cloud Volumes ONTAP and Cloud Manager go hand in hand with Terraform to achieve the level of efficiency expected in large scale cloud storage deployment.

New call-to-action

-