More about AWS EFS
Keeping data properly backed up is a critical task when running your infrastructure in the cloud. While most AWS services that are used for data storage have a built-in mechanism for backup, this is not the case with Amazon Elastic File System (AWS EFS). To back up Amazon EFS you either need to use different AWS services or turn to other cloud file sharing solutions outside of AWS. NetApp’s option for this is Cloud Volumes ONTAP.
In this article we will cover several different methods for backing up Amazon EFS storage:Amazon EFS to EFS, Amazon EFS to Amazon S3, the open-source tool Terraform, and NetApp’s solution, Cloud Volumes ONTAP.
Using Amazon Elastic File System for Amazon EC2 Instances
Amazon Elastic File System (Amazon EFS) is a fully-managed file system for Linux-based, NFS file shares. Its main purpose is to provide a simple, scalable, and elastic file system that allows massive parallel shared access across thousands of Amazon EC2 instances.
As a file service, Amazon EFS can be used for all the major use cases for NFS shared file storage: home directories, application data, media libraries, etc.
Amazon EFS has two types of storage classes: Standard and Infrequent Access. With AWS lifecycle management tools you can set different kinds of policies to retain your data, which can help reduce storage costs.
Amazon EFS Performance and SLA
Performance is key requirement of any file system. When running Amazon EFS, the following tips can help you improve performance:
- Average I/O size: Make sure your I/O operations match your workload. A distributed Amazon EFS architecture can result in a low latency overhead for each file operation. That’s why it is important to have the appropriate I/O values configured and overhead amortized over a large amount of data.
- Simultaneous connections: Your application can be parallelized over several instances. By doing this, you’ll get higher throughput levels on file system.
- NFS Client mount settings: Amazon EFS supports Network File System v4.0 and v4.1 (NFSv4) protocols when mounting file systems on Amazon EC2 instances. Of these two options, NFSv4.1 provides better performance.
Another point to consider is Amazon EFS Service Level Agreement for monthly uptime percentage. AWS currently states that the monthly uptime percentage for Amazon EFS should be 99.9%. AWS backs that guarantee with Service Credits. If uptime should fall from 99.9% down as much to 99.0%, your Amazon EFS bill for that billing cycle will be reduced by 10%. If uptime falls under 99.0% to as low as 95.0%, your bill for Amazon EFS on that billing cycle will be reduced by 25%. Uptime any lower than 95.0%, your usage for that month will not be charged.
Backing Up NFS Data: 3 Methods
1. AWS Solutions
AWS Backup Service
AWS Backup is an all-in-one service that provides a way to create and manage Amazon EFS backups. Its benefits include incremental backups, good performance levels, and an option to create on-demand, manual backups through the console or CLI.
To start backing up Amazon EFS file systems with AWS Backup, the first thing that needs to be done is to create a backup plan. AWS Backup plan consists of the following:
- Schedule: Defines when the back up will be performed.
- Backup window: Backup window consists of the time that the backup window begins and the duration of the window, in hours. Backup jobs are started within this window.
- Lifecycle: This element determines when a backup recovery point should be moved to cold storage and at which point it should be deleted.
- Backup vault: Organizes the backup recovery points.
Once there is a backup plan in place, all we need to do is assign an individual Amazon EFS file system to that backup plan. This can be done through the use of either tags or with the file system ID in Amazon EFS. The backup process will begin automatically as soon as the plan is assigned to the file system.
- Easy to implement solution for backing up Amazon EFS from AWS.
- Incremental backups lower the costs of data stored.
- To ensure consistent backups, the user needs to manage pausing the applications or processes that are modifying the file system for the duration of the backup process.
- Backups stored on Amazon EFS storage, which can be a cost concern.
Amazon EFS to EFS Backup
One solution to backing NFS data on AWS is to implement automatic incremental backups of Amazon EFS to Amazon EFS. This solution consists of the following components:
- Two Amazon CloudWatch events
- An AWS Lambda function
- DynamoDB table
- An Amazon SNS Topic
- An Amazon S3 bucket
The Amazon CloudWatch events are used to start and stop the backup process. The first scheduled event triggers a Lambda function that launches an Amazon EC2 instance in an AutoScaling group. This AutoScaling group creates an ID for the backup and stores the ID in a DynamoDB table. The Amazon S3 bucket is used to store logs of the backup process. If the backup window expires before the process is complete, the second CloudWatch event invokes the orchestration function to update the desired capacity of the AutoScaling group to zero, thereby terminating the Amazon EC2 instance and the backup process.
Backing Up to Amazon S3
To back up your Amazon EFS data to Amazon S3, you would need to implement a script that would run at a desired time and perform incremental sync to Amazon S3. This can be done using the AWS CLI sync option. With this option, AWS CLI will look into the Amazon EFS source directory, compare it to the destination bucket, and perform an incremental backup of the data.
- EFS to EFS backup is a ready-to-deploy AWS CloudFormation template.
- Can be implemented in just a few minutes.
- Backing up Amazon EFS data to Amazon S3 involves fewer components, only requiring an Amazon S3 bucket for sync.
- The EFS to EFS backup has a predefined window for backup process execution. Keep in mind that that this window should be changed as your EFS data scales.
- Backing up Amazon EFS to Amazon S3 relies on your knowledge of Bash, Python, or some other language to use the AWS CLI.
- Amazon EFS to Amazon S3 back up requires alerts and monitoring in case a backup fails.
2. Using an Open-Source Tool: Terraform
Terraform is an open-source tool for managing your infrastructure as a code. It is able to back up NFS data in Amazon EFS file systems to Amazon S3 using AWS DataPipeline. To do this, Terraform is equipped with a module called terraform-aws-efs-backup.
The module workflow will periodically launch an Amazon EC2 instance based on defined schedule. It then runs shell commands that are defined for the Amazon EC2 instance. Then, data from Amazon EFS is synced to Amazon S3 using the AWS CLI. All of the execution logs are stored on Amazon S3. Upon success or failure, a message will be sent to an SNS topic. Backup retention is managed using the Amazon S3 lifecycle rule.
- Free and open-source tool that can be modified to suit your backup needs.
- Allows NFS data to be synced from Amazon EFS to Amazon S3.
- Knowledge of Terraform and its templates is required.
- Requires both Amazon EFS, Amazon S3, and the use of the AWS CLI.
3. Amazon EFS Alternative: Using Cloud Volumes ONTAP for Incremental Backups of NFS Data
NetApp Cloud Volumes ONTAP is an alternative solution to using Amazon EFS for NFS file storage. With support for both NFS and SMB/ CIFS file shares for cloud and on-premises systems, Cloud Volumes ONTAP doesn’t only give you an alternative to using Amazon EFS, it also offers one for Amazon FSx as well.
To backup NFS data, Cloud Volumes ONTAP uses NetApp Snapshot™ technology. NetApp snapshots create instant, application-aware, consistent, incremental backups of NFS data volumes no matter how large the source data is. These backup copies can be used later for a number of purposes including restoring volumes to specific points in time.
Using NetApp SnapMirror®, Cloud Volumes ONTAP is able to keep backup data synced between repositories, whether they’re on AWS, Azure, or on-prem. Also, since backup data that will not be used frequently, users can benefit from better storage economy through the use of Cloud Volumes ONTAP’s data tiering feature. Data tiering lets you move backup NFS data to inexpensive object storage on Amazon S3 or Azure Blob Storage. Data tiering is highly effective in terms of savings due to the lower object storage costs, while keeping the data exposed over NFS and/or SMB/ CIFS.
Backup Costs: Cloud Volumes ONTAP vs Amazon EFS Pricing
One of the major concerns with using Amazon EFS is the costs. Even if you back up Amazon EFS data to another Amazon EFS or an Amazon S3 bucket, you still have to pay for that storage. Cloud Volumes ONTAP gives users a way to cut down their storage footprint and associated costs with its space efficiency features. These include thin provisioning, data deduplication, compression, and compaction. It’s an easy way to save on cloud storage that isn’t available natively on AWS. Aside from storage costs, there is the cost of maintaining a custom solution for backing up your Amazon EFS data. There are costs for engineering time to develop, test, implement, maintain, test, and monitor that solution. That is why when choosing a custom solution you should consider how much time would be needed to properly implement that solution, but also, keep in mind for future updates.
- Point-in-time snapshots.
- Storage efficiency which reduces overall cloud storage footprint.
- Support for NFS and SMB/ CIFS.
- No need for costly custom solutions.
- To backup your NFS data in AWS or Azure, start a free 30-day trial of Cloud Volumes ONTAP now.
- May give users more features than required.
- Licensing outside of the cloud provider.
We’ve just taken a look at some of the different ways to handle backups of your NFS data. Using AWS tools to back up EFS can be a bit complicated. Open-source tools such as Terraform can offer a better option since you can adapt them for your needs, although there is a lack of support and they require advanced knowledge to implement a backup solution. NetApp Cloud Volumes ONTAP provides an out-of-the-box Amazon EFS alternative for serving and backing up NFS data, complete with built-in features and mechanisms for backup and storage-cost minimization.