More about AWS EFS
AWS EFS is a popular cloud file sharing service that can be mounted on Amazon EC2 instances, alongside Amazon EBS. It allows access to files and folders using the traditional NFS protocol, so it is a common choice for cloud native workloads as well as workloads migrated from on-premise environments.
In this post, we’ll review which scenarios are best suited to EFS, how to use EFS performance options, and get tips to optimize your performance. In addition, we’ll show how NetApp Cloud Volumes ONTAP can help optimize performance.
In this article, you will learn:
- AWS EFS performance use cases
- Working with EFS performance modes
- 7 tips for optimizing EFS performance
- Optimizing EFS costs with Cloud Volumes ONTAP
Amazon EFS Performance Use Cases
Amazon EFS is a massively scalable distributed file system, accessed using the Network File System (NFS) protocol. EFS can be mounted on thousands of Amazon EC2 instances in parallel, allowing all those instances to gain shared access to your files.
EFS provides good performance for the following use cases:
- Big data and analytics—EFS can provide high throughput to compute nodes, read-after-write consistency and low-latency file operations.
- Media processing (for example video editing, sound design)—EFS provides high data consistency with high throughput and central, shared access to large files, letting you distribute workloads between multiple machines.
- Content management and web servers—EFS provides a high throughput file system that can serve static files quickly for websites, archives and other public data stores.
- Home directory—EFS is a file system that can be accessible across the enterprise, with granular permissions at the file or directory level.
EFS is less performant in these use cases:
- OLTP databases—OLTP applications like financial transaction systems or retail sales systems, require high IOPS. EFS running in General Purpose Performance Mode (see more about performance modes below) has a limit of 7,000 file system operations per second.
- Code repositories or version control systems—EFS is not suitable for workloads that have a random seek component. Running Git or other code repositories, or scripting languages like PHP or Ruby, is not efficient because it requires very frequent access to a large number of small files.
- Applications requiring single file access—EFS imposes a throughput limit of 250 MBps per volume per instance. This can result in high latency if an application requires very frequent access to the same file and this file cannot be distributed.
- Workloads requiring snapshots—critical data, or any data you wish to protect against ransomware or other data loss scenarios, should not be placed on EFS because it does not currently support storage snapshots. You can backup EFS, but EFS backups consume bandwidth and can interfere with production operations.
- Data requiring access from Windows machines—EFS does not support the Server Message Block (SMB) protocol, which is used by Windows, so the EFS file system cannot be accessed from Microsoft Windows operating systems.
Working with EFS Performance Modes
EFS provides two performance modes:
- General Purpose Performance Mode—the default mode, suitable for most EFS use cases mentioned above.
- Max I/O Performance Mode—offers a higher threshold of operations per second, but with higher latency for file operations. This is optimal for highly parallelized applications.
To determine which performance mode is suitable for your workloads, Amazon advises running your application using General Purpose mode, and monitoring the PercentIOLimit metric in Amazon CloudWatch. If this metric hit 100% for extended periods of time, consider using Max I/O mode.
In addition, EFS provides two throughput modes.
- Bursting Throughput—throughput for file operations scales with your file system usage. Depending on the size of your data you get a certain number of burst credits, which allow you to get higher throughput for a limited time. For example, a 1-TiB file system runs continuously at a throughput of 50 MiB/second and is allowed to burst to 100 MiB/s for 12 hours each day.
- Provisioned Throughput—lets you instantly provision the throughput you need, regardless of the amount of data stored. In this mode, Amazon bills separately for the data you use and separately for the amount of throughput.
7 Amazon EFS Performance Tips
The following tips can help you get the best performance out of your EFS workloads.
Monitor average I/O throughput
Amazon EFS is highly distributed, and this means there is a small latency overhead for file operations. As I/O increases, the total overhead increases. Try to keep the number of I/O operations per second to a minimum, for example by uniting multiple files required by the same application into one file.
Monitor EFS burst credits
If you choose to use Bursting Throughput, it is essential to monitor usage of burst credits. Take into account that backups can also consume your credits. If you run out of burst credits in production, throughput can slow down dramatically, and you can experience outages or slowdowns.
Manage backups carefully
Amazon EFS does not currently have a snapshot mechanism, so you need to design a backup process that copies your data to a new server. When you perform backups, make sure to rate-limit them so they don’t exhaust your burst credits (in Bursting Throughput mode) or incur extra charges for throughput (in Provisioned Throughput mode).
Leverage simultaneous connections
EFS is massively parallel and can be mounted on thousands of EC2 instances. The more you can parallelize your application and access EFS files from multiple EC2 instances, the high performance you can get across instances.
Use asynchronous write operations
Asynchronous writes are buffered on your EC2 instances before being written to Amazon EFS. Asynchronous have lower latency but note that this has a tradeoff in consistency and time taken to complete the write.
EFS Mount Settings
Use the settings recommended by AWS. Use NFS 4.1, if your operating system supports it, because it provides better performance. Increase the read and write buffers for NFS clients to 1MB.
Capacity on EC2 Instances
Ensure your EC2 instances have enough memory and computing capacity to perform the required amount of read and write operations. Choose your instant types accordingly. Note that EBS-optimized instances do not provide performance benefits when using EFS.
High Performance File Shares with Cloud Volumes ONTAP
NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, lets you use compute and storage resources to create a virtual storage appliance, as a high performance alternative to Amazon EFS. This provides the following benefits:
- NFS and SMB/CIFS file shares (in addition to iSCSI)
- Active Directory integration, allowing users to access files with their existing domain credentials
- Space-efficiency technologies including thin provisioning, data compaction, compression and deduplication, which can dramatically reduce storage volumes and costs
- NetApp Snapshots™ technology, creating instant backups of data, irrespective of size
- FlexClone® technology, creating writable clones that do not take up extra storage
- SnapMirror® technology, providing incremental data synchronization from on-premises NetApp appliances to the cloud
- Automatic data tiering, offloading less-frequently-used data to Amazon S3 while still exposing it over NFS and/or CIFS