Amazon EMR provides a variety of capabilities that eliminate some of the complexities surrounding analytics workloads management. However, some considerable challenges still remain. Sizing an EMR cluster that is going to run on a dynamic data set of a growing size, dealing with the capacity limits of Amazon EBS volumes, the HDFS 3x replication factor, and added Amazon S3 charges for API calls all introduce additional complexity and costs.
With the help of the NetApp In-Place Analytics Module, EMR clusters can get access to data managed by Cloud Volumes ONTAP using NFS. With Cloud Volumes ONTAP’s NAS capabilities, you get the advantages of EMR without having to consume additional storage, replicate existing data sets, or make endless of API calls to Amazon S3, saving significant costs and operational efforts.
Create NFS volumes for big data
Install NetApp In-Place Analytics Module (NIPAM)
Mount NFS volumes to EMR clusters
Significant savings by reducing the amount of EMR clusters, eliminating the need for three HDFS copies and the substantial amount of S3 API calls.
NetApp data cloning technology allows you to instantly deploy volume clones on which you can run variations of analytics while keeping your main volumes dedicated to production workloads.
Robust data reliability with NetApp Snapshots, data replication, and high availability pair deployments.