Amazon Elastic MapReduce (Amazon EMR) is a scalable Big Data analytics service on AWS. When using Amazon EMR clusters, there are few caveats that can lead to high costs. When using EMR alongside Amazon S3, users are charged for common HTTP calls including GET, SELECT, PUT, POST, and other operations. When using Amazon EBS as storage for EMR, implementing bootstrap actions and manually tracking the automatic disk space increases that take place when volumes reach 90% capacity leads to increased management overheads.
Using the NetApp-In-Place-Analytics Module (NIPAM), Amazon EMR users can run analytics jobs on their current NFS repositories on AWS with Cloud Volumes ONTAP, or burst their on-prem data instantly to Cloud Volumes ONTAP by using FlexCache. With Cloud Volumes ONTAP, Amazon EMR users gain cost-cutting storage efficiencies, zero API costs, data mobility, and automated data tiering between Amazon EBS and Amazon S3, so cold data is stored at low-cost when Amazon EMR isn’t running analytics jobs. For Cloud Volumes ONTAP users, this integration with Amazon EMR provides an easy way to analyze all the NFS data stored in the cloud.