logo-ontapCloud Volumes ONTAP

Amazon EMR

Optimizing data costs for Amazon EMR analytic workloads.

Start Free Trial


High costs and data management overhead

Amazon Elastic MapReduce (Amazon EMR) is a scalable Big Data analytics service on AWS. When using Amazon EMR clusters, there are few caveats that can lead to high costs. When using EMR alongside Amazon S3, users are charged for common HTTP calls including GET, SELECT, PUT, POST, and other operations. When using Amazon EBS as storage for EMR, implementing bootstrap actions and manually tracking the automatic disk space increases that take place when volumes reach 90% capacity leads to increased management overheads.


Enterprise-grade data management

Using the NetApp-In-Place-Analytics Module (NIPAM), Amazon EMR users can run analytics jobs on their current NFS repositories on AWS with Cloud Volumes ONTAP, or burst their on-prem data instantly to Cloud Volumes ONTAP by using FlexCache. With Cloud Volumes ONTAP, Amazon EMR users gain cost-cutting storage efficiencies, zero API costs, data mobility, and automated data tiering between Amazon EBS and Amazon S3, so cold data is stored at low-cost when Amazon EMR isn’t running analytics jobs. For Cloud Volumes ONTAP users, this integration with Amazon EMR provides an easy way to analyze all the NFS data stored in the cloud.

How it works


  • 1

    Set up an Amazon EMR Cluster for your analytics workload

  • 2

    Create a Cloud Volumes ONTAP deployment

  • 3

    Install NIPAM and connect Cloud Volumes ONTAP to Amazon EMR

  • np-protection-2791803-000000

    Easy management

    A single storage back end to service both enterprise workloads and your AWS EMR architecture.

    Robust data reliability with NetApp Snapshot™ copies, SnapMirror® data replication, and AWS high availability pair deployments.

  • ic-cost-Dec-09-2020-12-45-03-81-PM

    AWS EMR costs

    No API costs when running EMR on data hosted by Cloud Volumes ONTAP.

    Tiering cold data automatically between Amazon EBS disks and low-cost Amazon S3 object storage as needed.

    Storage efficiencies that drastically reduce data footprint and associated storage costs.

  • np-graph-888703-000000


    NetApp FlexClone® data cloning technology allows you to instantly deploy volume clones on which you can run variations of analytics while keeping your main volumes dedicated to production workloads.


Get block and file storage for the price of object storage.

See Full Pricing

How to get started

Select cloud to get started with