hamburger icon close icon
Hybrid Cloud Management

Cloud-Based Analytics with Cloud Volumes ONTAP and FlexCache

Analytics are making it possible for organizations to gain valuable business insights from the data they accumulate. But there are serious challenges they can face when trying to access cloud-based analytics capabilities across disparate environments, such as in hybrid cloud management scenarios.

This blog will take a closer look at the challenges of analyzing disparate data sets and how they can be solved with NetApp FlexCache technology and cloud bursting with Cloud Volumes ONTAP.

Click below to continue as we cover:

What Is Cloud Analytics?

With the growing trend of generating data everywhere due to increased digitization, organizations are creating, storing, processing and archiving vast sums of data from all over the world. Some of this data could live on proprietary data centers and some on cloud platforms, while some of it could reside in various remote and branch offices (ROBO). In an increasingly interconnected world, information has become truly global, meaning dispersed data is now increasingly the norm.

new1-2In dispersed deployments, data can reside anywhere.

Despite global distribution, analyzing these vast sums of data together in a coherent manner to derive meaningful intelligence is still a key requirement. The ability to store data generated from multiple sources and geographies and analyze them for co-relations or patterns in order to predict emerging trends or future behaviors such as customer buying patterns can provide an organization with an edge over their competitors.

The cloud has taken a pole position to help meet this growing need. 

Due to the availability of vast computing capabilities and next generation analytics based on artificial intelligence (AI), machine learning (ML), and deep learning (DL), cloud computing platforms have increasingly become the de-facto aggregation point for data to be stored and analyzed. Increased popularity of data analytics platforms such as Elasticsearch, deployed on AWS, Azure, and Google Cloud, can be seen as evidence of this growing trend.

Cloud-Based Analytics Challenges

When leveraging cloud analytics, such global distribution of data does bring a number of challenges for organizations to address, however.

  • Data consolidation: When organizations try to analyze these data sets to gain insights, they’re faced with the challenge of having to replicate the individual islands of data into a shared repository in the cloud in a timely manner, where it can be correlated and analyzed.
  • Sync and timing: The consolidation of data often requires many third-party data replication technologies or scripts, which need also to keep the data synchronized across platforms introducing complexity.
  • Performance matters: Moving large chunks of data from various ROBO sites and customer data centers to the cloud for analytics can also consume lots of bandwidth and most people would be at the mercy of internet latency which can delay the whole process.
  • Security: Migrating data from various locations to the cloud over an insecure connection such as the internet can expose vital organizational data and create a serious security risk. Using third-party encryption in order to solve this could be expensive and logistically challenging for many organizations.
  • Increased costs: In addition, moving data between repositories could also be a costly exercise due to duplicate storage needs across each platform and the related storage costs and transfer costs can run into hundreds of thousands of dollars.
  • Multiple namespaces: It also creates the inevitable namespace issue where the same data now lives in multiple namespaces across the multi cloud platform with no single namespace acting as a single point of access for the consumers of that data (applications or users).
    Because of these challenges, many organizations are starting to discover that moving various data from one location to another for analysis is not the most efficient way to go about it anymore. Fortunately, NetApp has a solution.

Using NetApp FlexCache and Cloud Volumes ONTAP for Cloud-Based Analytics

NetApp Cloud Volumes ONTAP is NetApp’s cloud-native, enterprise-grade storage solution that is available on all the major cloud platforms. Cloud Volumes ONTAP enables customers to enjoy high performance cloud storage with enterprise grade data availability, data protection, data security, and data governance features.

Picture25The Cloud Volumes ONTAP architecture.

NetApp FlexCache solves the problem of having to move various islands of data into a central location for analysis by providing an intelligent cache of a data volume in a remote location that is persistent, writable, consistent, coherent, and current.

Picture125NetApp FlexCache in NetApp ONTAP

How FlexCache Data is Used with Cloud Volumes ONTAP

A FlexCache copy of a source data volume looks and appears exactly the same to clients as the source volume. However, the FlexCache copy is a sparse container: that means not all the files from the original data volume are cached. Instead, the storage is efficiently used by prioritizing the retention of only the working dataset (most recently used data) at the cache volume.

When a FlexCache volume is created for the first time, there is no data taking up space in that volume other than metadata which is used to present the same look and feel of the source volume’s actual dataset—such as files and folders—to the FlexCache volume’s clients. When client’s access specific blocks of data, those are retrieved on demand from the source volume and cached while being served to the clients.

NetApp FlexCache can be leveraged for various different use cases such as financial applications that distribute data for analysis, media rendering scenarios where remote workers (i.e. artists, graphic designers, etc.) can be given access to the master data set and to create render farms, and in software development, where developed code and applications can be distributed to various testers and QA teams distributed across remote locations easily.

Now with NetApp Cloud Volumes ONTAP, enterprise customers can create FlexCache volumes to bring various on-premises data volumes, or even data volumes residing on other remote cloud platforms closer to a single cloud location for the purpose of analyzing those data centrally without the need to permanently move or replicate the data from the source locations.

FlexCache Volumes Benefits

This clever architecture provides a number of benefits: 

  • Near instant access to remote data: A FlexCache volume created on Cloud Volumes ONTAP can be accessed by cloud analytics solutions such as Elasticsearch almost instantly. The ability to provide instant data copies across hybrid and multicloud environments ensures FlexCache and Cloud Volumes ONTAP customers benefit from speedy access to cloud analytics, in order to maintain the edge over their competitors.
  • High performance: The ability to create a sparse cache (based on metadata) closer to the analytics engine removes the remote data access latency and the related performance issues. With the high-performance storage architecture of Cloud Volumes ONTAP, complex analytic queries are completed quickly without unnecessary delay.
  • Reduced cloud storage costs: The sparse nature of the cache also ensures that FlexCache volumes minimize the underlying cloud storage consumption costs with Cloud Volumes ONTAP. The on-demand retrieval of only the needed data to the cloud ensures that un-necessary replication and associated cloud storage costs and transfer costs are avoided. Customers running FlexCache with Cloud Volumes ONTAP can also benefit from significant reduction in cloud consumption costs when Cloud Volumes ONTAP’s storage efficiency features are applied to FlexCache data.
  • Enhanced security: NetApp FlexCache is secure by design, with TLS encryption enabled by default between the origin and the destination. This secure caching process provides customers with the flexibility they need to be able to ensure secure and safe data transfer from various locations such as ROBO sites, which may not always have secure connections to the cloud.
  • Hybrid and multicloud flexibility: NetApp FlexCache volumes can extend the data from any ONTAP based solution to a remote ONTAP instance such as another ONTAP solution in a customer’s data center, or Cloud Volumes ONTAP instances running anywhere in the cloud. FlexCache volumes on Cloud Volumes ONTAP are now supported for both Linux-based NFS and Windows based SMB workflows, providing much more flexibility to the customers in the cloud.
  • Burst to the cloud: Together with NetApp Cloud Volumes ONTAP, FlexCache also provides a great solution for a number of cloud bursting use cases. FlexCache volumes created on Cloud Volumes ONTAP can enable cloud bursting for customers to provide fast and immediate access to copies of their on-premises data in the cloud for use with cloud-native services, such as AWS analytics or Azure analytics offerings, with no additional costs or replication delay.
    Learn more about Cloud Bursting with Cloud Volumes ONTAP here.
  • Reduced data management overhead: FlexCache ensures that customers can limit their data management challenges such as backup and disaster recovery requirements only to the original copy of the data. FlexCache copies on Cloud Volumes ONTAP are maintained as coherent, up-to-date copies of the source data and any changes at the destinations are automatically replicated back to the source. This architecture also ensures that other inherent data duplication issues do not apply to FlexCache customers such as data segmentation and multiple namespace issues, reducing overall data management complexity.

NetApp Cloud Volumes ONTAP and FlexCache ensures customers can harness various cloud technologies such as cloud analytics to make meaningful insights from all the different islands of data that exist within their hybrid cloud infrastructure, irrespective of where they were created.

Refer to this NetApp FlexCache Technical Report for additional information. Now let’s see how all this works in practice.

A Case Study: A Multinational Investment Firm Uses Google Cloud Analytics with Cloud Volumes ONTAP

One of the largest American multinational investment and financial services organizations headquartered in New York City has been using NetApp Cloud Volumes ONTAP on Google Cloud and NetApp FlexCache in order to leverage cloud analytics for their important data.

With NetApp FAS-based storage solutions in their global data centers and ROBO sites, this company made a shift to embrace the public cloud as a part of a multi-year cloud journey. One of their critical workloads performs investment analysis on a periodic basis (monthly, quarterly, etc.). Some of the processing requires significant compute resources and the customer was looking for ways to leverage a large number of compute nodes on Google Cloud. The main goal was to find a solution that supports their cloud strategy, with the least amount of modifications on process, automation, and skill sets.

NetApp FlexCache became the key to their solution. Working with their standard on-premises architecture, FlexCache was leveraged to burst data to their Cloud Volumes ONTAP instances on Google Cloud, where the data can easily be analyzed to provide business intelligence for new investment strategies. The customer has wider plans to increase the use case of NetApp FlexCache and Cloud Volumes ONTAP further within its global organization by expanding across multiple analytics platforms as well as offering this as a standard cloud capability to internal customers (Business Lines) in the future.

It’s Data Analytics, Optimized

Data analytics are providing insights that businesses can use to become more successful. But in large, distributed deployments sharing that data can be a challenge. NetApp has a solution for this in Cloud Volumes ONTAP with FlexCache.

If you’re running analytics across multiple repositories in dispersed locations, gain the most from your data at the highest level of performance with Cloud Volumes ONTAP and FlexCache.

Read about additional analytics customer stories with Cloud Volumes ONTAP here.

New call-to-action

Cloud Solution Architect