Cloud Storage

Cloud Storage: The Complete Guide

Image 1-1
Yifat Perry

Product Marketing Lead

Dec 1, 2020 7:49:11 AM

Cloud storage uses remote resources to maintain, manage, and provide access to data. When users need to save, access, or modify data, they must connect to the remote resource over a network (typically the Internet). 

The purpose of cloud storage is to enable users to store data off-site using resources they do not have to purchase, maintain, or manage. Additionally, because the costs of these resources are distributed among all users, resources can be cheaper, enabling users to access storage at a scale or performance level that they might not otherwise be able to afford. 

How Does Cloud Storage Work?

Cloud storage systems are typically made up of large numbers of distributed data centers or servers. The resources of these centers and servers are leased out to customers as reserved capacity or on-demand. When you store data in cloud resources, the service provider is responsible for ensuring that it is durable and available to users. This is done by replicating data across multiple servers or data centers. 

Public vs Private vs Hybrid Cloud Storage Services

When considering cloud storage, there several different service models you can choose from depending on your intended use, budget, and required level of control. Storage services can be grouped into two main categories—public and private. There are also subcategories you can choose from, defined as hybrid and multicloud. 

Public Cloud Storage

Public cloud storage services are based on resources that are owned, maintained, and operated by the cloud storage provider. Google, Azure, and Amazon are the three largest public cloud storage providers. 

Public cloud resources are designed for multiple tenants per server. Providers separate tenant or customer data through access controls, security policies, and data isolation practices. Some public cloud providers also offer dedicated servers for greater isolation. The resources you lease are made available over Internet connections and you can access data through web interfaces which generally use REST APIs. 

Public cloud storage services can be reserved or used on-demand. Reserved resources may require advance payment or an agreement to retain services for a specific period of time. On-demand services are available as needed and are paid for on a monthly basis. Services used are typically charged according to the number of gigabytes used and the amount of bandwidth used for data access or transfer. 

Hybrid Cloud Storage

Hybrid cloud storage is a combination of remote cloud storage (either public or private) and local resources. Storage is often implemented with proprietary software or appliances that sync with cloud resources via API. These infrastructures are typically used by organizations that want or need to keep data locally accessible. For example, due to mission-critical legacy applications or compliance restrictions. 

When operating a hybrid infrastructure you can have separate data stored in cloud or local resources, or you can sync data. For example, you can use policy engines to transfer infrequently accessed data to the cloud while retaining frequently accessed data. Meanwhile, syncing data enables you to gain the cloud's availability while maintaining low latency provided by on-site resources. 

Multicloud Storage

Multicloud storage is a strategy that uses multiple types of cloud storage services (public and private) or services from different vendors. The purpose of multicloud strategies is to diversify your services to avoid vendor lock-in, optimize use according to resource capability, or to enable use of otherwise incompatible services or applications. Multicloud storage strategies can combine native services, supplier-integrated services, and marketplace services. 

Private Cloud Storage

Private cloud storage services can be remote or on-premises. Services can be owned and operated in-house or by a cloud provider. These services provide dedicated, single tenant resources and allow greater control over your stored data.

Depending on the location of your private cloud services, you can base your infrastructure on your existing hardware, newly purchased hardware, or hardware leased from a provider. Also depending on location, you can access your stored data through private networks or through the Internet. 

The cost for private cloud services depends on whether it is self operated or not. If yes, costs are based on necessary hardware purchases, ongoing hardware maintenance, and cloud infrastructure operation and management. If a vendor operates private clouds, you are typically charged based on the resources you are using and the level of support needed.

Cloud Storage Technology

Although object storage is the most common cloud storage technology, there are other options. Generally, these include block storage (such as Amazon EBS) and file storage (such as Azure Files). 

Object Storage

Object storage is a type of storage that enables you to store unstructured data using metadata-based schemes rather than file hierarchies. These storage resources store data as objects, a group of file data and metadata with a unique identifier. You use this ID to recall the object as needed. Since you store objects in a flat structure rather than a hierarchy, all file context is stored in metadata. 

Block Storage

Block storage is a type of storage that uses abstraction to create volumes or blocks in a low-level storage device. These blocks serve as virtual hard drives that you can attach to instances or VMs to serve as persistent storage. 

File Sharing

Cloud file sharing is a type of storage that holds data hierarchically using files and folders. File storage is the standard storage type used by on-premises machines and servers. Typically, you access data through the Network File System (NFS) protocol or the Server Message Block (SMB) protocol.

Distributed Storage

Distributed storage is not a type of storage so much as a storage infrastructure. It enables you to split storage across multiple workstations, servers, or data centers. Each storage device serves as a node in a storage cluster that you manage centrally. You can create distributed storage infrastructures for object, block, and file storage. 

Distributed storage infrastructures are what provide the benefits of cloud storage, including:

  • Scalability—you can scale storage horizontally by increasing the number of storage nodes in your cluster. 
  • Redundancy—you can store multiple copies of data in remote locations for greater availability and durability. Syncing enables you to ensure data is mirrored. 
  • Cost—you can use lower performance or commodity hardware that is linked together to provide the same storage volume as higher cost solutions.
  • Performance—you can ensure that users can access data from nearby locations, reducing latency. You can also enable massively parallel access, splitting data retrieval across resources. 

Benefits of Cloud Storage to Businesses

Switching to cloud storage can provide cost saving, availability, and data reliability benefits for businesses. Below are the most common benefits that you can expect to receive from cloud storage services. 

Reduced Capital Expenses

Moving to public or hosted private cloud services reduces your need to purchase or maintain hardware. This means you are not responsible for the resources required to keep hardware up to date, to house, cool, or secure hardware, to monitor hardware, or to capacity plan. Since infrastructure is leased you also take on less technical debt and can redirect capital expenses to operational tasks.

Data Tiering for Cost Savings

Cloud storage services often offer multiple hardware or access tiers to choose from. This enables you to tier your data according to access priority and frequency. You can limit high performance storage resources to your most critical data and push lower priority data to lower performance tiers. This enables you to save costs without having to purchase or maintain additional hardware.

Data Redundancy and Replication

Cloud storage services typically include automatic data replication and redundancy features. These features distribute your data across servers, data centers, availability zones, or regions to ensure that data remains available. This redundancy helps protect you from hardware failures, natural disasters, and issues related to heavy traffic. 

Mobility

The remote accessibility of cloud storage enables you to work with distributed teams and users easily. You can access data stored in the cloud at any time from anywhere and on almost any device. Additionally, since cloud data access is centralized, it’s easier for IT teams to manage data without limiting accessibility. 

Disaster Recovery

In addition to the built-in data replication that most cloud storage services provide you can use cloud storage to enable reliable disaster recovery. Since cloud storage is typically remote you can use it to store failover systems or backups that will remain available even if your on-premises systems go down. 

Cloud Storage Challenges

When considering the benefits of cloud storage services, you should also be sure to consider the challenges you might face implementing these resources. 

Cloud Complexity

On an individual level cloud resources are easy to provision and consume. Most users can manage setting up a Google Drive or Dropbox account. However, configuring storage resources for an organization with multiple users, diverse business goals, and a variety of compliance obligations is much more complex. 

When adopting cloud storage resources, you cannot simply move data and trust your IT team to do the rest. Instead, you need to plan your migration carefully and include staff with cloud experience and expertise.

Data Movement Challenges

Once you plan your migration, you may also run into challenges when moving data. The amount of data you need to migrate, the support for data formats, and the security of the transfer all play a role. 

Additionally, you may need to shut down or enable syncing for some applications and components to ensure no data is lost. If you do not manage data carefully, you may not be able effectively use it after migration. 

Cloud Security

When you store data in the cloud you have less control over it than when stored on-premises. This is caused by several main issues:

  • Data is more accessible to outsiders due to Internet connectivity.
  • Monitoring is more difficult due to distribution of data.
  • You are reliant on the cloud provider for infrastructure security.

In combination, these factors put your data at greater risk, particularly if you do not understand which aspects of security are your responsibility. Not properly implementing access controls or securing storage resources can provide malicious parties full access to your data and accounts.

Cloud Storage vs Cloud Backup

Although both use cloud resources to store data, cloud storage and cloud backup are not interchangeable terms. To clarify these terms, remember the following:

  • Cloud storage—often used to supplement or replace local storage. Cloud storage can be used to store active, infrequently accessed, or archived data. You can also use it to store backups of cloud or on-premises resources. It enables you to provide distributed, remote access to data with centralized management. 
  • Cloud backup—can refer to the process of duplicating recovery to the cloud, the actual backups themselves, or the service that is used to store backup data. Cloud backups are used to provide redundancy for data and ensure that a copy remains accessible even if the original is damaged. 

Below we describe backup services offered by the three leading public cloud providers. You can use these services to create and manage backups of your data. 

AWS Backup

AWS Backup is a service that you can use to create backups of EBS, EFS, DynamoDB, and RDS services. It also includes an integration with AWS Storage Gateway that enables you to create backups of on-premises data. This service enables you to backup most AWS data when combined with the native snapshot capabilities that are included in many of AWS’s other services.

Through the AWS Backup console, you can manage backups from across your services. This includes determining which storage services you store backups in, who has access to backup data, and how long you retain backups for.

Azure Backup

Azure Backup is a solution you can use to backup Azure service or on-premises data. It enables you to automate and manage backups and their life cycles. This service also integrates with Recovery Services vaults, storage resources designed specifically for backup data. 

Azure Backup and Recovery Services are part of a collection of services Azure offers for backup creation and management. The other main service is Azure Site Recovery. This service enables you to create and remotely store backups of your data and services which you can then use for disaster recovery or as failover services. 

Google Cloud Backup

Unlike AWS or Azure, Google Cloud does not offer a specific service for backup creation or management. Instead, it enables you to store backups in lower tier (i.e., cheaper) storage services. Your primary storage options include:

  • Nearline Storage—designed for data that is accessed once a month or less. This option is best suited to your most recent backup or partial backups.
  • Coldline Storage—designed for data that is accessed once a year or less. This option is best suited to disaster recovery backups or archived backup data. 

These are reasonable options in Google Cloud but are not as functional with other cloud providers. The difference is that Google’s services provide access with sub-millisecond latency while cold and archive storage in AWS or Azure can take several hours or days to retrieve. 

Cloud Storage vs Cloud Database

Cloud storage enables you to store unstructured data or files, while cloud databases enable you to store structured data. You can store this data in tables, for relational databases, or in other formats like key-value pairs, for NoSQL databases. 

Cloud databases rely on cloud storage. In some cases the storage is abstracted from the user and packaged in the database solution. In other cases, the user has to deploy their own storage and connect the database to that storage.

All three major cloud providers offer database services that are based on their own resources. Some also enable you to host data or workloads in hybrid or on-premises resources. Below you can see a collection of the various database services each provider offers. 

AWS Cloud Database Services

AWS database services include:

  • Amazon RDS—a relational database service that supports your choice of six engines, including SQL Server, MariaDB, Oracle, PostgreSQL, MySQL, and Amazon Aurora. This is a fully managed service that you can manage through CLI, API, or the Console. 
  • Amazon Aurora—a proprietary relational database that offers PostgreSQL and MySQL compatibility modes. This database is designed for high performance and integration with AWS services. 
  • Amazon DynamoDB—a document and key-value database designed for low latency, high performance, and durability. This is a fully managed service for production level workloads. 
  • Amazon ElastiCache—an in-memory data store that you can use in place of a traditional database. It is compatible with Redis and Memcached and provides scalability, high performance, and super low latency. 
  • Amazon Neptune—a fully managed graph database that is optimized for high speed querying. It supports RDF and Property Graph data models and SPARQL and Gremlin languages. 
  • Amazon Timestream—a fully managed time series database designed for analytics, DevOps, and IoT workloads. It enables you to stream data and perform time sensitive queries. 
  • Amazon Quantum Ledger Database—a fully managed ledger database that you can use to verify transactions cryptographically. Stored data is transparent and immutable, making it ideal for auditing and financial transactions. 

Azure Cloud Database Services

Azure database services include:

  • Azure Cosmos DB—a fully managed, multi model database. It enables you to define schema and indexes from your workloads and application for maximum flexibility. CosmosDB includes API support for Table, SQL, MongoDB, Gremlin, Cassandra, Spark, and ectd. 
  • Azure SQL Database—a fully managed database based on the SQL Server engine. It is highly available, scalable, and includes serverless options. You also have the option of bringing existing SQL Server licenses from on-premises. 
  • Azure Database for MySQL—a fully managed database service based on the MySQL Community edition. You can integrate it with Azure Kubernetes Service and Azure App Service. 
  • Azure Database for PostgreSQL—a fully managed database for PostgreSQL Hyperscale. You can use this service in the cloud or on-premises. It is extensible through plugins, including for Azure Data Studio, Timescale DB, Visual Studio Code, and PostGIS.
  • SQL Server on Virtual Machines—a service that enables you to host SQL server on VMs with hybrid connectivity. You can use this database with Windows or Linux and use it to extend support for SQL Server 2008. 
  • Azure Synapse Analytics—an analytics service that combines big data analytics with data warehousing. It is integrated with SQL and Apache Spark engines and can be integrated with CosmosDB. 
  • Azure Data Explorer—a data analytics service that you can use to perform real-time analytics on streaming data. You can use it to perform time series analyses and query big data. 
  • Azure Cache for Redis—a fully managed, in-memory data store that supports Redis workloads. It includes features for built-in security, scalability, and reliability. 
  • Azure Database for MariaDB—a fully managed database based on the community version of MariaDB. It supports a variety of open source frameworks and includes features for high availability, scalability, and built-in security. 

Google Cloud Database Services

Google Cloud database services include:

  • Cloud SQL—a fully managed database that provides support for SQL Server, PostgreSQL, and MySQL workloads. It includes features for automated backups, data replication, and failover. 
  • Cloud Spanner—a fully managed relational database that is ACID compliant, globally distributed and supports automatic sharding. It also offers multi-regional availability and transparent synchronous replication. 
  • BigQuery—a scalable, serverless, multi-cloud data warehouse. You can use it to perform predictive and real-time analytics with built-in machine learning. 
  • Cloud Bigtable—a fully managed NoSQL database designed for operational and analytical workloads. It is cluster based and can scale to hundreds of nodes with built-in replication and high availability.
  • Cloud Firestore—a fully managed document database that you can use to store, sync, and query application data. It includes client libraries for offline support and live synchronization and integrates with Firebase. 
  • Firebase—a NoSQL database that provides real time data synchronization. You can use it to collaborate globally, support serverless applications, and provide offline support.
  • Cloud Memorystore—a fully managed in-memory data store that supports Memcached and Redis. You can use it to migrate caching layers and create application caches. It includes features for automatic failover, monitoring, patching, and high availability. 

Cloud Storage with NetApp CVO

NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.

See Additional Guides on Key Cloud Storage Topics

NetApp, together with several partner websites, has authored a large repository of content that can help you learn about many aspects of cloud storage. Check out the articles below for objective, concise reviews of key cloud storage topics.

Cloud File Sharing

Authored by NetApp

File shares support some of the most important workloads that enterprise businesses rely on, and the resources of the public cloud have created interesting new possibilities. Every major public cloud provider now offers its own cloud file sharing service, each with its own target workloads and considerations. But not every enterprise will find what they’re looking for in a fully managed, all-cloud service.

See top articles in our cloud file sharing guide:

Multicloud Storage

Authored by NetApp

Multicloud strategies are becoming more popular as organizations seek to optimize their cloud services and deployments. These strategies can help you prevent vendor lock-in, increase your flexibility, and help you optimize costs. 

This guide explains what multicloud storage is, how it works, what it’s used for, the core requirements for this storage, and how Cloud Volumes ONTAP supports it. 

See top articles in our multicloud storage guide:

AWS Database Services

Authored by NetApp

AWS offers a range of database services and support to try and meet all its clients needs. Many of these services are fully managed to help reduce your IT workload and enable you to store and use data as simply as possible. 

This guide explains what AWS database support is available, what database services are available, and how you can migrate your databases to AWS. 

See top articles in our AWS database services guide:

AWS Migration: Understanding the Process and Solving 5 Key Challenges

Authored by NetApp

Migrating to AWS is a big step for many organizations looking to modernize their operations. However, migration is a complex process and should only be undertaken once you fully understand your options and responsibilities.

This article explains the four phases of  AWS migration, how to choose between migration strategies, five common migration challenges, and how to migrate with Cloud Volumes ONTAP.

See top articles in our AWS migration guide:

AWS Snapshots for Amazon EBS

Authored by NetApp

Snapshots are a common method for natively backing up cloud data and services. This method enables you to save point in time backups which can be restored when needed.

This guide explains what types of storage snapshots are available, what AWS snapshots are, and how to use AWS snapshots. 

See top articles in our AWS snapshots guide:

Azure Backup

Authored by NetApp

Azure provides a wide variety of services to its users to help you manage your cloud data and services reliably. Azure Backup is one such service that can help provide data loss protection and peace of mind.

This guide explains what Azure Backup is and how to use it to backup your Azure data. 

See top articles in our Azure Backup guide:

Azure File Storage

Authored by NetApp

Storing file data in Azure is simple through Azure File Storage service. This service enables you to store files across cloud and on-premises resources, enabling you to flexibly and securely share data and workflows. 

This guide explains what Azure File Storage is, common use cases for Files, management concepts and components of the service, how data is accessed and the architecture of the service, and some best practices for securing your data.

See top articles in our Azure file storage guide:

Azure Files

Authored by NetApp

Azure Files is one of several storage services available to users in Azure. It is a service designed to replicate file shares like those commonly used on premises. With this service, you can smoothly transition your files to the cloud and allow file sharing across your teams. 

This guide explains what Azure Files is, how it complements other storage services, pricing and use cases for Files, and pros and cons you should be aware of. 

See top articles in our Azure Files guide:

Azure Database Services

Authored by NetApp

Nearly every production cloud deployment has one or more databases. These tools provide support for applications, enable workloads, and organize your data meaningfully. Having databases available that support all your needs is essential and Azure offers a range to choose from. 

This guide explains what Azure database workloads are supported, how databases work in Azure, and what services are available.

See top articles in our Azure database guide:

Azure Cost Management

Authored by NetApp

Cost management is a priority for many organizations in the cloud. While cloud services can be cheaper than on-premises resources it is not a given. Proactive cost management can help you ensure your costs don’t spiral and your ROI is optimized.

This guide explains what the Azure Cost Management service is, how to use the service, and highlights some additional Azure cost management tools you can use.

See top articles in our Azure cost management guide:

Azure High Availability

Authored by NetApp

High availability is one of the major benefits of cloud services. The guarantee that your data will remain accessible is critical to supporting high priority workloads and applications and is the reason many move to the cloud in the first place.

This guide explains what high availability is and how to optimize Azure high availability.

See top articles in our Azure high availability guide:

Google Cloud Storage

Authored by NetApp

Google Cloud offers a variety of storage options for you to choose from. These services form the base of many other services in the cloud and understanding what your options are can help you manage your cloud more efficiently.

This guide explains what Google Cloud Storage options exist and their common uses.

See top articles in our Google Cloud storage guide:

Google Cloud Database Services

Authored by NetApp

Google Cloud’s specialty is flexibility and integration of services and this extends to its database services. In Google Cloud you have a wide variety of database deployments, models, and support to choose from. 

This guide explains your options for deploying databases in the cloud, what Google Cloud database services are available, and how to choose the right service for you.

See top articles in our Google Cloud database guide:

Kubernetes Storage

Authored by NetApp

Software developers and DevOps engineers are packaging applications into lightweight units called containers. Kubernetes helps manage and scale containers across clusters of physical machines. 

In this environment, Kubernetes storage becomes a significant challenge. By default, containers are ephemeral, meaning that any transient data on the container is lost when it shuts down. However, Kubernetes provides several options for persistent storage.

See top articles in our Kubernetes guide:

Object Storage

Authored by Cloudian

Object storage is the most common type of storage used in cloud services. You may even be currently using object storage without knowing it. If you’re not, now may be the time to switch.

This guide explains what object storage is and how it can benefit you.

See top articles in our object storage guide:

File Upload and Sharing Technologies

Authored by Cloudinary

File uploads are a common method of collecting file data from users and creating interactivity in services. For example, file uploads are used to enable users to edit their own images or submit documents for translation. 

This guide explains what file uploads are, covers the most common types of file upload methods, and explains how you can use Cloudinary to upload files through a variety of languages and frameworks. 

See top articles in our file upload guide: