AWS Data Loss Prevention: Tools and Strategies

Data Loss Prevention (DLP) is a set of tools and protocols your organization can use to protect itself from theft, inadvertent or malicious loss, or unauthorized access and manipulation. Since DLP plays a central role in running operations continuously and in swift recovery from failure, it should be a major objective of any AWS high availability strategy. But first, it's important to understand why data loss occurs in the first place.

In this article, we’ll provide insight into the common causes of data loss, including virus and malware, power failures, and insider threats.

We’ll also look at AWS data loss prevention approaches, such as encrypting AWS S3 data and monitoring S3 buckets. We’ll shed light on some top AWS data loss tools, like Symantec Data Loss Protection and McAfee Total Protection for DLP and explain how NetApp Cloud Volumes ONTAP can help. 

In this article:

What Is DLP?

DLP is a system of tools and protocols used to protect data from loss, theft or unauthorized manipulation. This system works by classifying data and identifying violations of data handling policies set by organizations or regulatory bodies like HIPAA, GDPR or PCI DSS. They notify security teams of such violations and initiate protective actions such as encryption or user alerts. Security teams use DLP tools to monitor endpoints and cloud use, filter data streams and provide compliance reports.

The three types of data that DLP strategies are most concerned with are:

  • Personal informationincludes Personally Identifiable Information (PII), Protected Health Information (PHI), Payment Card Information (PCI) and other information protected by compliance regulations.
  • Intellectual propertyincludes source code, research and development information, whitepapers and internal price lists.
  • Corporate dataincludes employee information, financial documentation, strategic planning information and information regarding mergers or acquisitions.

Causes of Data Loss

There are many ways an organization can lose or leak data. The most common are:

  • Viruses and malwaremalicious code or programs used to breach systems for the purpose of data corruption, theft or ransom. Viruses and malware are often unintentionally downloaded or executed by users but can be intentionally planted.
  • Power failurespower outages or surges can lead to unsaved data and damaged hard drives. If servers running security tools go down, they can leave unaffected devices vulnerable.
  • Social engineeringusers are manipulated into providing authorization credentials or confidential information to attackers under the guise of a trustworthy front. This tactic is increasingly common as users grow immune to more obvious phishing techniques.
  • Insider threatscompromised employees or contractors who abuse legitimate permissions to access, transfer or manipulate data. This also applies to accidental data loss due to human error, software errors or lack of compliance with security protocols.
  • Back door attackout-of-date or unpatched software leaves security gaps that can be easily exploited. IoT devices are particularly vulnerable as they are often harder to manage.

AWS DLP Approaches

There are multiple approaches that can be taken to secure data but all of them require continuous security monitoring and correct set-up to be effective. In order to implement these approaches, you need to understand general security patterns and apply them to your cloud security controls and services.

Encrypting AWS S3 data

Amazon S3 will automatically encrypt data as it is written to disk and decrypt it when accessed provided the setting is enabled. To accomplish this, Amazon offers multiple options:

  • AWS Managed Keys (SSE-S3)—objects are encrypted with unique keys using multi-factor encryption. Encryption is done server-side with 256-bit Advanced Encryption Standard (AES-256).
  • AWS KMS-Managed Keys (SSE-KMS)—provides an audit trail of key use in addition to standard managed keys service. Keys can be generated and managed through this service and can also be used for client-side encryption.
  • Customer-Provided Keys (SSE-C)—you can create your own keys or use a third-party service to supply keys to Amazon. With this method, Amazon only takes care of the server-side encryption part, not the access management or client-side encryption.

Monitoring S3 buckets

The best way to monitor your cloud and your S3 buckets is through the use of a Security Information and Event Monitoring (SIEM) system which will allow you to manage alerts and view security information from a centralized dashboard. Built-in S3 notifications, set to alert you when buckets or their contents are modified or accessed, for example, can be sent to your SIEM and handled appropriately. Setting notification rules to cover permissions changes and limiting who has access to modify configuration settings will allow you to ensure that your data stays protected.

Protecting AWS S3-based data through policies

You should establish policies to control access and modification rights based on permissions or criteria you set. These can be managed, stand-alone policies attached to users, groups or roles in AWS, or inline policies implemented on a case-by-case basis. Managed policies are generally preferred as they can be more easily adapted and assigned.

Two key types of policies for managing your cloud security are:

  • Identity and Access Management (IAM)━IAM allows for flexible authentication by separating management flow, database administration tasks, from application flow, application access to data.
  • Access Control Lists (ACLs)━ACLs determine who can access specified resources, buckets and objects. By restricting network traffic and what specific rights traffic is allowed in regard to a resource, ACLs are able to reduce attack vectors and allow finer control over data security.

Data classification

You can classify data to help you determine appropriate security measures and reduce the stumbling blocks to an agile work environment. Classification of data should go beyond simple public or private descriptions into levels of data sensitivity and should be applied to both preventive and detective tools. Machine learning tools like User Behavior Analysis (UBA) enable the automatic detection of suspicious activity based on assigned or learned classifications and can be combined with alert functionalities, according to thresholds you determine.

Swim-lane isolation

Swim-lane isolation is the grouping of microservices into domains that mirror your business model, for example, differentiating access allowed to payment tools from that allowed to marketing tools. This allows you to create a data-access pattern that ensures only specified APIs are authorized to view or modify data and prevents leakage from one microservice domain through less secure domains. Swim-lane isolation can be achieved by applying a combination of IAM controls and ACLs that differ according to domain.


In addition to the strategies mentioned above, AWS provides several DLP tools to suit your different needs. Below, we look at two well-known products and one less well-known. All three can work both on-premises and in the cloud and all are scalable.

Symantec Data Loss Protection

Symantec provides an enterprise-oriented solution, focused on the use of AI technologies to identify unstructured data, detect data embedded in forms and images such as scanned documents or screenshots and detect full or partial data matching based on fingerprinting. It includes prepackaged policies (HIPAA, GDPR, etc.) to ensure regulatory compliance and includes both on and offline functionality. Cloud apps such as Dropbox, Google Suite, Salesforce and Office 365 are supported as well.

McAfee Total Protection for DLP

McAfee specializes in forensic analysis of data loss, monitoring breaches or leaks in the context of security policies and providing feedback useful for the creation of new compliance rules or the modification of existing ones. It operates via a centralized dashboard and uses manual classification in addition to third-party classification to prioritize sensitive data, including contexts such as location or application usage.

Endpoint Protector

Endpoint offers features for managing and monitoring periphery devices and ports, including automatic USB encryption, as well as for data transfers through email, cloud solutions and applications.It allows manual and automatic scanning of data for purposes of identification, management, and encryption, including an add-on for advanced encryption. Android and iOS devices can also be encrypted, located and managed.

AWS Data Loss Prevention with Cloud Volumes ONTAP

Cloud Volumes ONTAP provides data protection technology which can help prevent data loss. NetApp Snapshot™ technology requires no additional storage and does not impact application performance.

In many failure scenarios, an AWS high availability configuration can be a major factor in preventing data loss. But that doesn’t mean that it is the most efficient way to protect your data, both in terms of costs and flexibility.

NetApp Cloud Volumes ONTAP provides data protection in the form of instant, cost-effective NetApp Snapshot™ copies. These incremental backups are completely space-efficient thanks to the signature WAFL layout and because of the application of storage efficiencies such as deduplication, compaction, and compression. That means copies are faster to create, so there is even less chance of data ever being lost.

New call-to-action