More about AWS Database
- MongoDB on AWS: Managed Service vs. Self-Managed
- Deployment Options on AWS for Apache Cassandra
- AWS PostgreSQL: Managed or Self-Managed?
- MySQL Database Migration: Amazon EC2-Hosted vs. Amazon RDS
- AWS MySQL: MySQL as a Service vs. Self Managed in the Cloud
- AWS ElastiCache for Redis: How to Use the AWS Redis Service
- AWS Data Analytics: Choosing the Best Option for You
- Database Case Studies with Cloud Volumes ONTAP
- Amazon DocumentDB: Basics and Best Practices
- AWS NoSQL: Choosing the Best Option for You
- AWS Big Data: 6 Options You Should Consider
- Oracle on AWS: Managed Service vs. Managed Storage Options
- AWS Database as a Service: 8 Ways to Manage DBs in AWS
- AWS Oracle RDS: Running Your First Oracle Database on Amazon
- SQL Server in AWS: Two Deployment Options
- DynamoDB Pricing: How to Optimize Usage and Reduce Costs
- AWS Databases: The Power of Purpose-Built Database Engines
- AWS Oracle: How to Lift and Shift Your Oracle DB to Amazon
- Overcome AWS RDS instance Size Limits with Data Tiering to AWS S3
What is AWS PostgreSQL?
PostgreSQL is a high-performance, enterprise-class open-source relational database that supports both SQL (relational) and JSON (non-relational) queries. PostgreSQL is highly reliable and is known for its integrity and accuracy, having been developed over 20 years by a large open-source community.
PostgreSQL is used as a primary data repository for many web, mobile, geospatial, and analytics applications. The most recent major release is PostgreSQL 12.
Amazon Web Services (AWS) provides PostgreSQL as a fully managed AWS database service, using Amazon Relational Database Service (RDS) and Amazon Aurora, an Amazon-developed database engine which is fully compatible with PostgreSQL.
In this post, we’ll review AWS managed PostgreSQL with RDS and Aurora. We’ll also present a second option—running PostgreSQL on an Amazon EC2 compute instance with EBS storage, to achieve the same level of control as in your local data center.
In this article, you will learn:
- Managed PostgreSQL Options on AWS: RDS and Aurora
- PostgreSQL on AWS: Fully Managed or Self Managed?
- PostgreSQL in AWS with NetApp Cloud Volumes ONTAP: The Best of Both Worlds
Managed PostgreSQL Options on AWS: RDS and Aurora
Let’s explore the two main options for deploying PostgreSQL on AWS as a managed service - Amazon Relational Database Service (RDS), and Amazon Aurora.
We’ll also cover Babelfish with Aurora, an innovative service that lets you migrate Microsoft SQL Server databases to PostgreSQL and save on license costs.
AWS RDS PostgreSQL
Amazon RDS lets you deploy fully managed PostgreSQL database instances. It provides automated backups, and lets you easily create database snapshots, with point in time restore. Amazon does not provide access to the physical hosts running the database instances (for example, using SSH).
You can manage your RDS PostgreSQL instances using standard management utilities such as pgAdmin or psql.
Easy Managed Deployments
Amazon RDS for PostgreSQL lets you easily run your PostgreSQL database on Amazon RDS DB Instances. You get to use all of the familiar features of PostgreSQL, and leverage RDS management capabilities. You can easily deploy new databases and migrate your existing applications and tools. RDS provides you with features to simplify and automate the process.
Backup and Recovery
There are several backup and recovery options available when using RDS PostgreSQL, including:
- Automated backups—by default, Amazon RDS is configured to automatically create point in time copies of DB instances, which you can later restore as needed. Automated backups store copies of the database and transaction logs, and you can specify a retention period of up to 35 days.
- DB snapshots—the system lets you create snapshots as backups for each DB instance. RDS stores DB snapshots until you delete them, and lets you copy snapshots across available AWS regions for disaster recovery and migration purposes.
In addition to automated backups and DB snapshots, RDS provides features for replication, including multi-AZ deployments and read replicas.
Once you enable this feature, RDS starts automatically creating and modifying your DB instance, to ensure database availability. The system automatically creates and manages a replica of your database and puts it in “standby” within another availability zone. Any database updates will be then made concurrently on both the main and standby databases. This allows RDS to failover to the updated standby if your main database fails.
The read replicas feature can help you provision read-heavy databases beyond the capacity of a single instance. To do this, you need to create one or more replicas of your source database instance. You can then serve read traffic from multiple data copies, and increase the aggregate read throughput of your application. This works within an AWS region. To ensure consistency, any changes to the instance source are propagated to all associated read replicas.
AWS Aurora PostgreSQL
Amazon Aurora PostgreSQL is an Amazon-developed database engine that is compatible with PostgreSQL and fully complies with ACID requirements (Atomicity, Consistency, Isolation, Durability). It can run as part of Amazon RDS, and is also provided in a serverless model, known as Aurora Serverless.
Aurora PostgreSQL is a fully managed service that lets you operate and easily scale existing and new PostgreSQL databases. It takes care of database management concerns like setup, patching and upgrades, backup and recovery, failover, and high availability. RDS provides automated migration tools to convert existing RDS PostgreSQL databases to Aurora PostgreSQL.
Commercial Database Migrations
Amazon provides you with the capabilities needed to migrate your PostgreSQL database workloads to Aurora. AWS Database Migration Service (DMS), for example, can help you significantly accelerate the migration process. Amazon offers the first six months of DMS for free, when your target database is Aurora.
Backups and Point-in-Time Restore
Aurora provides automatic and continuous backups of point in time copies, which are stored in Amazon S3. While the process is automated, you can define a retention period of up to 35 days and restore your backups at any time. You can choose to restore any second, and up to the last five minutes of the retention period.
Aurora provides machine learning (ML) capabilities that are available for use directly from your database. You can leverage this feature to gain ML-based predictions, which can be queried using SQL. Aurora lets you securely integrate with AWS ML services, which provide ready-made ML algorithms you can add to your application.
Amazon Aurora Serverless provides a layer that automatically manages your infrastructure. Once you enable Aurora Serverless, you do not need to manage database instances. The system automatically starts, shuts, and scales the capacity of your database according to the needs of your application.
Babelfish for Aurora PostgreSQL
Babelfish is a new translation layer for Amazon Aurora PostgreSQL, which allows Aurora to communicate with applications written for Microsoft SQL Server. Babelfish is a built-in Amazon Aurora feature offered at no extra cost.
Migrating from an old SQL Server database is time-consuming and resource intensive. Amazon provides the Database Migration Service (DMS), which can automatically migrate database architecture and data. However, application migration is often an issue, and in many cases, it is necessary to rewrite application code to interact with the database.
Babelfish allows Aurora PostgreSQL to understand T-SQL (Microsoft's SQL dialect) and communicate with applications written for SQL Server. Applications that were originally written for SQL Server 2014 or newer can run in Aurora with almost no code changes. As a result, organizations can migrate from SQL Server to Aurora with low cost and minimal risk.
Learn more about how to migrate your database to PostgreSQL in our articles:
- AWS Cloud Migration Services: Don’t Migrate Alone
- AWS Database Migration Service: Copy-Paste Your Database to Amazon.
Fully Managed vs. Self Managed Deployment of PostgreSQL on AWS
The above fully managed options are ideal for cloud-native or re-architected applications that do not require complete control over the environment. For more control over configuration, environment and tuning, you can opt for a self-managed deployment.
Self managing PostgreSQL in AWS involves running an EC2 instance with a PostgreSQL image. This gives you complete control over database configuration, allowing you to fine tune configuration and performance, just like you would do in your on-premises data center.
How Does Self-Managed PostgreSQL on AWS Work?
You can self-manage PostgreSQL on AWS by using two additional services—Amazon Elastic Block Store (EBS) and Amazon EC2.
To set this up, you need to first install and deploy your PostgreSQL database on an EC2 instance. You can simplify the process of setting up the deployment by using Amazon Machine Images as templates for your EC2 instances:
- You can create your own Amazon Machine Image (AMI) and use it as a template for your PostgreSQL EC2 instances.
- You can leverage ready-made AMIs with PostgreSQL preinstalled and pre-configured. Some AMIs come with high availability built in. Note that AMIs may carry a flat or per-hour charge.
The next step is to attach an EBS volume, which provides pluggable and durable storage for your data. Once you set up your self-managed database, you need to continuously maintain and monitor its operation.
Pros and Cons of Fully Managed Deployment—PostgreSQL on RDS
Pros of managed deployments:
- Elasticity scalability—RDS enables you to easily modify PostgreSQL resources hosted on AWS. To do this, you need to make some changes in the settings. The system will roll out your changes within a few minutes. Additionally, RDS supports storage autoscaling, and can allocate additional storage when space is running out, without any intervention.
- Pay on demand—RDS pricing is based on hourly fees, which require no upfront investment because you pay for the resources you use. You can also increase and decrease allocated resources, on-demand, as needed. This payment model provides the flexibility needed to optimize costs, and quickly spin up and destroy RDS instances for temporary jobs.
- Reserved instances—can help you significantly reduce costs. This option is ideal for predictable workloads that do not require on-demand flexibility.
Cons of managed deployments:
- Version support—RDS for PostgreSQL only supports versions 9.5, 9.6, 10, 11, and 12. If your PostgreSQL database runs on another version, you will need to upgrade it.
- Limit on number of instances—RDS supports only a maximum of 40 database instances and a maximum of 25 security groups for each Amazon region.
- Storage limitation—PostgreSQL database size on RDS is limited to a maximum of 16 TB.
- Less control—because RDS manages many aspects, you lose control over many database configuration options. For example, it is not possible to control where data is located or finetune the database.
- Vendor lock-in—migrating a PostgreSQL database out of RDS can be complex. You will need to tune and test your database before moving it into a new environment.
- Set limitations—it is not possible to change the service level agreements (SLAs) for RDS. Everything defined in the SLA, like the monthly uptime of 99.95%, is set in stone. There is no option to add a higher uptime or change any requirements like the availability, data protection, or disaster recovery.
Pros and Cons of Self-Managed Deployment—PostgreSQL with EC2 and EBS
Pros of self-managed deployments:
- Supports all versions—even if you run on a version not supported by RDS, you can run your database as is.
- No limits on number of instances—run as many databases as necessary by starting additional EC2 instances.
- No storage limitations—you can scale up storage using larger EBS volumes, or third-party storage services.
- Cost-effective—RDS charges an additional fee for the management layer. Self-management deployments costs are significantly lower because you pay only for system resources.
- Flexibility—self-managed deployments enable you to completely control the configurations of your database and its location.
- No vendor lock in—you can set up your database in a way that ensures it can be easily migrated to other cloud environments and on-premises locations.
- Scalability—RDS is limited to 16 TB and allows the use of only 40 database instances within a region. Self-managed deployments provide full control over the amount of compute and storage resources provisioned in EC2.
- Data protection—a self-managed deployment gives you full control over the configuration and schedule of your backups. You can set this up according to your own data and disaster recovery requirements.
- Extensive features—when you set up your own self-managed deployment, you gain control over all features and capabilities. This enables you to quickly upgrade and patch your database at any time.
Cons of self-managed deployments:
- Complexity—self-managing PostgreSQL on AWS requires skills and knowledge in AWS cloud environments and database management.
- Security—you need to know how to properly configure and maintain the architecture, as well as set up proper security measures. Misconfigurations can create vulnerabilities, which might be exploited by threat actors looking to hack into your system.
- Scaling—self-managed deployments require you to set up scaling jobs on your own, and then manage groups and monitor the operation. This takes time and may not be the most efficient route to take for demanding applications.
PostgreSQL in AWS with NetApp Cloud Volumes ONTAP: The Best of Both Worlds
NetApp Cloud Volumes ONTAP helps you get all the benefits of the self-managed deployment option—without the complexity.
NetApp Cloud Volumes ONTAP, the leading enterprise-grade storage management solution, delivers secure, proven storage management services on AWS, Azure and Google Cloud. Cloud Volumes ONTAP supports up to a capacity of 368TB, and supports various use cases such as file services, databases, DevOps, or any other enterprise workload, with a strong set of features including high availability, data protection, storage efficiencies, Kubernetes integration, and more.
Cloud Volumes ONTAP extends your control and provides more capabilities, including a number of automatic features that aren’t available using either the managed service option on RDS or the native AWS storage model.
When using NetApp Cloud Volumes ONTAP you can leverage the following features:
- Point-in-time, instantly-creatable consistent snapshot copies
- Zero-cost clones
- Reduced storage consumption thanks to storage efficiencies like thin provisioning, data compression, deduplication, and automatic tiering of infrequently-used data (can save 70% of storage costs)
Cloud Volume ONTAP allows you to leverage self-managed PostgreSQL in EC2 with enterprise capabilities, achieving the following additional benefits:
- Run any version of PostgreSQL
- No limitations on instances configuration, scalability or storage volume
- No vendor lock-in
- No SLA limitations
- Pay only for actual resources used
- Full control over database configuration
- Full control over disaster recovery and data protection