Introduction Understand Consider Choose Use Explore

Choosing an AWS database service

Taking the first step

Purpose	Help determine which AWS database or databases are the best fit for your organization.
Last updated	May 13, 2024
Covered services	Amazon Aurora Amazon DocumentDB Amazon DynamoDB Amazon ElastiCache Amazon Keyspaces Amazon MemoryDB for Redis Amazon Neptune Amazon QLDB Amazon RDS Amazon Timestream

Introduction

Amazon Web Services (AWS) offers a growing number of database options (currently more than 15) to support diverse data models. These include relational, key-value, document, in-memory, graph, time-series, wide column, and ledger databases.

Choosing the right database or multiple databases requires you to make a series of decisions based on your organizational needs. This decision guide will help you ask the right questions, provide a clear path for implementation, and help you migrate from your existing database.

This six and a half minute video explains the basics of choosing an AWS database.

Understand

Databases are important backend systems used to store data for any type of app, whether it's a small mobile app or an enterprise app with internet-scale and real-time requirements.

This decision guide is designed to help you understand the range of choices available to you, establish the criteria that make sense for you to make your database choice, provide you with detailed information on the unique properties of each database—and then allow you to dive deeper into the capabilities that each offers.

What kinds of apps do people build using AWS databases?

Internet-scale apps: These apps can handle millions of requests per second over hundreds of terabytes of data. They automatically scale vertically and horizontally to accommodate your spiky workloads.
Real-time apps: Real-time apps such as caching, session stores, gaming leaderboards, ride-hailing, ad-targeting, and real-time analytics need microsecond latency and high throughput to support millions of requests per second.
Enterprise apps: Enterprise apps manage core business processes, such as sales, billing, customer service, human resources, and line-of-business processes, such as a reservation system at a hotel chain or a risk-management system at an insurance company. These apps need databases that are fast, scalable, secure, available, and reliable.
Generative AI apps: Your data is the key to moving from generic applications to generative AI applications that create differentiating value for your customers and their business. Often, this differentiating data is stored in operational databases powering your applications.

Note

This guide focuses on databases suitable for Online Transaction Processing (OLTP) applications. If you need to store and analyse massive amounts of data quickly and efficiently (typically met by an online analytical processing (OLAP) application), AWS offers Amazon Redshift, a fully managed, cloud-based data warehousing service that is designed to handle large-scale analytics workloads.

There are two high-level categories of AWS OLTP databases—relational and non-relational.

The AWS relational database family includes eight popular engines for Amazon RDS and Amazon Aurora. The Amazon Aurora engines include Amazon Aurora with MySQL compatibility, and Amazon Aurora with PostgreSQL compatibility. The other RDS engines include Db2, MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server. AWS also offers deployment options such as Amazon RDS Custom and Amazon RDS on Outposts.
The non-relational database options are designed for specific data models including key-value, document, caching, in-memory, graph, time series, wide column, and ledger data models.

We explore all of these in detail in the Choose section of this guide.

Database migration

Before deciding which database service you want to use to work with your data, you should spend time thinking about your business objective, database selection, and how you are going to migrate your existing databases.

The best database migration strategy helps you take full advantage of the AWS Cloud. This might involve migrating your applications to use purpose-built cloud databases. You might just want the benefit of using a fully managed version of your existing database, such as RDS for PostgreSQL or RDS for MySQL. Alternatively, you might want to migrate from your commercial licensed databases, such as Oracle or SQL Server, to Amazon Aurora. Consider modernizing your applications and choosing the databases that best suit your applications' workflow requirements.

For example, if you choose to first transition your applications and then transform them, you might decide to re-platform (which makes no changes to the application you use, but lets you take advantage of a fully managed service in the cloud). When you are fully in the AWS Cloud, you can start working to modernize your application. This strategy can help you exit your current on-premises environment quickly, and then focus on modernization.

The following image shows how the AWS Database Migration Service is used to move data to Amazon Aurora. For resources to help with your migration strategy, see the Explore section.

Example: How the AWS Database Migration Service is used to move data to Amazon Aurora

Consider

You're considering hosting a database on AWS. This might be to support a greenfield/pilot project as a first step in your cloud migration journey, or you might want to migrate an existing workload with as little disruption as possible. Or perhaps you would like to port your workload to managed AWS services or even refactor it to be fully cloud-native.

Whatever your goal, considering the right questions will make your database decision easier. Here's a summary of the key criteria to consider.

Business objective

The first major consideration when choosing your database is your business objective. What is the strategic direction driving your organization to change? As suggested in the 7 Rs of AWS, consider whether you want to rehost an existing workload, or refactor to a new platform to shed commercial license commitments.

Migration strategy

You can choose a rehosting strategy to deploy to the cloud faster, with fewer data migration headaches. Install your database engine software on Amazon EC2, migrate your data, and manage your database much as you do on-premises. While rehosting is a fast path to the cloud, you are still left with the operational tasks such as upgrades, patches, backups, capacity planning and management, maintaining performance, and availability targets.

Alternatively, you can choose a re-platform strategy where you migrate your on-premises relational database to a fully managed Amazon RDS instance.

You may consider this an opportunity to refactor your workload to be cloud-native, making use of Amazon Aurora or purpose-built NoSQL databases such as Amazon DynamoDB, Amazon Neptune, or Amazon DocumentDB.

Finally, AWS offers serverless databases, which can scale to an application's demands with a pay-for-use pricing model and built-in high availability. Serverless databases increase your agility and optimize costs. In addition to not needing to provision, patch, or manage servers, many AWS serverless databases provide zero downtime maintenance.

AWS serverless offerings include Amazon Aurora Serverless, Amazon DynamoDB, Amazon ElastiCache, Amazon Keyspaces, Amazon Timestream and Amazon Neptune serverless, the graph database.

Data considerations

The core of any database choice includes the characteristics of the data that you need to store, retrieve, analyze, and work with. This includes your data model (is it relational, structured or semi-structured, using a highly connected dataset, or time-series?), data access (how do you need to access your data?), the extent to which you need real-time data, and whether there is a particular data record size you have in mind.

Operational considerations

Your primary operational considerations are all about where your data is going to live and how it will be managed. The two key choices you need to make are:

Whether it will be self-hosted or fully managed: The core question here is where is your team going to provide the most value to the business? If your database is self-hosted, you will be responsible for the day-to-day maintenance, monitoring and patching of the database. Choosing a fully managed AWS database simplifies your work by removing undifferentiated database management tasks, allowing your team to focus on delivering value such as schema design, query construction, query optimization, and also responsible for supporting the development of applications that align with your business objectives.
Whether you need a serverless or provisioned database: DynamoDB, Amazon Keyspaces, Timestream, ElastiCache, Neptune, and Aurora provide models for how to think about this choice. Amazon Aurora Serverless v2, for example, is suitable for demanding, highly variable workloads. For example, your database usage might be heavy for a short period of time, followed by long periods of light activity or no activity at all.

Resiliency considerations

Database resiliency is key for any business. Achieving it means paying attention to a number of key factors, including capabilities for backup and restore, replication, failover, and point-in-time recovery (PITR).

Performance considerations

Consider whether your database will need to support a high concurrency of transactions (10,000 or more) and whether it needs to be deployed in multiple geographic regions.

If your workload requires extremely high read performance with a response time measured in microseconds rather than single-digit milliseconds, you might want to consider using in-memory caching solutions such as Amazon ElastiCache alongside your database, or a database that supports in-memory data access such as MemoryDB.

Security considerations

Security is a shared responsibility between AWS and you. The AWS shared responsibility model describes this as security of the cloud managed by AWS, and security in the cloud managed by the customer. Specific security considerations include data protection at all levels of your data, authentication, compliance, data security, storage of sensitive data and support for auditing requirements.

Choose

Now that you know the criteria by which you are evaluating your database options, you are ready to choose which AWS database services might be a good fit for your organizational requirements.

This table highlights the type of data each database is optimized to handle. Use it to help determine the database that is the best fit for your use case.

Database families	When would you use it?	What is it optimized for?	Related database engines or services
Relational	Use when you're migrating or modernizing an on-premises relational workload or if your workload has less predictable query patterns.	Optimized for structured data stored in tables, rows, and columns. They support complex queries through joins.	Amazon Aurora Amazon RDS
Key-value	Use for workloads such as session stores or shopping carts. Key-value databases can scale to large amounts of data and extremely high throughput of requests, while servicing millions of simultaneous users through distributed processing and storage.	Optimized for consistent single-digit millisecond performance at any scale (meaning any number of writes and reads).	Amazon DynamoDB
In-memory	Use Amazon ElastiCache when you need a caching layer to improve read performance. Use Amazon MemoryDB for Redis when you need full data persistence, but still need sub-millisecond read latencies.	ElastiCache is optimized to support microsecond reads and sub-millisecond writes. MemoryDB supports microseconds reads and single-digit milliseconds writes. ElastiCache is an ephemeral cache while MemoryDB is an in-memory database.	Amazon ElastiCache Amazon MemoryDB for Redis
Document	Use when you want to store JSON-like documents with rich querying abilities across the fields of the documents.	Optimized for storing semi-structured data as documents with multi-layered attributes.	Amazon DocumentDB (with MongoDB compatibility)
Wide column	Use when you need to migrate your on-premises Cassandra workloads, or when you need to process data at high speeds for applications that require single-digit-millisecond latency.	Optimized for workloads that require heavy reads/writes and high throughput coupled with low latency and linear scalability.	Amazon Keyspaces (for Apache Cassandra)
Graph	Use when you have to model complex networks of objects, such as social networks, fraud detection and recommendation engine use cases.	Optimized for traversing and evaluating large numbers of relationships, and identifying patterns with minimal latency.	Amazon Neptune
Time series	Use when you have a large amount of time series data, potentially from a number of sources, such as Internet of Things (IoT) data, application metrics, and asset tracking.	Optimized for storing and querying data that is associated with timestamps and trend lines.	Amazon Timestream
Ledger	Use when your organization has to communicate with other entities (businesses, customers) and you need a way to verify and trust each other, or when you not only need to retrieve the current state of data, but need to prove how data mutated into the current state.	Optimized for maintaining a complete, immutable, and verifiable history of database changes.	Amazon Quantum Ledger Database (Amazon QLDB)

Use

Just as there is no single database that can satisfy all possible use cases effectively at the same time, any particular database type discussed above, may not satisfy all your requirements perfectly.

Consider your needs and workload requirements carefully and prioritize based on the considerations covered above, the requirements you must meet to the highest standard, the ones you might have some flexibility on, or even the ones you can do without. This can form a system of values, that will help you make effective trade-offs, that lead to the best possible outcome for your unique circumstances.

Also consider that, usually, you will be able to cover your application requirements with a mix of best-fit databases. Building a solution with multiple database types allows you to lean on each, for the strengths it provides.

For example, in an e-commerce use case, you may use DocumentDB (for product catalogs and user profiles), leaning on the flexibility provided by semi-structured data (but also the low, predictable latency afforded by DynamoDB, when your users are browsing your product catalog). You may also use Aurora for inventory and order processing, where a relational data model and transaction support may be more valuable to you.

To help you learn more about each of the available AWS database services, we have provided a pathway to explore how each of the services work. The following section provides links to in-depth documentation, hands-on tutorials, and resources to get you started.

Amazon Aurora

Getting started with Amazon Aurora

This guide includes tutorials and covers more advanced Aurora concepts and procedures, such as the different kinds of endpoints and how to scale Aurora clusters up and down.