Choosing a suitable database for high-volume transactional data

Choosing a database for high-volume transactional data is the most consequential decision in any system design because the database determines your latency floor, throughput ceiling, consistency guarantees, and scaling path—and changing it later requires a painful, risky migration. A transactional database (OLTP) handles thousands to millions of short, concurrent operations per second—inserts, updates, deletes—while maintaining ACID guarantees that ensure every transaction either fully succeeds or fully rolls back. In system design interviews, interviewers do not want you to name a database; they want you to derive the choice from requirements. "I would use PostgreSQL" is a statement. "I would use PostgreSQL because the workload is read-heavy with complex joins, requires ACID transactions for payment integrity, and the data model is relational with strong referential constraints" is engineering reasoning that earns points.

Key Takeaways

  • Start from workload shape, not database brand. Characterize the workload first: read/write ratio, query complexity, consistency requirements, data volume, and latency targets. The workload profile determines the database category; the specific product is secondary.
  • ACID transactions are mandatory for financial data, inventory, bookings, and any system where partial updates cause business harm. Choose PostgreSQL, MySQL, or Aurora for single-region ACID. Choose Cloud Spanner or CockroachDB for globally distributed ACID.
  • DynamoDB is the default for simple key-value and document workloads at massive scale—single-digit millisecond reads, automatic sharding, zero operational overhead. But it lacks joins, complex queries, and strong multi-item transactions.
  • The database is always the bottleneck. Application servers scale horizontally with ease. Databases scale through read replicas (reads), connection pooling (connections), caching (repeated queries), and sharding (writes and storage). Each adds complexity.
  • In interviews, justify every database choice with three properties: the access pattern it supports, the consistency guarantee it provides, and the scaling path it enables. This three-part justification is the framework interviewers expect.

The Database Selection Framework

Before naming a database, answer five questions about your workload.

1. What is the read/write ratio?

Read-heavy (10:1 or higher): B-tree databases (PostgreSQL, MySQL) excel. Add read replicas and caching for scale. Write-heavy (1:1 or write-dominant): LSM-tree databases (Cassandra, ScyllaDB) convert random writes to sequential I/O for higher throughput. Balanced: PostgreSQL with proper indexing handles balanced workloads well up to moderate scale.

2. What query patterns do you need?

Point lookups by primary key: DynamoDB, Redis, Cassandra—optimized for key-value access. Complex joins across multiple tables: PostgreSQL, MySQL—relational databases with query planners that optimize multi-table queries. Full-text search: Elasticsearch—purpose-built for text search with relevance ranking. Time-series queries: TimescaleDB, InfluxDB—optimized for time-ordered data with retention policies.

3. What consistency guarantees are required?

Strong consistency (linearizability): PostgreSQL, Spanner, CockroachDB—every read returns the latest write. Eventual consistency acceptable: DynamoDB (default), Cassandra—reads may be stale temporarily. Tunable consistency: DynamoDB (optional strong reads), Cassandra (per-query quorum)—choose consistency vs performance per operation.

4. What is the data volume and growth rate?

Under 1 TB: A single PostgreSQL instance handles this comfortably with proper indexing. 1–10 TB: PostgreSQL with partitioning, or DynamoDB for automatic sharding. 10+ TB: Sharded PostgreSQL (Citus), DynamoDB, Cassandra, or Spanner for distributed storage.

5. What are the latency targets?

Sub-millisecond: Redis (in-memory). Must be cached or in-memory data. Single-digit milliseconds: DynamoDB, Cassandra, Aurora—fast point reads on SSDs. 10–50ms: PostgreSQL with proper indexing—acceptable for most web applications. Seconds: Acceptable only for batch analytics (BigQuery, Redshift), not transactional workloads.

Database Options for High-Volume Transactions

PostgreSQL

Engine: B-tree (InnoDB-equivalent via heap + B-tree indexes) Consistency: Strong (ACID, serializable isolation available) Scaling: Vertical + read replicas. Horizontal via Citus extension. Max practical throughput: ~10,000–50,000 TPS on a single instance (hardware-dependent)

PostgreSQL is the default transactional database for system design interviews. It provides full ACID compliance, complex query support (joins, subqueries, CTEs, window functions), and the most mature extension ecosystem of any open-source database. Aurora PostgreSQL adds managed failover, up to 15 read replicas, and 5x throughput improvement over standard PostgreSQL.

Best for: E-commerce order management, payment systems, user account management, booking systems, CMS platforms—any workload requiring relational integrity with complex queries.

Interview application: "I would use Aurora PostgreSQL for the order service. The data model is relational—orders reference users, products, and addresses through foreign keys. The workload is 70% reads (order history, dashboards) and 30% writes (new orders, status updates). Aurora provides up to 15 read replicas for the read-heavy dashboard queries and automated failover for high availability."

MySQL (InnoDB)

Engine: B-tree (InnoDB storage engine) Consistency: Strong (ACID) Scaling: Vertical + read replicas. Vitess for horizontal sharding.

MySQL is the second most common relational database in interviews. InnoDB provides ACID transactions, row-level locking, and foreign key constraints. MySQL scales through read replicas and, at massive scale, through Vitess—the sharding middleware that YouTube, Slack, and GitHub use.

Best for: Web applications with well-understood relational schemas. Applications migrating from legacy MySQL deployments.

DynamoDB

Engine: LSM-tree internally (proprietary) Consistency: Eventually consistent (default); strongly consistent reads available at 2x cost Scaling: Automatic and unlimited—no manual sharding or capacity planning Throughput: Millions of requests per second

DynamoDB is the default for key-value and document workloads at scale. It provides single-digit millisecond latency for point reads, automatic sharding, and zero operational overhead. DynamoDB Global Tables replicate data across regions with single-digit millisecond replication lag.

Limitations: No joins. No complex queries beyond primary key and sort key access patterns. Limited secondary index flexibility. Multi-item transactions supported but limited to 100 items per transaction.

Best for: Session stores, user profiles (simple lookups), URL shorteners, shopping carts, gaming leaderboards, IoT device state—any workload dominated by single-item reads and writes with simple access patterns.

Interview application: "I would use DynamoDB for the session store. Each session is a simple key-value record accessed by session_id. DynamoDB provides sub-5ms reads at any scale with no operational overhead. The trade-off versus PostgreSQL: we lose JOIN capability, but sessions do not need joins—they are independent documents."

Cloud Spanner

Engine: Proprietary (split-based with Paxos consensus) Consistency: Strong (globally consistent using TrueTime synchronized clocks) Scaling: Horizontal, globally distributed, automatic resharding

Cloud Spanner is the only production database that provides globally distributed SQL with strong consistency. TrueTime uses GPS receivers and atomic clocks in every Google data center to synchronize time across regions, enabling consistent reads after writes across continents.

Best for: Global financial systems, multi-region inventory management, any system where global SQL consistency is a hard requirement.

Interview application: "For the global payment ledger serving users in North America, Europe, and Asia, I would use Cloud Spanner. It provides strongly consistent reads globally—a transfer initiated in Tokyo is immediately visible in New York. No other managed database offers this guarantee. The trade-off is cost: Spanner starts at approximately 0.90/node-hour versus 0.10/hour for Aurora."

Cassandra / ScyllaDB

Engine: LSM-tree Consistency: Tunable (ONE, QUORUM, ALL per query) Scaling: Horizontal with consistent hashing, no single leader bottleneck

Cassandra handles extreme write throughput—hundreds of thousands of writes per second—through its LSM-tree engine and leaderless replication. Every node accepts writes, eliminating the single-leader bottleneck of PostgreSQL. ScyllaDB is a C++ reimplementation of Cassandra that achieves 10x lower latency on the same hardware.

Best for: Time-series data (IoT sensor readings, application metrics), event logging, write-heavy workloads where eventual consistency is acceptable.

CockroachDB

Engine: LSM-tree (Pebble storage engine) Consistency: Strong (serializable isolation) Scaling: Horizontal with automatic range-based sharding

CockroachDB provides distributed SQL with serializable transactions—similar to Spanner but without requiring Google's TrueTime hardware. It runs on standard cloud VMs, making it more accessible than Spanner for teams not on GCP. CockroachDB uses a Raft-based consensus protocol for replication and automatic range splitting for horizontal scaling.

Best for: Applications requiring distributed SQL with strong consistency across multiple regions without GCP lock-in. Multi-region SaaS applications needing serializable transactions with horizontal scaling.

Database Comparison for High-Volume Transactions

DatabaseEngineConsistencyJoinsMax TPS (single node)Scaling ModelBest For
PostgreSQLB-treeStrong (ACID)Full SQL10K–50KVertical + replicasComplex relational workloads
Aurora PostgreSQLB-treeStrong (ACID)Full SQL50K–200KManaged replicasHigh-availability relational
MySQL (InnoDB)B-treeStrong (ACID)Full SQL10K–50KVertical + VitessWeb applications
DynamoDBLSM-treeTunableNoneUnlimited (auto)Auto-shardingKey-value at massive scale
Cloud SpannerProprietaryGlobal strongSQLScales horizontallyAuto-reshardingGlobal SQL consistency
CassandraLSM-treeTunableLimited100K+ writesLeaderless horizontalWrite-heavy, time-series
CockroachDBLSM-tree (Pebble)Strong (serializable)Full SQL10K–50KHorizontalDistributed SQL

Scaling Transactional Databases

Read Scaling: Replicas

Create 1–3 read replicas that receive replicated data from the primary. Route read traffic to replicas. Aurora supports up to 15 replicas with sub-100ms replication lag. This multiplies read throughput linearly.

Trade-off: Replicas may serve slightly stale data during replication lag. Route balance-critical reads (payment confirmation) to the primary. Route dashboard and history reads to replicas.

Connection Scaling: Pooling

PgBouncer (PostgreSQL) or ProxySQL (MySQL) maintains a pool of reusable database connections. At 10,000 concurrent API requests, connection pooling serves all requests through 50–100 database connections—preventing connection exhaustion that crashes the database.

Write Scaling: Sharding

When a single primary cannot handle write throughput, shard data across multiple primaries. Hash-based sharding on a unique key (user_id, order_id) distributes writes evenly. Each shard is an independent database handling a fraction of total writes.

Trade-off: Cross-shard queries become expensive (scatter-gather). Cross-shard transactions require two-phase commit or saga pattern. Shard only when vertical scaling and caching are insufficient.

Query Scaling: Caching

Redis with a 90–95% hit ratio reduces database load by 10–20x. Cache frequently accessed query results with appropriate TTLs. Use cache-aside pattern: check cache first, query database on miss, populate cache on response.

For structured practice applying database selection decisions across complete system design problems, Grokking the System Design Interview covers transactional database trade-offs in every design solution. For advanced database patterns including distributed transactions, multi-region replication, and production-scale sharding strategies, Grokking the Advanced System Design Interview builds the depth required for L6+ interviews. The System Design Interview guide provides the end-to-end framework for integrating database selection into every interview phase.

Frequently Asked Questions

How do I choose a database in a system design interview?

Start from workload shape: read/write ratio, query complexity, consistency requirements, data volume, and latency targets. Derive the database category from these properties, then name a specific product. Justify with three properties: access pattern, consistency guarantee, and scaling path.

When should I choose PostgreSQL over DynamoDB?

Choose PostgreSQL when you need complex joins, relational integrity with foreign keys, ACID transactions across multiple tables, or complex analytical queries. Choose DynamoDB when you need simple key-value access at unlimited scale with zero operational overhead and eventual consistency is acceptable.

What is the difference between OLTP and OLAP databases?

OLTP (Online Transaction Processing) handles high volumes of short, concurrent transactions with strong consistency—PostgreSQL, MySQL, DynamoDB. OLAP (Online Analytical Processing) handles complex analytical queries over large datasets—BigQuery, ClickHouse, Redshift. Most systems need both: OLTP for operations, OLAP for analytics.

When should I consider Cloud Spanner?

When you need globally distributed SQL with strong consistency—reads in any region return the latest write from any other region. Typical use cases: global financial systems, multi-region inventory management. Spanner costs significantly more than Aurora (~9x per hour), so use it only when global consistency is a hard requirement.

How do I scale a transactional database?

In order of complexity: add a caching layer (Redis, 90%+ hit ratio), add read replicas (multiply read capacity), implement connection pooling (PgBouncer/ProxySQL), and finally shard (distribute writes). Each step adds complexity. Exhaust simpler options before sharding.

What is connection pooling and why is it critical?

Connection pooling maintains a fixed set of reusable database connections shared across requests. Without pooling, each concurrent request opens a new connection, and at 10,000 requests the database exhausts its connection limit and crashes. PgBouncer and ProxySQL solve this at the infrastructure level.

Should I use NoSQL for all high-throughput workloads?

No. NoSQL databases like DynamoDB excel at simple key-value access at massive scale. But if your workload requires joins, complex queries, or multi-table transactions, relational databases (PostgreSQL, Aurora) are more appropriate. High throughput with relational requirements → Aurora with read replicas and caching.

What is the difference between Aurora and standard PostgreSQL?

Aurora uses a distributed storage engine that provides 5x throughput improvement, up to 15 read replicas with sub-100ms lag, automated failover in under 30 seconds, and continuous backup to S3. Standard PostgreSQL requires manual replica management and has lower throughput limits. Aurora costs more but reduces operational overhead.

When should I shard my transactional database?

When a single primary cannot handle write throughput or storage capacity after exhausting vertical scaling, read replicas, caching, and connection pooling. Most systems do not need sharding until millions of daily transactions. Sharding adds cross-shard query complexity and distributed transaction overhead.

How do I discuss database selection in a system design interview?

During estimation, quantify QPS, storage, and read/write ratio. During database selection, name the database with a three-part justification: "I chose [database] because [access pattern fit], [consistency guarantee], and [scaling path]." During trade-offs, compare your choice against the alternative you rejected and explain why the trade-off favors your selection.

TL;DR

Database selection for high-volume transactions starts from workload shape, not brand preference. Characterize the workload across five dimensions: read/write ratio, query complexity, consistency requirements, data volume, and latency targets. PostgreSQL (Aurora) is the default for complex relational workloads with ACID requirements—handles 10K–50K TPS with read replicas and caching for scale. DynamoDB is the default for simple key-value access at unlimited scale with zero ops overhead—single-digit millisecond reads, automatic sharding. Cloud Spanner is the only option for globally distributed SQL with strong consistency using TrueTime—significantly more expensive, justified only when global consistency is a hard requirement. Cassandra/ScyllaDB handles extreme write throughput (100K+ writes/second) through LSM-tree and leaderless replication—best for time-series and event logging. Scale through a predictable progression: caching (simplest, highest ROI) → read replicas → connection pooling → sharding (most complex, last resort). In interviews, justify every database choice with access pattern, consistency guarantee, and scaling path.

TAGS
System Design Interview
System Design Fundamentals
CONTRIBUTOR
Design Gurus Team
-

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
Is coding required for system design?
What is the hardest part of a virtual interview?
What are the tips for whiteboard coding without syntax errors?
Why should I work for Dell?
How does Apple contact for an interview?
Is there coding in system design interview?
Related Courses
Grokking the Coding Interview: Patterns for Coding Questions course cover
Grokking the Coding Interview: Patterns for Coding Questions
The 24 essential patterns behind every coding interview question. Available in Java, Python, JavaScript, C++, C#, and Go. The most comprehensive coding interview course with 543 lessons. A smarter alternative to grinding LeetCode.
4.6
Discounted price for Your Region

$197

Grokking Modern AI Fundamentals course cover
Grokking Modern AI Fundamentals
Master the fundamentals of AI today to lead the tech revolution of tomorrow.
3.9
Discounted price for Your Region

$72

Grokking Data Structures & Algorithms for Coding Interviews course cover
Grokking Data Structures & Algorithms for Coding Interviews
Unlock Coding Interview Success: Dive Deep into Data Structures and Algorithms.
4
Discounted price for Your Region

$78

Design Gurus logo
One-Stop Portal For Tech Interviews.
Copyright © 2026 Design Gurus, LLC. All rights reserved.