When to choose SQL versus NoSQL databases for a design problem

Question

Design Gurus · Accepted Answer

SQL vs NoSQL is the most frequently discussed database trade-off in system design interviews. SQL databases (PostgreSQL, MySQL, Aurora) store data in structured tables with predefined schemas, enforce ACID transactions, and excel at complex queries with joins. NoSQL databases (DynamoDB, Cassandra, MongoDB, Redis) store data in flexible formats (key-value, document, wide-column, graph), scale horizontally by design, and trade strict consistency for availability and throughput. In every system design interview, your database choice is a scored trade-off discussion—interviewers care less about which you pick and more about whether you can explain why given the requirements.

Key Takeaways

Start every database decision with the workload, not the technology. Ask: What is the access pattern? What is the read-to-write ratio? Is ACID compliance required? How large will the data grow?  
SQL is the right default when you need transactions, complex queries with joins, data integrity, or when the data model has well-defined relationships.  
NoSQL is the right choice when you need horizontal scalability, high write throughput, flexible schemas, or simple key-value access patterns at massive scale.  
Many production systems use both. Netflix uses MySQL for billing (needs ACID) and Cassandra for content metadata (needs write throughput). This is the mature answer in interviews.  
Never say "NoSQL because it scales better" without qualification. Facebook scales MySQL to petabytes. Uber built their payment system on PostgreSQL. The tool matters less than the reasoning.

Why This Question Matters in Interviews

Database selection is the easiest trade-off discussion to score points on because every system design question requires at least one database. Interviewers use this decision to evaluate three things: whether you understand data modeling fundamentals, whether you can match a database to an access pattern, and whether you can articulate trade-offs under pressure.

A weak answer: "I would use NoSQL because it scales."

This earns zero points because it shows no reasoning about the specific workload.

A strong answer: "I would use DynamoDB here because our access pattern is simple key-value lookup by short URL, we need to handle 100,000 reads per second, and horizontal scalability is critical. The trade-off is that we lose the ability to do ad-hoc joins—if we later need analytics across URL mappings, I would add a separate Redshift data warehouse and sync via Kinesis."

The difference is specificity: named technology, stated access pattern, quantified scale, explicit trade-off, and mitigation plan.

The Core Comparison

Dimension SQL (Relational) NoSQL (Non-Relational)
Data model Tables with rows and columns, predefined schema Key-value, document, wide-column, graph; flexible schema
Relationships First-class support via foreign keys and joins No native joins; denormalized or application-level joins
Consistency Strong (ACID: Atomicity, Consistency, Isolation, Durability) Typically eventual (BASE: Basically Available, Soft-state, Eventually consistent)
Scalability Primarily vertical; horizontal requires sharding effort Horizontal by design; add nodes to scale
Query flexibility Rich SQL with joins, aggregations, subqueries Limited to primary key lookups and predefined access patterns
Schema Fixed; changes require migrations Flexible; schema evolves with data
Write throughput Limited by lock contention and schema validation High; append-only logs and no schema validation overhead
Best for Transactions, complex relationships, data integrity High throughput, simple access patterns, massive horizontal scale
Examples PostgreSQL, MySQL, Aurora, SQL Server, Spanner DynamoDB, Cassandra, MongoDB, Redis, Bigtable

When to Choose SQL

You Need ACID Transactions

Any system where data correctness is non-negotiable requires SQL. Payment processing, banking, inventory management, and order fulfillment involve operations that must be atomic—either the entire transaction succeeds or the entire transaction rolls back.

If two concurrent requests try to deduct from the same account balance, SQL's isolation guarantees prevent double-spending.

Interview example: "For the payment service, I need ACID transactions to ensure that debiting the buyer and crediting the seller happen atomically. If either fails, the entire transaction rolls back. PostgreSQL with serializable isolation handles this correctly."

You Need Complex Queries With Joins

When your data has well-defined relationships—users have orders, orders have line items, line items reference products—SQL's join capability is a natural fit. Running a query like "Find all orders from users in California who purchased product X in the last 30 days" is trivial in SQL and painful in NoSQL (requiring multiple round-trips and application-level joining).

Interview example: "The admin dashboard needs to run ad-hoc analytics queries joining users, orders, and products. SQL handles this natively. With NoSQL, I would need to denormalize the data or maintain a separate analytics store."

Your Data Model Has Strong Relationships

E-commerce catalogs, ERP systems, social graphs with complex queries, and financial ledgers all have deeply relational data. SQL schemas enforce referential integrity—you cannot create an order for a non-existent user if the foreign key constraint is in place.

Scale Is Manageable

If your system handles thousands to tens of thousands of requests per second (not millions), SQL on a managed service like Aurora handles it well. Aurora supports up to 128 TB of storage, 15 read replicas, and automatic failover. Most systems never outgrow this.

Real-world validation: Facebook scales MySQL to manage petabytes of data across their infrastructure. Uber built their payment system on PostgreSQL. SQL's scalability ceiling is higher than most candidates assume.

When to Choose NoSQL

You Need Horizontal Scalability at Massive Scale

When your system needs to handle millions of writes per second or store petabytes of data, NoSQL databases like Cassandra and DynamoDB are designed for horizontal scaling. You add nodes to increase capacity without architectural changes.

Interview example: "Our chat system processes billions of messages per day. Cassandra handles this because its LSM-tree storage engine is optimized for write-heavy workloads, and we can add nodes linearly as message volume grows."

Your Access Pattern Is Simple

If every query is a single-key lookup—"given user_id, return profile" or "given short_url, return long_url"—a key-value store like DynamoDB is faster, cheaper, and simpler than SQL. You do not need the overhead of schema validation, join support, and query planning.

Interview example: "The URL shortener's access pattern is a direct key-value lookup: given the short code, return the destination URL. DynamoDB gives us single-digit millisecond latency for this pattern with zero capacity planning via on-demand mode."

You Need Flexible or Evolving Schemas

Content management systems, user-generated content platforms, and IoT data pipelines often deal with data whose structure varies per record or evolves frequently. Document databases like MongoDB store each record as a JSON-like document with no fixed schema, eliminating the need for migration scripts.

Type	How It Stores Data	Best For	Examples
Key-Value	Simple key → value pairs	Session storage, caching, URL shorteners, user preferences	DynamoDB, Redis, Riak
Document	JSON/BSON documents with nested fields	Content management, user profiles, catalogs with varying attributes	MongoDB, Firestore, CouchDB
Wide-Column	Rows with dynamic columns grouped by column families	Time-series data, IoT telemetry, activity logs, messaging	Cassandra, HBase, Bigtable
Graph	Nodes and edges representing entities and relationships	Social networks, recommendation engines, fraud detection	Neo4j, Amazon Neptune

When to choose SQL versus NoSQL databases for a design problem

Key Takeaways

Why This Question Matters in Interviews

The Core Comparison