Database Selection for System Design Interviews: How to Choose and Defend

01Why This Choice Exposes You

Database selection is the second-most-tested concept in system design interviews after caching. It also exposes more candidates than any other single decision in a 45-minute loop. The reason is that the failure modes are unusually varied: candidates default to what they know, get caught flat-footed when asked to justify the choice, hedge instead of committing, or pick the trendy option without grounding it in the actual access pattern.

The interviewer is not testing whether you know that Postgres is a relational database. They are testing whether you can reason from access pattern to consistency model to scale to a defensible choice in three sentences. Most candidates can name databases. Far fewer can defend one when challenged.

This page covers how to make the choice and how to defend it. It is organized around the candidate's reasoning path, not around the textbook taxonomy of database types. By the end, you should be able to walk into an interview, hear "what database would you use here?", and answer with confidence regardless of the question.

What This Page Is and Isn't

This is a deep-dive on how to choose a database in an interview, not a reference manual on every database that exists. Specific databases (Postgres, DynamoDB, Cassandra, Spanner, MongoDB, Redis) appear here as illustrations of categories. The categories are durable; the named systems will change. If you're looking for a feature-by-feature comparison matrix, you're in the wrong place.

02The Four Questions That Determine the Choice

Before you can pick a database, you need answers to four questions. Most candidates skip directly to picking. The act of asking these four questions out loud, even briefly, is the senior signal interviewers are looking for.

Question 01

What's the access pattern?

Are reads dominant or writes? Are queries by primary key, by secondary attribute, by range scan, by full-text search, by similarity, by graph traversal? Are joins common, occasional, or non-existent? Is the workload OLTP (many small operations) or OLAP (few large analytical queries)?

Access pattern is the load-bearing factor. The wrong database for the right access pattern is recoverable. The right database for the wrong access pattern is not.

What it impliesHeavy joins → relational. Key-only lookups → key-value. Flexible queries on documents → document store. Wide rows with time-ordered data → wide-column. Similarity search → vector. Each access pattern points toward a category.

Question 02

What's the consistency requirement?

Does the application break if a read returns slightly stale data? Or does it require strict serializability? Most product features tolerate eventual consistency just fine. Financial transactions, inventory at checkout, and authentication usually do not.

Be specific about which operations need which guarantee. A typical product might need strong consistency on user account writes but eventual consistency on activity feeds. The same application can use different consistency models for different data.

What it impliesStrong consistency required everywhere → relational or strongly consistent NoSQL (Spanner, FoundationDB). Tunable consistency → Cassandra, DynamoDB. Eventual consistency acceptable on most paths → most NoSQL options open up.

Question 03

What's the scale, today and projected?

Total data size. Reads per second. Writes per second. Geographic distribution. Some databases scale beautifully past a single node (Cassandra, DynamoDB). Some require careful sharding to scale at all (Postgres, MySQL). The ceiling matters because reaching the ceiling means a migration, and database migrations are notoriously expensive.

Don't overshoot. A startup designing for hypothetical billion-user scale on day one is a junior pattern. Match the database to the realistic scale you'll hit in two to three years, with a clear path to scale further if needed.

What it impliesSub-terabyte and sub-100K QPS → almost any database works. 100K–1M QPS or 10TB+ → narrow to natively-distributed options. 1M+ QPS or geographic distribution → wide-column, distributed key-value, or globally-replicated SQL like Spanner.

Question 04

What's the operational fit?

This is the question most candidates skip and the one staff loops grade for. Is this a managed service or self-hosted? What's the on-call burden? Do we have ecosystem maturity for backups, monitoring, query analysis? Does the team have experience with this database, or are we onboarding a new operational surface?

The technically-best database that nobody on the team knows how to operate at 3am is the wrong choice. Operational maturity beats technical optimality at every senior bar in 2026.

What it impliesSmall team, no dedicated DBAs → managed services (RDS, DynamoDB, Cloud Spanner). Mature platform team with database expertise → self-hosted options open up. Existing skill set in Postgres → bias toward Postgres unless the access pattern truly demands otherwise.

Answer these four out loud during the interview, even briefly. "Reads are key-only and dominant, eventual consistency is fine on most paths, scale is around 100K reads per second, and we want managed services. That points toward DynamoDB or a managed Redis." That sentence is the senior move. Skip the four questions and you're guessing.

03The Decision Path, Visualized

The four questions narrow the space. The diagram below shows the most common paths from access pattern to a category. It's not exhaustive (no flowchart is) but it covers maybe 80% of interview questions and most production use cases.

From Access Pattern to Database Category

A representative decision path. Start with access pattern (top), narrow to category (middle), then apply consistency, scale, and operational fit to choose the specific system. Modern tier categories (vector, search, time-series, graph) are highlighted in terracotta because they're often forgotten in favor of the "default" relational or NoSQL choice.

The diagram is a starting frame, not a recipe. Real decisions involve combinations: a system might be primarily relational with a vector index for AI features, or primarily key-value with a search index layered on top. The polyglot reality is the next section.

04The Eight Categories

Each category below describes what the database type is, when to reach for it, the canonical example or two, and what falls apart if you misuse it. These are the categories you should be able to discuss in any interview.

Relational (SQL)

Postgres · MySQL · Spanner · CockroachDB

Strong schema, ACID transactions, joins, mature ecosystem. Postgres is the right default unless you have a specific reason to choose otherwise.

The 2026 development: Postgres has gotten remarkably good at scaling vertically and supports JSON, full-text search, and vector search natively, which collapses several categories into one.

When to reach for itJoins matter. Schema is stable. Transactions span multiple rows. Operational maturity matters. Default choice unless ruled out.

Document

MongoDB · Couchbase · DynamoDB (in document mode)

Schema-flexible, nested documents, queries on document attributes. Good for content-shaped data where schema evolves frequently or each record has different fields.

Modern relational databases (Postgres especially) handle JSON well enough that document-only stores are often unnecessary. Reach for document stores when the access pattern is genuinely document-shaped, not because you want "flexibility."

When to reach for itRecords have variable structure. Queries access nested fields. Schema evolves rapidly without ceremony.

Key-Value

DynamoDB · Redis · Aerospike · FoundationDB

Lookups by primary key. Often microsecond latency. Often natively distributed and horizontally scalable to millions of operations per second.

The hidden complexity is access patterns: the moment you need a query other than "by primary key," key-value stores become awkward. Secondary indexes exist in DynamoDB but they're a tax on writes and consistency.

When to reach for itLookups are key-only. Latency must be sub-millisecond. Scale is high. You don't need joins or complex queries.

Wide-Column

Cassandra · ScyllaDB · HBase · BigTable

Optimized for high write throughput, time-ordered data, and queries by partition key with range scans on a clustering key. Excellent for time-series, event logs, and analytics-shaped workloads.

The trap is treating it like a general-purpose database. Wide-column requires you to know your access patterns up front and design the schema (partition key + clustering keys) for those patterns. New access patterns may require new tables or migrations.

When to reach for itWrite-heavy. Time-ordered or partition-key access. Scale beyond what relational can comfortably handle.

Graph

Neo4j · Amazon Neptune · TigerGraph

The relationships between entities are the data. Optimized for traversal queries (friends of friends, shortest path, cycle detection) that would require expensive recursive joins in a relational database.

The honest reality is that graph databases are over-recommended in interviews. Most "social network" questions can be answered well enough with a relational store and good indexes. Reach for graph databases only when traversal depth is unbounded or when graph algorithms are the core of the product.

When to reach for itRelationships matter as much as entities. Traversal queries are common and deep. The question genuinely calls for graph algorithms.

Elasticsearch · OpenSearch · Solr · Typesense

Inverted-index based. Optimized for full-text search, faceted queries, ranking, and analytics on document content. Often used as a secondary index in front of a primary store.

The mistake is using a search index as the primary database. Search engines have weaker durability guarantees and consistency models than primary stores. They are caches with rich query semantics, not systems of record.

When to reach for itFull-text search. Faceted filtering. Analytics on text or document content. Always secondary to a real primary store.

Time-Series

InfluxDB · TimescaleDB · Prometheus

Optimized for append-heavy workloads keyed by time, with downsampling, retention policies, and time-windowed aggregations as first-class operations. Used for metrics, monitoring, IoT telemetry, and financial tick data.

For most application workloads, a relational or wide-column database with time-ordered indexes is sufficient. Reach for time-series databases when the volume is high and time-windowed queries dominate the access pattern.

When to reach for itAppend-only or append-heavy. Queries are time-windowed. Retention and downsampling are first-class concerns.

Vector

pgvector · Pinecone · Weaviate · Milvus

Stores high-dimensional embeddings and supports approximate nearest-neighbor search. Increasingly mainstream in 2026 for RAG pipelines, semantic search, and recommendations with embedding-based ranking.

The 2026 development: pgvector is good enough for most use cases that you can keep your existing Postgres and add vector search to it, instead of running a separate vector store. Reach for dedicated vector databases (Pinecone, Weaviate) only when you have AI-specific scale needs that Postgres can't meet.

When to reach for itAI-adjacent question. Similarity search. Embedding-based ranking. Often paired with a primary store rather than replacing one.

The Default Move

If you're not sure, start with Postgres. It handles relational, document (JSON), full-text search, and now vector search competently. It's not the best at any of them but it's good enough at all of them, and the operational maturity is unmatched. Many interview questions can be answered with "Postgres, with these extensions" and a defense of why this is right-sized for the problem.

05The Polyglot Reality

Real production systems rarely use one database. They use several, each chosen for a specific access pattern, with the application stitching them together. This is called polyglot persistence, and naming it explicitly during the interview is a senior signal.

Examples of common polyglot patterns:

Postgres for primary state, Redis for cache and ephemeral data, Elasticsearch for search. The classic stack for most product applications.
DynamoDB for user data, Cassandra for write-heavy event streams, S3 for blob storage, OpenSearch for full-text. A common shape for high-scale consumer products.
Postgres with pgvector for AI features, primary product data, and search. The 2026 single-database move when scale doesn't require otherwise.
Spanner for globally-consistent financial transactions, BigTable for analytics, Memcache for read cache. Common at companies that need geographic strong consistency.

The interview move is to acknowledge polyglot persistence when the question warrants it, but not to over-engineer with seven databases when one would do. "I'd use Postgres as the primary store and add Redis as a cache; if search becomes a bottleneck I'd add Elasticsearch later" is a strong sentence. It commits to a starting architecture and explicitly defers the polyglot decision until the requirements force it.

The depth probe: "Why not just one database?"

If an interviewer pushes back with "why not just put everything in Postgres?", the strong response acknowledges the tradeoff. "Postgres can handle this, but Elasticsearch will give us 10x better full-text query performance and faceted search natively. The operational cost of running both is real, but the user-facing improvement is worth it." This shows you understand both the temptation to consolidate and the reason to split.

If an interviewer pushes back with "why not split the user data across two stores for scale?", the strong response argues for simplicity. "We could shard, but at this scale Postgres can handle it on a single primary with replicas. Adding sharding before we need it is operational debt. I'd defer that decision until our writes exceed roughly 50K per second."

The interview move is to commit to a starting architecture, name what would force you to expand it, and defer the expansion until the requirements force it.

06How to Defend the Choice in Three Sentences

The senior signal is the ability to defend a database choice in three sentences. Most candidates take a paragraph. Most weak candidates take a paragraph and never commit. The strong candidate's three sentences cover: the access pattern, the consistency choice, the operational fit.

Templates:

"I'd use [database] because the access pattern is [pattern], we can tolerate [consistency level], and the operational fit is [managed/self-hosted/team familiar]."
"I'd use [database]. Reads are [pattern] dominant; writes need [consistency requirement]; we want [operational property]."

Worked examples:

Twitter feed: "I'd use Cassandra for the tweet store. Writes are append-heavy and time-ordered, eventual consistency is fine on the timeline, and Cassandra scales horizontally without sharding ceremony. The team needs ops experience but it's a known tradeoff."
Payment processor: "I'd use Postgres for transactions, possibly Spanner if we need multi-region strong consistency. Writes need ACID, queries are mostly key and range, and operational maturity matters more than peak throughput. We can scale through read replicas before we need to shard."
RAG service: "I'd use Postgres with pgvector for the embeddings. Writes are moderate, queries are similarity-based with metadata filtering, and we get to keep our existing Postgres ops. If scale demands it later we can move to Pinecone, but we're not there today."

Notice what these examples have in common: a specific database choice, a one-clause justification per dimension, and an explicit "if X happens we'd reconsider" clause. That last part shows you understand the choice isn't permanent and that you have a migration story.

What the weak version sounds like

"Well, you could use Postgres or DynamoDB or Cassandra here. They all have tradeoffs. Postgres has joins but doesn't scale as well, DynamoDB is managed but limits queries, Cassandra is great for writes but eventually consistent. It really depends on what we're optimizing for."

That paragraph said nothing. The candidate listed options instead of choosing one. The interviewer cannot evaluate a non-choice. This is the most common failure mode and it's recoverable: pick something, defend it, and update if challenged.

07Common Failure Modes

1. Choosing what you know without justification

"I'd use MongoDB because that's what I've used." This is honest but exposes you. The interviewer is grading whether you can match the database to the problem, not whether you have prior experience with one. Frame your familiarity as a reason to defend the choice, not the choice itself: "I'm most familiar with Postgres, and the access pattern fits relational well, so I'd start there."

2. Choosing the trendy option without justification

"I'd use vector databases because the question involves AI." Reaching for a specialized store because of one feature without considering whether the existing primary store could handle it is a junior pattern. The 2026 honest answer is often "Postgres with pgvector" rather than "Pinecone." Knowing when not to introduce a new database is the senior signal.

3. Listing tradeoffs without committing

The hedge: "X has these advantages, Y has those, it depends on the requirements." If the interviewer wanted a survey, they'd ask for a survey. They asked for a choice. Pick one. Defend it. Update if challenged.

4. Designing for hypothetical scale

Picking Cassandra for a question that has 100 users today and a clear path to maybe 100,000 is over-engineering. Match the database to the realistic scale, not the dream scale. Cost-aware design is now part of the senior rubric explicitly. Expensive infrastructure for hypothetical traffic is graded against, not for.

5. Skipping operational fit

Choosing the technically-best database for the access pattern when nobody on the team can operate it. The interviewer wants to see that you'd consider whether the team can run the system at 3am, not just whether the system fits the workload. Naming this constraint explicitly is what staff loops grade for.

6. Treating consistency as binary

"It needs to be strongly consistent" or "eventual consistency is fine" with no nuance. Real applications need different consistency for different operations. User profile reads can be eventually consistent. User profile writes might need strong consistency. Account balance always needs strong consistency. Saying out loud "I'd use eventual consistency on these paths and strong consistency on these other paths" is a depth signal.

08How Database Selection Interacts With Other Concepts

Database selection doesn't live alone. The choice cascades into other architectural decisions.

Database selection × Caching. The cache strategy follows the database choice. A read-replica-heavy SQL setup needs different caching than a Cassandra cluster. The two decisions are often co-dependent. See the caching deep-dive.
Database selection × Sharding. Some databases shard natively (Cassandra, DynamoDB). Some require manual sharding (Postgres, MySQL). The operational cost of "Postgres at scale" is largely the cost of sharding it later. Choose accordingly.
Database selection × Replication. Different databases have different replication models. Postgres has read replicas with replication lag. Cassandra has tunable per-query consistency. Spanner has globally consistent multi-region replication. The replication model often matters more than the database name.
Database selection × Consistency. The "what's the consistency requirement?" question we covered in Section 2 is itself a deep concept. CAP and PACELC tradeoffs cluster here, and the database choice is downstream of those tradeoffs.
Database selection × Vector search. The 2026 question of "do I add a vector store, or use pgvector inside my existing Postgres?" is itself a database selection question. The answer depends on scale and query latency requirements.

For more on cross-concept interactions, see the concepts library hub.

09Practice Scenarios

Three scenarios. Read the setup. Decide what database you'd choose and how you'd defend it before opening the reveal. The reveal is one defensible answer, not the only correct answer.

Scenario 01

A social product needs to store user profiles and the follow graph at scale.

200 million users. Profile reads dominate (millions per second). Follow graph reads are also high (every timeline assembly hits it). Some users have tens of millions of followers. Profile data is small per user (a few KB) but the follow graph has billions of edges.

How to think about this

Two stores, polyglot. For user profiles: a key-value store (DynamoDB or Redis with persistence) is right because access is by user_id and profiles are small and read-heavy. Eventual consistency is fine for profile reads.

For the follow graph: this is harder. A naïve relational approach hits the celebrity hot-key problem. A wide-column store (Cassandra) handles the scale but querying "who does user X follow?" needs careful schema design (partition by follower, with followee as clustering key). A graph database (Neo4j, Neptune) is the textbook answer but operational maturity at this scale is real concern.

Strong answer: "DynamoDB for profiles, Cassandra for the follow graph with partition by follower_id. The celebrity case is hot-key risk but I'd handle it with a separate hot-celebrity cache layer rather than choosing a different database."

Scenario 02

A fintech startup needs to record financial transactions with strict audit requirements.

Modest scale (thousands of transactions per second). Transactions must be strongly consistent: a balance read after a write must reflect the write. Multi-region disaster recovery required. Heavy read of transaction history for reporting.

How to think about this

This is a clear relational choice. ACID transactions, joins for reporting, mature ecosystem for audit logging. The default is Postgres with read replicas and a robust backup story.

The multi-region requirement is the interesting twist. Postgres can do multi-region with replication, but writes are still primary-bound. If the requirement is genuinely "active-active multi-region with strong consistency," that's where Spanner or CockroachDB earn their cost. If it's "primary in one region, replica in another for DR," Postgres handles it.

Strong answer: "Postgres for transactions and history. If the multi-region requirement is for active-active, I'd switch to Spanner; if it's for disaster recovery only, Postgres with cross-region replicas works and is operationally simpler."

Scenario 03

A SaaS product needs to add a "natural language search" feature over its document corpus.

Existing system uses Postgres for primary state. Search currently uses a basic LIKE-query approach which is slow. Product wants to add semantic search using embeddings, plus keep keyword search working. Scale is modest (millions of documents, low thousands of searches per minute).

How to think about this

The textbook answer is "add Pinecone." The senior answer is "evaluate pgvector first."

At this scale, pgvector inside the existing Postgres instance handles semantic search well. The advantages: no new operational surface, transactions span document state and embeddings (so writes stay consistent), and the existing keyword search can be improved with Postgres full-text capabilities at the same time. Hybrid search (combining keyword and semantic) is straightforward when both live in the same database.

You'd reach for a dedicated vector database (Pinecone, Weaviate) only if the scale was meaningfully higher, or if pgvector's HNSW index couldn't meet latency requirements.

Strong answer: "Add pgvector to the existing Postgres. We get semantic search and keyword search in the same database, no new ops surface, and consistency between document state and embeddings. We'd reconsider if scale grows past what pgvector can handle, but at millions of documents we're not there."

10Database Selection FAQ

What's the right default if I don't know what to pick?

Postgres. It handles relational, document (JSON), full-text search, and vector search competently. The operational maturity is unmatched. The honest senior answer for many product-shaped interview questions is "Postgres, with appropriate extensions and replication." If the question forces you off this default, that itself is interesting information about the access pattern.

SQL or NoSQL?

Wrong question. The right questions are "what's the access pattern?" and "what's the consistency requirement?" Once you've answered those, the SQL/NoSQL distinction often doesn't matter. Postgres with JSONB handles many "NoSQL" workloads. DynamoDB handles many "SQL-shaped" workloads. The category names are 2010s framing; the access pattern is what matters.

Should I always pick a managed service?

For interviews at most companies, yes. Managed services reduce operational burden, which is now graded explicitly at senior level. For staff and above interviews, the answer is "managed when it fits, self-hosted when the workload demands it." Reasonable defaults: managed Postgres (RDS, Cloud SQL), managed DynamoDB or equivalent, managed Redis. Self-hosted only when there's a specific reason.

How do I handle the "but Cassandra is more scalable" pushback?

By defining "scalable." Cassandra is more scalable horizontally than Postgres without sharding ceremony. But Postgres at a single large instance with read replicas can comfortably handle most product workloads up to roughly 100K writes per second and tens of thousands of reads per second on hot data. The senior move is to right-size: "Postgres handles the realistic scale. If we exceed roughly 50K writes per second on this table, I'd reconsider, but we're not there today."

When should I bring up CAP and PACELC?

When the consistency tradeoff is the load-bearing decision. For most product interviews, mentioning consistency choice (strong vs eventual) is enough. For infrastructure interviews and staff loops, naming PACELC explicitly (the latency tradeoff even when there's no partition) is a depth signal. Don't recite the framework; use it to explain a specific choice.

What about NewSQL options like Spanner, CockroachDB, YugabyteDB?

They earn their place when you need globally-distributed strong consistency and can absorb the cost. For most interview questions they're overkill. Mention them when the interviewer pushes on multi-region consistency requirements. Default to Postgres or a wide-column store unless the question explicitly requires global strong consistency.

How do I handle questions about sharding when picking a database?

Database selection and sharding are separate concepts but they interact. Postgres requires manual sharding at scale. DynamoDB and Cassandra shard natively. The right framing is "I'd start with Postgres unsharded; if we exceed [scale threshold] we'd shard or migrate." This commits to a starting point and acknowledges the future migration path. The dedicated sharding deep-dive covers this in detail.

What's the answer for AI workloads?

The 2026 honest answer for most use cases: pgvector inside your existing Postgres, unless scale or latency demand a dedicated vector store. The reasons: no new operational surface, transactional consistency between document state and embeddings, hybrid search (keyword plus semantic) in one database. Reach for Pinecone, Weaviate, or Milvus when you have AI-specific scale or query patterns that pgvector can't meet. The AI infrastructure deep-dive covers this in more detail.

Continue

Sharding and Partitioning →

The next concept on the recommended learning path. Once data exceeds a single machine, you shard. Choosing the shard key, handling hot keys, and resharding without downtime are the depth-probe areas.