Write‑through vs write‑back vs write‑around caching: trade‑offs?

Caching is one of the fastest wins in scalable architecture. Yet the moment you move beyond a basic read cache, you face a core choice about how writes flow through the cache and the source of truth. The three classic write policies are write through, write back, and write around. Each changes latency, durability, and freshness in different ways.

Write policies define what happens when your application updates data that is also cached.
Write through updates the cache and the database in the same path.
Write back updates the cache first and delays persistence to the database.
Write around skips the cache on writes and sends them only to the database.

These choices decide where the newest value lives, how quickly users see it, and what you risk when nodes fail.

Why It Matters

Your cache write policy directly shapes user experience and reliability.

Latency at write time versus latency at read time
Risk of losing recent updates during crashes or restarts
Cost of extra write traffic and cache churn
How long stale values may linger after a write
Operational simplicity for teams and ease of debugging

Interviewers love this topic because it links data consistency, performance tuning, and real world failure modes in distributed systems.

How It Works Step by Step

Below is the step by step flow for each policy. Assume a simple key value item and a relational or document database as the source of truth.

Write through

Application writes the new value to the cache.
The cache immediately writes the same value to the database before returning success.
Cache and database stay aligned because the write path includes both.
Reads hit the cache and always see the latest committed value.

Result

Predictable consistency and easy mental model
Higher write latency because every write waits on the database
More total writes since each update touches two layers

Write back

Application writes the new value to the cache and returns success right away.
The cache marks the entry dirty and persists it to the database later by flush or eviction.
A write buffer or queue batches changes to reduce database pressure.
If the cache fails before flushing, dirty data can be lost.

Result

Very low write latency and reduced database load
Risk of data loss on crash unless you add durable logs or replication for the cache layer
Reads see the freshest value from the cache even if the database is not yet updated
Operational complexity due to background flush and recovery logic

Write around

Application writes directly to the database and skips the cache.
If the key was present in the cache, it can be invalidated or allowed to expire.
Later reads may miss the cache and fetch from the database, then repopulate the cache.
Eventually the cache warms with hot keys driven by read traffic.

Result

Write path is simple and fast for the cache tier
Fewer useless cache fills for write heavy keys that are rarely read
Read after write may be stale until refresh, which hurts interactive user flows

Real World Example

Consider a video platform that shows a view counter and a watch progress marker.

View counter is high volume and tolerant to small lag. Write back is attractive because it batches many increments and protects the database from a storm of tiny updates. The cache must use a durable append log or a replicated memory store to guard against loss.
Watch progress is personal data where users expect immediate accuracy across devices. Write through fits better because once success is returned, the database is updated and any device reading through the cache will see the latest value.
Bulk ingest of metadata from a partner feed that users rarely read in the next minutes can use write around. You push the data into the database and let the cache warm only when users actually browse those videos.

This mix shows that one policy does not rule them all. Many platforms combine them per entity type or even per endpoint.

Common Pitfalls or Trade offs

Write through
- Extra latency on every write which can become a bottleneck during spikes
- Double write amplification increases cloud spend
- If the database is down, the whole write path blocks unless you implement circuit breakers and graceful degradation
Write back
- Risk of data loss on node crash or process restart if dirty entries are only in memory
- Harder debugging because database and cache can temporarily disagree
- Requires durable write logs, replication in the cache tier, and careful flush scheduling
Write around
- Read after write may return old data which surprises users
- Cache hit rate can dip after large write bursts
- Requires precise invalidation or short TTL to avoid long stale windows

Interview Tip

A common prompt is Choose a write policy for shopping cart updates. Start by clarifying needs. If checkout and cross device sync require immediate accuracy, prefer write through. To reduce database load during flash sales, add short lived write back for counters and telemetry, but keep cart items on write through. State the failure plan as well. For write back, mention a durable append log and cache replication to eliminate loss on crash.

Key Takeaways

Write through favors correctness and simplicity at the cost of higher write latency
Write back gives the best write latency and database offload but needs durability features in the cache tier
Write around avoids polluting the cache for write heavy keys but can hurt read after write freshness
Mature systems often mix policies by data class and access pattern
Always define your stale window, failure recovery plan, and observability before choosing

Table of Comparison

Policy	Write Latency	Read Freshness	Durability Risk	Database Load	Best Fit Workloads	Operational Complexity
Write Through	High, since each write also updates the database	Strong, data is consistent after every commit	Low, because data is persisted before success is returned	High (double writes)	Profiles, orders, user settings, monetary data	Low to Medium
Write Back	Very low, as writes return immediately	Fresh in cache, database lags until flush	High, unless cache is durable and replicated	Low (batched writes)	Counters, analytics, activity streams, like buttons	Medium to High
Write Around	Low, skips cache write on updates	Potentially stale until next cache refresh	Low, database ensures persistence	Medium	Bulk imports, cold data, rarely read records	Low

How to choose Step by Step

Define correctness needs. Is read after write required for users to trust the flow.
Measure the write to read ratio for the entity. Heavy write keys that are rarely read favor write around.
Decide the maximum stale window you can tolerate in seconds.
Estimate peak write QPS and database headroom. If the database is the bottleneck, write back with durable safeguards can help.
Set an operational budget. If the team wants simpler ops, write through is easier to reason about and debug.

Design add ons to make each policy safer

Write through
- Use idempotent handlers to tolerate retries when the database times out
- Add circuit breakers and a queue fallback if the database is down
Write back
- Persist every write to a local or remote append log before acknowledging success
- Replicate the cache tier so another node can flush dirty data on failover
- Expose metrics for dirty set size, flush lag, and error rates
Write around
- Invalidate specific keys on write or keep a short TTL to bound staleness
- Warm the cache for keys on the critical path using async refresh jobs

FAQs

Q1. Is write back safe for financial systems?

Only if the cache tier provides strong durability. Most financial flows prefer write through so the database becomes the single source of truth at success time.

Q2. When should I use write around over write through?

Choose write around if writes are frequent but reads are rare or delayed. This avoids wasting memory and churn in the cache for keys that users will not read soon.

Q3. How can I avoid data loss with write back?

Use a durable append log for the cache tier, replicate cache nodes, and replay the log on restart. Monitor dirty entry counts and flush delays.

Q4. Why does write through feel slower?

Each write waits on both cache and database. You pay the cost at write time in exchange for instant read freshness and easier debugging.

Q5. Can policies be mixed in one system?

Yes. Many teams use write through for user critical data, write back for counters and telemetry, and write around for bulk ingestion or migration jobs.

Q6. What is write allocate and no write allocate?

Write allocate means a write that misses the cache brings the entry into the cache. No write allocate means it does not. Write around is a common case of no write allocate.

Further Learning

Level up your mastery of caching choices and distributed systems patterns with structured practice. See the playbook in Grokking the System Design Interview.

If you want a deeper tour of traffic shaping, cache tiering, and performance trade offs that show up in real services, enroll in Grokking Scalable Systems for Interviews.

Both courses include practical examples, review questions, and step by step frameworks that turn this topic into a strength for your next system design interview.