How do you implement right‑to‑be‑forgotten across replicas and caches?

Right to be forgotten is the requirement that when a user asks to delete their account or personal data, your system must remove it from every place it lives. That includes primary databases, read replicas, in memory caches, CDNs, search indexes and sometimes logs and backups. In a modern distributed system with many services and regions, this is a non trivial system design problem and a favourite topic in senior system design interviews.

The key idea is simple. You are not allowed to keep data that can identify the user once they ask you to delete it. The hard part is making sure that deletion is complete, consistent, observable and fast enough, without breaking availability or creating privacy regressions.

Why it Matters

Right to be forgotten is not just a product feature. It is a regulatory and trust requirement. Regulations like GDPR and similar privacy laws give users the right to request deletion of their personal data. If your deletion logic only touches the main database but leaves stale profiles in caches or replicas, you are effectively non compliant.

From a system design interview perspective, this topic is a powerful way for interviewers to test whether you truly understand:

End to end data flows across microservices
Replication and eventual consistency in distributed systems
Cache invalidation strategies in a scalable architecture
The difference between logical delete, hard delete and data retention policies

Candidates who treat deletion as a single SQL delete statement miss the real complexity. Strong candidates talk about identifiers, tombstones, event driven deletion pipelines, replay safety and observability of deletion.

How it Works Step by Step

A practical design for right to be forgotten across replicas and caches usually follows these steps.

Step 1. Model user identity and data surfaces

Start by listing every place where user related data can live. For example

Primary relational database or document store
Read replicas in multiple regions
Caches such as Redis, memcached, application level caches
CDN caches for avatars or profile pages
Search indexes such as Elasticsearch or OpenSearch
Analytics stores and data lake
Queues and streams that may still hold events

Introduce a stable internal user identifier and ensure that every data surface uses this identifier. Without a consistent id, you cannot reliably delete across systems.

Step 2. Introduce a privacy controller service

Create a dedicated privacy or data protection service that owns the deletion workflow. It should expose an API like

POST /users/{userId}/forget

This service receives the request, authenticates and authorises it, records an audit entry, and orchestrates deletion across all stores. Treat this as a long running workflow that may take seconds or minutes to fully complete.

Step 3. Write a deletion request record and tombstone

Before touching any data, write a deletion request record to a durable store keyed by user id. This record should contain

Request time
Legal basis or request origin
Status for each subsystem, for example database, cache, search index

Then write a tombstone record or a logical delete flag for the user in the primary source of truth. This ensures that any later replication or cache fill logic does not resurrect the user data.

Example

users table has a boolean column is_deleted and a timestamp deleted_at
Application code treats any row with is_deleted as non existent for read paths

Step 4. Propagate delete to primary and replicas

Now you perform hard deletion or partial deletion on primary and replicas. There are two common patterns.

Strongly consistent replication. If you use log based replication, issue a delete or update that flows through the replication stream. Replicas will eventually apply the delete. You must ensure that all writes to that user cease once the tombstone is set.
Event driven fan out. The privacy controller emits a UserForgotten event to a message bus. Each storage owning service subscribes and performs its own deletion, setting status back in the controller.

Key implementation details

Use idempotent operations. Deleting the same user twice should be safe.
Use versioning to avoid stale replicas re creating data. For example, store a user_version and only write if the incoming version is newer.
Have a maximum propagation time target, for example all replicas must delete within thirty minutes.

Step 5. Invalidate caches and CDNs

Caches are the part that trips many candidates in system design interviews. A cache can still serve stale user profiles after the database row is deleted, which violates right to be forgotten.

You can use a mix of strategies.

Cache key based eviction. When the privacy controller handles a deletion request, it publishes a message containing all relevant cache keys for that user. Cache nodes subscribe and evict those keys.
Token based identity. Instead of caching raw user records, you cache them with a version token. When the user is deleted or updated, the token changes and all old cache entries become invalid.
CDN purge. For content stored and cached at the edge such as avatars, use CDN purge APIs to remove objects by URL or path. Combine this with short TTLs for personal data resources.

Important detail. The tombstone logic in the application layer should ensure that even if a cache miss causes a database read, the result is treated as deleted and a new cache entry is not created.

Step 6. Handle search indexes and analytics

Search and analytics often maintain denormalised copies of user data. For right to be forgotten you have two main options.

Maintain an index of per user documents. When you handle a deletion, you directly delete or scrub all documents that contain that user id.
For aggregate analytics that cannot be reversed, you remove identifiers or coarse grain the data so it no longer counts as personal data.

Search indexers should also listen to the same UserForgotten event and apply deletions.

Step 7. Manage backups and cold storage

Backups are tricky. Most organisations do not go back and edit old write once snapshots. Instead, they ensure that if a backup is restored, a follow up job re applies all deletion tombstones before the system becomes active.

From a design and interview point of view the key points are

Deleted data must never re enter active serving systems from backups.
You keep deletion request records and tombstones longer than any backup retention period.

Step 8. Observability for deletion

Right to be forgotten is incomplete without good observability. Build dashboards and alerts such as

Percentage of deletion jobs completed within target time
Number of failed deletion tasks by subsystem
Random sampling of deleted users to confirm no data returned via public APIs

In an interview, mentioning observability and verification of deletion often differentiates senior candidates.

Real world example Instagram style account deletion

Think of a social photo sharing app similar to Instagram. A user requests account deletion. What happens behind the scenes.

The user clicks delete account which hits the privacy controller API.
The controller records the request and sets a tombstone on the user row in the primary user database. The app stops showing this user in any feed or profile page.
A UserForgotten event is published to a message bus.
Services that own posts, comments, followers, messages each consume this event. Some decide to delete entire documents. Others scrub text and keep non personal counters like total like counts.
Cache service receives the event and removes keys such as user:{id}, timeline:{id}, profile_page:{id}.
CDN purge is called for the profile picture and perhaps all media under a user specific path.
Search indexer removes this user from people search and from search results where personal content is directly associated with the user.
An offline job verifies that for this user id, no live API still returns personal data.

This flow is a realistic picture of how a large consumer platform handles right to be forgotten in a distributed environment.

Common Pitfalls or Trade offs

Here are common issues teams and candidates miss.

Forgetting caches and replicas. Deleting only from the primary database, which leaves stale data in Redis, CDN or search index.
Incomplete data inventory. Not knowing all the places where user data lives, especially in legacy or analytics systems.
Resurrection bugs. A stale replica or consumer processes an old event and re inserts a deleted row because there is no version check or tombstone logic.
Over aggressive deletion. Deleting derived metrics that no longer count as personal data and that you are legally allowed to keep, such as aggregated statistics. This can hurt analytics and experimentation.
Blocking user flows. Performing all deletion synchronously in the request path so the user has to wait while many subsystems respond. This causes timeouts and reliability issues. Better to accept the request quickly and process deletion asynchronously while preventing any further use of the data.

The trade off is mostly between

Strong consistency of deletion across all replicas and caches
Simplicity and performance of the system

You usually aim for a bounded time window in which the delete is guaranteed to propagate, combined with safeguards that prevent recreation of data after tombstoning.

Interview Tip

In a system design interview, if you get a requirement like

Users in the European region must be able to request deletion of all personal data within thirty minutes.

You can structure your answer like this:

Clarify what counts as personal data and which subsystems store it.
Propose a privacy controller and deletion event that fans out to services.
Describe tombstones in the primary store and cache invalidation based on user id.
State a clear deletion SLA and how you monitor it.
Mention that backups are handled by applying tombstones after restore.

Also be ready to compare this to a simple soft delete design and explain why that is not enough for legal compliance.

Key Takeaways

Right to be forgotten is about complete end to end deletion of personal data, not a single delete statement.
You need a central workflow that knows all data surfaces and orchestrates deletion across replicas, caches and indexes.
Tombstones and idempotent events prevent data from being resurrected by stale replicas or consumers.
Cache and CDN invalidation are essential or you will keep serving deleted data from edge locations.
Observability and deletion SLAs are part of the design, especially for enterprise and regulated systems.

Table of Comparison

Here is a comparison of different approaches to user deletion in a distributed system:

Approach	What happens	Deletion guarantee	Pros	Cons	When to use
Simple hard delete in primary store only	Delete row or document in primary database	Primary only. Replicas and caches may still hold stale data	Easy to implement	Non compliant for right to be forgotten in distributed systems	Very small systems with no replicas or caches
Soft delete with logical flag	Mark user as deleted and filter during reads	System stops using data but raw data still exists	Low risk of accidental resurrection in read paths	Not enough for GDPR style deletion	Internal systems where legal compliance is not required
Event driven deletion with tombstones	Central controller writes tombstone and emits deletion events to all services	Strong privacy guarantee once events propagate within target SLA	Scales across regions and microservices. Observable and safe	More moving parts. Requires careful idempotency and monitoring	Large distributed systems that must satisfy regulatory deletion
TTL based expiry only	User data expires after a retention period	Best effort and delayed. No immediate delete on request	Very simple for cache layers	Not sufficient for user initiated deletion	Supplemental strategy for caches and edge layers

FAQs

Q1. What is right to be forgotten in system design?

It is the requirement that when a user requests deletion, your system removes their personal data from all online storage, replicas, caches and indexes within a defined time window, and ensures it cannot be recreated by stale components.

Q2. How do you delete user data from caches and CDNs?

You keep track of cache keys per user, then on a deletion request you publish an invalidation event that cache nodes and CDNs subscribe to. They evict keys and purge objects by path or URL. You also keep personal data TTLs short and rely on tombstones so deleted users are never written back to cache.

Q3. Do I have to delete data from backups for GDPR compliance?

Most designs do not mutate backups themselves. Instead, they guarantee that if a backup is restored, a deletion replay job runs before the system serves any traffic, applying all historical tombstones and deletion requests so that personal data does not re appear.

Q4. How fast should right to be forgotten deletion happen?

Regulations often expect deletion within a reasonable period, for example days. Many modern services target minutes or hours for online systems. In design interviews, you can suggest an SLA such as complete deletion from replicas and caches within thirty minutes and explain how you monitor it.

Q5. What is the difference between soft delete and right to be forgotten?

Soft delete only hides data from normal reads by marking a flag, while the raw data still exists in storage. Right to be forgotten requires actual removal or irreversible transformation of personal data so that it is no longer present in live systems and cannot be associated with the user.

Q6. How do I talk about right to be forgotten in a system design interview?

Explain how you identify all data surfaces, introduce a central deletion workflow, use tombstones and events to propagate deletion, invalidate caches and CDNs, and ensure backups cannot re introduce data. Mention explicit deletion SLAs and monitoring to show senior level thinking.

Further Learning

If you want to practice designing privacy aware architectures end to end, a great next step is to study the core patterns in Grokking System Design Fundamentals. It walks through data modelling, replication and consistency in a very practical way.

To go deeper into deletion flows, replication behaviour and cache invalidation in large distributed systems, explore the advanced scenarios inside Grokking Scalable Systems for Interviews. It is tailored for high level system design interview preparation and focuses on building scalable and compliant architectures.