Pattern Walkthrough · Social Feed

Design Twitter / Social Feed

A complete 50-minute interview walkthrough. The four-step framework applied end-to-end: clarify, decompose, deep-dive on the celebrity fan-out problem, evaluate. Read it as the conversation you'd have, not the answer you'd memorize.

Pattern Social Feed·Difficulty Senior / Staff·Reading time ~40 min

00Quick Orientation

This is the first walkthrough on the site, so a quick note on what it's trying to do. The goal isn't to give you a single right answer to memorize. The goal is to model the conversation a senior candidate has with an interviewer over 50 minutes. The framework is on display the whole time. Where there's a real choice point, both alternatives are surfaced and the choice is justified by the requirements that came earlier.

The four framework steps from the methodology page structure the walkthrough:

  1. Clarify (5 min) — What are we actually building? Who uses it, at what scale, with what constraints?
  2. Decompose (10 min) — High-level architecture. The major components and how they connect.
  3. Deep-dive (25-30 min) — The hardest 1-2 problems, worked in detail. For social feed, that's almost always the celebrity fan-out problem.
  4. Evaluate (5 min) — What would you do differently at the next scale? What did we skip? Where are the failure modes?

Time is approximate; interviewers vary. The principle is to allocate roughly half the time to deep-dive on the hardest problems, not to spread time evenly. Strong candidates protect the deep-dive budget by keeping clarification tight and decomposition concise.

Pattern recognition first

This is a social feed pattern. Naming the pattern out loud in the first minute is the senior signal. "This sounds like a social feed pattern. The dominant decisions are around feed materialization (push vs pull fan-out), how we handle celebrity accounts, and what we use for timeline storage. Want me to focus there, or are there specific aspects you'd rather emphasize?" That sentence does work: it shows you've seen this kind of problem, sets up the rest of the conversation, and gives the interviewer a chance to redirect if they want to.

Step 1Clarify (5 min)

Before drawing anything, get the requirements straight. The same question ("design Twitter") could mean five different systems depending on what's in scope and what scale you're targeting. Wrong assumptions here propagate through the rest of the design.

Time5 min · ~10% of the interview

The conversation

Here's how the clarifying conversation might actually go. The interviewer will have prepared answers to most of these but will also use the questions as signal — strong candidates ask the right questions, in the right order, with reasoning behind each one.

You

Before I dive in, I want to make sure I understand the scope. When you say "Twitter," are we focused on the core feed experience — posting, following, and reading the timeline — or do you want me to also cover search, DMs, notifications, and so on?

Interviewer

Let's stay focused on the core feed. Posting, following, and the home timeline.

You

Great. A few more clarifying questions. What scale are we designing for? Day-one launch, or something at established-Twitter scale?

Interviewer

Assume mature Twitter scale. Hundreds of millions of daily active users.

You

For the timeline ordering — is it strictly chronological, or do we need to support algorithmic ranking? That'll affect the storage and read path significantly.

Interviewer

Reverse-chronological for now. We can talk about ranking briefly if there's time, but the main design should target chronological.

You

Last few. Are there edge cases I should plan for explicitly — celebrity accounts with millions of followers, deletes, edits? And what's our latency target for the home timeline?

Interviewer

Celebrity accounts are very much in scope; you'll want to address that specifically. Deletes need to work, edits aren't required. Home timeline should load in under 200ms p99.

The functional requirements, written down

Out of that conversation, the functional requirements are roughly:

  • Post. Users can post tweets (text, ~280 chars; ignore media for the core design).
  • Follow. Users can follow other users; following is asymmetric.
  • Home timeline. Users can read a feed of recent posts from accounts they follow, in reverse-chronological order.
  • User timeline. Users can see all posts by a specific user.
  • Delete. Users can delete their own posts; deletes propagate to all timelines that referenced them.

The non-functional requirements

  • Scale. Hundreds of millions of DAU; we'll quantify in a moment.
  • Latency. Home timeline p99 under 200ms.
  • Availability. Highly available; brief tolerance for stale timelines is fine.
  • Read-heavy. Many more reads than writes (typical social feed ratio is roughly 100:1 reads to posts).
  • Celebrity-tolerant. Some users have millions of followers; the system must not break on their posts.

Quick scale estimation

Numbers don't have to be exact, but having rough magnitudes drives design decisions. A 60-second back-of-envelope for mature Twitter scale:

QuantityEstimate
Daily active users~250M DAU
Posts per user per day~0.4 (most users read more than post)
Total posts per day~100M / day · ~1.2K / second average
Peak post rate (3x average)~4K posts / second
Timeline reads per user per day~50
Total timeline reads per day~12.5B / day · ~150K / second average
Peak read rate (3x average)~500K reads / second
Average followers per user~200 (median is much lower; mean is pulled by celebrities)
Read:write ratio~125:1

A few things this tells us right away. The system is heavily read-skewed (125:1), so we should optimize the read path aggressively, even if that means doing more work on writes. The peak read rate (500K/s) is large but not exotic. The peak write rate (4K/s) is small enough that the writes themselves aren't the problem; what matters is what each write triggers downstream.

The follower counts also matter. The average user has 200 followers, but celebrities have millions or tens of millions. Naive fan-out (write to each follower's timeline at post time) would mean a single celebrity tweet could trigger 100M writes. That's the celebrity problem we'll deep-dive on later.

What good clarification looks like

Notice what didn't happen: the candidate didn't ask "do users have profiles?" or "should it support hashtags?" These are real features but they don't change the core architecture. Strong clarifying questions discriminate between materially different designs. The questions that mattered were scope, scale, ordering semantics, and edge cases — each one had a clear effect on the design that came next.

Step 2Decompose (10 min)

Now the high-level architecture. Boxes and arrows for the major components. The goal is a defensible skeleton that supports the requirements, not a finished design — we'll fill in details during the deep-dive.

Time10 min · ~20% of the interview

High-Level Architecture

Social feed architecture showing client, post service, fan-out, timeline, and storageCLIENTweb / mobileAPI GATEWAY / LOAD BALANCERpostsreadsPOST SERVICEvalidate, persist, dispatchTIMELINE SERVICEassemble, hydrate, returnreadPOST STOREdurable, by post IDPostgres / KVFAN-OUT QUEUEpost-created eventsKafkaSOCIAL GRAPHfollowers / followingsharded KVTIMELINE CACHEper-user listsRedisFAN-OUT WORKERSpush posts to follower timelinesconsultwrite timelinesTimeline Service also fetches post content from Post Store via batch lookup (hydration).Write path (terracotta) and read path (teal) separated.Timeline cache holds precomputed feeds; post store is the system of record.

The high-level skeleton. Posts are persisted, then fan-out workers asynchronously push them to follower timelines. Reads serve from a per-user timeline cache, hydrating post content from the post store. The two paths are deliberately separated.

Walking through the components

Five major components. Each one has a well-defined responsibility and a clear boundary.

  • API gateway / load balancer. Standard front door. Authentication, rate limiting (see rate limiting), routing to the right backend service. Nothing exotic here.
  • Post service. Handles writes. Validates input, persists to the post store, then publishes a fan-out event to the queue. Returns to the user as soon as the post is durably persisted; fan-out happens asynchronously.
  • Post store. The system of record for posts. Postgres or a transactional KV. Sharded by post ID. Stores the actual content (text, author ID, timestamp, etc.). This is what we never lose; everything else is rebuildable from here.
  • Fan-out queue + workers. Kafka or equivalent. The post service writes a "post created" event; workers consume the event, look up the author's followers in the social graph, and push the post ID into each follower's timeline cache. This is where the celebrity problem lives, which we'll deep-dive on next.
  • Timeline service + cache. Per-user precomputed timelines stored as lists in Redis (sorted by post timestamp, capped at maybe ~800 entries per user). Read path: timeline service fetches the user's timeline IDs from the cache, then hydrates the post content from the post store, then returns the assembled feed.
  • Social graph. Stores the follow relationships. Users → following, users → followers. Sharded by user ID. Read-heavy. Often Cassandra, DynamoDB, or a graph database; for this scale, a sharded KV is sufficient.

Why the write path and read path are separated

The deliberate separation is the most important architectural choice here. Posts are written rarely (relative to reads); timelines are read constantly. By doing the expensive work (looking up followers, materializing timelines) at write time, we make reads cheap: a single Redis call for the timeline IDs plus a batch lookup for the post content. This is timeline materialization, and it's the canonical optimization for a heavily read-skewed system.

The cost is that writes are now expensive: a post by a user with 1000 followers triggers 1000 small writes. That's fine for normal users. The cost explodes for celebrities, which is exactly why deep-dive comes next.

What you're not building yet

Notice what's not in the architecture: notifications, search, trends, hashtags, DMs, media (images, video), media transcoding, recommendations, ad serving. All of these are real Twitter features but they're not part of the core feed scope we clarified. Adding any of them prematurely would dilute the design. If the interviewer asks about them later, we'll know which boxes to add and why; for now, scope discipline.

Step 3aDeep-Dive: Fan-Out Strategy (12 min)

This is where the bulk of the interview happens. The single most consequential decision in a social feed design is how the timeline gets assembled. Three strategies; only two are practical at scale; the right answer is a hybrid.

Time12 min · ~25% of the interview

The choice

Push vs Pull: Two Strategies

Comparison of push and pull fan-out strategiesPUSH (FAN-OUT-ON-WRITE)expensive write, cheap readAUTHORtweetFAN-OUTF1F2F3···FNAT WRITEN writes (one per follower)AT READ1 cache lookup · very fastBREAKS WHENN is huge (celebrities)PULL (FAN-IN-ON-READ)cheap write, expensive readREADERrequestTIMELINE SVCU1U2U3···UMAT WRITE1 write to author's timelineAT READM lookups (one per followee) + mergeBREAKS WHENread traffic is heavy (every read costs)

Push fan-out does the work at write time: a tweet from someone with N followers triggers N writes to N timeline caches. Reads are then trivial. Pull fan-out does the work at read time: each timeline read merges the recent posts of every account the user follows. Each strategy breaks under different conditions.

Push (fan-out-on-write)

When a user posts, the system writes the post ID into the timeline cache of every follower. Reads are trivial: fetch the user's timeline cache, hydrate the post content, return.

  • Pros: Reads are O(1) — a single Redis call. Latency is excellent and scales with read traffic almost for free. Most users feel the system as fast.
  • Cons: Writes are O(N) where N is follower count. For most users (median ~100 followers) this is fine. For celebrities with millions of followers, a single tweet triggers millions of writes. The fan-out queue clogs; the timeline caches thrash; some followers see the tweet seconds or minutes later than others.
  • Storage cost: Each follower gets a copy of the post ID in their timeline. With 250M users averaging 200 follows, the timeline cache stores ~50B post-ID references. Bounded per user (we cap at ~800 entries) so total storage is predictable.

Pull (fan-in-on-read)

When a user posts, the system writes only to the user's own timeline (a list of their own posts). When someone reads their home timeline, the timeline service queries every followed user's timeline, merges them in time order, returns the top N.

  • Pros: Writes are O(1) — one append to the author's user timeline. Storage is minimal (each post stored once, not N times). Celebrities don't break anything special at write time.
  • Cons: Reads are O(M) where M is how many users the reader follows. For someone following 1000 accounts, every timeline read becomes 1000 lookups plus an in-memory merge. At 500K reads/sec across millions of users, this is operationally devastating. Latency suffers; throughput suffers worse.
  • Storage cost: Minimal. Each post stored once in the author's timeline.

Why neither alone is right

Push breaks for celebrity authors. Pull breaks for any read-heavy workload. The depth probe — "what about celebrity accounts?" — is exactly the question that exposes which strategy you've picked and whether you've thought about its limits.

The answer is hybrid, and it's specifically how Twitter's real architecture handles this. Push for normal users, pull for celebrities, merge at read time.

Decision

Pure push, pure pull, or hybrid?

Pure push optimizes the read path at the cost of write amplification on celebrity posts. Pure pull keeps writes cheap but explodes the read path under realistic load. Neither alone holds up at this scale.

Hybrid: push for users with normal follower counts (the vast majority), pull for users above a threshold (celebrities). The reader's timeline service merges the pushed timeline with the pulled celebrity posts at read time. Most reads are still fast because most posts came through push; the celebrity tail is handled separately without overwhelming the fan-out workers.

PickedHybrid. Justified by our scale (250M DAU, asymmetric follower distribution with celebrities having tens of millions). At smaller scale, pure push would be acceptable; at this scale, the celebrity write amplification forces the hybrid.

Step 3bDeep-Dive: The Celebrity Solution (10 min)

The hybrid is the canonical answer. The depth lives in the specifics: how do you decide who's a celebrity, where does the celebrity post live, and how does the merge happen at read time without blowing the latency budget.

Time10 min · ~20% of the interview

Hybrid: Push for Normal Users, Pull for Celebrities

Hybrid push-pull strategy with merge at read timeWRITE PATHNORMAL USER~200 followersPUSHFAN-OUT TO FOLLOWERS~200 timeline writesCELEBRITY10M+ followersPULLCELEBRITY TIMELINEsingle write · cachedTHE THRESHOLDPush if followers < ~10K.Pull if followers &geq; ~10K.Threshold tunable.Stored as flag on user record.Updated when crossing in/out.READ PATH (MERGE AT READ)READERTIMELINE SVC: MERGEpush results + celebrity pullsPUSHED TIMELINERedis · ~800 entriesCELEBRITY TIMELINESfew followed; cachedMERGED HOME TIMELINEWHY THE MERGE WORKSReader follows ~200 accounts.Most are normal → pushed.A few are celebrities → pulled.Pull cost is bounded byhow many celebrities the userfollows (usually < 50).In-memory merge:~10ms even at p99.

The hybrid in detail. Normal users push to follower timelines as before. Celebrities just write to their own user timeline; nothing fans out. At read time, the timeline service fetches the reader's pushed timeline, fetches the recent posts of the celebrities they follow, and merges in time order. The merge is bounded because most users follow only a small number of celebrities.

The threshold

How do we decide who's a celebrity? A simple threshold on follower count — say, 10,000 followers — works fine. Above the threshold, we don't fan out; below, we do. The exact threshold is tunable based on system capacity and observed performance.

The threshold is stored as a flag on the user record. When a user crosses the threshold (gains followers and goes above; loses followers and goes below), the flag flips and the system's behavior for that user changes. The crossing is rare enough that we can handle it asynchronously.

The merge mechanics

At read time, the timeline service does three things:

  1. Fetch the reader's pushed timeline. One Redis call. Returns ~800 post IDs in time order, all from non-celebrity authors the user follows.
  2. Identify the celebrities the reader follows. Look up the reader's following list, filter for users with the celebrity flag. This is bounded — most users follow at most a few dozen celebrities.
  3. Fetch each celebrity's recent posts. One small Redis call per celebrity (their cached recent posts). These are heavily cached because many readers want them.
  4. Merge the lists. Pushed timeline + celebrity posts, sorted by time, take top N. In-memory merge of a few thousand items is microseconds of work.
  5. Hydrate. Batch-fetch the post content for the top N IDs from the post store. Return the assembled feed.

Total latency budget for a typical read: 1-2 Redis calls for the pushed timeline, ~10-30 Redis calls for celebrity timelines (parallel), 1 batch lookup for hydration. Realistic p99 for the full operation: 50-100ms with appropriate caching, well within the 200ms target.

Why this works in practice

The hybrid succeeds because of a structural fact about real social networks: most users follow many normal accounts and few celebrities. The pull cost is bounded by the number of celebrities followed, which is usually small. The push cost is bounded by the threshold (we'd never push to >10K followers). Neither dimension explodes in the way they would for the pure strategies.

The depth-probe response

"What if I followed 5,000 celebrities?" The strong response: "That'd be unusual but real. The pull cost grows linearly with celebrities-followed, so we'd start to see latency degrade. Mitigation: cache aggressively at the celebrity timeline level (one cached celebrity timeline serves millions of readers), which removes most of the load. If we still saw issues for users in this long tail, we could materialize their timelines on a slower path with looser freshness guarantees. The 200ms target probably won't hold for these users; we'd accept that and document it."

Step 3cDeep-Dive: Storage, Deletes, Brief Notes (5 min)

The hybrid fan-out is the centerpiece, but a few storage choices and edge cases are worth quick attention before evaluation. Time-budget allowing, the interviewer will probe these directly; if not, naming them briefly shows you'd handle them.

Time5 min · ~10% of the interview

Post store: Postgres or sharded KV?

The post store is the system of record. Posts are immutable once written (no edits in our scope), so the workload is append-heavy with point lookups by post ID. Two reasonable choices:

  • Postgres, sharded by post ID. Easy to operate. Strong consistency. Good fit for the workload because reads are point lookups and writes are appends. The same "Postgres-as-default" position from the database selection deep-dive applies here.
  • Sharded KV (DynamoDB, Cassandra). Scales horizontally without operational tuning. Eventual consistency at the cell level is acceptable because posts are immutable. Twitter's actual real-world choice (Manhattan, their internal Cassandra-like store) sits in this family.

Either is defensible. The deciding factor is operational maturity: a team running Postgres at scale should pick Postgres; a team without DBA depth might prefer the managed KV. Sharding covers the partition-key tradeoffs (post ID hash gives even distribution; the only follow-up is whether to add a time component to the key for chronological queries).

Timeline cache: per-user lists in Redis

Each user's pushed timeline is a Redis sorted set keyed by user ID, with post IDs scored by post timestamp. Cap each timeline at ~800 entries (longest realistic feed depth a user reads in one session). Older entries get evicted; if a user scrolls past the cap, fall back to the post store and reconstruct.

Sharding the Redis cluster by user ID gives even distribution and locality (one user's timeline is on one shard). Caching covers the placement and eviction patterns; the timeline cache is the canonical "read-through" placement at the application tier.

Deletes

When a user deletes a post, the post store marks it deleted (soft delete, kept for moderation/audit purposes). The harder problem is the cached timelines: the deleted post ID may already exist in millions of follower timelines.

Two approaches, often combined:

  • Tombstone in the post store. When the timeline service hydrates posts and encounters a deleted one, it just skips it. The cached timeline still has the ID; the user just sees a slightly shorter feed for a window. Cheap to implement, eventually self-corrects.
  • Async cleanup of timeline caches. A background job consumes the delete event and removes the post ID from follower timelines. Slower but cleaner. Often implemented as a best-effort second pass.

The first is enough for normal scale; the second is added for celebrities or when retention guarantees matter (compliance, legal holds).

Pagination

Cursor-based pagination, not offset-based. The timeline cache is a sorted set; the cursor is "post ID older than X" and the read is "give me the next 20 entries before this cursor." This avoids the offset-based gotchas (new posts during pagination shifting the offset) and is what every modern API does.

Briefly: ranking, search, notifications

If time permits, a sentence each on what these would add:

  • Ranking. The timeline service applies a scoring model after merge but before return. Score includes recency, engagement signals, author affinity. This is where a learning-to-rank model would slot in. Adds latency (10-30ms) but transforms the user experience for engagement-driven products.
  • Search. Separate Elasticsearch index over posts, kept in sync via CDC from the post store. Search and indexing covers the pattern; the primary-store-plus-search-index is the canonical architecture.
  • Notifications. Separate notification service consumes the same fan-out events. Reuses the social graph. Message queues covers the delivery semantics.

Each of these is a deep-dive in itself. Naming them shows scope awareness without spending time we don't have.

Step 4Evaluate (5 min)

The closing move. What did we build, what did we skip, what would break under stress, and what would we add at the next scale milestone? Strong candidates evaluate their own design with the same rigor they applied to building it.

Time5 min · ~10% of the interview

What we got right

  • Read path is optimized for the dominant workload. 125:1 read-to-write ratio means the reads have to be cheap. The pushed timeline cache makes typical reads a single Redis call.
  • Celebrity problem is handled explicitly. The hybrid push/pull is the canonical solution and correctly justified by the asymmetric follower distribution.
  • Storage tier is simple and rebuildable. Post store is the source of truth; the timeline cache is rebuildable from posts + social graph. If the cache cluster fails, we degrade to slow reads from the post store rather than losing data.

What we'd add at the next scale

  • Geographic distribution. If users span continents, we'd add region-local read replicas of the post store and timeline cache. Replication and consistency covers the tradeoffs; the answer is asynchronous regional replication with a primary-region for writes.
  • Tiered timeline storage. Active users (logged in recently) get full timeline caches; dormant users get reduced or no cache. Saves Redis memory significantly. Recompute on user return.
  • Sharded fan-out queue. The fan-out queue itself becomes a bottleneck at extreme scale. Partition by author ID; multiple parallel fan-out worker pools.

What we explicitly didn't cover

  • Media (images, video) — would add a separate object-store + CDN tier, transcoding pipeline, and async upload flow.
  • Edits — would force the post store toward update-friendly schema and complicate the cache (post IDs in timelines would still resolve correctly, but content changes need to propagate).
  • Search, notifications, DMs, ad serving, hashtags, trends — all separate systems that compose with this one.
  • Cross-region failover and disaster recovery semantics.
  • Privacy controls (private accounts, blocking, muting).

Where this design would break

  • Sustained celebrity post storms. If many celebrities post simultaneously (a major news event), the merge cost at read time spikes. The cache for celebrity timelines absorbs most of this, but pathological cases exist. Mitigation: aggressive celebrity timeline caching at edge, possibly precomputing celebrity-merged feeds for high-overlap follower clusters.
  • The 5,000-celebrities-followed user. Pull cost at read time is unbounded in principle. We accepted some latency degradation for these users; in practice, they're rare enough that the design holds.
  • Timeline cache cluster failure. If Redis fails, the system falls back to reconstructing timelines from the post store, which is much slower. Latency would degrade severely for the duration. Mitigation: replicate Redis aggressively, accept the read fallback as a documented degraded mode.
  • Cold-start for new users. A new user with no follow graph has an empty timeline. We'd surface a default content stream (popular posts, suggested follows) for cold-start cases.
The evaluate step is where staff candidates separate from senior. Naming what you didn't build, what would break, and what you'd add next signals the kind of operational thinking that distinguishes someone who's run systems from someone who's only designed them.

04Common Follow-Up Probes

Interviewers usually have 2-3 follow-up questions ready once the main design is laid out. These are the most common in social feed interviews. The strong responses below assume the design above is on the whiteboard.

Probe

"How would you add ranking instead of strict chronological?"

The timeline service applies a scoring model after the merge step, before returning the top-N. Inputs: post features (age, author engagement history), user features (interaction history, affinity signals), and contextual features (time of day, device). Output: a score per candidate post; top-N by score replaces top-N by time. The merge becomes "merge by score" instead of "merge by time," which is a small change at the application level. The bigger change is the model training pipeline and how feedback flows back. Most production feeds use a combination of recent + ranked: recency is hard-floored so the feed feels fresh, and ranking re-orders within the recency window.

Probe

"What if a user goes from non-celebrity to celebrity overnight?"

Two cases. First, a user gradually accumulates followers and crosses the 10K threshold during normal operation. Their flag flips; future posts use pull instead of push. Past posts that were already pushed to follower timelines stay there; they don't need cleanup. Second, a viral event spikes a user's followers from 1K to 1M in hours. Their flag flips at the threshold crossing; their past posts (already pushed) are fine; new posts use pull. The transition handles itself; what's worth flagging is that during the threshold-crossing window, fan-out workers may briefly thrash on this user before the flag updates, which is acceptable for the few minutes it takes to settle.

Probe

"How do you handle a user who follows tens of thousands of accounts?"

The pushed timeline is bounded at ~800 entries regardless of how many accounts the user follows. So the read path scales fine. The actual cost: the user's timeline cache may be churnier (many small writes from the many fan-out workers updating it). Redis handles this without issue at our scale, but if it became a hotspot we could shard the user's timeline across multiple keys. For the celebrity-merge cost, see the previous answer about 5K celebrities — we accept latency degradation for the long tail.

Probe

"What's the failure mode if Redis goes down?"

The whole timeline cache becomes unavailable. The system falls back to reconstructing timelines from the post store — fetching the user's followed accounts from the social graph, fetching recent posts from each, merging in memory. This is much slower (probably 1-2 seconds per request). Latency targets are missed; some requests will time out. We'd serve stale or empty timelines to many users during the outage. Mitigations: replicated Redis cluster (failover to replica is seconds), regional Redis deployments (one region's failure doesn't take down all reads), graceful degradation messaging to users. Observability matters here — alerting on Redis error rates and on the resulting timeline-read p99 spike.

Probe

"How would this design change for a different ratio, like Instagram (visual-heavy) or LinkedIn (slower, more curated)?"

Same pattern, different parameters. Instagram adds a media tier (object store + CDN + transcoding pipeline) and the post store schema is different, but the fan-out and timeline mechanics are nearly identical. LinkedIn's lower posting frequency makes pure push viable longer (fewer celebrity-equivalent edge cases) but adds heavier ranking signals (professional context, relevance). The mental model is "social feed pattern with these specific variations" — the pattern stays the same; what changes is which sub-decisions get more or less attention.

05How This Walkthrough Composes Concepts

The whole point of the questions hub framing is that questions are compositions of concepts you already know. Here's the explicit map for this walkthrough:

  • Caching. The timeline cache is the canonical "precomputed view" use of caching. Per-user materialized timelines in Redis. Eviction is implicit (the cache is the system of record for the materialization, but rebuildable from posts + graph).
  • Sharding. Post store sharded by post ID (hash). Timeline cache sharded by user ID. Social graph sharded by user ID. Each partition decision driven by the access pattern.
  • Message queues. The fan-out queue (Kafka) decouples the post-write from the timeline updates. Allows the post service to return quickly while fan-out happens asynchronously. At-least-once delivery + idempotent timeline writes.
  • Database selection. Postgres or DynamoDB for posts (transactional + immutable workload). Redis for timelines (in-memory list operations). Sharded KV for social graph (point lookups by user).
  • Load balancing. The API gateway distributes requests across post-service and timeline-service replicas. Standard layer-7 LB pattern.
  • Rate limiting. Per-user post creation rate limits (prevents spam). Tier-aware (verified accounts may have different limits).
  • Replication and consistency. Post store replicated for availability. Timeline cache replicated for read scaling. Eventually-consistent with the post store; staleness measured in seconds is acceptable.
  • Observability. Per-user-tier latency metrics (normal users vs celebrity-followers). Fan-out queue depth as a leading indicator. Cache hit rate. SLO around timeline read p99.

If you've worked through the concept library, you've already seen each of these primitives in detail. The walkthrough composes them into a specific solution; the concepts are the toolkit.

06Walkthrough FAQ

Should I memorize this walkthrough?

No. Memorizing the answer is the wrong preparation strategy and interviewers can usually tell. What you should internalize is the shape of the conversation: how clarification narrows the design, where the deep-dive goes, how decisions get justified by requirements. The specific decisions (push threshold at 10K, timeline cap at 800, etc.) are reasonable defaults but they should come from your reasoning in the moment, not from memory.

Is this how Twitter actually built it?

The hybrid push/pull strategy and the celebrity threshold are real Twitter patterns, documented in their engineering blog posts over the years. The specific choices (Redis, Kafka, Postgres) are common implementations but not necessarily what Twitter used internally — they built specialized systems (Manhattan, Heron) for some of these layers. The concepts and the architecture are the durable lesson; the specific implementations vary by company and era.

What if the interviewer pushes me toward pure pull instead of hybrid?

Engage with the alternative honestly. Pure pull is a defensible choice for systems with much lower scale or different priorities (e.g., a system where storage cost dominates and read latency is less critical). If the interviewer suggests pure pull, walk through what that design would look like and where it'd break. The signal is that you can reason about the tradeoffs, not that you've memorized "hybrid is right." If at the end your reasoning still favors hybrid, say so; if their constraints actually favor pull, accept that.

How do I know how much time to spend on each step?

Roughly 10/20/60/10 (clarify/decompose/deep-dive/evaluate). Watch the interviewer's body language — if they're pushing forward, move on; if they want detail, slow down. Use the deep-dive budget on the hardest 1-2 problems, not on shallow coverage of everything. If you're 30 minutes in and still on decomposition, you've over-invested in setup; cut the remaining decomposition and pivot to deep-dive. Time discipline is itself a senior signal.

What if I'm asked "design Instagram" instead?

Same pattern, different parameters. Notice it's still a social feed (you follow people, you see their posts). The differences are: media-heavy (images and video drive different storage/CDN concerns), Stories (ephemeral content adds an expiry dimension), Explore (algorithmic discovery feed in addition to following feed). Run the same framework; the clarification step surfaces the differences and the deep-dive lands on whichever sub-problem the interviewer wants to probe. Pattern recognition compresses the prep; the specific question's variations come out in clarification.

How long does this conversation actually take?

50 minutes is the typical interview length for this depth. Reading the walkthrough takes longer (~40 min) because written prose is denser than spoken design discussion. The actual interview rhythm is more conversational: the interviewer interrupts with questions, you adjust, you backtrack occasionally. Practice with a partner if you can; the cadence is part of the skill, not just the content.

What if I don't recognize the pattern?

Fall back to the framework. The four steps work even if you can't name the pattern: clarify the requirements, decompose into components, deep-dive on the hardest part, evaluate. You'll find yourself rediscovering pattern-like decisions (push vs pull, materialization vs computation) in real time. Slower than starting with pattern recognition, but defensible. After the interview, look up which pattern fit and add it to your repertoire for next time.

Continue

Back to the Question Library →

The other nine patterns and their walkthroughs (publishing over time). Or revisit the framework with this walkthrough's structure as a reference for how the four steps work in practice.

Design Gurus logo
One-Stop Portal For Tech Interviews.
Copyright © 2026 Design Gurus, LLC. All rights reserved.