How do you implement tiered storage (SSD/HDD/object) with lifecycle?

Tiered storage with lifecycle is a strategy that places data on different storage media based on how hot or cold that data is, then moves it over time to balance performance, cost, and durability. Think of it as a conveyor belt for bytes. New and frequently accessed items begin on fast solid state drives. As interest cools, they migrate to cheaper hard drives. Long lived or rarely accessed content settles in an object store. You still present a single logical namespace to applications, but placement and movement are handled by policy. This pattern shows up across large distributed systems and it is a favorite topic in the system design interview because it forces you to reason about throughput, latency, durability, and cost at the same time.

Why It Matters

Real systems rarely have unlimited budget or zero latency goals. Tiered storage lets you buy performance only where it matters. Hot content benefits from low latency reads and writes on SSD. Warm content trades a little latency for a better price on HDD. Cold content sits in object storage with near infinite scale at the lowest price. Lifecycle policies automate the transitions, so ops does not become a manual treadmill. For a scalable architecture, this approach gives you levers to meet service level objectives while staying inside a predictable cost envelope.

Lifecycle design also unlocks compliance and resilience. You can enforce retention rules, apply WORM style controls, and place replicas across failure domains without changing application code. In an interview, showing that you can model access patterns, pick sensible thresholds, and explain safe migration is an easy way to stand out as someone who understands distributed systems end to end.

How It Works step by step

  1. Clarify requirements and budgets Write down latency and durability targets for hot, warm, and cold data. Capture retention rules, legal hold, and recovery point and time objectives. Establish a storage budget per monthly active user or per gigabyte so you can trade speed for cost with intent.

  2. Define tiers and placement factors A common split is SSD for hot, HDD for warm, and object storage for cold. Decide replication for each tier. For example, SSD may keep two local replicas for fast recovery, HDD one local plus one remote, and object storage one bucket in each region with native cross region replication.

  3. Design the metadata model Centralize truth about each object in a small, fast metadata store. Key fields include object id, current tier, list of locations, last access timestamp, size, checksum, encryption key id, retention policy, legal hold flags, and a version counter. This record is the pointer you flip during migration.

  4. Write path On create, write to the hot tier first. Compute a checksum. Acknowledge to the client once the hot tier write and the metadata record commit succeed. Optionally schedule background copies to warm and cold tiers for durability. Record a creation timestamp and an initial last access timestamp.

  5. Read path with promotion Route reads using metadata. If the object is on SSD, serve it directly. If it is on HDD, read from HDD and consider promotion to SSD when a threshold is crossed, for example three hits inside one hour. If it is only on object storage, stream it back and optionally prefetch to HDD or SSD. Update last access timestamp with a rate limited writer so you do not overload the metadata store.

  6. Lifecycle engine Policies drive demotion from SSD to HDD and from HDD to object storage. Policies can use age since last access, total age, size, and storage pressure. Encode hysteresis so objects do not bounce between tiers. Example policy: demote to HDD if no access in seven days, demote to object storage if no access in ninety days, promote to SSD if five hits in one day.

  7. Data mover and atomic cutover Movement is copy then switch. The mover copies bytes to the target tier using chunked or multipart transfer. Verify checksum. Use idempotent job ids so retries are safe. After verification, flip the metadata pointer in a transaction and mark the old location as stale. A recycler reclaims old bytes after a grace period.

  8. Small object packaging for object storage Object stores can be inefficient for many tiny files. Aggregate small objects into larger segments and keep a segment index map in metadata or a side table. Bloom filters can short circuit misses. This pattern cuts request overhead and improves throughput.

  9. Deletion and retention Support soft delete with a tombstone and a retention clock. Do not physically delete until the retention window passes and all replicas confirm. Legal hold overrides lifecycle. For WORM, disable updates and allow only versioned appends.

  10. Observability and control loops Track per tier capacity, read and write latency distributions, migration backlog, promotion and demotion rates, checksum failures, and cost per gigabyte. Alert on thrash signals, for example many promote then demote cycles for the same object. Build a dry run mode to test policy changes against historical access logs.

Real World Example

Picture a photo sharing service. New uploads land on SSD to deliver instant feed loads and quick edits. A background job replicates the original to HDD and creates multiple resolutions. After seven days without access, the original moves to HDD but popular thumbnails stay on SSD. After ninety days without access, the original moves to object storage across two regions. When a user opens an old album, the service streams the original from object storage and promotes it back to HDD. A cache warming job may keep the next ten images for that user on SSD to speed up the slide show experience.

At scale, this design keeps hot working sets fast and cheap, yet the long tail of content remains extremely durable and low cost. Teams often discover that SSD capacity holds steady while total stored data grows many times, because lifecycle continuously clears space for new hot content.

Common Pitfalls or Trade offs

  • Metadata bottleneck Storing last access for every read can crush the metadata store. Use write behind buffers, sampling, or count sketch style approximate counters to lower write traffic.

  • Thrashing due to aggressive promotion Promote only after repeated hits inside a short window and add a cool down before another demotion. Without hysteresis, movers waste bandwidth and SSD fills with one hit wonders.

  • Unsafe cutover Never delete the source before checksum verification. Use a two step commit in metadata so readers do not see mixed state.

  • Small object inefficiency Millions of tiny files on object storage create high request overhead and metadata bloat. Use segment packaging with an index and compact periodically.

  • Mismatched SLO and policy If your SSD demotion window is shorter than user behavior, cold opens will spike and latency will violate the system goal. Calibrate with real access logs.

  • Ignoring multi region blast radius Keep replicas in different failure domains. A rack power event should not remove all hot copies.

Interview Tip

Expect a prompt like design a media storage service that must serve ninety percentile read latency under fifty milliseconds while storing petabytes. Start by sizing hot set size from traffic and access decay. Propose SSD for the hot set, HDD for warm, and object storage for cold, with explicit thresholds. Explain copy then switch, idempotent movers, checksum validation, and hysteresis. Finish with a quick back of envelope on promotion and demotion throughput and the operational metrics you would watch.

Key Takeaways

  • Tiered storage maps hot warm cold data to SSD HDD and object storage with one logical namespace.

  • Lifecycle policies automate promotion and demotion using age, frequency, and size, with hysteresis to prevent churn.

  • Safe migration is copy then switch with checksum verification and delayed reclamation.

  • Small object packaging and a lean metadata model keep cost and latency under control.

  • Clear SLOs and observability guardrails make the system predictable for a system design interview and for production.

Table of Comparison

ApproachTypical LatencyCost per TerabyteDurabilityOperational RiskBest Fit
Tiered Storage with LifecycleLow for hot, medium for warm, higher for coldBalanced and tunableHigh with cross-region replicasMedium due to policy and moversLarge mixed workloads with long tails
Single SSD OnlyVery lowHighHigh with replicationMedium capacity pressureSmall hot working sets
Single HDD OnlyMediumMediumHigh with replicationMediumSequential or batch-heavy access
Object Storage OnlyHigh to very highLowVery high with native featuresLowArchive and infrequent access

FAQs

Q1. How is lifecycle different from caching?

Caching keeps a copy near compute but treats the copy as disposable. Lifecycle moves the primary location of truth across tiers and updates metadata accordingly. Tiered storage can still use caches in front of each tier.

Q2. What signals should drive promotion to SSD?

Use a short window hit count, request latency spikes, or future demand hints. For example, if a user opens one photo in an album, prefetch the next few items to SSD.

Q3. How do you avoid data loss during migration?

Use copy then verify then switch. Keep the old copy for a grace period. Make movers idempotent and store job state so retries are safe.

Q4. Should the metadata store be relational or key value?

If you need secondary indexes and complex queries for audits, relational is fine. If access is mostly by object id with strict latency targets, a key value store is a better fit. Many systems use a blend.

Q5. How do you estimate migration throughput for planning?

Multiply daily demotion and promotion counts by average object size, then add a safety factor for retries and verification reads. Ensure the mover fleet has enough network and CPU to meet the schedule without hurting live traffic.

Q6. How do encryption and keys work across tiers?

Store an encryption key id in metadata and use per object keys or per segment keys. Ensure keys are available in all regions and rotate with a background reencrypt job that respects lifecycle policies.

Further Learning

Build stronger intuition for storage and data movement with a focused path. For a guided deep dive into storage placement, promotion, and durability patterns, take the practical course Grokking Scalable Systems for Interviews. If you want to cement fundamentals like latency budgeting, consistency, and data modeling before tackling large designs, start with Grokking System Design Fundamentals.

TAGS
System Design Interview
System Design Fundamentals
CONTRIBUTOR
Design Gurus Team
-

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Image
One-Stop Portal For Tech Interviews.
Copyright © 2025 Design Gurus, LLC. All rights reserved.