How do you handle clock skew (NTP/PTP) and monotonic time?
Clock skew is the difference in wall clock time between machines. In distributed systems even a small offset can break leases, expire tokens too early, or reorder events in confusing ways. Network Time Protocol and Precision Time Protocol keep wall clocks in sync, while monotonic time is a local clock that only moves forward and is ideal for measuring durations. Handling all three correctly is the foundation of reliable timestamps, timeouts, and ordering in a scalable architecture.
Why It Matters
Time shows up everywhere in a system design interview and in production systems. You set TTL on cache entries, calculate request timeouts, rotate keys, compute analytics windows, and decide who is leader based on a lease with an expiry. If clocks drift, a server may think a lease has expired when it has not. If the kernel steps the clock backward, a scheduler may believe a task completed in negative time. Using NTP or PTP for wall clocks and a monotonic clock for durations keeps your system safe from these surprises.
How It Works Step by Step
-
Separate concepts of time Wall clock time is calendar time used for human events, logs, and timestamps. Monotonic time is a steadily increasing clock that ignores manual clock changes and leap seconds. Use wall clock for labels. Use monotonic time for durations and timeouts.
-
Use a monotonic clock for every duration Read a monotonic clock at start and stop to compute elapsed time. Never subtract two wall clock timestamps to find a duration. This single habit prevents negative durations during clock steps and leap second events.
-
Choose NTP for general fleets and PTP for high precision zones NTP is widely available and good to a few milliseconds on local networks. PTP uses hardware timestamping and special switches to reach microsecond level accuracy in data centers.
-
Harden the time client Use multiple time sources for redundancy. Prefer clients that support modern security. Enable slew mode where possible to gradually adjust time. For PTP enable hardware timestamping on the network interface and use boundary or transparent clocks in switches.
-
Publish a clock error budget Decide a maximum tolerated offset for each tier, such as one millisecond for PTP zones and fifty milliseconds for general clusters. Expose metrics for offset, jitter, and root dispersion. Gate risky operations when the offset breaches budget.
-
Apply uncertainty to time based logic When expiring items or validating leases, subtract an uncertainty margin from the expiry or wait out the uncertainty window before declaring success. This mirrors the idea of commit wait to ensure the chosen timestamp is in the global past.
-
Use logical clocks for consistent ordering Avoid using wall clock to order cross node events with strong guarantees. Use Lamport clocks, vector clocks, or hybrid logical clocks to achieve reliable ordering without perfect sync.
-
Tame leap seconds with a consistent policy Choose either time smear or an inserted second and apply it across the fleet. Document the behavior so developers and testers know what to expect.
-
Make logs and tracing bilingual Log a wall clock timestamp for humans and a monotonic based duration for machines. Tracing spans should report elapsed time from monotonic sources.
-
Test with deliberate skew Run chaos tests that skew clocks on some hosts both forward and backward. Verify that leases, timeouts, and retention logic behave correctly and that alerts fire when offsets exceed budget.
Real World Example
Consider a global relational service that offers external consistency similar to Google Spanner. Each replica synchronizes with multiple time sources and exposes an uncertainty window. During a write the coordinator chooses a commit timestamp at the upper bound of that window, then performs a short commit wait so that once the client receives success, the timestamp is guaranteed to be in the past everywhere. Readers can safely use snapshot reads at any timestamp older than the current uncertainty. Inside each node, timeouts and backoff use a monotonic clock so local scheduling remains correct even if the wall clock changes. In networks that require very tight windows, the same cluster enables PTP with hardware timestamping to keep uncertainty small and throughput high.
Common Pitfalls or Trade offs
- Using wall clock to compute durations which creates negative or absurd values when the clock is stepped
- Depending on timestamps for total order of events across nodes rather than logical or hybrid logical clocks
- Running a single time source which turns a routine outage into a fleet wide time jump
- Ignoring uncertainty in leases which causes leaders to overlap during skew and split brain moments
- Handling leap seconds differently across services which leads to misaligned windows and confusing logs
- Forcing PTP everywhere which adds cost and operational complexity where millisecond accuracy is more than enough
Interview Tip
A favorite prompt is to ask you to design a session store with a thirty minute expiry across several regions. A strong answer says sessions carry an issue timestamp from the authority, services apply a safety margin equal to measured uncertainty, and all timeouts and backoff use a monotonic clock. Mention that you monitor offset and refuse session writes if clock error exceeds a threshold until sync recovers.
Key Takeaways
- Treat wall clock and monotonic clock as different tools for different jobs
- Use NTP for general sync and PTP where you need microsecond level accuracy
- Add an explicit uncertainty margin to leases and time based invariants
- For cross node ordering use logical or hybrid logical clocks rather than raw timestamps
- Log wall clock for humans and measure durations with monotonic time for correctness
Table of Comparison
| Concept | What it delivers | Typical accuracy | Best fit | Main drawback |
|---|---|---|---|---|
| NTP | Network wide wall clock sync at low cost | Few milliseconds on local networks | General computing fleets and web services | Not ideal for ultra tight ordering guarantees |
| PTP | Hardware assisted wall clock sync | Microsecond level in data centers | Trading, storage, high rate telemetry | Requires special switches and setup |
| Monotonic time | Stable local clock for durations and timeouts | Not absolute time, only relative | Retries, backoff, circuit breakers, schedulers | Cannot label events for humans |
| Lamport or vector clocks | Order without relying on wall clock | Logical only | Conflict resolution in stores and queues | Less intuitive than timestamps |
| Hybrid logical clocks | Blend of physical time and counters | Close to physical with small uncertainty | Global ordering with good performance | More complex to implement correctly |
| TrueTime style uncertainty | Timestamp interval with commit wait | Bounded by sync quality | Externally consistent transactions | Write latency rises with uncertainty |
| Time smear for leap seconds | Smooths the extra second over a day | Invisible to most apps | Large fleets with mixed stacks | Needs consistent fleet wide policy |
FAQs
Q1. What is the practical difference between wall clock and monotonic time?
Wall clock aligns with calendars and is used for timestamps and reports. Monotonic time only moves forward and is used to measure durations and enforce timeouts. Never use wall clock to compute elapsed time.
Q2. When should I choose PTP instead of NTP?
Choose PTP when you need microsecond level accuracy, such as in trading, storage, or very low jitter telemetry. For most services NTP is sufficient and simpler to operate.
Q3. How do I make leases safe under clock skew?
Estimate the current uncertainty on each host and subtract a margin from lease duration, or perform a short wait before declaring a commit or a leadership claim valid. This prevents overlapping leaders.
Q4. How do leap seconds affect applications?
If the kernel inserts a leap second, time may appear to pause or repeat, which can confuse timestamp based logic. A smear avoids a sudden jump by spreading the adjustment over a day. Pick one policy and apply it everywhere.
Q5. How do I detect time drift in production?
Collect metrics from NTP or PTP clients including offset, jitter, and root dispersion. Alert when offsets breach your error budget. During incidents, hold writes that depend on strict ordering.
Q6. Can I rely on timestamps for total ordering across nodes?
Only if you can prove very small bounded uncertainty. In most systems prefer logical or hybrid logical clocks for ordering, and reserve physical timestamps for labels and coarse windows.
Further Learning
- Build your timing instincts with Grokking System Design Fundamentals and master clocks, retries, and timeouts from first principles.
- Practice end to end designs that use leases, sessions, and commit wait in Grokking the System Design Interview.
- Dive deeper into distributed transactions and cross region consistency in Grokking Scalable Systems for Interviews with hands on case studies.
GET YOUR FREE
Coding Questions Catalog
$197

$78
$78