Explain SLO vs SLA vs SLI.
SLI is the measured reliability metric, SLO is the internal target for that metric, and SLA is the external contractual promise with penalties. (#definition)
When to Use
Use SLIs to measure user-facing reliability (latency, error rate).
Define SLOs to set clear internal targets for reliability and balance velocity. Apply SLAs when you need legally binding commitments with customers or partners.
Example
A streaming service might define:
- SLI = % of video starts under 2 seconds
- SLO = 99% of starts under 2s per 30 days
- SLA = 99.9% uptime monthly, with credits if breached.
Want to practice system design and reliability concepts hands-on?
Check out Grokking System Design Fundamentals, Grokking the System Design Interview, or Mock Interviews with ex-FAANG engineers.
Why Is It Important
They prevent ambiguity between engineering teams and business stakeholders, enable error budgets, and ensure user experience and contracts are aligned.
Interview Tips
Start by defining each term clearly, then give a concrete example. Mention error budgets and explain how SLOs differ from SLAs. Show awareness of real-world trade-offs.
Trade-offs
- Strict targets: Higher cost, better trust.
- Relaxed targets: Faster iteration, but risks churn. Too many SLIs = noise; too few = blind spots.
Pitfalls
- Confusing SLO with SLA
- Picking infra metrics instead of user-centric ones
- Ignoring long-tail latency
- Setting SLOs without error budgets
GET YOUR FREE
Coding Questions Catalog
$197

$78
$78