Explain Per-User vs Per-Token Rate Limits.

Per-user vs per-token rate limits define whether request quotas are enforced on an entire user account or on each access token individually in a system design.

When to Use

  • Per-user limits: Ensure fairness across all tokens from the same account, useful when preventing abuse from multiple API keys.
  • Per-token limits: Useful when each API key, client, or credential needs its own quota, often tied to billing tiers or partners.

Example

Suppose Alice owns two tokens.

With per-token limits, each token gets its own quota, doubling her throughput. With per-user limits, both tokens share the same cap.

Want to dive deeper?

Explore Grokking System Design Fundamentals, Grokking the Coding Interview, or try Mock Interviews with ex-FAANG engineers to practice these concepts.

Why Is It Important

Rate limiting protects system stability, prevents abuse, and guarantees fair resource allocation. Choosing the right strategy impacts scalability and user experience.

Interview Tips

  • Use concrete scenarios (one user with multiple tokens).
  • Mention implementation details (e.g., Redis counters, sliding windows).
  • Show awareness of fairness vs flexibility trade-offs.

Trade-offs

  • Per-user: Fair usage per account but complex aggregation across tokens.
  • Per-token: Simpler to implement but can be gamed by generating many tokens.

Pitfalls

  • Confusing token identity with user identity.
  • Forgetting to aggregate usage across tokens for per-user policies.
TAGS
System Design Interview
System Design Fundamentals
CONTRIBUTOR
Design Gurus Team
-

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Image
One-Stop Portal For Tech Interviews.
Copyright © 2025 Design Gurus, LLC. All rights reserved.