Designing a URL shortening service with high throughput

Question

Design Gurus · Accepted Answer

Designing a URL shortener is the most frequently asked system design interview question at FAANG companies, and for good reason—it compresses nearly every foundational concept into a single problem: key generation algorithms, database selection, caching, read-heavy optimization, sharding, and collision handling. The system takes a long URL as input, generates a unique short alias (like tiny.url/u7yX9), and redirects users to the original URL when they access the short link. At production scale—100M daily active users generating 1M new URLs per day with a 100:1 read-to-write ratio—the URL shortener becomes a genuinely challenging distributed systems problem that tests your ability to make and defend architectural trade-offs under time pressure.

Key Takeaways

The URL shortener is a read-heavy system with a 100:1 read-to-write ratio. This asymmetry drives every major design decision: database selection, caching strategy, and scaling approach.  
Key generation is the core technical challenge. Base62 encoding of auto-incrementing IDs is simple but predictable. MD5/SHA-256 hashing avoids predictability but requires collision handling. Pre-generated key ranges (like Twitter's Snowflake approach) eliminate collisions but add coordination complexity.  
A NoSQL key-value store (DynamoDB, Cassandra) is the standard database choice. The data model is a simple short_code → long_url mapping with no relationships, making relational features unnecessary overhead.  
A Redis cache with 90–95% hit ratio is essential. With 100M redirects per day, the cache absorbs 90–95M reads, reducing database load to 5–10M—a 10–20x reduction.  
Interviewers evaluate trade-off reasoning, not a perfect answer. Why you chose 302 over 301, why NoSQL over SQL, and why Base62 over MD5 matters more than the choices themselves.

Step 1: Requirements and Scope

Functional requirements:

Shorten: Given a long URL, generate a unique, compact short alias using alphanumeric characters (a-z, A-Z, 0-9). Redirect: Given a short URL, redirect the user to the original long URL with minimal latency. Custom aliases: Optionally allow users to specify a custom short code (e.g., tiny.url/my-resume). Expiration: URLs expire after a configurable TTL (default: 5 years). Analytics (optional): Track click counts per shortened URL.

Non-functional requirements:

Low latency: Redirections complete in under 20ms (p99 under 200ms). High availability: 99.99% uptime—if the service is down, all shortened URLs are broken. Scalability: Handle 100M DAU and 1B+ stored URLs. Uniqueness: Short codes must be collision-free. Unpredictability: Short codes should not be easily guessable (security consideration).

Interview tip: Always clarify these requirements with your interviewer before designing. Asking "Is analytics a requirement?" or "Should short codes be unpredictable?" demonstrates structured thinking. Skipping requirements is the fastest way to design the wrong system.

Step 2: Back-of-Envelope Estimation

This step justifies every subsequent architectural decision.

Write traffic: 1M new URLs per day = ~12 writes per second (average). At 3x peak: ~36 writes per second. This is trivially low—almost any database can handle it.

Read traffic: 100:1 read-to-write ratio = 100M redirects per day = ~1,160 reads per second (average). At 3x peak: ~3,500 reads per second. This is moderate but becomes challenging with database latency requirements.

Storage: Each URL record is approximately 500 bytes (7-byte short code + 100-byte long URL + metadata). 1M new URLs/day × 365 days × 5 years = 1.825B URLs. 1.825B × 500 bytes = ~912 GB ≈ 1 TB total storage. Manageable on a single database node, but sharding adds fault tolerance.

Short code space: Using 7 characters from a Base62 alphabet (a-z, A-Z, 0-9): 62^7 = 3.5 trillion unique codes. At 1M new URLs per day, this lasts over 9,000 years—no exhaustion risk.

Bandwidth: 100M redirects × 500 bytes per redirect response ≈ 50 GB/day outbound = ~580 KB/s. Negligible.

Interview application: "Based on the estimation, our system is read-dominated at 100:1 ratio. Write throughput is trivially low at 36 QPS peak. The design challenge is optimizing read latency and availability, not write throughput. I would focus the architecture on caching and read replicas."

Step 3: API Design

Endpoint Method Input Output Notes
/api/urls POST { long_url, custom_alias?, expires_at? } { short_url } Creates a shortened URL
/{short_code} GET Short code in URL path 302 redirect to long URL The core redirect operation
/api/urls/{short_code} DELETE Short code + API key { success: true } Deletes a shortened URL

301 vs 302 redirect: This is a classic interview trade-off question.

301 (Permanent Redirect): The browser caches the mapping. Subsequent requests go directly to the long URL without hitting your server. Reduces server load but loses analytics visibility—you cannot track repeat clicks.

302 (Temporary Redirect): The browser always hits your server first. Every click is tracked. Higher server load but full analytics and the ability to update or expire links.

Decision: Use 302 if analytics or link expiration is a requirement (typical case). Use 301 only if minimizing server load is the absolute priority. "I would use 302 because our requirements include analytics and link expiration. If a link expires, a 301-cached browser would never learn about the expiration."

Step 4: Key Generation — The Core Deep Dive

Key generation is where interviewers spend the most time.

Three approaches exist, each with distinct trade-offs.

Approach 1: Base62 Encoding of Auto-Incrementing IDs

Generate a unique integer ID (auto-increment counter or distributed ID generator like Twitter Snowflake), then encode it in Base62 (a-z, A-Z, 0-9) to produce a compact string.

Example: ID 125,462,371 → Base62 → u7yX9

Pros: Zero collisions (IDs are unique by definition). Simple implementation. Short codes are compact.

Cons: Predictable—an attacker can enumerate all URLs by incrementing the ID. Centralized counter becomes a bottleneck unless distributed (Snowflake adds complexity).

Mitigation: Use a distributed ID generator (Snowflake) that combines timestamp + worker_id + sequence to generate unique IDs across multiple servers without coordination. Apply a shuffle or XOR operation before Base62 encoding to add unpredictability.

Approach 2: Hashing (MD5/SHA-256) with Truncation

Hash the long URL using MD5 or SHA-256, then take the first 7 characters (after Base62 encoding) as the short code.

Pros: Unpredictable. No centralized counter. Same long URL always produces the same hash (deduplication).

Cons: Collisions are possible—two different long URLs could produce the same 7-character prefix. Requires a collision resolution strategy (append a sequence number, rehash with salt, or check-and-retry).

Mitigation: On collision, append an incrementing counter to the input and rehash: hash(long_url + "1"), hash(long_url + "2"), until a unique code is found. In practice, with 3.5 trillion possible codes and 1.8B stored URLs, collision probability is extremely low.

Approach	Collisions	Predictability	Throughput	Complexity
Base62 + Auto-Increment	None	High (enumerable)	High (local counter)	Low
MD5/SHA-256 Truncation	Possible (rare)	Low	Medium (hash computation)	Medium
Pre-Generated Key Ranges	None	Low	Very high (batch allocation)	Medium

Designing a URL shortening service with high throughput

Key Takeaways

Step 1: Requirements and Scope

Step 2: Back-of-Envelope Estimation

Step 3: API Design

Step 4: Key Generation — The Core Deep Dive

Approach 1: Base62 Encoding of Auto-Incrementing IDs

Approach 2: Hashing (MD5/SHA-256) with Truncation

Approach 3: Pre-Generated Key Ranges (Key Generation Service)

Step 5: Database Selection

Step 6: Caching Strategy

Step 7: Scaling for High Throughput

Common Interview Follow-Up Questions

Frequently Asked Questions

Why is the URL shortener the most common system design interview question?

Should I use Base62 or MD5 for key generation?

Why NoSQL over SQL for a URL shortener?

What cache hit ratio should I target?

Should I use 301 or 302 redirects?

How do I handle hash collisions?

How much storage does a URL shortener need?

How do I scale reads for a global URL shortener?

What is the biggest mistake candidates make on this question?

How do I prevent the URL shortener from being used for phishing?

TL;DR

Endpoint	Method	Input	Output	Notes
`/api/urls`	POST	`{ long_url, custom_alias?, expires_at? }`	`{ short_url }`	Creates a shortened URL
`/{short_code}`	GET	Short code in URL path	302 redirect to long URL	The core redirect operation
`/api/urls/{short_code}`	DELETE	Short code + API key	`{ success: true }`	Deletes a shortened URL