Custom system design for scalable e-commerce platforms
A scalable e-commerce platform handles the full transaction lifecycle—product discovery, search, cart management, inventory reservation, payment processing, and order fulfillment—while maintaining sub-second response times during traffic spikes that can reach 10–100x normal volume during flash sales and events like Black Friday. Global e-commerce sales are projected to reach 3.8 trillion in 2026, with Black Friday 2025 generating 11.8 billion in US online sales alone and mobile driving 56% of transactions. In system design interviews, "Design Amazon" or "Design an e-commerce platform" tests the broadest range of skills: read-heavy product catalogs, write-heavy order processing, real-time inventory tracking, payment correctness, search relevance, and the ability to handle extreme traffic variability. This is one of the most comprehensive interview questions because every subsystem—catalog, cart, inventory, checkout, search—could be its own 45-minute design problem.
Key Takeaways
- E-commerce is fundamentally a read-heavy system with critical write correctness. Users browse 100x more than they buy, but when they buy, the inventory deduction and payment charge must be exactly correct. This asymmetry drives every architectural decision.
- Six core services define the platform: product catalog, search, shopping cart, inventory, order/checkout, and payment. Each has different scaling characteristics, consistency requirements, and database needs.
- Inventory management is the hardest sub-problem. Overselling (selling more than available stock) causes order cancellations, refund costs, and trust damage. Inventory must be reserved atomically during checkout, not during cart addition.
- Flash sales create 10–100x traffic spikes in minutes. The architecture must handle this through CDN caching for product pages, pre-warmed auto-scaling, queue-based order processing, and rate limiting on checkout to prevent inventory overselling.
- Use different databases for different services. Product catalog in DynamoDB (flexible schema for variable attributes). Orders and payments in PostgreSQL (ACID transactions). Shopping carts in Redis (sub-ms reads, automatic expiration). Search in Elasticsearch (full-text, faceted).
Step 1: Requirements and Estimation
Functional requirements: Product catalog with browsing, filtering, and search. Shopping cart with add, remove, and update. Checkout with inventory reservation and payment processing. Order tracking and history. Seller management for product listings. Personalized recommendations.
Non-functional requirements: 100M DAU. 99.99% availability. Product page latency under 200ms. Checkout latency under 2 seconds. Zero overselling—inventory accuracy is a hard constraint.
Estimation: 100M DAU × 30 page views/day = 3B page views/day = ~35,000 QPS average, ~100,000 QPS peak. 100M DAU × 1% conversion rate = 1M orders/day = ~12 orders/second average. 10M products in the catalog × 5 KB average = 50 GB catalog storage. Flash sales: 10x normal traffic for 30–60 minutes = 1M QPS peak for product pages.
Step 2: Core Service Architecture
Product Catalog Service
Responsibility: Store and serve product data—title, description, images, pricing, variants, seller information, and reviews.
Database: DynamoDB with a single-table design. Products have variable attributes (clothing has size/color; electronics have specs; books have ISBN/author). DynamoDB's flexible schema handles this naturally. Partition key: product_id. GSI on category for browsing.
Caching: CloudFront CDN caches product pages for 60 seconds at the edge—eliminating origin load for the 95% of traffic that is repeat views of popular products. Redis caches hot product data (top 5% of products generate 80% of views) with a 5-minute TTL.
Interview application: "The product catalog is read-heavy at 35,000+ QPS. I would use DynamoDB for flexible product schemas and a two-tier cache: CloudFront at the edge for static product pages and Redis for dynamic product data. With a 95% combined cache hit ratio, DynamoDB sees only ~1,750 QPS—well within a single table's capacity."
Search Service
Responsibility: Full-text search with filtering (category, price range, brand, rating), sorting (relevance, price, newest), and autocomplete.
Database: Elasticsearch for full-text search with inverted indexes. Products are indexed with all searchable fields. Faceted search enables "Electronics → Laptops → 13-inch → Under $1,000" drill-down filtering.
Data sync: Product catalog changes propagate to Elasticsearch via a CDC (Change Data Capture) pipeline from DynamoDB Streams. New products become searchable within 1–5 seconds of being added to the catalog. Seller price updates, stock status changes, and new reviews are reflected in search results with near-real-time freshness—ensuring that users never see out-of-stock products or outdated prices in search results.
Shopping Cart Service
Responsibility: Manage per-user cart state—add items, remove items, update quantities, calculate totals. Carts are ephemeral (abandoned carts expire after 7–30 days).
Database: Redis for active carts (sub-millisecond read/write, TTL-based expiration). Each cart is a Redis hash keyed by user_id, with fields for each product_id mapping to quantity. Redis handles millions of concurrent carts with minimal infrastructure.
Why Redis over a traditional database: Cart operations happen on every user interaction—adding, removing, updating quantities. At 100M DAU with active cart manipulation, this generates millions of writes per day. Redis handles this at sub-millisecond latency with no persistence overhead for ephemeral data. If Redis restarts, cart data can be reconstructed from the last known state or the user simply re-adds items—acceptable for a non-financial data type.
Inventory Service
Responsibility: Track available stock per product-variant (SKU). Reserve inventory during checkout. Release reservations on payment failure or timeout.
Database: PostgreSQL with row-level locking for inventory updates. Inventory is the one service where strong consistency and ACID transactions are non-negotiable. A race condition that allows two users to reserve the last item results in overselling.
The reservation pattern: When a user enters checkout, the inventory service atomically decrements available stock and creates a reservation record with a 10-minute TTL. If payment succeeds, the reservation converts to a confirmed deduction. If payment fails or times out, a background job releases the reservation and restores the stock.
Interview application: "Inventory uses PostgreSQL with SELECT FOR UPDATE to lock the SKU row during checkout. This prevents two concurrent checkouts from reserving the same last item. The reservation has a 10-minute TTL—if payment does not complete within 10 minutes, the reserved stock is automatically released. This prevents dead inventory from abandoned checkouts."
Order and Checkout Service
Responsibility: Orchestrate the checkout flow: validate cart → reserve inventory → process payment → create order → confirm to user.
Database: PostgreSQL for order records. Orders are financial records requiring ACID transactions, audit trails, and complex queries (order history with filtering, seller dashboards).
Checkout flow as a saga: The checkout spans multiple services (inventory, payment, order). A choreography-based saga coordinates the flow: Checkout Service reserves inventory → publishes InventoryReserved event → Payment Service charges the card → publishes PaymentSucceeded → Order Service creates the order record → publishes OrderCreated → Notification Service sends confirmation.
If payment fails: Payment Service publishes PaymentFailed → Inventory Service consumes the event and releases the reservation. The user sees "Payment failed, please try again." No inventory was permanently deducted.
Order states: Each order follows a state machine: created → payment_pending → payment_confirmed → fulfillment_processing → shipped → delivered (or cancelled/refunded at various points). Explicit state tracking enables the system to recover from failures at any point—an order stuck in "payment_pending" after a timeout triggers a status check against the PSP rather than silently dropping the order or double-charging the customer.
Payment Service
Responsibility: Process credit card charges through a PSP (Stripe, Adyen). Ensure idempotency (no double-charges). Record ledger entries.
Implementation: Stripe.js tokenizes card data client-side—raw card numbers never touch your servers (PCI DSS scope reduction). Every charge request includes a client-generated idempotency key stored in Redis with 24-hour TTL. A double-entry ledger in PostgreSQL records every debit and credit, summing to zero. Reconciliation runs daily against PSP settlement reports to detect discrepancies before they compound.
Step 3: Handling Flash Sales and Traffic Spikes
Flash sales are the defining scalability challenge for e-commerce. A product drop or Black Friday event sends traffic from 35,000 QPS to 1M+ QPS in minutes.
CDN absorption: Product pages served from CloudFront edge cache handle 95%+ of traffic during spikes. The origin server sees only cache misses and dynamic requests.
Pre-warmed auto-scaling: Schedule auto-scaling to pre-warm 10x normal capacity 30 minutes before a known flash sale. Auto-scaling responds to sudden spikes with a 2–3 minute delay—pre-warming eliminates this gap.
Queue-based checkout: During extreme spikes, checkout requests are queued rather than processed synchronously. Users see "Your order is being processed" and receive confirmation within 30–60 seconds. This prevents the checkout service from being overwhelmed and protects inventory consistency.
Rate limiting on checkout: Limit checkout submissions to the rate the inventory and payment services can handle. This prevents stampede-induced overselling. Return 429 with an estimated wait time during peak periods.
Virtual waiting room: For the highest-demand flash sales (limited-edition drops, concert tickets), implement a virtual queue. Users enter a waiting room and are admitted to the checkout page at a controlled rate that matches the system's processing capacity. This converts an uncontrolled stampede into a managed queue—preserving user experience while protecting backend services from overload. Shopify's flash sale infrastructure uses this pattern to handle limited-edition streetwear drops serving 50,000+ concurrent users without downtime.
Inventory pre-allocation: For announced flash sales with known stock limits, pre-allocate inventory into a separate fast-path table in Redis. The flash sale checkout reads from Redis (sub-ms) instead of PostgreSQL (ms), enabling 10x higher checkout throughput during the critical first minutes. Once Redis stock is depleted, the sale ends instantly without database contention.
Step 4: Database per Service
| Service | Database | Reasoning |
|---|---|---|
| Product Catalog | DynamoDB | Flexible schema for variable product attributes, auto-scaling |
| Search | Elasticsearch | Full-text search, faceted filtering, relevance ranking |
| Shopping Cart | Redis | Sub-ms latency, TTL expiration, ephemeral data |
| Inventory | PostgreSQL | ACID transactions, row-level locking for stock reservation |
| Orders | PostgreSQL | Financial records, complex queries, audit trails |
| Payments | PostgreSQL (shared with orders) | Double-entry ledger, ACID, reconciliation queries |
| User Profiles | DynamoDB | Simple key-value lookups at scale |
| Recommendations | Redis + feature store | Pre-computed recommendations, sub-ms serving |
Frequently Asked Questions
What makes e-commerce system design unique?
The combination of read-heavy browsing (100:1 read-to-write ratio), write-critical correctness (no overselling, no double-charging), extreme traffic variability (flash sales at 10–100x normal), and multiple subsystems each requiring different databases and consistency models. No other system design question tests this breadth.
How do you prevent overselling in an e-commerce platform?
Reserve inventory atomically during checkout using PostgreSQL SELECT FOR UPDATE with row-level locking. Create a time-limited reservation (10-minute TTL). If payment succeeds, confirm the deduction. If payment fails or times out, release the reservation. Never deduct inventory on cart-add—only during checkout.
What database should the product catalog use?
DynamoDB or a document database. Products have variable attributes that do not fit a fixed relational schema. DynamoDB provides single-digit millisecond reads, automatic sharding, and flexible schemas. PostgreSQL with JSONB columns is an alternative for teams already operating PostgreSQL.
How does an e-commerce platform handle flash sales?
CDN caching absorbs 95%+ of product page traffic. Pre-warmed auto-scaling adds 10x capacity before the event. Queue-based checkout prevents the ordering pipeline from being overwhelmed. Rate limiting on checkout protects inventory consistency. Together, these handle spikes from 35,000 to 1M+ QPS.
Should an e-commerce platform use microservices or a monolith?
Start with a modular monolith for teams under 10 engineers. The checkout flow benefits from single-database ACID transactions without distributed saga complexity. Extract services (search, recommendations, notifications) as the team and traffic grow. Keep inventory and orders together as long as possible for transactional integrity.
How do you design the checkout flow?
As a choreography-based saga: reserve inventory → process payment → create order → send confirmation. Each step publishes events. Failure at any step triggers compensating transactions (release inventory on payment failure). Idempotency keys prevent duplicate charges on retries.
What caching strategy should an e-commerce platform use?
Two-tier caching: CDN (CloudFront) for static product pages with 60-second TTL, and Redis for dynamic product data with 5-minute TTL. The top 5% of products generate 80% of traffic—cache them aggressively. Cart data lives entirely in Redis. Inventory is never cached—always read from PostgreSQL for accuracy.
How do you handle search in an e-commerce platform?
Elasticsearch with products indexed for full-text search and faceted filtering. DynamoDB Streams + CDC pipeline syncs product changes to Elasticsearch within 1–5 seconds. Autocomplete uses edge n-gram tokenization on a dedicated Elasticsearch index. BM25 for relevance ranking, boosted by popularity and recency.
What is the role of the CDN in e-commerce?
The CDN serves product images, static assets, and cached product pages from edge locations close to users. During flash sales, the CDN absorbs 95%+ of traffic, protecting the origin servers. CloudFront with 600+ PoPs provides global coverage. Without a CDN, the origin would need to handle 1M+ QPS directly.
How do you design recommendations for an e-commerce platform?
A three-stage pipeline: candidate generation (collaborative filtering + content-based from user history), ranking (ML model scoring candidates by predicted purchase probability), and re-ranking (diversity, freshness, promoted products). Pre-compute recommendations in batch and serve from Redis for sub-millisecond latency.
TL;DR
A scalable e-commerce platform is a read-heavy system (100:1 browse-to-buy ratio) with critical write correctness (no overselling, no double-charging). Six core services: product catalog (DynamoDB, flexible schema, CDN-cached), search (Elasticsearch, faceted filtering, CDC sync), shopping cart (Redis, sub-ms, TTL expiration), inventory (PostgreSQL, row-level locking, reservation pattern with 10-minute TTL), orders (PostgreSQL, ACID, saga-based checkout), and payments (Stripe tokenization, idempotency keys, double-entry ledger). Flash sales send traffic from 35,000 to 1M+ QPS—handle with CDN absorption (95% of traffic), pre-warmed auto-scaling, queue-based checkout, and rate-limited ordering. The inventory reservation pattern is the critical correctness mechanism: atomically reserve stock during checkout, release on payment failure or timeout. Use different databases for different services based on access patterns and consistency requirements—no single database fits all e-commerce workloads.
GET YOUR FREE
Coding Questions Catalog

$197

$72

$78