Grokking System Design Interview, Volume II
Ask Author
Back to course home

0% completed

Vote For New Content
Design a Flash Sale for an E-commerce Site (Hard)
Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Step 1: System Definition

Step 1: System Definition

A flash sale system is an e-commerce component designed to handle short-term sales events where a limited inventory of products is sold to a massive number of buyers in a very short time frame. In such events, traffic spikes dramatically (e.g. hundreds of thousands of users hitting the site simultaneously), and the number of purchase requests far exceeds the available items (for example, millions of requests for only thousands of items). The system must rapidly serve many concurrent users, reliably track inventory, and fairly allocate the product items. This requires a highly scalable and robust architecture that can maintain consistency (no overselling of products), low latency for user actions, and high availability despite extreme load. The flash sale system typically integrates with the broader e-commerce platform (user accounts, product catalog, payment processing) but uses specialized strategies (caching, queuing, etc.) to withstand bursty traffic and ensure a smooth user experience during the sale.

Core Entities:

  • User: An individual shopper (with account info, address, etc.) participating in the sale.
  • Item/Product: A product on flash sale, with details like ID, description, price, and a limited inventory count.
  • Inventory: Stock counts for each item. This is critical in flash sales – all items will have a fixed quantity (e.g., 1000 units) that cannot be exceeded.
  • Order: A purchase order created when a user successfully buys an item. It links a User to one or more Items, the quantity, timestamp, etc.
  • Payment: Represents payment transactions for orders. Often handled via an external payment gateway (credit card, PayPal, etc.), but our system manages the payment status (pending, successful, failed).
  • Other Entities: (Supporting) Shopping Cart/Reservation (to hold an item briefly during checkout), Shipment (for delivery info), but the core focus here is on inventory and orders during the flash sale window.
Step 2: Clarify and Define Requirements

Step 2: Clarify and Define Requirements

Functional Requirements:

  • Browse Flash Sale Items: Users worldwide should be able to view the list of flash sale products and their details (price, remaining stock, etc.) in real-time.

  • Attempt Purchase: Users can attempt to purchase an item (add to cart & checkout immediately). Given the short supply, this should be a quick, streamlined process.

  • First-Come, First-Serve: The system must allocate items to purchase requests in the exact order they are received. If two users try to buy the last unit at the same time, only one request succeeds – fairness is critical.

  • Real-Time Queue Position: Users must be able to see their current position in the waiting queue when participating in the flash sale. The client (web or mobile app) will periodically poll the server (e.g., every few seconds) for updates on the user’s queue position. This gives users transparency and reassurance while waiting. By providing a live position number (e.g., “You are #42 in line”), the system improves user experience and fairness perception.

  • Order Creation & Payment: When a user checks out, the system creates an order and processes payment. It should only confirm the order after payment is authorized. A typical flow: reserve item -> process payment -> confirm order.

  • Prevent Overselling: The system should never sell more items than available. If stock = 0, further purchase attempts must be blocked or fail gracefully (e.g., “Out of stock” message). This requires atomic inventory checks and updates so that two simultaneous purchases don’t think that they both succeeded.

  • Global Access: The flash sale is global – users from different regions (US, Europe, Asia, etc.) should experience the sale with minimal latency. The system should handle distributed traffic (perhaps via multiple regional data centers) but still enforce the single global inventory limit.

Non-Functional Requirements:

  • Extreme Scalability: The system must handle millions of concurrent users hammering the site. Traffic will spike dramatically at the sale start. (For perspective, Alibaba’s Singles’ Day sale has peaked at ~583,000 orders per second – our design should be prepared for similar magnitudes.) Most of these will be read requests (browsing), with a smaller fraction as purchase attempts.
  • Low Latency: Key actions (loading the product page, clicking “Buy”) should complete in under 500ms on average, despite the load. Users shouldn’t experience timeout errors even under peak traffic. This implies fast responses from the servers and efficient backend operations (e.g., inventory check, order placement).
  • High Availability: The system should have zero downtime during the flash sale. It must be resilient to failures – any single component failure (server, database node, etc.) should not take down the whole service. Redundancy and failover are essential (active-active setups, clustering, etc.).
  • Rapid Elastic Scaling: The architecture should allow autoscaling or rapid provisioning to handle sudden surges. For instance, ramp up many application servers right when the sale starts, and scale down after the rush. The design should also support load balancing to spread traffic and avoid any one server being overwhelmed.
  • Security & Abuse Prevention: (Implied) The system should defend against bots or malicious actors that could unfairly snatch inventory or overload the system. Techniques like rate limiting (throttling excessive requests), CAPTCHAs or login requirements, and monitoring for unusual activity should be in place to preserve system health and fairness.

Step 3: Back-of-the-Envelope Capacity Estimation

Let’s estimate the scale to ensure our design can handle it:

  • Concurrent Users: Assume a peak of ~5 million concurrent users on the site during the sale. (Flash sales of popular brands can attract millions globally in a short window.) Not every user generates a request every second, but many will repeatedly refresh or navigate. If even half of them perform an action in a given second, that’s ~2.5 million operations/second across the system. We should prepare for a ballpark peak of hundreds of thousands to a few million requests per second in bursts.
  • Peak QPS (Queries Per Second): Most of these operations are reads (viewing pages, checking item status). We might estimate a 90% read vs 10% write ratio. For example, out of 1M requests per second, 900k could be read (GET requests for product info, inventory status) and 100k could be purchase attempts (writes that create orders or update inventory). The read volume is huge because many users browse or refresh even if only a small fraction successfully checkout. Write volume will spike initially (everyone trying to buy) but the total successful orders are limited by inventory (e.g., if only 100k items total, at most 100k successful order writes will happen, though there may be many more attempts that get rejected due to stock out).
  • Database Load: For reads, we cannot hit the database for every page view at these numbers – we’ll need caching. If we don't use cache, 900k reads/sec could overwhelm a single database cluster. Instead, product info and even inventory status should be served from in-memory caches or CDNs. Write load (100k/sec in worst case) is also extremely high for a single DB, so we must distribute writes or use high-throughput stores. The order placement transactions (which involve inventory decrement + order insert) are the most critical. They must be handled in a strongly consistent way, but as quickly as possible.
  • Bandwidth: Serving millions of users means high network bandwidth. If each page or API response is ~5 KB of JSON (for example, product data without images), 1 million requests/sec = 5 GB/sec of outgoing data from servers. That’s 40 Gbps – feasible only with distributed servers and CDNs. We will offload large content (images, CSS/JS files, etc.) to a Content Delivery Network (CDN), so our application servers mostly send small JSON or HTML responses. The CDN edges near users handle the heavy lifting of images/videos. Internally, between services and databases, network links must handle bursts of write operations and replication.
  • Storage: The flash sale itself might not consume massive storage (the number of products is limited and orders are relatively few compared to reads). For instance, if 100k orders are placed, and each order record is a few KB, that’s only a few hundred MB of new data.
  • Peak API Requests Patterns: We expect a traffic spike at the sale start (e.g., everyone refreshing at 12:00 PM when the sale opens). The pattern might be: heavy GET requests to view the item page, followed by a wave of POST requests to attempt purchases. After inventory sells out, read traffic may remain high (people checking if more stock appears or browsing other items) but write traffic will drop. Our system should handle the initial spike which is the most intense period. In summary, design for worst-case QPS in the millions (if including all reads) and order events in the tens of thousands per second. These numbers align with large-scale events – e.g., Alibaba reported 583k orders/sec at peak of Singles’ Day, and Amazon’s Prime Day backend handled even higher internal request rates (hundreds of millions of ops/sec) by scaling out globally.
Step 4: High-Level Architecture

Step 4: High-Level Architecture

At a high level, the flash sale system will use a multi-tier, distributed architecture with components to handle presentation, application logic, caching, queuing, and persistence. Below is an outline of the key components and how they interact:

  • Clients (Web/App): Users interact via a web browser or mobile app. We can optimize the client to reduce load (for example, disable multiple rapid clicks on the buy button to cut down duplicate requests). The client will show a countdown to sale start, then enable the purchase action. Once clicked, it sends a purchase request to the backend. To improve perceived performance, static content (product images, scripts, CSS) will be loaded from a CDN so that the initial page load is quick and not hitting our servers.

  • API Gateway & Load Balancer: Entry point that routes client requests to appropriate services. It performs user authentication (e.g., checking login/session tokens) and rate limiting to throttle excessive requests. This prevents the backend from being flooded by bots or duplicate submissions. The gateway ensures only verified, reasonably rate-controlled requests reach the Booking Service.

  • Booking Service: Acts as the front-line handler when the flash sale begins. Instead of processing orders directly (which could overwhelm databases), it immediately queues each incoming order request into a Kafka topic and responds to the user with an acknowledgment of “request received.” This asynchronous buffering absorbs the burst of traffic so downstream services can work at their own pace. The Booking Service is essentially stateless; it mainly publishes an event to Kafka. Each product on sale has its own Kafka topic to guarantee strict ordering per item – this way, the first user to request Product X will always be at the front of the queue for that product, enforcing fairness.

  • Order Service: The Order Service consumes queued order events (e.g., “User A wants 1 unit of Product X”) from the Kafka topics in the exact order they were received. The Order Service orchestrates the end-to-end transaction. It first checks if the product inventory is available, if yes, it creates a Reservation record in the database, otherwise, it returns 'Out of Stock' message to the client. The Order Service coordinates with the Inventory Service and Payment Service to complete the purchase. Importantly, this service is designed to handle a high throughput of events, decoupled from user-facing latency. The event-driven approach via Kafka decouples the web layer from the processing layer, preventing thread contention and improving scalability.

  • Inventory Service: This service manages product stock levels. When the Order Service requests to reserve an item, the Inventory Service performs an atomic stock decrement – ensuring that no two orders can grab the same unit of inventory simultaneously. For example, it may use an atomic database operation (UPDATE ... WHERE stock > 0). If stock is insufficient (already zero), it notifies the Order Service that this order cannot be fulfilled (e.g., sold out). On a successful decrement, the item is reserved for that user. No overselling is allowed: by centralizing stock updates here with proper locking/atomic ops, we ensure two orders can’t reduce the same inventory count. The Inventory Service only confirms a reservation if the stock was available; otherwise, the order is rejected. It also updates the reservation with a timestamp (to support expiry if payment isn’t completed).

  • Payment Service: Responsible for processing payments (credit card, digital wallets, etc.). When an order is reserved, the user is prompted to pay (through the UI which calls the Payment Service). The Payment Service integrates with external payment gateways. It attempts to charge the user’s payment method and handles success or failure. For reliability, it includes retry logic and failover: if the primary payment gateway fails or times out, it will automatically retry the transaction or route it to an alternate payment provider. If payment succeeds, it finalizes the order; if it fails, it triggers a rollback (inventory release, order cancellation). The Payment Service is stateless in terms of transactions (all state changes in orders and payments are stored in DB), so it can scale out easily.

  • Reserved Transaction Expiry Handler: A specialized component (could be a separate microservice or a scheduled job) that monitors reserved orders and frees up inventory for those that didn’t complete payment within the allowed time window (e.g., 5 minutes). For example, when an order is created in pending payment state, we start a countdown (detailed design is discussed later). If the payment is not confirmed in time, the reservation will be marked as expired/cancelled and the stock is incremented back by 1, making that item available to others. This ensures no item stays undersold (held in limbo) due to an abandoned cart.

  • Databases:

    • Single Source of Truth – SQL Database: All persistent data (reservations, orders, inventory counts, payments, user info) is stored in SQL databases. A relational database (e.g., MySQL or PostgreSQL) is chosen for its strong consistency and ability to enforce transactions. This is critical to ensure we don’t oversell items – e.g., the Inventory Service will use a SQL transaction to decrement stock and update the reservation record atomically. The SQL DB is the single source of truth that ultimately determines inventory levels and order statuses, even if caches or other layers temporarily have different information. We will enable replication on these databases (one primary for writes and multiple replicas for reads) to improve throughput and availability.

    • Schema design: We will use separate schemas (or databases) for each service to maintain loose coupling. For example, an Inventory DB with a products table (including columns: product_id, total_stock, reserved_stock, etc.), an Orders DB with orders table (order_id, user_id, product_id, status, reserved_at, etc.), and a Payments DB (transactions, receipts). This separation follows the microservice principle that each service owns its data. Cross-service data consistency is maintained via events (for example, Order Service doesn’t directly modify the Inventory DB, it calls the Inventory Service API which updates its DB and returns success/failure).

  • Workflow: All services communicate through well-defined APIs or events. For example, when flash sale starts, users flood the API Gateway which passes valid requests to Booking Service. Booking Service enqueues the request in Kafka (to a topic specific to that product). The Order Service (and multiple instances of it) consumes from Kafka. Because each product has a dedicated topic, ordering is strictly maintained for each item. The Order Service, upon reading a new order event, will coordinate with Inventory Service (reserve stock) and then respond to the user (e.g., by updating an order status the frontend is polling, or via WebSocket notification that “item is reserved, please proceed to payment”). The user then completes payment, which goes to Payment Service. Payment Service verifies the transaction (with external providers) and then notifies Order Service: on success, the order is marked Complete and inventory remains deducted; on failure, a compensating action triggers inventory rollback (increment stock) and marks the order as cancelled.

High-level Design
High-level Design

Caching Layer

To reduce database load, a cache is employed for frequently accessed data. The cache (e.g., Redis or Memcached) will store data like product details. For relatively static data (product info, images, descriptions, flash sale configuration), caching is straightforward and can drastically cut down read traffic to the DB. For example, product pages can be served from cache or even a CDN if they don’t change often during the sale. The cache is updated whenever the DB updates so it stays consistent.

Step 5: Database Schema

Below is a detailed relational schema for the key tables — Users, Products, Inventory, Reservation, Orders, and Payments — followed by indexing, partitioning, and constraint recommendations to optimize performance for high-volume events.

Users Table

This table stores user accounts and authentication details.

Field NameData TypeDescription
user_idBIGINT (PK)Unique user identifier (primary key, auto-increment).
emailVARCHAR(255) UNIQUEUser's email address (used for login; must be unique).
password_hashVARCHAR(255)Hashed password for authentication.
saltVARCHAR(255)Salt used for hashing the password (if applicable).
nameVARCHAR(100)User’s full name.
created_atDATETIMETimestamp when the account was created.
updated_atDATETIMETimestamp of the last update to the account information.
statusVARCHAR(50)Account status (e.g., 'active', 'disabled', 'banned').

Notes:

  • The primary key is user_id. A unique index on email ensures fast lookups during login and enforces uniqueness.
  • Passwords are stored as hashes with a unique salt per user for security.

Products Table

Manages items available for the flash sale.

Field NameData TypeDescription
product_idBIGINT (PK)Unique product identifier (primary key, auto-increment).
nameVARCHAR(200)Product name.
descriptionTEXTDetailed description of the product.
priceDECIMAL(10,2)Regular price of the product.
categoryVARCHAR(100)Category or type of product (for organization/filtering).
statusVARCHAR(50)Product status (e.g., 'active', 'out_of_stock', 'discontinued').
created_atDATETIMETimestamp when the product was added to the catalog.
updated_atDATETIMETimestamp of the last update to product info or price.

Notes:

  • Primary key is product_id. Indexing name or category can help with product searches or category listings if needed.
  • The status field can quickly indicate if a product is available for sale.

Inventory Table

Tracks stock levels for each product to prevent overselling during the flash sale.

Field NameData TypeDescription
product_idBIGINT (PK, FK to Products)Product identifier (primary key, references Products).
quantityINTAvailable stock for the product. This value is decremented when orders or reservations occur.
last_updatedDATETIMETimestamp of the last inventory update.

Notes:

  • There is a one-to-one relationship between Products and Inventory (product_id is both the primary key and a foreign key to Products).
  • Concurrent Stock Deduction: Updates to quantity should occur within a transaction to avoid race conditions. One approach is using an atomic UPDATE (e.g., UPDATE Inventory SET quantity = quantity - 1 WHERE product_id = X AND quantity >= 1) or a SELECT ... FOR UPDATE to lock the row during a purchase attempt. This prevents multiple transactions from overselling the same item by queuing concurrent row access.
  • A check constraint can ensure quantity >= 0 at all times (no negative stock).

Reservation Table

Temporarily holds stock for a user who is in the process of checking out, to prevent others from buying that stock until the user completes payment or the hold expires.

Field NameData TypeDescription
reservation_idUUID (PK)Unique reservation identifier (primary key).
user_idBIGINT (FK to Users)User who reserved the product (foreign key to Users).
product_idBIGINT (FK to Products)Product reserved (foreign key to Products).
quantityINTQuantity reserved for this user.
reserved_atDATETIMETimestamp when the reservation was created.
expires_atDATETIMETimestamp when the reservation will expire if not completed.
statusVARCHAR(50)Reservation status ('PROCESSING', 'RESERVED', 'FAILED', 'SUCCEEDED', etc).

Notes:

  • When a user initiates checkout, a reservation record is created and the stock is temporarily allocated to that user. This prevents others from purchasing it during the reservation window. The Inventory quantity will be decremented at this point.
  • The expires_at field is crucial. A background job or service should remove or mark reservations as expired when this timestamp passes, releasing the held stock back to inventory. This helps avoid underselling (stock being tied up in stale reservations).
  • A unique constraint on (user_id, product_id) can ensure a user has only one active reservation per product at a time (preventing the same user from holding the same item multiple times).
  • Indexes on product_id help quickly calculate total reserved stock for a product, and an index on expires_at speeds up queries that find and clean up expired reservations.

Orders Table

Records purchases made by users during the flash sale. Each order is a purchase of a product by a user.

Field NameData TypeDescription
order_idBIGINT (PK)Unique order identifier (primary key, often auto-increment).
user_idBIGINT (FK to Users)User who placed the order (foreign key to Users).
reservation_idUUID (FK to Reservation)The rservertion against this order.
product_idBIGINT (FK to Products)Product that was purchased (foreign key to Products).
quantityINTQuantity of the product purchased in this order.
priceDECIMAL(10,2)Purchase price per unit at the time of order (captures flash sale price or discount if any).
statusVARCHAR(50)Order status (e.g., 'pending', 'completed', 'cancelled', 'refunded').
created_atDATETIMETimestamp when the order was created.
updated_atDATETIMETimestamp of the last update to the order status or details.

Notes:

  • Foreign keys: user_id references Users, reservation_id references Reservation and product_id references Products. This maintains referential integrity (a valid user, reservation, and product must exist for each order).
  • In a simple flash sale scenario, each order might represent a single product purchase (to simplify checkout during high traffic). For multi-item carts, an additional OrderItems table would be introduced, but it’s omitted here for brevity.
  • The status field, combined with timestamps, helps track the order lifecycle (creation, completion, cancellation).
  • An index on user_id can speed up querying a user's order history, and an index on product_id helps in analyzing total sales per product or verifying stock consumption.

Payments Table

Logs payment transactions (successful or failed) for orders.

Field NameData TypeDescription
payment_idBIGINT (PK)Unique payment transaction identifier (primary key).
order_idBIGINT (FK to Orders)Associated order that this payment is for (foreign key to Orders).
user_idBIGINT (FK to Users)User who made the payment (foreign key to Users; redundant to order’s user for quick access).
amountDECIMAL(10,2)Payment amount (should match order total for successful payments).
methodVARCHAR(50)Payment method (e.g., 'credit_card', 'paypal', 'wallet').
statusVARCHAR(50)Payment status ('success', 'failed', 'pending').
transaction_refVARCHAR(100)Reference ID from the payment gateway (e.g., transaction ID).
timestampDATETIMETimestamp of the payment attempt.
failure_reasonVARCHAR(255)Reason for failure if the payment did not succeed (nullable).

Notes:

  • Foreign keys: order_id links to Orders, ensuring a payment is tied to a valid order. user_id links to Users` (this could be derived through the order, but is stored for convenience and redundancy).
  • Typically, an order will have one successful payment. Multiple entries for the same order could exist if retries or different payment methods were attempted (one will be success, others failed). A unique constraint on order_id for status='success' could enforce only one successful payment per order (if desired).
  • Index on order_id is important for quickly retrieving payments by order (especially during reconciliation). An index on status can help isolate failed transactions for analysis or retry.

Indexing and Partitioning Strategies

To support high concurrency and fast performance, proper indexing and data partitioning are essential:

  • Indexes:

    • Each primary key (e.g., user_id, product_id, order_id) is automatically indexed, ensuring fast lookups by ID.
    • Users: Unique index on email for quick authentication lookups.
    • Products: Index on category (if queries by category are common) and possibly a full-text index on name/description if keyword searches are needed.
    • Inventory: The primary key on product_id suffices for direct stock lookups. This table is small (one row per product), but proper locking on updates is crucial for concurrency control rather than additional indexes.
    • Orders: Index on user_id to retrieve a user's orders quickly. Index on product_id to analyze orders per product or to assist in stock reconciliation. If querying recent orders often, an index on created_at (or a composite index like (product_id, created_at)) can help, especially if partitioned by date.
    • Payments: Index on order_id (to quickly find the payment for a given order). Index on status (to find all failed payments, for example) or on user_id if analyzing user payment history.
    • Reservations: Index on product_id to efficiently calculate how many units of a product are reserved at a given moment. Index on expires_at to find expired reservations for cleanup. A composite index on (user_id, product_id) paired with the unique constraint helps enforce one reservation per user per product and also speeds up checking an existing reservation for a user.
  • Database Sharding: To handle high write volumes and large data, we implement sharding on the primary databases. Choosing the right shard key is important for load distribution. One effective strategy is to shard by User ID or Order ID rather than by Product. The rationale is that a hot product in a flash sale would concentrate load on a single shard if we sharded by product, whereas sharding by user spreads the load more evenly. For example, we can apply a hash function to the user ID to determine which shard an order or reservation goes to, achieving an even distribution and avoiding hotspots. This way, even if 100k users click “Buy” at once, their requests are being recorded across, say, 10 shards based on user ID hash, instead of all hitting one database. Similarly, user data (addresses, etc.) can be sharded by user ID. Product catalog data might be smaller and not require sharding, but order and reservation data will be huge during a flash sale and benefit from this partitioning.

  • Scalability Considerations:
    For extreme loads, consider read-write separation (primary for writes, replicas for reads) to distribute the traffic. The schema design supports this since reads (e.g., browsing products, checking inventory) can be done on replicas, while writes (placing orders, payments) go to the primary.

Step 6: Detailed Component Design and Data Management

Step 6: Detailed Component Design

In a flash sale, each buy request transitions through several states as it is processed by different services. Below is the lifecycle of a transaction from initiation to completion or termination:

1. Transaction States

  • WAITING: The transaction is in the Kafka waiting queue, waiting for inventory assignment. At this stage, the transaction is only in Kafka (i.e., not stored in DB).
  • PROCESSING: The transaction has been dequeued from Kafka, and the inventory is being checked. The reservation record is created in the DB with the status 'PROCESSING'.
  • RESERVED: The inventory has been successfully decremented, and the transaction is awaiting payment (with a time limit).
  • SUCCEEDED: The payment was completed within the timeout window, and the order is confirmed.
  • FAILED: Payment failed after retry attempts, or another critical failure occurred (e.g., out of stock).
  • EXPIRED: The transaction timed out before payment completion; the inventory is incremented back.
  • CANCELED: The user manually canceled the transaction before payment.

2. Workflow Steps

  1. WAITING (Queued): When a user clicks “Buy” during the flash sale, the request is accepted by the Booking Service that creates a new reservation record (a Kafka message) and publishes to a Kafka topic specific to that product type. The message contains the reservation_id (a UUID), user_id, product_id, and a timestamp. At this stage, no database entry exists yet; the request is simply waiting in the queue to be handled in a first-come-first-serve manner. Queuing the incoming orders helps throttle the surge of requests and ensures orderly, sequential inventory updates. The reservation remains in this WAITING state until a worker service (Order Service) dequeues it for processing.

  2. PROCESSING (Underway): An Order Service instance pulls the request from Kafka and begins processing it. At this point, a reservation record is created in the database with the status PROCESSING to track the in-progress transaction. The Order Service now coordinates with the Inventory Service to reserve an item for the customer and then coordinates with the Payment Service to process the payment.

  3. RESERVED (Inventory Held/Pending Payment): In this state, the item has been successfully reserved for the order. The Inventory Service decrements the available stock so that no other customer can purchase this item while payment is pending. It also updates the reservation status to RESERVED in the database, and notifies the Order Service, which then alerts the user — typically by redirecting them to a payment page — that their item is locked in. The reservation is temporary – the user is given a fixed time window (e.g., 5 minutes, as configured for the flash sale) to complete the payment. During this time, the order is essentially in a pending payment state. The system starts an expiration timer for the reservation. No other buyer can claim this inventory during the reserved period, which prevents overselling (selling more items than stock) and also avoids underselling (holding stock indefinitely without purchase). If the item is already sold out by the time of processing, the Order Service will mark this order as FAILED (out-of-stock) and trigger a failure response. Here is how we can atomically decrement the inventory and update the reservation status using a database transaction:

-- @productId : The ID of the product being purchased -- @orderId : The ID of the order currently in 'PROCESSING' state START TRANSACTION; -- 1. Attempt to decrement inventory if stock is available. UPDATE Inventory SET stock = stock - 1 WHERE product_id = @productId AND stock > 0; -- Check if the inventory update succeeded. SET @rows_updated = ROW_COUNT(); -- 2. Update the order status based on inventory availability. IF (@rows_updated = 1) THEN -- Inventory was available; mark the transaction as RESERVED. UPDATE Reservation SET status = 'RESERVED' WHERE transation_id = @transationId AND status = 'PROCESSING'; ELSE -- Inventory not available; mark the order as FAILED. UPDATE Reservation SET status = 'FAILED' WHERE transation_id = @transationId AND status = 'PROCESSING'; END IF; COMMIT;

The above query is possible only when Inventory and Reservation tables are on the same shard, because we can't have a DB transaction updating cross-shard tables. In the 'Database Schema' section, we suggested partitioning the DB based on user_id to distribute reservations onto multiple shards (as compared to partitioning based on product_id, which can overload a partition containing a hot product). To handle partitioning based on user_id, we will update the inventory separately and then update the reservation record. Since this will not be happening in one DB transaction, we could have a failing scenario where the Inventory Service decrements the inventory record but crashes before updating the reservation status to 'RESERVED'. Now, when another instance of the Inventory Service takes up this request, it will decrement the inventory again, as the reservation is still in 'PROCESSING' state.

To handle this scenario, where the Inventory and Reservation tables are on different shards, we will use a helper table called InventoryUpdated. Here’s how it works:

  1. Updating Inventory: When the Inventory Service decrements the stock, it also inserts a record into the InventoryUpdated table which is also present on the same shard. Both of these queries will happen in one DB transaction. This record in InventoryUpdated acts like a flag, marking that the inventory has already been updated for this request.

  2. Crash Handling: No, if the service crashes before it can update the Reservation record, the presence of the record in InventoryUpdated tells any new service instance that the inventory was already been decremented for this request. The service will check the presence of this record against the reservation and decrements the stock only if no such record exists. This prevents the system from subtracting the stock a second time when retrying the workflow.

  3. Finalizing the Reservation: Once the Reservation is marked RESERVED, the Inventory Service will delete the corresponding record from InventoryUpdated. This cleanup ensures this helper table stays small and only contains pending updates.

In summary, the InventoryUpdated table is used to safely coordinate the inventory deduction when working across different shards. It prevents accidental double-decrementing of stock if a failure occurs during the process, ensuring that each reservation is processed exactly once.

  1. SUCCEEDED (Completed): This is the successful completion state for the transaction. It is reached when the Payment Service confirms that payment was completed within the allowed time window. Once the user pays (e.g., entering payment details and the payment is approved), the Payment Service notifies the Order Service (via a Kafka message) that the payment is complete. The Order Service then updates the reservation status to SUCCEEDED in the database, finalizing the transaction. At this point, the item is officially sold and will not be returned to inventory. Downstream actions can be triggered here (e.g. generating an order confirmation for the user, notifying fulfillment/shipping services). The transition from RESERVED to SUCCEEDED marks a successful flash sale purchase.

  2. FAILED (Error or Payment Failed): The reservation becomes FAILED if the payment ultimately cannot be completed. The user attempted payment but it was declined or did not succeed even after retries. For example, the Payment Service tried charging the card multiple times or via multiple methods and exhausted all retry attempts without success. In this case, the reservation is marked as FAILED due to payment failure. The reserved inventory is then released back to stock (since the item was never actually paid for).

  3. EXPIRED (Timed Out): If the user does not complete payment within the allotted reservation time, the reservation moves to EXPIRED state. This is an automatic transition triggered by a timeout. A dedicated Reserved Transaction Expiry Handler watches for orders stuck in RESERVED beyond the payment window. When a timeout occurs, it updates the reservation status to EXPIRED and publishes an event (on a Kafka topic) indicating the reservation expired. The Inventory Service consumes this event and increments the stock back, effectively restoring the item to inventory. This design ensures that inventory is not permanently lost due to an abandoned cart – the quantity is returned for others to buy once the original reservation expires. From the user’s perspective, an expired order typically means they took too long to pay and the order was canceled by the system. They might receive a notification that the item was released, and if they still want it, they’d have to place a new order. The system must also handle if a payment notification comes in after expiration – for example, by rejecting the payment and initiating a refund if the order is already expired, as a late payment event could occur in rare cases due to network delays.

  4. CANCELED (User Aborted): This state occurs when the user actively cancels the reservation before completing payment. For instance, if the user changes their mind and clicks a “Cancel Order” button during the payment phase (while the reservation is RESERVED), the system will mark the reservation as CANCELED. Like expiration, cancellation triggers the release of reserved inventory back to the stock. The Order Service updates the status to CANCELED in the DB, and an event is sent to the Inventory Service to increment the item count back. From a workflow perspective, a canceled transaction is very similar to an expired one, except it was manually triggered by the user rather than by a timer. The user is typically shown a cancellation confirmation, and the item becomes available to others again.

State Transitions Summary: A typical successful flow is WAITING → PROCESSING → RESERVED → SUCCEEDED. However, at the RESERVED stage the order can also go to EXPIRED (if timed out), FAILED (if payment fails or other error), or CANCELED (if user aborts). Throughout this lifecycle, services communicate via Kafka messages to update the reservation and keep data (like inventory counts) consistent.

Detailed Component Design
Detailed Component Design

3. Background Job: RESERVED Transaction Expiry Handler

Purpose

This background service is responsible for:

  1. Tracking RESERVED transactions in-memory to efficiently detect expired transactions.
  2. Handling expiration logic by marking overdue transactions as EXPIRED, restoring inventory, and notifying the inventory service.
  3. Ensuring recovery from crashes by reloading RESERVED transactions from the database.
  4. Providing scalability and fault tolerance, so that multiple instances can coordinate expiry handling.

1. Storage and Retrieval of RESERVED Transactions

  • Primary in-memory storage:

    • Use an in-memory data structure (e.g., a priority queue (min-heap) or sorted set) to track RESERVED transactions by expiration time.
    • This allows efficient expiration checks—expired transactions are always at the top.
    • Data structure:
      reserved_transactions = SortedSet() # Store (expiration_time, transaction_id)
  • Crash recovery mechanism:

    • If the service crashes, it will reload all active RESERVED transactions from the database at startup.
    • Query:
      SELECT reservation_id, expiration_time FROM Reservation WHERE status = 'RESERVED';
    • The service re-populates the in-memory queue with these transactions.

2. Transaction Expiry Check Mechanism

  • Efficient Expiry Detection:

    • The in-memory priority queue ensures O(1) retrieval of the next expiring transaction.
    • The system continuously peeks at the next expiration timestamp and waits until it’s due.
    • If the top transaction in the queue is expired, it is processed immediately.
  • Polling Mechanism (Worker Loop):

    • A loop runs continuously, checking for expired transactions:
      while True: now = current_time() if reserved_transactions and reserved_transactions.peek()[0] <= now: process_expired_transaction(reserved_transactions.pop()) else: sleep_for_next_expiry() # Sleeps until the next expiry time

3. Processing Expired Transactions

Notify the Order Service that stock is available by pushing an event to Kafka (e.g., "Inventory.Stock_Released" event). The Order Service will coordinate with the Inventory Service to mark the reservation as EXPIRED in the database and increment inventory back.

4. Concurrency & Scaling Considerations

  • Distributed Service Instances:

    • The expiry service should scale horizontally, meaning multiple instances can run in parallel.
    • Sharding approach:
      • Reservations can be partitioned by product_id or reservation_id.
      • Each worker is assigned a subset of reservations (ensuring no race conditions).
  • Ensuring Only One Worker Expires a Reservation:

    • Use a query like this:
      UPDATE Reservation SET status = 'EXPIRED' WHERE reservation_id= <reservation_id> AND status = 'RESERVED';
      • If the query updates 0 rows, it means another worker already processed the expiration.

This design ensures high efficiency, fast detection of expired transactions, and robust crash recovery.

4. Message Queue and Concurrency Control

Apache Kafka is the backbone of our flash sale request pipeline. All incoming purchase attempts are funneled through Kafka to smooth out the traffic spikes (a technique often called queue-based load leveling). By queuing requests, we ensure the downstream processing (order creation, payment, etc.) happens at a rate the system can handle, rather than being overwhelmed in the first second of the sale. Kafka is well-suited for this due to its high throughput and ability to retain ordered logs of events.

  • Per-Product Topics & Ordering: To enforce a strict first-come-first-serve policy, we create a dedicated Kafka topic for each flash sale product. This means all orders for “Product X” go into topic “FlashSale_ProductX”. Within that topic, Kafka preserves the ordering of events. If we need to scale consumers, we could use partitions, but we must be careful: multiple partitions could introduce out-of-order processing across partitions. For simplicity, if the inventory of the product is not huge (say 10K items) and our consumer can handle it, we might keep a single partition per product to maintain global order. Alternatively, if using partitions for scalability, we ensure the ordering key is the product_ID (thus all events for one product go to the same partition). In short, ordering is maintained on a per-product basis so that fairness is guaranteed for each item.

  • Consumer (Order Service) Concurrency: We run a pool of consumers (Order Service instances) reading from these topics. Each product topic can be consumed by one instance (or one thread) to maintain order. This way, orders for a single product are processed sequentially. For multiple products, we can consume in parallel (different topics handled by different consumer threads or instances). Kafka naturally allows horizontal scaling by partitioning topics and having consumer groups, but since each product’s topic might not need more than one partition, scaling is achieved by adding more consumers for different products rather than parallelizing one product’s consumption. This design avoids race conditions on inventory – since one consumer handles all orders for a given product, it processes one at a time, updating inventory in a safe sequence.

  • Decoupling and Async Processing: Using Kafka decouples the ingestion of requests from the processing of orders, which is crucial for handling extreme loads. The web layer (Booking Service) just drops messages in Kafka quickly and isn’t held up by complex logic. Later, the Order/Inventory logic pulls from Kafka as fast as it can, but if it falls behind, Kafka will buffer the backlog. This prevents request loss and allows the system to catch up when possible. It also isolates failures – if the Order Service goes down, incoming requests still pile up in Kafka (durably stored) and can be processed when the service restarts. This decoupling via a publish-subscribe model reduces direct dependencies between services and minimizes cascade failures. In a way, Kafka serves as the “gatekeeper” during the sale, ensuring the database is never directly hit by the full brunt of user traffic at once.

  • Atomicity and Exactly-Once Considerations: We configure Kafka and the consumers for reliability. Kafka itself is distributed and fault-tolerant, so losing a broker won’t drop our messages. The Order Service will use consumer group management so if one instance fails, another takes over reading the topic partition where it left off. We also design idempotency into consumers – if a message is ever reprocessed (in case of failure/retry), the Order Service will check if that order was already handled to avoid duplicates. Kafka can be configured with at-least-once or exactly-once semantics; we might leverage Kafka transactions or idempotent consumer logic to effectively achieve exactly-once processing, which is important to not double-sell an item.

Summary: Kafka ensures sequential processing for each product and serves as a buffer to convert the instantaneous spike into a steady stream. This dramatically reduces contention on shared resources like the database. The use of topics per product enforces fairness (each user’s request is enqueued in the exact order received). And because of the asynchronous design, user-facing operations remain snappy – the user clicks “Buy” and immediately gets a queued confirmation, rather than waiting for the entire order to complete. This improves perceived performance and avoids users retrying in frustration (which could cause more load).

5. Booking Service: Retrieving and Returning Queue Position

Queue Position Tracking Mechanism: The Order Service, which consumes order requests from the Kafka queue, is responsible for determining each user’s updated position in line. Under the hood, each incoming request can be tagged with a queue sequence number (for example, using the Kafka message offset as an identifier for position). In Apache Kafka, every message in a topic partition has an offset, which is essentially its position in the log (queue). The Order Service (as a Kafka consumer) continuously tracks the latest offset it has processed/committed.

When a user’s client polls for an updated position, the server (e.g., a lightweight Queue Service or the Order Service itself) can calculate how many requests are still ahead of that user’s request in the queue. One way to do this is by comparing the user’s message offset (or initial position number) with the Oreder Service’s current processing offset. For instance, Kafka’s consumer lag — the difference between the latest produced message and the last consumed message — indicates how many events remain unprocessed in the queue. If a user’s request is at offset 1050 and the Order Service has last processed offset 1025, there are 25 messages ahead of that request in the queue (meaning the user’s current position is 26th).

Step-by-Step Mechanism:

  1. Assign Queue Position on Enqueue: When a purchase request arrives during the flash sale, it is published to the Kafka queue. At this point, the system can determine the request’s position in line – e.g., by noting the Kafka message offset or by using an incrementing counter. The user is informed of their initial queue position (for example, “You are #300 in line”).
  2. Retrieve Updated Position: When the client asks for an update (via polling), the server checks how many messages have been processed relative to the user’s position. This can be done by computing the number of messages ahead of the user in the queue. For example, the system can subtract the Order Service’s latest committed offset (or count of processed requests) from the user’s message offset to see how many are still in front.
  3. Return/Push Position to User: The server returns this number to the client in its response (e.g., { "queuePosition": 26 }). The client then updates the UI to show the new queue position. This cycle continues until the user reaches the front of the queue (position 1) and their request is picked up for processing by the Order Service.

Real-Time Queue Updates: Polling vs WebSockets

Current Approach (Polling): The current design uses client polling to update queue positions. The client repeatedly sends requests to the server at intervals to get the latest position. This is straightforward to implement and works on all browsers (using regular HTTP requests). However, frequent polling can be inefficient – each request/response carries overhead (HTTP headers, connection setup) and may return no new data if the position hasn’t changed much. There’s also a slight latency trade-off: if the client polls, say, every 5 seconds, the user’s position update could be up to 5 seconds out-of-date. Polling too often reduces latency but increases server load; polling less often saves resources but gives less real-time feedback.

Alternative Approach (WebSockets): As an enhancement, the system could use WebSockets to push queue position updates to clients in real time, instead of relying on polling. With a WebSocket, the client opens a persistent connection to the server. The server can then send (push) updated queue positions to the client immediately whenever the position changes, without the client having to ask repeatedly. This means the user’s position on the screen updates in real-time (for example, moving from #42 to #41 as soon as one request ahead is processed).

Trade-offs – Polling vs WebSockets:

  • Polling (HTTP Requests): Simple and widely supported by all clients/firewalls. The server does not need to maintain client state between requests. However, polling is less efficient if many users are waiting: it generates a lot of HTTP requests and can increase server load and bandwidth usage. Polling also introduces a timing delay dependent on the polling interval (updates aren’t truly instant).
  • WebSockets: Provides instant, push-based updates with minimal latency – users see their queue position update as soon as it changes. It’s also more efficient on the network since after the initial handshake, there’s no HTTP overhead per message (no repeated headers, etc.). The trade-off is added complexity: the server must handle many open WebSocket connections and ensure they stay alive (which can be challenging at very large scale, requiring robust connection management and load balancing). There’s also a need to implement reconnection logic (e.g., if a connection drops, the client should reconnect and possibly re-authenticate). In summary, WebSockets are more scalable per update (fewer redundant requests) but harder to scale in connections, whereas polling is easier to scale in a stateless way but wastes resources with redundant queries if not tuned carefully.
Image

.....

.....

.....

Like the course? Get enrolled and start learning!

Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible