Grokking System Design Interview, Volume II
Ask Author
Back to course home

0% completed

Vote For New Content
Design a Reminder Alert System
Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

A Notification System for Reminders is a platform that lets users schedule reminders and delivers timely alerts through various channels (email, SMS, push notifications, or in-app) when those reminders are due. The system acts as a central hub to manage scheduled events (reminders) and dispatch notifications to users via their preferred channels. A real-world analogy is the reminder feature in Slack or calendar apps – for example, Slack relies on scheduled jobs to ensure messages and reminders reach users at the right time. Here, we generalize that concept to a web-scale service handling millions of users.

Key Terminology:

  • User: The person setting up reminders.
  • Reminder: A scheduled alert with a message (e.g. “Take medication”) that repeats at specified intervals (recurring). Each reminder can trigger multiple notifications over time (each occurrence is one notification).
  • Recurring Schedule: A rule for repetition (e.g. hourly, daily, monthly). This can be specified in terms like “every 4 hours,” “every day at 9:00 AM,” or calendar-based recurrences. The system translates these into concrete future trigger times.
  • Notification/Alert: The actual delivered message at a scheduled time. Notifications can be delivered via multiple channels – for example, an email, an SMS text message, a mobile push notification, or an in-app alert.
  • Time Zone & DST: Users’ reminders are often set in their local time zones. The system must account for time zone differences and Daylight Savings Time (DST) changes so that “9:00 AM daily” truly means 9:00 AM in the user’s locale even if UTC time offsets shift.
  • Web-Scale: Implies the system is designed for high scale – millions of users and reminders – with robust distribution, fault-tolerance, and low latency. The design must prevent bottlenecks and single points of failure, using distributed system techniques to handle very large workloads.

Before designing the solution, let’s clarify the key requirements.

Functional Requirements

  • Recurring Reminder Management: Users can create new reminders, update them (e.g. change schedule or message), and delete/cancel them. Each user may have up to 500 active recurring reminders at a time (supporting heavy power-users).
  • Flexible Recurrence Intervals: The system should support common intervals such as hourly, daily, weekly, monthly, etc., and possibly more complex schedules (e.g. every weekday, or custom CRON-like expressions). For simplicity, assume the core frequencies are hourly, daily, and monthly as examples. Users specify the start time (or first occurrence) and the repeat pattern.
  • Accurate Scheduling: Ensure each reminder notification is sent within a few seconds of its scheduled time. This means if a user’s reminder is set for 10:00:00, the notification might arrive at 10:00:03 at the latest. The system needs a high timing accuracy despite the large scale.
  • Time Zone Awareness: Users can set reminders in their local time zone. The system must adjust for the user’s time zone and handle Daylight Savings Time transitions correctly (e.g. a daily 9 AM reminder still fires at 9 AM local even after a DST clock change).
  • Multi-Channel Notification Delivery: Support delivering notifications via email, SMS, push notifications, and in-app alerts. Users should be able to choose one or multiple channels for each reminder. For example, a reminder might notify via both a push notification on the user’s phone and an email. The system should abstract the sending mechanism for each channel but ensure all configured channels get the message. Multi-channel support is a core requirement.
  • In-App Alerts: The system should integrate with the product’s in-app notification system. For instance, when a reminder triggers, the user sees an alert in the web or mobile app (e.g. a pop-up or a notification center message). If the user is offline, the in-app alert may be stored so that it appears when they next use the app.
  • Real-Time Updates: Changes to reminders (edits or deletions) should take effect immediately. If a user deletes a reminder, any future occurrences should be canceled (no notifications sent after deletion). If they update the time or frequency, the system should reschedule subsequent notifications accordingly.
  • User Interface & Listing (Implied): While not explicitly asked, typically users should be able to list and view their reminders and see past notifications. This implies the system should retrieve a user’s reminders quickly (e.g. for an app screen showing “Your Reminders”).

Non-Functional Requirements

  • Scalability: The system must scale to millions of users and reminders without performance degradation. We expect potentially millions of reminder triggers per day. The design should handle growth by scaling horizontally (adding servers) and partitioning data/workload. For example, a robust notification service might target handling millions of notifications per day or even per hour.
  • High Throughput & Low Latency: It should support a high throughput of scheduling and sending operations. Delivery latency must be low – reminders are time-sensitive. The end-to-end pipeline (from scheduled time to actual delivery) should introduce minimal delay (ideally seconds or less). This necessitates efficient scheduling algorithms and asynchronous processing.
  • High Availability & Reliability: The system should be fault tolerant. No single failure (server crash, network issue, etc.) should cause lost reminders or a system-wide outage. We aim for 24/7 uptime with possibly 99.9% or higher availability. The design should include redundancy (multiple instances for each component) and data replication. Also, ensure at-least-once delivery of each notification – every scheduled reminder will be delivered even if retries are needed. (In critical use-cases we could even pursue exactly-once semantics, but at-least-once with de-duplication is typically acceptable for notifications).
  • Consistency & Correctness: The scheduling must be correct, especially around edge cases (like DST changes or leap years). No duplicate notifications for a single occurrence, and no missed occurrences as long as the service is running. Updates or deletions should not result in “ghost” notifications. Data consistency is important so that a reminder’s state is clear (e.g., not scheduling a deleted reminder).
  • Security & Privacy: Ensure that reminders are only accessible by their owners (proper authentication/authorization on APIs). Sensitive user data (phone numbers, emails, message content) should be stored securely (e.g., encrypted at rest and in transit). Also, provide safeguards against abuse (someone scheduling 500 SMS every minute could be abuse/spam – put reasonable limits or captchas to avoid turning it into a spam system).
  • Extensibility & Configurability: The design should allow adding new notification channels in the future (for example, maybe WhatsApp or voice calls) with minimal changes. It should also accommodate evolving requirements, like more complex recurrence patterns, without a complete redesign.
  • Monitoring & Logging: There should be extensive monitoring (metrics, alerts) and logging. For such a critical system, we need to monitor trigger delays, queue backlogs, success/failure rates per channel, etc. Alerts should fire if, say, deliveries are lagging beyond a few seconds or if a component fails, so that engineers can respond quickly.

Before designing the system, it’s important to gauge the scale of data and traffic we need to handle. Here are some rough estimations for a “web-scale” scenario:

  • User Base: Suppose we have 10 million daily active users using the reminder service (this could be a subset of a larger registered user base). It’s web-scale, so we plan for possibly tens of millions of users eventually.

  • Reminders per User: Each user can create up to 500 recurring reminders, but the average user might only create, say, 5–10. Heavy users might approach the limit. Let’s assume an average of 10 reminders/user to be safe. With 10M users, that’s about 100 million reminders stored in the system. (Even if our average is overestimated, we design for this upper bound. In reality, some users will have none or a few reminders, so the actual number might be lower.)

  • Total Daily Trigger Events: The number of notifications triggered per day depends on the recurrence frequencies. Let’s assume on average each reminder triggers once per day (some hourly ones will trigger 24 times a day, but many daily or weekly ones trigger less; we’ll approximate it as ~1 for a large mix). With ~100M reminders, that’s on the order of 100 million notifications per day. This is an approximation – the actual could vary, but it gives the right order of magnitude (10^8). For comparison, one design example targets ~50 million tasks per day, leading to ~34k tasks per minute on average. Our estimate is roughly double that scale.

  • Traffic Rate: 100M notifications/day breaks down to ~1.16k notifications per second on average (100M/86400s ≈ 1157/s). However, traffic won’t be uniform. Likely, there are peaks at certain times of day. For example, if many people set daily 9:00 AM reminders, we’ll see surges around top-of-hour or morning times in each region. We should design for peak bursts significantly above the average. It’s possible to have thousands or even tens of thousands of notifications in a particular second during peak hours. (For instance, a scenario of 1 million notifications in one minute has been considered in similar systems, which is ~17k/sec peak.) So our system should comfortably handle tens of thousands of notifications per second at peak.

  • Read vs Write Workload:

    • Writes: Creating or updating reminders are write operations to our storage. If we have 100M total reminders over time, the creation rate might be, say, 1% of reminders created per day (just a guess for steady state) = 1M creations/updates per day, which is about 11.6 per second on average – that’s trivial compared to notification sends. Even if spikes of user activity happen (say 100k users join and set reminders in a short time), those writes might hit a few hundred per second at peak. This is manageable with a distributed database.
    • Reads: Reading user’s reminders (e.g., when a user opens the app) is also relatively low traffic compared to sends. The heavy read-like workload is actually the scheduler scanning for due reminders and the reads/writes to process those due tasks. This will be a significant portion of system activity. We will design the scheduler to efficiently query upcoming reminders (likely by indexed time), rather than full table scans. Each notification sent also incurs a read of the reminder data (if not already in memory or included in the due query). We might treat sending a notification as a “read” from the perspective of retrieving the reminder and then a “write” to the outbound channel.
    • Notification Delivery: The act of sending out notifications isn’t a traditional database read/write, but it generates outgoing requests (to email servers, SMS gateways, push services). These external calls can also be on the order of thousands/sec at peak. We must ensure our external channel integrations (e.g., email/SMS providers) and network can handle that throughput or have a queuing/retry strategy if they rate-limit us.
  • Data Storage Volume:

    • Each reminder stored might include: user ID, schedule pattern, next trigger time, message content (text), chosen channels, and maybe some metadata. This could be on the order of a few hundred bytes per reminder. 100M reminders * 300 bytes each ≈ 30 billion bytes (~30 GB) of active reminder data. This is feasible for a distributed NoSQL database (or a partitioned SQL) to handle. We may also store a history of sent notifications (for audit or user reference), which could grow large – e.g., 100M notifications/day, if we keep 7 days of history that’s 700M records. If each history record is small (say 100 bytes), 700M * 100B = 70GB for a week’s history. We might not need to store all of that long-term; perhaps only recent history or aggregate stats.
    • As a baseline, storing 100M reminders and their next execution times is not a big issue with modern distributed storage. Even higher user counts (50M users with a few reminders each) are fine – e.g. another design projected 50GB for user data at 50M users. Our focus will be on efficient indexing by time rather than volume per se.
  • Network Bandwidth: If we send, say, 100M notifications (emails/SMS/push) per day, and each notification payload is small (let’s assume 1KB on average including headers), that’s 100M KB = ~100 GB of outbound data per day. Spread over the day it’s about 1.16MB/s average, but at peak could be an order of magnitude higher. This is within reason for a data center, but we need to ensure our design (especially the email/SMS sending components) can handle the concurrency and volume.

  • Latency Requirements: To achieve “few seconds” accuracy, our scheduler likely needs to check for due tasks very frequently (possibly every second or so). If we only checked once a minute, worst-case latency could be up to 60 seconds. So we anticipate a scheduling tick of perhaps 1–5 seconds. This increases the number of queries to the due-task datastore but is necessary for high timing precision. We will account for that in the design (e.g., making those queries extremely efficient via keys or in-memory timers).

In summary, this system will handle on the order of 10^7–10^8 reminders and notifications per day, require high throughput processing (thousands of events per second), and manage tens of gigabytes of reminder data. These numbers drive us toward a distributed, partitioned architecture with a focus on time-based data access and parallel processing of jobs.

At a high level, we will adopt a distributed microservices architecture with distinct components for managing reminders, scheduling jobs, and delivering notifications. The design will separate the concerns of reminder storage & scheduling from notification delivery, using asynchronous queues to decouple the timing of reminders from the act of sending messages. Major components and their interactions are outlined below:

  • API Layer (Reminder Service): This is the front-end service that handles user requests – creating, updating, deleting, or listing reminders. It exposes RESTful (or gRPC) endpoints for the client app or web interface. This service performs input validation (correct date format, allowed number of reminders, etc.), applies business rules, and then persists the reminder details in the database. It also communicates with the scheduling component to ensure any new or changed reminder is accounted for in the execution pipeline. The API layer will be stateless and load-balanced across many instances, so it can handle concurrent user requests globally.

  • Database (Reminders Data Store): We will use a reliable, scalable data store to persist reminder definitions and schedules. A strong choice here is a distributed NoSQL database (like Apache Cassandra or Amazon DynamoDB) because it can handle a very high volume of writes and reads across multiple nodes with partitioning. The data model (detailed later) will be designed to efficiently query reminders by their next execution time. We’ll likely have two main tables: one for Reminder Definitions (per user, storing the recurrence pattern, message, channels, etc.) and one for Scheduled Instances (storing the next occurrence time for each active reminder, indexed by time). This separation allows easy listing by user and efficient lookup by time, at the cost of some duplication. We’ll also consider an execution log table to record when each reminder was sent, for auditing or user reference.

  • Scheduler Service: This is the heart of the system’s timing logic. The Scheduler is responsible for continuously scanning for reminders that are due to be sent and enqueuing those notifications for delivery. It operates somewhat like a distributed cron or timer system. Concretely, the Scheduler will periodically query the database for reminders that are “due” (i.e. their next execution timestamp <= now + 5s). For each due reminder it finds, it will create notification tasks (one per channel to send) and push them into a Notification Queue (described next). After scheduling a reminder occurrence, the Scheduler also computes the reminder’s next occurrence (for recurring reminders) and updates the database with the new next execution time. This component must be highly available and scalable: we will run multiple instances of the Scheduler service in parallel, partitioning the work so that each handles a subset of reminders (to share the load of potentially thousands of due tasks per second). We will detail how we avoid conflicts (no two schedulers grabbing the same reminder) using time-sharding and coordination. Importantly, the Scheduler ensures that reminders are processed on schedule – it’s designed to achieve that “within a few seconds” delivery requirement by its query frequency and distribution.

High Level Design of Reminder Alert System
High Level Design of Reminder Alert System
  • Notification Queue (Message Bus): Once a reminder is due, we decouple the act of picking it up from the act of sending it. The Scheduler will place notification events onto a reliable queue or streaming system (e.g., Apache Kafka, RabbitMQ, or a cloud equivalent). This queue serves as a buffer that smooths out spikes in load and allows downstream sender services to process notifications asynchronously at their own pace. It also improves fault tolerance – if, say, an email service is temporarily slow, the queue will backlog but the Scheduler can continue pushing events without dropping them. Each event on the queue contains all information needed to send the notification: e.g., the reminder ID, user info, message content, and target channel. We could have a single unified “Notification Queue” with a field indicating channel, or separate queues per channel (one for emails, one for SMS, etc.). A common approach is a unified topic where multiple consumer groups (one per channel type) consume the messages they are interested in. However, since each reminder might require sending to multiple channels, the Scheduler might actually push multiple messages – one to each channel’s queue. For clarity, we can conceptually treat it as a fan-out: the Scheduler triggers an event for each channel that needs notification. The queue system should guarantee at-least-once delivery of each message, and we will design idempotency to handle duplicates if they occur. The queue also helps us implement a retry mechanism easily: if a send fails, the message can be re-queued or moved to a dead-letter queue for later inspection.

  • Notification Delivery Services (Channel Processors): For each channel (Email, SMS, Push, In-App), we will have dedicated worker services or processors. These are consumers of the Notification Queue. For example, an Email Service will pull email notification events from the queue and execute the sending of emails. Similarly, an SMS Service will handle text messages, a Push Notification Service will integrate with Apple/Google push gateways, and an In-App Notification Service will handle creating in-app messages. By separating these, each can scale and be optimized independently. If suddenly SMS traffic grows, we can add more SMS workers without affecting email. Each service will have the logic to format the message appropriately and communicate with an external provider or internal system:

    • Email: uses an SMTP server or third-party email API (like SendGrid, SES) to send the email. It will include the reminder content in the email body, addressed to the user’s email on file.
    • SMS: uses an SMS gateway API (like Twilio or Nexmo) to send the text message to the user’s phone number.
    • Push Notifications: uses platform push services (Apple Push Notification service for iOS, Firebase Cloud Messaging for Android) to deliver a push alert to the user’s device. Requires the user’s device token registered, etc.
    • In-App: this might directly write to a notifications table in our database and/or send a real-time signal to the app. For real-time, we could use WebSocket connections or a push service if the user’s app is currently open; otherwise the next time the app connects, it fetches the new notification. In-app alerts might also generate a badge count or similar.

    Each channel processor will implement retry logic for failed sends (e.g., retry an email 3 times with backoff if the SMTP server is not responding). They will also log success/failure status (possibly back to a central log or the database’s history table). This architecture (Notification Service + Queue + Channel processors) is a classic publisher-subscriber pattern, allowing the system to handle millions of messages by decoupling producers from consumers.

  • Time Zone Module: While not a standalone service, it’s important to mention how time zones are handled. The Reminder Service (API) will include logic to convert user-provided times into a standard format (likely UTC) for storage, and store the user’s time zone or offset. The Scheduler will use this information when computing next trigger times for recurring reminders. We might incorporate a library or service for handling time zone conversions and DST rules (e.g., using the IANA tz database). This ensures that a “10:00 AM every day” rule is interpreted correctly each day even if offsets change. We may also need a small calendar service or utility to compute recurrence rules (especially for monthly recurrences, which can be complex with varying month lengths, last-day-of-month rules, etc.). This is largely an internal logic component rather than a separate deployed service.

  • Load Balancing & Routing: All user-facing requests (to the API layer) will go through a Load Balancer that distributes traffic across multiple Reminder Service instances (and across multiple data centers or regions if we deploy globally). Similarly, each set of microservices (Scheduler nodes, queue brokers, channel workers) will have their own load balancing or partitioning strategy: e.g., multiple scheduler instances coordinating (via a leader or partitioning scheme), and multiple instances of each channel consumer reading from the queue (with Kafka, for instance, we’d have multiple consumer threads in a consumer group). The system should also be multi-region in deployment if needed: we could deploy a cluster in e.g. US, EU, Asia, and keep user data local to their region to optimize latency for real-time notifications. However, a global queue and scheduler can also be used with careful design (time zones help partition naturally, e.g. reminders will tend to be distributed over 24h by local time).

  • External Services: The design will integrate with external systems such as Email providers, SMS gateways, push notification services, etc. We abstract these behind our channel services, but we should acknowledge their role. If any external service has a rate limit (for example, an SMS API might allow X messages per second), we may need to throttle requests or have multiple accounts/providers to distribute load. For reliability, we might even use fallback providers (e.g., if SMS provider A is down, route through provider B).

  • Workflow Summary: When a user creates a reminder, the flow is: User → API → Database (store reminder & initial next trigger) → Scheduler (when time comes, finds due reminder) → Queue → Channel Service → External send → User receives notification. If the reminder is recurring, the scheduler also calculates and schedules the next occurrence (updates DB) before or after sending. If a user updates or deletes a reminder, the API will update the DB and the scheduler will either reschedule or remove any pending actions (the scheduler always uses latest data from DB, so a deletion will mean it no longer finds that reminder when scanning).

This high-level design emphasizes separation of concerns: the when (scheduling) is handled independently from the how (delivery), which improves scalability and maintainability. Next, we’ll drill down into each part, including data models and algorithms to meet the requirements.

Designing the right data schema is critical for performance. We need to support two primary access patterns efficiently:

  1. Lookup reminders by next execution time (for the Scheduler to find what’s due now).
  2. Lookup reminders by user (for user queries and for computing next occurrences, etc.).

To achieve this, we will use a denormalized approach with two main tables/collections:

  • Reminders Table (Definition Storage): This stores the primary data for each reminder. Each entry includes: ReminderID (primary key, could be a UUID), UserID (the owner, also used for partitioning), the RecurrencePattern (e.g. “daily at 9:00”), the TimeZone (e.g. “America/Los_Angeles”), the MessageContent (text of the reminder), and the Channels (which channels to send to, e.g. [“email”, “push”], along with necessary addresses like email address or phone number – or those could be looked up from a user profile). We also store metadata like CreatedAt, maybe LastSentAt and NextScheduledTime (though the next time will also be in the schedule table). In a relational model, this might be a normalized table referencing a User table for contact info; in NoSQL, we might duplicate some info for quick access.

    • Partitioning: We can partition this table by UserID. That way, retrieving all reminders for a given user is efficient (all in one partition or one shard). The primary key could be a composite of UserID + ReminderID (with ReminderID as a clustering key or sort key if using Cassandra). This fits because user-centric queries (like listing or modifying reminders) are common and should be quick. It also spreads data across partitions by user, avoiding any single hot partition provided users are numerous.
    • This table is updated on user actions (create/update/delete). It’s not frequently scanned in full, so it can be large. By partitioning on user, the system can handle millions of users.
  • Schedule Table (Next Occurrences): This table is optimized for time-based queries. Each entry represents the next scheduled execution time of a reminder that is currently active (has a future occurrence). Key fields: NextExecutionTime (the timestamp for the next trigger, likely in UTC), and ReminderID (or a reference to the reminder). We also include a partitioning key for sharding, possibly a TimeSlot or Segment (explained below). Additional fields might be included to avoid joins – e.g., we could duplicate the UserID and maybe the Channels or message snippet here if needed. However, we can also fetch details via ReminderID from the Reminders table if efficient (which is partitioned by user, so we’d need UserID too; including UserID here would help that lookup).

    • Primary Key Design: To efficiently query by time, we can use NextExecutionTime as part of the primary key. In a relational DB, we’d index this field for range scans (“give me all reminders where next_time <= now + 5s”). In a NoSQL like Cassandra, we might designate NextExecutionTime as a partition key. However, a single partition per exact timestamp could become a hotspot if many reminders share the same timestamp. A better approach is to include a higher-level time bucket or an additional key to spread out load.

      • One proven strategy: convert NextExecutionTime to a UNIX epoch minute (or second) and use that as a partition key, possibly combined with a secondary key for distribution. For example, the partition key could be ExecutionMinute = floor(epoch_time/60) (if we go by minute). Then all reminders due in the same minute go to the same partition. This way, when the scheduler queries “what’s due this minute?”, it hits a single partition and gets all items. We can further refine to second-level if needed for finer accuracy (partition by exact epoch second for due time), though that results in many more partitions. Minute-level partition is a balance between partition count and timing resolution.
      • Segment Sharding: To handle extremely high volume in one time slot, we introduce a Segment number in the key. The idea is to pre-divide each time partition into multiple segments so that multiple scheduler workers can each handle a slice without contention. For example, we define Segment values 1 through N (say N=10 or N=16) and assign each reminder randomly to a segment when scheduling. Then the composite primary key could be (ExecutionMinute, Segment) as the partition key and ReminderID as the clustering key. This way, reminders due in the same minute are distributed across N partitions (one per segment) rather than all in one. A scheduler instance responsible for that minute can query one segment or a subset. We will have a coordinator assign segments to scheduler nodes (discussed in Scheduling algorithm). In Cassandra terms, ExecutionMinute+Segment as partition key ensures tasks in the same minute/segment are grouped.
      • By using these keys, the Schedule table allows efficient retrieval of “due reminders” without scanning the entire dataset. We avoid a full table scan (which would be untenable at 100M+ rows) and instead do a targeted lookup on the current time slot partition. This optimization was necessary to eliminate cluster-wide scans for each query.
    • Data in Schedule Table: At minimum, it has NextExecutionTime, Segment, and ReminderID. We likely include UserID to quickly fetch the full reminder details (since ReminderID by itself might not tell us which partition in the Reminders table to look at, if partitioned by UserID). Alternatively, we can make ReminderID globally unique and have a global index in the Reminders table – but that’s more complex. Including UserID and maybe some of the reminder data (like channels or message excerpt) in the schedule table entry can make the Scheduler’s job faster (it can decide what to do without an extra DB round-trip). This is denormalization for performance. The trade-off is that if the user updates the reminder content or channels, we need to update both tables. This is manageable with proper design (the API service can update both consistently).

    • When a reminder triggers and we compute the next time, we will update or insert a new entry with the new NextExecutionTime. We might update the same row’s key (which is tricky if the time changed – essentially it’s a delete old + insert new under a different partition). More likely, the flow is: remove the old entry and insert a new entry with the new time (since the partition key changed). If a reminder is deleted or completed, we remove its entry from this table entirely.

  • User Profile / Contact Info (optional): If channels require contact points (like email address, phone number, device token), we might have a User table or some storage for that. In many architectures, this is separate from the reminder system (managed by a user service). Our reminder system could call out to a User Service to get the latest contact info when needed. Alternatively, we can store needed info in the Reminders table for convenience (for instance, store the target email or phone for that reminder explicitly, if users might want different emails per reminder – unlikely, probably they use their main contact). It’s probably safer to store just userID and have channel services resolve the actual delivery address via a profile lookup, ensuring we always use up-to-date info. This aspect can be abstracted, but we should note it in design.

  • Notification Log (Execution History): A table to record each time a reminder was executed (timestamp, reminderID, result, maybe the content that was sent, and status like delivered/failed). Partition by ReminderID (or by UserID) so that we can retrieve history for a given reminder or user easily. Sort by execution time so we can get recent executions quickly. This table can grow large; we might enforce a retention (e.g., keep last 30 days of history, or archive older entries to cold storage). The log is useful for debugging (e.g., “Why didn’t I get my reminder yesterday?” we can look it up and see if it failed to send). Each channel service would insert into this log after attempting to send (or the Scheduler might insert when it hands off to queue, marking “scheduled, pending delivery”). This is more for completeness; the core functionality doesn’t rely on the log.

Choice of Database: Given the volume and access patterns, a NoSQL datastore like Cassandra is well-suited. It can handle high write throughput (for inserts/updates on schedule table), and its wide-column model allows the composite key approach we outlined. Cassandra’s partitioning will distribute data by the partition key (which we set to time+segment), and replicate for fault tolerance. Another option is Amazon DynamoDB which has similar partition key concepts (we’d use a compound key with time bucket). DynamoDB even supports TTL on items, which could auto-expire past events if we used it for one-time schedules – but for recurring, we keep updating future times. If we prefer SQL for simplicity, we’d likely partition the schedule table by date (e.g., separate physical partitions or tables for each day or hour) to aid performance, and use an index on time. However, the relational approach might become a bottleneck beyond a certain scale or require careful query planning (full table index on millions of rows might handle thousands of lookups per second, but tens of thousands could be tough). Therefore, we lean towards distributed NoSQL for the scheduler data. For the Reminders definition table, relational could work (since user-based lookups are not extremely high volume), but to simplify our stack, using the same store for both is fine. Cassandra can store the reminders by user partition, as described, and it’s efficient as long as no single user has an enormous number of entries (500 max in our case, which is trivial within one partition).

Example: A sample entry in the Schedule table might look like:

Partition (ExecutionMinute=17084040, Segment=3): { ReminderID: "rem12345", UserID: "user6789", Message: "Meeting in 5 minutes", Channels: ["email","push"], NextExecutionTime: 2025-06-01T09:00:00Z, TimeZone: "America/Los_Angeles" }

This indicates that the reminder rem12345 for user 6789 is due at that UTC time. The scheduler querying the partition for 2025-06-01 09:00 UTC, segment 3 will find this. It can then process sending “Meeting in 5 minutes” via email and push. After sending, it will compute the next occurrence (say daily, so nextExecutionTime becomes 2025-06-02 09:00:00Z) and insert a new entry possibly in a different partition (minute and maybe different segment) for the next day.

6.1 Scheduling Algorithm and Reminder Execution Pipeline

The Scheduler Service is responsible for executing reminders on schedule. It’s essentially a distributed cron engine tailored to our data model. Here’s how it functions in detail:

  • Scheduler Instances and Coordination: We will run multiple instances of the Scheduler service to handle the volume. To avoid two instances picking the same reminder, we partition the work by the keys we set up (time and segment). One approach is to have a primary coordinator for schedulers that assigns each instance a set of segments to handle. For example, if we have 8 scheduler workers and we defined segments 1–8, the coordinator can assign each worker one segment (or if fewer workers than segments, some get multiple). The coordinator can also monitor heartbeats – if a worker dies, the coordinator reassigns that worker’s segment range to another, so no scheduled tasks are missed. This dynamic segment assignment ensures fault tolerance. Alternatively, we could design it such that each scheduler instance knows its index and just takes segment = index mod N (if stable), but the explicit coordinator approach is more flexible for scaling up/down and handling failures.

  • Time-based Scanning: Each scheduler instance runs a loop (or a timed trigger) to scan for due reminders in its assigned segment(s). The frequency of scanning should be high to meet our timing needs. A simple strategy is: every second (or every few seconds), check the Schedule table for entries where NextExecutionTime equals the current time (to the minute or second) and Segment ∈ (the segments this instance is responsible for). For example, at 12:00:00, scheduler instance for segment 3 will query “SELECT * FROM ScheduleTable WHERE ExecutionMinute = 2025-05-19T06:11 (in epoch) AND Segment = 3” (if minute-level). It will retrieve all reminders due in that minute for that segment. If we use second-level precision, it might include seconds in the key and query exactly a timestamp match. In practice, we might fetch a small range (<= now time) to catch any that might have been slightly overdue (for instance, if there was a slight delay, anything up to now gets picked). The result of the query is a list of due reminders that this instance should process. Because we segmented by Segment and coordinated assignments, no other scheduler will handle these same entries, avoiding double processing.

  • Processing Due Reminders: For each due reminder retrieved, the Scheduler will perform the following steps:

    1. Fetch/Verify Reminder Details: If the schedule entry doesn’t contain all needed info (e.g., maybe it only has ReminderID and we need the full message or updated channels), the scheduler will fetch the corresponding record from the Reminders table (using UserID + ReminderID). This ensures we have the latest data in case the user updated the reminder after it was initially scheduled. Alternatively, if we stored the message and channels in the schedule entry (denormalized), we might trust that data for sending. But to be safe regarding updates, a quick lookup can help (this is a trade-off: extra read vs. ensuring latest info). Because this is keyed by UserID partition, it’s a fast primary-key read.

    2. Compute All Target Notifications: Determine which channels need to be notified and gather the destination addresses. For example, if channels = [Email, Push], fetch the user’s email address (maybe from the reminder or a user profile service) and the user’s device push token (perhaps stored when they logged in). This could involve calling a User/Device service or looking up a cache. We then prepare the payload for each channel (e.g., format the email subject/body, the SMS text, etc., possibly using templates if needed).

    3. Enqueue Notification(s): For each channel, create a message and push it to the Notification Queue for that channel. If using a unified queue with a channel field, publish one message with all info; channel processors will filter or ignore if not their type. If using separate queues per channel (likely simpler), push one message to each relevant queue/topic. Each message contains at least: the reminder ID, user ID, message content, and target contact (or enough to derive it). The queue will handle durable storage and delivery to consumers. We ensure the enqueue operation is robust – if the queue is down, we might have to retry or, worst case, mark the reminder as not sent. Typically, these queue systems are highly available.

    4. Schedule Next Occurrence: If the reminder is recurring (which it is, in our case), calculate the next execution time after the current one. This depends on the recurrence pattern:

      • Hourly: add the hour interval. E.g., if it’s every 4 hours, next = current time + 4 hours (taking time zone into account – likely our times are in UTC internally for schedule, so 4h is straightforward in UTC for a fixed interval).
      • Daily: we need to add 1 day in the user’s local date. This is where we use the stored TimeZone. We can take the current execution time, interpret it in the user’s timezone, then add one day (24 hours in that timezone context, which accounts for DST if applicable). For example, if it was 9:00 AM PDT on June 1, the next local 9:00 AM might be 24 hours later which on UTC could be +24 or +23/+25 hours if crossing a DST boundary. A robust way is using a library: e.g., in Java, use the ZoneId and ZonedDateTime for the user’s zone, then .plusDays(1) for daily recurrence, which auto-adjusts for DST jumps. Similarly for monthly: add one month (which handles month lengths, though if something was on the 31st and next month has 30 days, one strategy is to move to last day of month or choose 30th; these are policy decisions).
      • Monthly: likely interpreted as “same day of month each time” (if the user set 15th of every month at 10:00, do that). We must handle if a month is shorter (some systems send on the last day if so, or on the 1st of next month – but typically last day of shorter month is a common approach).
      • We update the Schedule Table: remove or mark the old entry as done and insert a new entry with the new NextExecutionTime (converted to UTC) and the same ReminderID, assigning it a new random Segment value (to keep distribution even) unless we choose to keep the same segment (random each time is fine to balance load long-term). This insertion is a simple DB write that will be picked up in the future when its time comes. If the reminder was marked as “non-recurring” or the user only wanted one occurrence (like a one-time scheduled notification), we would simply not insert a new one (thus completing the job). Here, since they are recurring by definition, we continue until the user deletes it.
    5. Acknowledge completion: Depending on implementation, we may update the Reminders table’s LastSentTime or similar for record-keeping. The history log entry can also be added now (e.g., “reminder X was executed at time Y, enqueued to channels…”).

    6. Error Handling: If any step fails (e.g., database update fails or queue push fails), the Scheduler should have a retry mechanism. For instance, it might try a few times or move on and let a future tick pick it up again. It’s crucial to avoid a scenario where a reminder is “stuck” and never gets enqueued. We might use an atomic update approach: e.g., use a lightweight transaction or a compare-and-set to mark a reminder as being processed to avoid duplicate processing by a failover node (if our coordination is robust, we may not need this). We also ensure idempotency: if, by rare chance, two scheduler nodes tried the same reminder, the reminderID and timestamp can serve as an idempotency key to avoid sending twice (channel services can check if they already sent a notification with that ID). We aim to design so that doesn’t happen, but belts-and-suspenders is good for reliability.

  • Frequency of Checks: As mentioned, to achieve low latency, we might have the Scheduler check every second or every few seconds for due tasks. With the partitioning strategy, these checks are cheap (a single partition read per segment for the current time slot). If we did it every second and there are, say, up to 50k tasks per minute, that’s on average <1000 tasks per second being fetched – split among segments, each segment might only see a few hundred per second, which is doable. The scheduler can batch process all due tasks in that second’s partition. If we use minute-level partitions, the scheduler might run every 10-15 seconds and fetch the current minute’s tasks, or fetch a minute a little ahead of time and then sleep. We might even look one minute into the future and pre-fetch tasks into memory, scheduling them on an internal timer for exact second dispatch. This could further tighten accuracy but adds complexity. A simpler approach: store with second precision in DB and query frequently. Given modern databases and moderate partition sizes, querying every second is not impossible, but could cause a lot of tiny queries. Possibly querying every 5 seconds for tasks in the next 5-second window is a compromise. In any case, the design target is being within a few seconds tolerance. The exact scheduling tick can be tuned and multiple schedulers ensure parallelism.

  • Scaling the Scheduler: Using the segment partitioning, we can horizontally scale. If one Scheduler instance can process, say, 1000 tasks per second, and peak we need 10,000 per second, we deploy 10 instances (and set number of segments accordingly). Because we assign segments and each scheduler only queries its segments, they operate in parallel safely. We avoid two instances hitting the same partition concurrently by design. If an instance goes down, its segments will not be queried until reassigned – the coordinator will detect the missed heartbeat and reassign those segments to others or a standby. This should happen quickly to avoid missing tasks. Even if a minute’s tasks were partially processed and the instance died, when a new scheduler picks up that segment, it might find some tasks whose time is already passed; it should still process them (perhaps marked as slightly delayed but still deliver). This guarantees at-least-once delivery even in failure scenarios, though a few notifications could be late by a few seconds or a minute during a failover.

  • Avoiding Skew and Hotspots: If a very popular time (like at exactly 9:00:00) has an enormous number of reminders, we rely on segments to distribute those among multiple machines. Also, randomizing segment assignment on insertion spreads out reminders roughly evenly. If one user has 500 reminders all due at 9:00, they’ll be randomly assigned segments 1–N so they won’t all fall on one segment. This smooths the per-segment load. We choose the number of segments (N) to be >= max number of scheduler instances we might need. For example, if we foresee up to 20 instances, we might choose N=32 segments, leaving headroom. This way, we can always distribute segments without running out. The coordinator can assign roughly equal count of segments to each instance (some might have one more than others if not divisible).

  • Example Flow: Suppose user Alice creates a daily 9:00 AM reminder in PST (UTC-8 standard, UTC-7 DST). She does this at 1 PM on Jan 1. The Reminder Service calculates the first occurrence: if 9 AM Jan 1 already passed, then Jan 2 9:00 AM PST is next. That in UTC is Jan 2 17:00 (5pm) UTC (assuming PST vs UTC). It stores an entry with NextExecutionTime = 2025-01-02T17:00Z. Now on Jan 2, when time approaches 16:59:xx UTC, the scheduler (responsible for that time slot) will pick up the entry. It will see it’s due at 17:00. It waits until roughly that time (maybe it queries at 16:59:55 for anything up to 17:00:00). It finds Alice’s reminder. It fetches details (message “Daily Stand-up meeting”, channel “push notification”), and sends off to the queue. The Push Notification service picks it up and sends to Alice’s device. Meanwhile, Scheduler computes next occurrence: Jan 3 9:00 AM PST which is Jan 3 17:00 UTC (if DST not changed yet). Inserts a new schedule entry for that. On March 13, when DST shift happens (clocks jump forward to UTC-7 to UTC-8 offset or vice versa), the Scheduler will compute next execution using timezone rules – on DST start, 9:00 might jump an hour in UTC. It will handle that via proper timezone arithmetic. Thus Alice never notices anything except her reminder always comes at 9:00 local.

6.2 Notification Delivery per Channel

Once the Scheduler has enqueued the notification events, the Channel Processors take over to actually deliver messages. We design each channel service to be scalable and reliable:

  • Email Service: This service (likely a cluster of workers) pulls email-type notifications from the queue. For each message, it prepares an email. This could involve plugging the reminder text into an email template (for example, a subject like “Reminder: {Reminder Text}”). The service then sends the email using an SMTP server or an email sending API. At web-scale, sending email directly via SMTP might be too slow (opening connections etc.), so integration with a service like Amazon SES or SendGrid is preferred – they handle the heavy lifting of email infrastructure. We just call their API (possibly batched). The Email service should handle results – e.g., the API might immediately say “accepted” or might return an error. If an error occurs (e.g., transient network issue or provider outage), the service should retry a few times. If it ultimately fails, it logs the failure (and maybe pushes the task to a dead-letter queue or marks it in the database for later retry). We also consider quotas – if the user has too many emails going out (500 daily reminders all at once), we rely on provider limitations or perhaps implement our own rate limiting (but since the user explicitly set them, we assume it’s okay, though we might stagger them by a few seconds to not appear as spam). The service also needs to handle email bounces or invalid addresses – those could be fed back into a “do not send” list if email is bad. However, that’s more of an email system concern. For our design, main point is it sends reliably and scales horizontally (we can run many instances pulling from the email queue). If using Kafka, we can have a consumer group such that partitions of the queue are split across instances (achieving parallel consumption). The queue ensures each message goes to one consumer.

  • SMS Service: Similar to email, this service reads SMS notification events. It calls an SMS gateway API (like Twilio) to send the text. Usually, you provide the phone number and message content. We likely have the user’s phone on file (perhaps in the user profile, linked by UserID). If not included in the message, the SMS service might do a quick lookup to get the phone number. After sending, we get a success or failure. SMS gateways might have rate limits (e.g., Twilio might limit X messages/second per number or account). We may need to throttle or purchase more capacity if needed. For scale, multiple SMS service instances can send concurrently; the gateway and telecom networks handle queuing on their side too. We implement retries for failures (e.g., network timeout, or gateway returned an error). If a phone is invalid or opt-out, the gateway usually tells us and we should respect that (maybe marking that user’s SMS channel as invalid). Given SMS can be costly, we also ensure we only send what’s necessary (the system might avoid duplicate SMS if the same reminder was already delivered by another channel, but since the user likely wanted both, we do both). SMS content is short, we might truncate or split if too long for one message.

  • Push Notification Service: This handles push notifications for mobile/web. For mobile apps, we use Apple APNs and Google FCM. We need device tokens or registration IDs from the user’s devices. Let’s assume the user’s devices are registered in a separate Device Service – our Push service might query that by UserID to get all active device tokens to send the reminder to (e.g., user might have phone and tablet). For each device, it constructs a payload (title/body for the notification). Then it connects to APNs or FCM (usually over HTTP/2 or their APIs) to send the notification. These services are quite scalable but also have rate limits and require maintaining authentication (keys, certs). We manage that within the Push service. The Push service should also handle responses – e.g., if a device token is invalid (user uninstalled app), APNs/FCM will return an error, and we should unregister that token in our device database. For reliability, push notifications are typically fire-and-forget (APNs/FCM will handle delivery to device if online). We rely on their infrastructure for retries to the device. Our concern is just getting it to their servers successfully. We might do limited retry on failure to contact APNs/FCM. This service also scales horizontally (multiple instances can each handle subsets of notifications; if using Kafka, partition by user or by device ID to ensure ordering per device perhaps).

  • In-App Notification Service: For in-app, since the user might be looking at the app or might open it later, the strategy can differ:

    • If the user is currently online (e.g., has a WebSocket connection to the server or is polling), we can push the notification in real-time. This could be done via a WebSocket server or push gateway that the app is connected to. In design terms, we might integrate with an existing real-time messaging system (for example, if the app uses something like Socket.io or a pub-sub for live updates). The In-App service would publish the reminder event to that channel (e.g., user-specific channel).
    • Additionally, or if the user is offline, we store the notification in a persistent store (maybe this is simply the same as the log table or another “notifications inbox” table keyed by user). The next time the user opens the app, they fetch unseen notifications. Many applications have a notification center in-app; we would populate that.
    • Implementation could be: the In-App service receives the event, writes a record to a UserNotifications table (UserID partition, with a list of notifications, unread flags, etc.), and also pings a real-time system to notify the user immediately if possible.
    • This service should ensure that even if the real-time push fails (user offline), the data is persisted so the user doesn’t miss it. It’s less about external integration (except the real-time channel) and more about internal data handling.
    • Scaling: It’s mostly writing to a DB and optionally sending realtime messages, which can be scaled by partitioning by user. Many app frameworks handle this as part of the main app, but we illustrate it as a service.
  • Unified Notification Processing: Each channel service operates independently, but we should maintain some consistency: e.g., if a reminder had multiple channels, ideally they all go out around the same time. By pulling from the queue, generally they will. Minor differences (one channel might send faster than another) are acceptable. We just ensure none are unreasonably delayed.

  • Logging and Acknowledgment: After a channel sends the notification, it can log the outcome. Successful sends might update the Notification Log (execution history) with a delivered status. Failures may either go to the log with a failed status and possibly trigger an alert if persistent. If a message fails after retries, we could escalate (maybe notify the user via an alternate channel that one channel failed? For example, if an email failed to deliver, perhaps send an in-app alert “Your email reminder could not be delivered.” This is an edge consideration and may not be necessary if they’ll see it in-app anyway). The system can also aggregate stats like delivery latency, success rates per channel, etc., for monitoring.

  • Idempotency: To guard against duplicate deliveries (in case the same message ID gets processed twice due to a rare bug or retry), each channel service can use an idempotency key (for example, a combination of ReminderID and the scheduled timestamp) and keep a cache of recently sent notifications. If it sees a duplicate, it can drop it. Or the Notification Queue could be configured for exactly-once processing if using something like Kafka with careful consumer management (though exactly-once at scale is tricky, at-least-once is more common). Our design leans on at-least-once and making operations idempotent, which is simpler.

6.3 Time Zone and Daylight Savings Handling

Handling time zones correctly is crucial for a reminder system – it’s what makes “recurring at 9:00 AM local time” possible. Here’s how we address it:

  • Storage of Time Zone: Each reminder created by a user will carry a TimeZone attribute (likely the IANA time zone string like “America/New_York”). The user’s client can provide this (most apps know the user’s locale or have it in profile). If not, we default to the user’s account timezone. This is stored in the Reminders table. We always keep the original intended timezone and local time, rather than converting to a fixed offset, because offsets change with DST. For example, we might store “Every day at 09:00 in America/Los_Angeles”.

  • Normalization to UTC for scheduling: Internally, for the Schedule table, we use UTC timestamps for NextExecutionTime. That means when we schedule the next occurrence, we calculate the exact UTC time when it will happen. The conversion uses the timezone rules. E.g., if user’s local 9:00 AM on a given date corresponds to 16:00 UTC, we store 16:00 UTC. This UTC value is what the scheduler queries against. This decouples scheduling from time zone math at runtime – the heavy lifting is done when computing next times, not when querying.

  • Calculating Next Occurrence with DST: We rely on standard libraries or algorithms for recurrences:

    • We can use a library like rrule (used in iCalendar recurrence) or simply use date-time arithmetic in the user’s timezone. For daily/weekly recurrences, the pattern is: take the last execution’s local date/time, add the interval (1 day, 7 days, etc.) in calendar terms in that timezone, then convert to UTC for storage. This automatically handles DST shifts. For example:

      • DST Spring Forward: Suppose a daily 2:30 AM reminder in a region where the clocks jump forward at 2:00 AM to 3:00 AM on March 14. On March 13, it fired at 2:30 AM. The next day, March 14, 2:30 AM local time does not exist (clocks skip from 1:59 to 3:00). Using calendar addition, what happens? Many libraries would roll such a time forward by the DST gap, effectively scheduling it at 3:00 AM (or skip to next valid time). We must decide our policy: we could say “skip that occurrence” or “fire at 3:00 AM”. Typically, calendar apps would fire at 3:00 AM in this case (so the user still gets a reminder albeit 30 min late on that day). Conversely, in Fall when clocks repeat, a daily 1:30 AM might happen twice; we likely only want one trigger (most systems fire at the first occurrence of 1:30 and then when clock repeats 1:00-2:00, they don’t fire again at the repeated hour to avoid duplicates). These nuances can be handled by leveraging library behavior or by storing an extra flag that DST changed. However, for our scope, assume using the timezone’s rules via a library will yield an acceptable outcome (most likely skip or adjust automatically). We can document that the reminder will adjust to DST transitions smoothly (maybe with minor one-time shifts).
    • For hourly intervals (say “every 4 hours”), we might treat those as fixed increments in absolute time (e.g., 4h in UTC) or as clock time increments. Usually “hourly” implies every X hours regardless of timezone shifts (so during DST changes, maybe one interval might be 3 or 5 hours as clock changes). But if a user says hourly, they probably mean “just keep coming every hour” which in practice is simpler: we can just add 3600*1000 ms to last UTC time for hourly intervals (DST doesn’t affect the interval – except the pattern relative to local time will temporarily shift by an hour during DST transitions, but that might be acceptable). Alternatively, if they specifically wanted it at :00 of each hour local time, that’s a different interpretation (less likely unless explicitly stated). We’ll assume hourly means a fixed period.

    • For monthly, we have to handle different month lengths. If someone sets a reminder on the 30th of each month, February doesn’t have 30th. We can decide to send on last day of Feb for that month. This can be an internal rule. Libraries like RRULE allow specifying “last weekday of month” etc., but we won’t go that deep. We simply note that our system would have logic for “if next occurrence falls on a non-existent date, adjust to last day of month”.

    • We will maintain a consistent approach: The original creation of the reminder will establish the pattern. For recurring tasks, it’s common to store a recurrence rule (like RRULE or Cron expression plus timezone). We can store something like “RRULE:FREQ=DAILY;INTERVAL=1;BYTIME=09:00;TZID=America/Los_Angeles”. That captures it fully. Then on each trigger, rather than naive addition, we could calculate the next occurrence via that rule (there are libraries to get “next occurrence after X”). This might be more precise and handle all cases (e.g., skip DST duplicate times elegantly). However, implementing RRULE might be overkill; a simpler custom logic as above is fine for typical patterns.

  • Time Zone Database Updates: We should note that time zone definitions can change (countries change DST rules, etc.). Our system should be updated with the latest tz database. If a user’s timezone offset changes due to law changes, reminders might shift unless updated. This is rare and probably beyond our need to handle dynamically, but good to mention that we’d keep our libraries updated.

  • User Timezone Changes: If a user travels or manually changes their preferred timezone in settings, how do we handle existing reminders? Possibly the reminder stays in the originally set timezone (e.g., you set daily 9am in PST, and even if you move to EST, maybe you still want it PST time). Or maybe the user expects it to now come at 9am in their new timezone. This is a product decision: perhaps tie reminders to a timezone or to “local time wherever you are”. For simplicity, assume reminders are bound to a fixed timezone chosen at creation (likely the user’s home zone). If the user updates it, we can update the stored TZ and recompute future times accordingly.

  • Testing DST: We would test scenarios around DST boundaries to ensure no double-send or drop: e.g., if a reminder is daily at 1:30 AM and the clock goes back causing 1:30 twice, we ensure to send only once. That might require the scheduler tracking that it already sent at that wall-time once. This could be managed by storing last execution timestamp and checking if the next computed time is not <= last (to avoid going backwards). But since we compute from last, likely it will move to next day anyway avoiding duplicates.

In summary, by storing timezone info and using proper conversions, we ensure scheduled times align with user expectations. The system always works internally in absolute time (UTC) for coordination, but all calculations of next times go through the lens of the user’s local calendar.

6.4 Failure Handling and Reliability

A robust system must anticipate various failure scenarios and mitigate them:

  • Scheduler Failover and No Single Point: As described, we run multiple scheduler instances with a coordinator. If a scheduler instance crashes mid-processing, the coordinator will detect it (missed heartbeat) and reassign its segments to another instance. The tasks that it was supposed to process for the current minute might be slightly delayed – the new instance will take over and query those partitions, possibly finding tasks that are already due (past by a minute). It should still process them (better late than never). This ensures the reminders are not lost. To minimize delay, the failover should be quick (heartbeat intervals of a few seconds). Additionally, because the state (the schedule table) is in the database, any instance can pick up any task – there is no in-memory state lost except maybe some currently processing tasks. So redundancy is inherent. We avoid single points of failure by clustering the DB (Cassandra is distributed with replication factor; Kafka for queue is also clustered; load balancers have failover; etc.). We should also have multiple instances of each channel service – if one email sender crashes, others continue, and the queue retains messages until consumed.

  • Exactly-Once vs At-Least-Once Delivery: Our system as designed leans towards at-least-once delivery. This means in rare cases (like a crash at just the wrong moment) a reminder might send twice, but we prefer that over missing it. For example, if a scheduler sent a message to the queue and crashed before it could update the next execution time, another scheduler might retry that task and send again. We can mitigate duplicates:

    • Use an idempotency key for sends (the combination of ReminderID and scheduled timestamp is unique for each occurrence). Channel services or even the external providers could filter duplicates if they get the same key. Or the Notification Queue can de-duplicate if configured (Kafka exactly-once processing or a de-dup layer).
    • The scheduler can mark an item as being processed (e.g., update a status flag in the schedule table) in an atomic operation, so that another node won’t process it. This is tricky in Cassandra (where lightweight transactions exist but are costly). The segment approach largely avoids parallel processing collisions, but not the case of a crash mid-way. We assume at-least-once and accept the possibility of a duplicate notification on rare occasions (with user impact being maybe two emails instead of one). We can mention in non-functional that we aim for at-least-once delivery and will handle duplicates in logic or apologize if they occur.
  • Database Failures: If the database (Cassandra) goes down or a partition becomes unavailable, the scheduler might fail to fetch due tasks or update times. Cassandra is designed for high availability; with proper replication, one node down doesn’t make data unreachable. The scheduler should handle transient query failures by retrying after a short delay. In worst-case scenario of a multi-node outage, we might temporarily not send some reminders (which is serious). That’s why multi-region or multi-cluster replication can be considered: maybe have a standby in another data center that can take over if the primary DB cluster fails. That adds complexity (multi-active clusters and ensuring no double sends across regions, etc.). At least within one region cluster, high replication (RF=3) is used so that data is durably stored.

  • Queue Failures: If our message queue (Kafka/Rabbit) is down, the scheduler might not be able to enqueue messages. We should design for the queue to be a reliable cluster as well. Kafka, for example, can survive broker failures if replicas are configured for partitions. If the queue is down entirely, the system effectively can’t deliver notifications. The scheduler could either buffer them (not ideal, as memory could overflow) or mark them in DB and try later. Perhaps if enqueue fails, we leave the schedule entry as still due (so it will be picked up again next scan) or move it to a “stuck” state. We would need to alert on this. A highly available queue mitigates this risk. Alternatively, use cloud managed queues with good SLAs.

  • Channel Failures: If an external service (email/SMS provider) is down or returns errors, our channel services will catch that. We implement retry policies – e.g., exponential backoff and retry up to a limit. We also have to be careful not to lose messages. If a channel service crashes while processing a batch from the queue, in at-least-once mode the message will reappear or be reprocessed by another consumer (depending on the queue mechanism). We must ensure our processing is idempotent since a retry might happen by a different instance. For example, if Email service crashes after sending the email but before acknowledging the queue, the queue might redeliver that message – resulting in potentially a duplicate email if we don’t check. Idempotency keys or storing a “sent” status in a temporary store can help. But because email might not easily deduplicate on content, we might accept a rare duplicate.

  • Ensuring No Notification Lost: To not lose notifications, every piece is durable: the schedule in DB is persistent, the queue is persistent (so once scheduler enqueues, it’s stored until delivered), and each channel attempt either succeeds or is logged for retry. If all retries fail, we log the failure. Perhaps we alert admin or even notify the user through an alternate path (“We couldn’t deliver your reminder via SMS, please check your phone or settings.”). That could be done via email or in-app as a fallback, but that’s optional.

  • Data Consistency on Updates: When a user updates or deletes a reminder, there’s a potential race if that reminder is about to fire. We handle this by always checking latest data at execution time. For example, if the user deletes a reminder at 10:00 and it was supposed to fire at 10:01, what happens? If the deletion removes the schedule entry before 10:01, the scheduler will simply not find it and nothing will send – good. If the timing was such that the scheduler already fetched it from DB just before deletion, we need a safeguard: perhaps mark reminders as deleted in the main table, and the scheduler double-checks the Reminders table status before sending. If it sees a deleted flag, it should skip sending. This check ensures no notification goes out after deletion. For updates (like changed time or channels), since we update DB, the next occurrence time or channel info might be different. If a reminder was due very soon and scheduler picked it up with old info, it might send the old message. This is a narrow window; to avoid it, we could implement a short lock: e.g., when updating a reminder’s time, set a flag that a change is in progress or just ensure the schedule entry is updated immediately. Possibly simpler: require that changes to a reminder that is due imminently (say in next 1 minute) might not take effect immediately – but that’s not great UX. Ideally, our system is quick enough that update operations update the schedule table atomically such that even if due within seconds, the entry reflects the new time, meaning scheduler won’t pick the old time. Using lightweight transactions or careful order of operations (delete old schedule entry, then update main) can help achieve that atomicity.

By planning for these failure scenarios, the system will gracefully handle errors: no single failure will completely stop reminders; at worst some might be delayed, but they should eventually go out. Our design aims for high reliability, borrowing techniques used in large-scale systems (redundancy, idempotent processing, partitioning to limit blast radius, and thorough monitoring).

To achieve web-scale performance, we incorporate several strategies and optimizations:

  • Sharding and Partitioning: We shard data and workload by natural keys:

    • Data Sharding: The Schedule table is partitioned by time (and segment) so that no single query or node handles all reminders. User-specific data is partitioned by UserID to distribute load evenly for user lookups. If we used SQL, we might physically partition tables by date or user range. In NoSQL (Cassandra), the partition keys ensure data is distributed across the cluster, avoiding hot spots (except possibly at very popular times, mitigated by segment sub-partitions).
    • Scheduler Workload Sharding: Using the segment mechanism, we effectively shard the scheduler processing. Each scheduler instance handles only a fraction of due reminders (e.g., 1/8th if 8 segments) at any given time, which multiplies our throughput linearly with number of instances. We can add more schedulers (and increase segment count if needed) to handle more tasks per time slice. This is horizontal scalability.
    • Channel Sharding: For channels, if one queue becomes a bottleneck, we could shard by user groups or message types. For example, if we truly had millions of notifications per minute, one Kafka topic might be fine (Kafka can handle high throughput with partitioning), but we could also have separate topics by region or by first letter of userID, etc. However, likely unnecessary unless extreme scale. Similarly, we could partition email sending by domain (some systems separate Gmail vs others if volume is huge to not overwhelm one).
  • Load Balancing:

    • We use load balancers at the front (user API requests) to distribute requests among many Reminder Service instances. This ensures we can handle high concurrency of user actions.
    • The scheduler coordinator load-balances the time segments among scheduler workers, as described, and can rebalance if one goes down.
    • The Notification Queue inherently load-balances via consumer groups: multiple consumers will automatically share the load of messages. For instance, if we have 5 Email service instances in one consumer group for the email queue, Kafka (with say 10 partitions) will assign roughly 2 partitions to each, distributing the messages. In RabbitMQ, we might have a work queue where each consumer gets a fair round-robin share.
    • Load balancing is also needed for external integrations: e.g., if we run our own SMTP servers, we’d load balance outgoing mail across a pool of servers. If we use third-party, their API endpoints are load-balanced on their side. For SMS, if we use multiple providers, we could load balance or failover between them.
  • Asynchronous Processing & Queues: The entire pipeline after scheduling is async. By decoupling scheduling from sending using a queue, we allow each part to scale and not block others. The user doesn’t wait for a notification to be sent when creating a reminder – it’s all in the background. Asynchronous queues smooth out spikes: if 100k notifications all become due at 9:00, the scheduler will dump 100k messages into the queue quickly. The channel consumers will pull and send as fast as they can; if that takes a minute or two, the queue buffers them. This way, the scheduler isn’t slowed down by the actual send throughput. It can proceed to schedule next tasks. The queue also provides resilience: if a consumer is down, messages wait until it’s back (at-least-once delivery). We use high-throughput messaging systems (Kafka is a good choice given its scalability and ordering guarantees per key; RabbitMQ works too for moderate scale but might strain at extreme scale).

    • We might also employ asynchronous I/O in the channel services – e.g., an email sender can send to multiple addresses in parallel or use non-blocking IO to handle many requests concurrently within one process. This increases throughput per machine. For example, using async HTTP calls to push notifications so one thread can handle many requests concurrently.
  • Caching: We leverage caching in a few places:

    • User and Channel Info: Cache frequently accessed data like user email or phone number, so that the channel services don’t always query a database or user service. For instance, when the Email service gets a message for user X, it can retrieve from an in-memory cache the email address of user X (populated on first use or via a subscription to user updates). This avoids an extra DB hit per message.
    • Time Zone Calculations: Calculating next occurrences is not heavy, but using proper libraries is fine. If performance became an issue, we could cache some computations, like precomputing next run times for certain patterns. But since each reminder is somewhat unique, there’s limited sharing. However, we might maintain a cache of timezone offset changes (e.g., know that DST next change is on date D for a given zone, etc.) to quickly adjust times. This is likely micro-optimization.
    • Reminder Lookup Cache: The Scheduler might keep a small cache of recently accessed reminders or results to avoid hitting the DB repeatedly for the same data. But since each reminder triggers maybe once an hour or day, it’s not repeatedly fetched within short time, so not much benefit.
    • In-App Notification Cache: We might cache recent notifications for a user in memory so if the user’s app queries for new notifications frequently, we don’t always hit the DB. But this is more of an application side optimization.
    • Configuration Cache: Things like segment assignments by coordinator can be cached or stored in a fast store (like ZooKeeper or etcd) that workers refer to, but that’s internal.
    • Overall, caching is not the biggest factor here since our heavy operations are more about throughput of distinct events rather than reusing the same data, but these targeted caches help reduce latency a bit.
  • Performance Optimizations:

    • Batching: Where possible, batch operations. E.g., the Scheduler can pull a batch of due reminders in one DB query (Cassandra will return all in that partition). Instead of processing one by one, it can batch insert the next occurrences or batch send to queue (some queue libraries allow batch publish). For example, Kafka can batch produce messages for better throughput. Similarly, channel services might batch send if supported: e.g., some email APIs allow sending in bulk (multiple recipients) in one call. If we had many emails at once, we could group by email domain or server. But careful: each reminder might be personal, so likely one email per call. Still, opening one connection to SMTP and sending multiple commands is better than reconnecting each time.
    • Event Driven and Streaming: We treat the flow as a stream of events. Using Kafka, we could even eliminate some DB querying: for instance, we could stream upcoming events through a timeline. However, a pure streaming solution for scheduling (like using a sorted set or wheel) can be complex at scale, so we stuck with DB + poll. But as an advanced thought: one could imagine writing all schedule events to a distributed log (Kafka topic keyed by execution time) and have consumers that wait for the correct time to emit them. In practice, Kafka now has features like using it as a delayed queue (with a timestamp on messages and consumers can skip ahead). This might be experimental. Our approach is more straightforward.
    • Parallelism: We maximize parallel processing: multiple threads or async processes handle different reminders concurrently. Within a single scheduler instance, if it finds 1000 due tasks, it could process them in parallel (spawn worker threads to enqueue messages). Similarly, channel services can multi-thread sending. We must be mindful of not overloading downstream (e.g., if you spawn 1000 threads to send 1000 SMS simultaneously, the SMS provider might choke). Usually some controlled concurrency with rate limiting is wise.
    • Advanced Data Structures: For extremely high timer volumes, some systems use a hierarchical timing wheel in memory to schedule near-term tasks efficiently. We could incorporate something: e.g., keep the next few seconds of tasks in an in-memory wheel to pop them exactly on time (reducing DB polling overhead). The scheduler could load tasks due in the next 60 seconds and then use a precise timer to trigger each. This is complex but reduces DB hits by grouping queries. Given our requirement of a few seconds accuracy, the DB polling approach (with one query per second or per segment) is acceptable and simpler. But if optimization needed, an in-memory buffer is an option.

In conclusion, through sharding, horizontal scaling, load balancing, asynchronous pipelines, caching, and careful data modeling, this design can handle the demanding throughput of a web-scale recurring reminder system. It draws on best practices from large notification systems – decoupling via queues, ensuring partitions for parallelism, and reliable delivery with at-least-once guarantees. This should meet the requirements of delivering millions of timely reminders daily, across the world, within seconds of their scheduled times.

.....

.....

.....

Like the course? Get enrolled and start learning!

Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible