Grokking SQL for Tech Interviews
Ask Author
Back to course home

0% completed

Vote For New Content
13. Active Users
Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Problem

Table: Accounts

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| id            | int     |
| name          | varchar |
+---------------+---------+
id is the primary key (column with unique values) for this table.
This table contains the account id and the user name of each account.

Table: Logins

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| id            | int     |
| login_date    | date    |
+---------------+---------+
This table may contain duplicate rows.
This table contains the account id of the user who logged in and the login date. A user may log in multiple times in the day.

Problem Definition

Active users are those who logged in to their accounts for five or more consecutive days.

Write a solution to find the id and the name of active users.

Return the result table ordered by id.

Example

Image

Output

Image

Try It Yourself

MYSQL
MYSQL

. . . .

Solution

To identify active users who have logged in for five or more consecutive days, the SQL solution employs a self-join and aggregation. The SELECT clause retrieves distinct account IDs (l1.id) and uses a subquery to fetch the corresponding names from the "Accounts" table. The FROM clause specifies the "Logins" table with aliases (l1 and l2) for the self-join.

The self-join is established in the JOIN clause, connecting rows where the account IDs are the same (l1.id = l2.id) and the login dates have a difference between 1 and 4 days (DATEDIFF(l2.login_date, l1.login_date) BETWEEN 1 AND 4). This ensures that the login dates are consecutive.

The GROUP BY clause is used to group the results by account ID (l1.id) and login date (l1.login_date). The HAVING clause filters the groups to include only those where the count of distinct login dates (COUNT(DISTINCT l2.login_date)) is equal to 4, indicating five or more consecutive logins.

SELECT DISTINCT l1.id, (SELECT NAME FROM accounts WHERE id = l1.id) AS NAME FROM logins l1 JOIN logins l2 ON l1.id = l2.id AND Datediff(l2.login_date, l1.login_date) BETWEEN 1 AND 4 GROUP BY l1.id, l1.login_date HAVING Count(DISTINCT l2.login_date) = 4

Let's break down the query step by step:

Step 1: Identify consecutive logins within a 4-day window

We use a self-join on the Logins table (l1 and l2) to find consecutive logins for each user where the login dates are within a 4-day window.

SELECT l1.id, (SELECT NAME FROM Accounts WHERE id = l1.id) AS NAME FROM Logins l1 JOIN Logins l2 ON l1.id = l2.id AND Datediff(l2.login_date, l1.login_date) BETWEEN 1 AND 4;

Output After Step 1:

+----+----------+ | id | name | +----+----------+ | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | +----+----------+

Step 2: Group by user and login date, filter for users with consecutive logins on 4 different days

We group the results from Step 1 by user and login date and filter out users who don't have logins on 4 different days.

SELECT l1.id, (SELECT NAME FROM Accounts WHERE id = l1.id) AS NAME, l1.login_date FROM Logins l1 JOIN Logins l2 ON l1.id = l2.id AND Datediff(l2.login_date, l1.login_date) BETWEEN 1 AND 4 GROUP BY l1.id, l1.login_date HAVING Count(DISTINCT l2.login_date) = 4;

Output After Step 2:

+----+----------+ | id | name | +----+----------+ | 7 | Jonathan | +----+----------+

Step 3: Get unique active users

We use DISTINCT to get unique active users from the filtered result.

SELECT DISTINCT id, NAME FROM ( SELECT l1.id, (SELECT NAME FROM Accounts WHERE id = l1.id) AS NAME, l1.login_date FROM Logins l1 JOIN Logins l2 ON l1.id = l2.id AND Datediff(l2.login_date, l1.login_date) BETWEEN 1 AND 4 GROUP BY l1.id, l1.login_date HAVING Count(DISTINCT l2.login_date) = 4 ) active_users;

Final Output:

+----+----------+ | id | name | +----+----------+ | 7 | Jonathan | +----+----------+

.....

.....

.....

Like the course? Get enrolled and start learning!

Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible