0% completed
Problem
Table: Accounts
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| name | varchar |
+---------------+---------+
id is the primary key (column with unique values) for this table.
This table contains the account id and the user name of each account.
Table: Logins
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| login_date | date |
+---------------+---------+
This table may contain duplicate rows.
This table contains the account id of the user who logged in and the login date. A user may log in multiple times in the day.
Problem Definition
Active users are those who logged in to their accounts for five or more consecutive days.
Write a solution to find the id and the name of active users.
Return the result table ordered by id
.
Example
Output
Try It Yourself
Solution
To identify active users who have logged in for five or more consecutive days, the SQL solution employs a self-join and aggregation. The SELECT
clause retrieves distinct account IDs (l1.id
) and uses a subquery to fetch the corresponding names from the "Accounts" table. The FROM
clause specifies the "Logins" table with aliases (l1
and l2
) for the self-join.
The self-join is established in the JOIN
clause, connecting rows where the account IDs are the same (l1.id = l2.id
) and the login dates have a difference between 1 and 4 days (DATEDIFF(l2.login_date, l1.login_date) BETWEEN 1 AND 4
). This ensures that the login dates are consecutive.
The GROUP BY
clause is used to group the results by account ID (l1.id
) and login date (l1.login_date
). The HAVING
clause filters the groups to include only those where the count of distinct login dates (COUNT(DISTINCT l2.login_date)
) is equal to 4, indicating five or more consecutive logins.
SELECT DISTINCT l1.id, (SELECT NAME FROM accounts WHERE id = l1.id) AS NAME FROM logins l1 JOIN logins l2 ON l1.id = l2.id AND Datediff(l2.login_date, l1.login_date) BETWEEN 1 AND 4 GROUP BY l1.id, l1.login_date HAVING Count(DISTINCT l2.login_date) = 4
Let's break down the query step by step:
Step 1: Identify consecutive logins within a 4-day window
We use a self-join on the Logins table (l1 and l2) to find consecutive logins for each user where the login dates are within a 4-day window.
SELECT l1.id, (SELECT NAME FROM Accounts WHERE id = l1.id) AS NAME FROM Logins l1 JOIN Logins l2 ON l1.id = l2.id AND Datediff(l2.login_date, l1.login_date) BETWEEN 1 AND 4;
Output After Step 1:
+----+----------+ | id | name | +----+----------+ | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | | 7 | Jonathan | +----+----------+
Step 2: Group by user and login date, filter for users with consecutive logins on 4 different days
We group the results from Step 1 by user and login date and filter out users who don't have logins on 4 different days.
SELECT l1.id, (SELECT NAME FROM Accounts WHERE id = l1.id) AS NAME, l1.login_date FROM Logins l1 JOIN Logins l2 ON l1.id = l2.id AND Datediff(l2.login_date, l1.login_date) BETWEEN 1 AND 4 GROUP BY l1.id, l1.login_date HAVING Count(DISTINCT l2.login_date) = 4;
Output After Step 2:
+----+----------+ | id | name | +----+----------+ | 7 | Jonathan | +----+----------+
Step 3: Get unique active users
We use DISTINCT to get unique active users from the filtered result.
SELECT DISTINCT id, NAME FROM ( SELECT l1.id, (SELECT NAME FROM Accounts WHERE id = l1.id) AS NAME, l1.login_date FROM Logins l1 JOIN Logins l2 ON l1.id = l2.id AND Datediff(l2.login_date, l1.login_date) BETWEEN 1 AND 4 GROUP BY l1.id, l1.login_date HAVING Count(DISTINCT l2.login_date) = 4 ) active_users;
Final Output:
+----+----------+ | id | name | +----+----------+ | 7 | Jonathan | +----+----------+
.....
.....
.....
Table of Contents
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible