0% completed
Problem
Table: Stadium
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| visit_date | date |
| people | int |
+---------------+---------+
visit_date is the column with unique values for this table.
Each row of this table contains the visit date and visit id to the stadium with the number of people during the visit.
As the id increases, the date increases as well.
Problem Definition
Write a solution to display the records with three or more rows with consecutive id
's, and the number of people is greater than or equal to 100 for each.
Return the result table ordered by visit_date
in ascending order.
Example
Output
Try It Yourself
Solution
To identify and display records from the Stadium
table where three or more consecutive id
's have a number of people greater than or equal to 100, we can utilize SQL's window functions and Common Table Expressions (CTEs). This approach allows us to group consecutive qualifying records efficiently and filter out those groups that meet the specified criteria. The final results will be ordered by visit_date
in ascending order.
Approach Overview
-
Filter Records with
people
≥ 100:- Select only those records where the number of people is greater than or equal to 100. These are the records of interest for identifying consecutive groups.
-
Assign Group Identifiers for Consecutive
id
's:- Use the difference between the
id
and a sequential row number to assign a unique group identifier (grp
) for consecutive records. This technique ensures that consecutiveid
's share the samegrp
value.
- Use the difference between the
-
Identify Groups with Three or More Consecutive Records:
- Group the filtered records by the calculated
grp
and count the number of records in each group. Retain only those groups that have three or more records.
- Group the filtered records by the calculated
-
Retrieve the Desired Records:
- Join the qualifying groups back to the filtered records to extract the complete details (
id
,visit_date
,people
) of each record within these groups.
- Join the qualifying groups back to the filtered records to extract the complete details (
-
Order the Results:
- Sort the final output by
visit_date
in ascending order to present the data chronologically.
- Sort the final output by
SQL Query
WITH ConsecutiveGroups AS ( -- Step 1: Assign groups where `people` >= 100 and `id`'s are consecutive SELECT id, visit_date, people, id - ROW_NUMBER() OVER (ORDER BY id) AS grp FROM Stadium WHERE people >= 100 ), GroupedCounts AS ( -- Step 2: Count the number of rows in each group SELECT grp, COUNT(*) AS group_size FROM ConsecutiveGroups GROUP BY grp HAVING COUNT(*) >= 3 ) -- Step 3: Retrieve rows from groups with 3 or more consecutive rows SELECT c.id, c.visit_date, c.people FROM ConsecutiveGroups c JOIN GroupedCounts g ON c.grp = g.grp ORDER BY c.visit_date;
Step-by-Step Explanation
Step 1: Assigning Group Identifiers (ConsecutiveGroups
CTE)
We begin by filtering the Stadium
records to include only those where the people
count is greater than or equal to 100. For these filtered records, we assign a group identifier (grp
) based on the difference between the id
and a sequential row number. This method ensures that consecutive id
's share the same grp
value, effectively grouping them together.
WITH ConsecutiveGroups AS ( SELECT id, visit_date, people, id - ROW_NUMBER() OVER (ORDER BY id) AS grp FROM Stadium WHERE people >= 100 )
Explanation of Each Line:
-
WITH ConsecutiveGroups AS (
- Initiates a Common Table Expression (CTE) named
ConsecutiveGroups
.
- Initiates a Common Table Expression (CTE) named
-
SELECT id, visit_date, people,
- Selects the
id
,visit_date
, andpeople
columns from theStadium
table.
- Selects the
-
id - ROW_NUMBER() OVER (ORDER BY id) AS grp
- Utilizes the
ROW_NUMBER()
window function to assign a sequential number to each row based on the ascending order ofid
. - Subtracts this row number from the
id
to compute a group identifier (grp
). For consecutiveid
's, this difference remains constant, thereby grouping them together.
- Utilizes the
-
FROM Stadium
- Specifies the
Stadium
table as the source of data.
- Specifies the
-
WHERE people >= 100
- Filters the records to include only those where the
people
count is greater than or equal to 100.
- Filters the records to include only those where the
Intermediate Output After Step 1:
Based on the provided input data:
id | visit_date | people | grp |
---|---|---|---|
2 | 2017-01-02 | 109 | 1 |
3 | 2017-01-03 | 150 | 1 |
5 | 2017-01-05 | 145 | 2 |
6 | 2017-01-06 | 1455 | 2 |
7 | 2017-01-07 | 199 | 2 |
8 | 2017-01-08 | 188 | 2 |
Note:
- Record with
id
= 4 is excluded becausepeople
= 99, which is less than 100.
Step 2: Counting Records in Each Group (GroupedCounts
CTE)
Next, we group the filtered records by the grp
identifier and count the number of records in each group. We retain only those groups that have three or more records, as they meet the criteria of consecutive id
's with people
≥ 100.
GroupedCounts AS ( SELECT grp, COUNT(*) AS group_size FROM ConsecutiveGroups GROUP BY grp HAVING COUNT(*) >= 3 )
Explanation of Each Line:
-
GroupedCounts AS (
- Initiates another CTE named
GroupedCounts
.
- Initiates another CTE named
-
SELECT grp, COUNT(*) AS group_size
- Selects the
grp
identifier and counts the number of records in each group, aliasing it asgroup_size
.
- Selects the
-
FROM ConsecutiveGroups
- Specifies the
ConsecutiveGroups
CTE as the source of data.
- Specifies the
-
GROUP BY grp
- Groups the records based on the
grp
identifier.
- Groups the records based on the
-
HAVING COUNT(*) >= 3
- Filters the groups to include only those with three or more records.
Intermediate Output After Step 2:
grp | group_size |
---|---|
2 | 4 |
Step 3: Retrieving Qualified Records
Finally, we join the ConsecutiveGroups
CTE with the GroupedCounts
CTE on the grp
identifier to extract all records that belong to groups with three or more consecutive id
's. The results are then ordered by visit_date
in ascending order.
SELECT c.id, c.visit_date, c.people FROM ConsecutiveGroups c JOIN GroupedCounts g ON c.grp = g.grp ORDER BY c.visit_date;
Explanation of Each Line:
-
SELECT c.id, c.visit_date, c.people
- Selects the
id
,visit_date
, andpeople
columns from theConsecutiveGroups
CTE.
- Selects the
-
JOIN GroupedCounts g ON c.grp = g.grp
- Performs an inner join with the
GroupedCounts
CTE (aliased asg
) on thegrp
identifier. This ensures that only records from groups that have three or more consecutiveid
's are included.
- Performs an inner join with the
-
ORDER BY c.visit_date
- Orders the final results by
visit_date
in ascending order to present the data chronologically.
- Orders the final results by
Final Output:
Based on the provided input data, the final output will include records from groups that have three or more consecutive id
's with people
≥ 100, ordered by visit_date
in ascending order.
+----+----------------------------+--------+ | id | visit_date | people | +----+----------------------------+--------+ | 5 | 2017-01-05T00:00:00.000Z | 145 | | 6 | 2017-01-06T00:00:00.000Z | 1455 | | 7 | 2017-01-07T00:00:00.000Z | 199 | | 8 | 2017-01-08T00:00:00.000Z | 188 | +----+----------------------------+--------+
.....
.....
.....
Table of Contents
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible