0% completed
Problem
Table: Matches
+-------------+------+
| Column Name | Type |
+-------------+------+
| player_id | int |
| match_day | date |
| result | enum |
+-------------+------+
(player_id, match_day) is the primary key (combination of columns with unique values) for this table.
Each row of this table contains the ID of a player, the day of the match they played, and the result of that match.
The result column is an ENUM (category) type of ('Win', 'Draw', 'Lose').
Problem Definition
The winning streak of a player is the number of consecutive wins uninterrupted by draws or losses.
Write a solution to count the longest winning streak for each player.
Return the result table in any order.
Example
Output
Try It Yourself
Solution
To determine the longest winning streak for each player, we need to analyze the Matches
table to identify consecutive wins without any interruptions by draws or losses. Below is a comprehensive breakdown of the approach, step-by-step execution, and detailed explanations for each line of the SQL query.
Approach Overview
- Transform Match Results:
- Convert the categorical
result
into numerical indicators to facilitate streak calculations.
- Convert the categorical
- Calculate Cumulative Non-Win Counts (
se
):- For each player, compute a running total of non-win results to identify streak segments.
- Determine Streak Lengths:
- Group matches based on the cumulative non-win counts to isolate winning streaks.
- Compute Longest Streak per Player:
- For each player, identify the maximum streak length from the grouped data.
- Present the Final Results:
- Display each player's ID alongside their longest winning streak.
SQL Query
SELECT player_id, Max(streak) AS longest_streak FROM (SELECT player_id, se, CASE WHEN se = 0 THEN Count(se) ELSE Count(se) - 1 END AS streak FROM (SELECT player_id, match_day, Sum(r) OVER ( partition BY player_id ORDER BY match_day) AS se FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a)b GROUP BY player_id, se) c GROUP BY player_id;
Step-by-Step Approach
Step 1: Transform Match Results into Numerical Indicators
Convert the categorical result
into numerical values where a win is represented by 0
and any other outcome (Draw
or Lose
) is represented by 1
. This transformation simplifies the calculation of winning streaks.
SQL Snippet:
SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches;
Explanation:
-
SELECT player_id, match_day,
- Selects the
player_id
andmatch_day
columns from theMatches
table.
- Selects the
-
CASE
- Initiates a conditional statement to transform the
result
column.
- Initiates a conditional statement to transform the
-
WHEN result = 'win' THEN 0
- Assigns a value of
0
if theresult
is'win'
.
- Assigns a value of
-
ELSE 1
- Assigns a value of
1
for any other result ('Draw'
or'Lose'
).
- Assigns a value of
-
END r
- Concludes the
CASE
statement and aliases the resulting value asr
.
- Concludes the
-
FROM Matches;
- Specifies the
Matches
table as the data source.
- Specifies the
Intermediate Output After Step 1:
+-----------+------------+---+ | player_id | match_day | r | +-----------+------------+---+ | 1 | 2022-01-17 | 0 | | 1 | 2022-01-18 | 0 | | 1 | 2022-01-25 | 0 | | 1 | 2022-01-31 | 1 | | 1 | 2022-02-08 | 0 | | 2 | 2022-02-06 | 1 | | 2 | 2022-02-08 | 1 | | 3 | 2022-03-30 | 0 | +-----------+------------+---+
Step 2: Calculate Cumulative Non-Win Counts (se
)
For each player, compute a running total of non-win results (r
) ordered by match_day
. This cumulative sum (se
) helps in identifying streak segments by indicating interruptions in winning streaks.
SQL Snippet:
SELECT player_id, match_day, Sum(r) OVER ( partition BY player_id ORDER BY match_day) AS se FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a;
Explanation:
-
SELECT player_id, match_day,
- Selects the
player_id
andmatch_day
columns from the subquerya
.
- Selects the
-
Sum(r)
- Calculates the cumulative sum of
r
values.
- Calculates the cumulative sum of
-
OVER (
- Introduces the window function to define how the cumulative sum is calculated.
-
partition BY player_id
- Divides the data into partitions based on
player_id
. This means the cumulative sum resets for each player.
- Divides the data into partitions based on
-
ORDER BY match_day)
- Orders the data within each partition by
match_day
to ensure the cumulative sum is calculated chronologically.
- Orders the data within each partition by
-
AS se
- Aliases the resulting cumulative sum as
se
.
- Aliases the resulting cumulative sum as
-
FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a;
- Specifies the subquery
a
which transforms theresult
into numerical indicators as done in Step 1.
- Specifies the subquery
Intermediate Output After Step 2 (se
):
+-----------+------------+----+ | player_id | match_day | se | +-----------+------------+----+ | 1 | 2022-01-17 | 0 | | 1 | 2022-01-18 | 0 | | 1 | 2022-01-25 | 0 | | 1 | 2022-01-31 | 1 | | 1 | 2022-02-08 | 1 | | 2 | 2022-02-06 | 1 | | 2 | 2022-02-08 | 2 | | 3 | 2022-03-30 | 0 | +-----------+------------+----+
Step 3: Determine Streak Lengths
Identify streak segments by grouping consecutive matches based on the cumulative non-win counts (se
). Calculate the length of each streak.
SQL Snippet:
SELECT player_id, se, CASE WHEN se = 0 THEN Count(se) ELSE Count(se) - 1 END AS streak FROM (SELECT player_id, match_day, Sum(r) OVER ( partition BY player_id ORDER BY match_day) AS se FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a)b GROUP BY player_id, se;
Explanation:
-
SELECT player_id, se,
- Selects the
player_id
and cumulative non-win countse
from the subqueryb
.
- Selects the
-
CASE
- Initiates a conditional statement to calculate the streak length based on
se
.
- Initiates a conditional statement to calculate the streak length based on
-
WHEN se = 0 THEN Count(se)
- If
se = 0
, it means the player has not had any non-wins up to that point, indicating an uninterrupted streak of wins. The streak length is equal to the count of such entries.
- If
-
ELSE Count(se) - 1
- If
se > 0
, it indicates that the streak has been interrupted by at least one non-win. The streak length is calculated as the count minus one to exclude the match that broke the streak.
- If
-
END AS streak
- Concludes the
CASE
statement and aliases the result asstreak
.
- Concludes the
-
FROM (...) b
- Specifies that the data is being selected from the subquery
b
, which containsplayer_id
,match_day
, andse
.
- Specifies that the data is being selected from the subquery
-
GROUP BY player_id, se;
- Groups the data by
player_id
andse
to aggregate the streak lengths accordingly.
- Groups the data by
Intermediate Output After Step 3:
+-----------+----+--------+ | player_id | se | streak | +-----------+----+--------+ | 1 | 0 | 3 | | 1 | 1 | 0 | | 2 | 1 | 0 | | 2 | 2 | 0 | | 3 | 0 | 1 | +-----------+----+--------+
Step 4: Compute Longest Streak per Player
For each player, identify the maximum streak length from the previously calculated streaks.
SQL Snippet:
SELECT player_id, Max(streak) AS longest_streak FROM (SELECT player_id, se, CASE WHEN se = 0 THEN Count(se) ELSE Count(se) - 1 END AS streak FROM (SELECT player_id, match_day, Sum(r) OVER ( partition BY player_id ORDER BY match_day) AS se FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a)b GROUP BY player_id, se) c GROUP BY player_id;
Explanation:
-
SELECT player_id, Max(streak) AS longest_streak
- Selects the
player_id
and calculates the maximum streak value for each player, aliasing it aslongest_streak
.
- Selects the
-
FROM (...) c
- Specifies that the data is being selected from the subquery
c
, which containsplayer_id
,se
, andstreak
.
- Specifies that the data is being selected from the subquery
-
GROUP BY player_id;
- Groups the data by
player_id
to perform the aggregation (Max(streak)
) for each player.
- Groups the data by
Final Output:
+-----------+----------------+ | player_id | longest_streak | +-----------+----------------+ | 1 | 3 | | 2 | 0 | | 3 | 1 | +-----------+----------------+
.....
.....
.....
Table of Contents
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible