Grokking SQL for Tech Interviews
Ask Author
Back to course home

0% completed

Vote For New Content
9. Longest Winning Streak
Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Problem

Table: Matches

+-------------+------+
| Column Name | Type |
+-------------+------+
| player_id   | int  |
| match_day   | date |
| result      | enum |
+-------------+------+
(player_id, match_day) is the primary key (combination of columns with unique values) for this table.
Each row of this table contains the ID of a player, the day of the match they played, and the result of that match.
The result column is an ENUM (category) type of ('Win', 'Draw', 'Lose').

Problem Definition

The winning streak of a player is the number of consecutive wins uninterrupted by draws or losses.

Write a solution to count the longest winning streak for each player.

Return the result table in any order.

Example

Image

Output

Image

Try It Yourself

MYSQL
MYSQL

. . . .

Solution

To determine the longest winning streak for each player, we need to analyze the Matches table to identify consecutive wins without any interruptions by draws or losses. Below is a comprehensive breakdown of the approach, step-by-step execution, and detailed explanations for each line of the SQL query.

Approach Overview

  1. Transform Match Results:
    • Convert the categorical result into numerical indicators to facilitate streak calculations.
  2. Calculate Cumulative Non-Win Counts (se):
    • For each player, compute a running total of non-win results to identify streak segments.
  3. Determine Streak Lengths:
    • Group matches based on the cumulative non-win counts to isolate winning streaks.
  4. Compute Longest Streak per Player:
    • For each player, identify the maximum streak length from the grouped data.
  5. Present the Final Results:
    • Display each player's ID alongside their longest winning streak.

SQL Query

SELECT player_id, Max(streak) AS longest_streak FROM (SELECT player_id, se, CASE WHEN se = 0 THEN Count(se) ELSE Count(se) - 1 END AS streak FROM (SELECT player_id, match_day, Sum(r) OVER ( partition BY player_id ORDER BY match_day) AS se FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a)b GROUP BY player_id, se) c GROUP BY player_id;

Step-by-Step Approach

Step 1: Transform Match Results into Numerical Indicators

Convert the categorical result into numerical values where a win is represented by 0 and any other outcome (Draw or Lose) is represented by 1. This transformation simplifies the calculation of winning streaks.

SQL Snippet:

SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches;

Explanation:

  1. SELECT player_id, match_day,

    • Selects the player_id and match_day columns from the Matches table.
  2. CASE

    • Initiates a conditional statement to transform the result column.
  3. WHEN result = 'win' THEN 0

    • Assigns a value of 0 if the result is 'win'.
  4. ELSE 1

    • Assigns a value of 1 for any other result ('Draw' or 'Lose').
  5. END r

    • Concludes the CASE statement and aliases the resulting value as r.
  6. FROM Matches;

    • Specifies the Matches table as the data source.

Intermediate Output After Step 1:

+-----------+------------+---+ | player_id | match_day | r | +-----------+------------+---+ | 1 | 2022-01-17 | 0 | | 1 | 2022-01-18 | 0 | | 1 | 2022-01-25 | 0 | | 1 | 2022-01-31 | 1 | | 1 | 2022-02-08 | 0 | | 2 | 2022-02-06 | 1 | | 2 | 2022-02-08 | 1 | | 3 | 2022-03-30 | 0 | +-----------+------------+---+

Step 2: Calculate Cumulative Non-Win Counts (se)

For each player, compute a running total of non-win results (r) ordered by match_day. This cumulative sum (se) helps in identifying streak segments by indicating interruptions in winning streaks.

SQL Snippet:

SELECT player_id, match_day, Sum(r) OVER ( partition BY player_id ORDER BY match_day) AS se FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a;

Explanation:

  1. SELECT player_id, match_day,

    • Selects the player_id and match_day columns from the subquery a.
  2. Sum(r)

    • Calculates the cumulative sum of r values.
  3. OVER (

    • Introduces the window function to define how the cumulative sum is calculated.
  4. partition BY player_id

    • Divides the data into partitions based on player_id. This means the cumulative sum resets for each player.
  5. ORDER BY match_day)

    • Orders the data within each partition by match_day to ensure the cumulative sum is calculated chronologically.
  6. AS se

    • Aliases the resulting cumulative sum as se.
  7. FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a;

    • Specifies the subquery a which transforms the result into numerical indicators as done in Step 1.

Intermediate Output After Step 2 (se):

+-----------+------------+----+ | player_id | match_day | se | +-----------+------------+----+ | 1 | 2022-01-17 | 0 | | 1 | 2022-01-18 | 0 | | 1 | 2022-01-25 | 0 | | 1 | 2022-01-31 | 1 | | 1 | 2022-02-08 | 1 | | 2 | 2022-02-06 | 1 | | 2 | 2022-02-08 | 2 | | 3 | 2022-03-30 | 0 | +-----------+------------+----+

Step 3: Determine Streak Lengths

Identify streak segments by grouping consecutive matches based on the cumulative non-win counts (se). Calculate the length of each streak.

SQL Snippet:

SELECT player_id, se, CASE WHEN se = 0 THEN Count(se) ELSE Count(se) - 1 END AS streak FROM (SELECT player_id, match_day, Sum(r) OVER ( partition BY player_id ORDER BY match_day) AS se FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a)b GROUP BY player_id, se;

Explanation:

  1. SELECT player_id, se,

    • Selects the player_id and cumulative non-win count se from the subquery b.
  2. CASE

    • Initiates a conditional statement to calculate the streak length based on se.
  3. WHEN se = 0 THEN Count(se)

    • If se = 0, it means the player has not had any non-wins up to that point, indicating an uninterrupted streak of wins. The streak length is equal to the count of such entries.
  4. ELSE Count(se) - 1

    • If se > 0, it indicates that the streak has been interrupted by at least one non-win. The streak length is calculated as the count minus one to exclude the match that broke the streak.
  5. END AS streak

    • Concludes the CASE statement and aliases the result as streak.
  6. FROM (...) b

    • Specifies that the data is being selected from the subquery b, which contains player_id, match_day, and se.
  7. GROUP BY player_id, se;

    • Groups the data by player_id and se to aggregate the streak lengths accordingly.

Intermediate Output After Step 3:

+-----------+----+--------+ | player_id | se | streak | +-----------+----+--------+ | 1 | 0 | 3 | | 1 | 1 | 0 | | 2 | 1 | 0 | | 2 | 2 | 0 | | 3 | 0 | 1 | +-----------+----+--------+

Step 4: Compute Longest Streak per Player

For each player, identify the maximum streak length from the previously calculated streaks.

SQL Snippet:

SELECT player_id, Max(streak) AS longest_streak FROM (SELECT player_id, se, CASE WHEN se = 0 THEN Count(se) ELSE Count(se) - 1 END AS streak FROM (SELECT player_id, match_day, Sum(r) OVER ( partition BY player_id ORDER BY match_day) AS se FROM (SELECT player_id, match_day, CASE WHEN result = 'win' THEN 0 ELSE 1 END r FROM Matches) a)b GROUP BY player_id, se) c GROUP BY player_id;

Explanation:

  1. SELECT player_id, Max(streak) AS longest_streak

    • Selects the player_id and calculates the maximum streak value for each player, aliasing it as longest_streak.
  2. FROM (...) c

    • Specifies that the data is being selected from the subquery c, which contains player_id, se, and streak.
  3. GROUP BY player_id;

    • Groups the data by player_id to perform the aggregation (Max(streak)) for each player.

Final Output:

+-----------+----------------+ | player_id | longest_streak | +-----------+----------------+ | 1 | 3 | | 2 | 0 | | 3 | 1 | +-----------+----------------+

.....

.....

.....

Like the course? Get enrolled and start learning!

Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible