Grokking SQL for Tech Interviews
Ask Author
Back to course home

0% completed

Vote For New Content
11. Dynamic Unpivoting of a Table
On this page

Problem

Problem Definition

Example

Output

Solution

Approach Overview

SQL Query

Step-by-Step Explanation

Problem

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| product_id  | int     |
| store_name1 | int     |
| store_name2 | int     |
|      :      | int     |
|      :      | int     |
|      :      | int     |
| store_namen | int     |
+-------------+---------+
product_id is the primary key for this table.
Each row in this table indicates the product's price in n different stores.
If the product is not available in a store, the price will be null in that store's column.
The names of the stores may change from one testcase to another. There will be at least 1 store and at most 30 stores.

Problem Definition

Implement the procedure UnpivotProducts to reorganize the Products table so that each row has the id of one product, the name of a store where it is sold, and its price in that store. If a product is not available in a store, do not include a row with that product_id and store combination in the result table. There should be three columns: product_id, store, and price.

The procedure should return the table after reorganizing it.

Return the result table in any order.

Example

Image

Output

Image

Solution

To transform the Products table from a wide format—where each store has its own column—to a long format—where each row represents a product's price in a specific store—we implement the stored procedure UnpivotProducts. This procedure dynamically reorganizes the data, ensuring flexibility regardless of the number or names of stores involved. Below is a comprehensive breakdown of the approach, step-by-step execution, and detailed explanations for each component of the provided SQL query.

Approach Overview

  1. Understand the Data Structure:

    • Wide Format: The Products table has multiple columns for each store (store_name1, store_name2, ..., store_namen), each representing the price of the product in that store.
    • Long Format: The desired output has three columns: product_id, store, and price, with each row representing a product's price in a specific store.
  2. Identify Store Columns Dynamically:

    • Since the number and names of stores can vary across different test cases (up to 30 stores), the procedure must dynamically identify all store columns in the Products table.
  3. Construct Dynamic SQL for Unpivoting:

    • Use SQL string functions to build a dynamic SELECT statement that unpivots the data by converting each store column into separate rows.
  4. Execute the Dynamic SQL:

    • Prepare and execute the dynamically constructed SQL statement to generate the final unpivoted table.

SQL Query

CREATE PROCEDURE UnpivotProducts() BEGIN # Write your MySQL query statement below. set group_concat_max_len = 1000000; set @sql = null; with stores as ( SELECT COLUMN_NAME store FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME='products' and COLUMN_NAME<>'product_id' ) select group_concat( concat( 'select product_id, "', store, '" ', 'as store, ', store, ' ', 'as price from products where ', store, ' is not null union' ) order by store separator ' ' ) into @sql from stores; /*select SUBSTRING(@sql, 1, LENGTH(@sql)-6);*/ set @sql = SUBSTRING(@sql, 1, LENGTH(@sql)-6); prepare stmt from @sql; execute stmt; END

Step-by-Step Explanation

Step 1: Increase GROUP_CONCAT Maximum Length

set group_concat_max_len = 1000000;
  • The GROUP_CONCAT function in MySQL has a default maximum length for the concatenated string it can produce. By setting group_concat_max_len to a large value (1,000,000 in this case), we ensure that the dynamic SQL string generated will not be truncated, even when dealing with numerous stores.

Step 2: Initialize the SQL Variable

set @sql = null;
  • Initializes the user-defined variable @sql to NULL. This variable will later store the dynamically constructed SQL statement.

Step 3: Retrieve Store Column Names

with stores as ( SELECT COLUMN_NAME store FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME='products' and COLUMN_NAME<>'product_id' )
  • This Common Table Expression (CTE) named stores retrieves all column names from the Products table that represent stores. It excludes the product_id column.

  • Explanation:

    • INFORMATION_SCHEMA.COLUMNS: A system table that contains information about all columns in all tables within the database.
    • TABLE_NAME='products': Filters the columns to only those belonging to the Products table.
    • COLUMN_NAME<>'product_id': Excludes the product_id column since it's not a store column.
    • AS store: Aliases the COLUMN_NAME as store for clarity in subsequent steps.
  • Result:

    • A list of store names corresponding to the columns in the Products table.

Step 4: Dynamically Construct SELECT Statements for Each Store

select group_concat( concat( 'select product_id, "', store, '" ', 'as store, ', store, ' ', 'as price from products where ', store, ' is not null union' ) order by store separator ' ' ) into @sql from stores;
  • It constructs a series of SELECT statements for each store, combining them using UNION to create a comprehensive query that unpivots the data.

  • Breakdown:

    • concat(...): For each store, creates a SELECT statement that retrieves the product_id, the store name as store, and the corresponding price. It includes a WHERE clause to filter out NULL prices, ensuring only available products are included.
      • Example for store Shop:
        select product_id, "Shop" as store, Shop as price from products where Shop is not null union
    • group_concat(...): Aggregates all the constructed SELECT statements into a single string, separated by spaces.
      • order by store: Ensures that the SELECT statements are ordered lexicographically based on store names.
      • separator ' ': Specifies a space as the separator between concatenated strings.
    • into @sql: Stores the aggregated SELECT statements into the @sql variable for later execution.
  • Example Output for the Given Stores (LC_Store, Nozama, Shop, Souq):

    select product_id, "LC_Store" as store, LC_Store as price from products where LC_Store is not null union select product_id, "Nozama" as store, Nozama as price from products where Nozama is not null union select product_id, "Shop" as store, Shop as price from products where Shop is not null union select product_id, "Souq" as store, Souq as price from products where Souq is not null union

Step 5: Remove the Trailing 'UNION'

/*select SUBSTRING(@sql, 1, LENGTH(@sql)-6);*/ set @sql = SUBSTRING(@sql, 1, LENGTH(@sql)-6);
  • Explanation:

    • SUBSTRING(@sql, 1, LENGTH(@sql)-6):
      • LENGTH(@sql): Calculates the total length of the concatenated string.
      • -6: Removes the last 6 characters (' union').
      • SUBSTRING: Extracts the substring from the first character up to the length minus 6, effectively trimming the trailing ' union'.
  • Result:

    • A properly formatted SQL statement without the trailing ' union', ready for execution.
  • Final Dynamic SQL Example:

    select product_id, "LC_Store" as store, LC_Store as price from products where LC_Store is not null union select product_id, "Nozama" as store, Nozama as price from products where Nozama is not null union select product_id, "Shop" as store, Shop as price from products where Shop is not null union select product_id, "Souq" as store, Souq as price from products where Souq is not null

Step 6: Prepare and Execute the Dynamic SQL Statement

prepare stmt from @sql; execute stmt;
  • Explanation:

    • PREPARE stmt FROM @sql;: Takes the dynamic SQL stored in @sql and prepares it as a statement named stmt.
    • EXECUTE stmt;: Runs the prepared statement, generating the unpivoted result set.
  • Final Output:

    Executing the prepared dynamic SQL will produce a table in the desired long format. For example:

    +------------+----------+-------+ | product_id | store | price | +------------+----------+-------+ | 1 | LC_Store | 100 | | 1 | Shop | 110 | | 1 | Souq | 5000 | | 2 | Nozama | 200 | | 2 | Souq | 190 | | 3 | Shop | 1000 | | 3 | Souq | 1900 | +------------+----------+-------+
    • Interpretation:
      • Each product_id appears multiple times, once for each store where it is sold.
      • The price column reflects the product's price in the respective store.
      • Rows where the product is not sold in a store are excluded, as per the WHERE store IS NOT NULL condition in the dynamic SQL.

.....

.....

.....

Like the course? Get enrolled and start learning!

On this page

Problem

Problem Definition

Example

Output

Solution

Approach Overview

SQL Query

Step-by-Step Explanation