Grokking SQL for Tech Interviews
Ask Author
Back to course home

0% completed

Vote For New Content
18. Retail Expansion Analysis (Medium)
Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Problem Statement

Table: StorePerformance
Each row in this table represents a store, detailing its unique ID, revenue in 2020 and 2021, and its geographical coordinates.

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| store_id    | int     |
| rev_2020    | float   |
| rev_2021    | float   |
| lat         | float   |
| lon         | float   |
+-------------+---------+
store_id is the primary key for this table.
rev_2020 and rev_2021 represent the store's revenue in 2020 and 2021, respectively.
lat and lon represent the latitude and longitude of the store's location.

Develop a solution to calculate the total potential revenue for 2021 from stores that have the same revenue in 2020 as at least one other store and are located at unique geographical coordinates.

Example

Input:

StorePerformance table: +----------+----------+----------+------+------+ | store_id | rev_2020 | rev_2021 | lat | lon | +----------+----------+----------+------+------+ | 1 | 100 | 150 | 10.0 | 10.0 | | 2 | 200 | 250 | 20.0 | 20.0 | | 3 | 100 | 300 | 20.0 | 20.0 | | 4 | 100 | 400 | 40.0 | 40.0 | +----------+----------+----------+------+------+

Output:

+----------+ | rev_2021 | +----------+ | 550.00 | +----------+

In this example, stores 1 and 4 meet the criteria: they have the same rev_2020 as another store and are located at unique coordinates. Store 2 is unique in its revenue, and store 3 shares its location with store 2, disqualifying both.

Try It Yourself

MYSQL
MYSQL

. . . .

Solution

We need to calculate the total potential revenue for 2021 (rev_2021) from stores that:

  1. Have Duplicate Revenues in 2020 (rev_2020): The rev_2020 value is shared by at least two stores.

  2. Are Located at Unique Geographical Coordinates (lat, lon): No other store shares the same (lat, lon) pair.

SQL Query

SELECT FORMAT(SUM(sp.rev_2021), 2) AS rev_2021 FROM StorePerformance sp INNER JOIN ( -- Identify rev_2020 values that are duplicated SELECT rev_2020 FROM StorePerformance GROUP BY rev_2020 HAVING COUNT(*) > 1 ) dup_rev ON sp.rev_2020 = dup_rev.rev_2020 INNER JOIN ( -- Identify unique (lat, lon) pairs across all stores SELECT lat, lon FROM StorePerformance GROUP BY lat, lon HAVING COUNT(*) = 1 ) unique_loc ON sp.lat = unique_loc.lat AND sp.lon = unique_loc.lon;

Explanation of the Query

  1. Subquery dup_rev: Identify Duplicate rev_2020 Values

    SELECT rev_2020 FROM StorePerformance GROUP BY rev_2020 HAVING COUNT(*) > 1
    • Purpose: Selects all rev_2020 values that appear more than once, indicating duplicate revenues.

    • Example Output:

      +----------+ | rev_2020 | +----------+ | 100 | +----------+
  2. Subquery unique_loc: Identify Unique (lat, lon) Pairs

    SELECT lat, lon FROM StorePerformance GROUP BY lat, lon HAVING COUNT(*) = 1
    • Purpose: Selects all (lat, lon) pairs that are unique across the entire table.

    • Example Output:

      +------+------+ | lat | lon | +------+------+ |10.0 |10.0 | |40.0 |40.0 | +------+------+
  3. Main Query: Join and Sum rev_2021

    SELECT FORMAT(SUM(sp.rev_2021), 2) AS rev_2021 FROM StorePerformance sp INNER JOIN dup_rev ON sp.rev_2020 = dup_rev.rev_2020 INNER JOIN unique_loc ON sp.lat = unique_loc.lat AND sp.lon = unique_loc.lon;
    • Purpose:

      • INNER JOIN dup_rev: Filters stores with duplicated rev_2020 values.
      • INNER JOIN unique_loc: Ensures that the stores are located at unique (lat, lon) coordinates.
      • SUM(sp.rev_2021): Aggregates the rev_2021 values of the eligible stores.
      • FORMAT(..., 2): Formats the sum to display two decimal places (550.00).
    • Processing with Example Input:

      • Eligible Stores:
        • Store 1: rev_2020 = 100, (lat, lon) = (10.0, 10.0) — Eligible.
        • Store 4: rev_2020 = 100, (lat, lon) = (40.0, 40.0) — Eligible.
      • Excluded Stores:
        • Store 2: rev_2020 = 200 (unique) — Excluded.
        • Store 3: rev_2020 = 100, (lat, lon) = (20.0, 20.0) — Shares location with Store 2, hence excluded as (20.0, 20.0) is not unique.
    • Final Calculation:

      SUM(rev_2021) = 150 (Store 1) + 400 (Store 4) = 550.00
    • Final Output:

      +----------+ | rev_2021 | +----------+ | 550.00 | +----------+

.....

.....

.....

Like the course? Get enrolled and start learning!

Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible