Grokking SQL for Tech Interviews
Ask Author
Back to course home

0% completed

Vote For New Content
5. Number of Transactions per Visit
Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Problem

Table: Visits

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user_id       | int     |
| visit_date    | date    |
+---------------+---------+
(user_id, visit_date) is the primary key (combination of columns with unique values) for this table.
Each row of this table indicates that user_id has visited the bank in visit_date.

Table: Transactions

+------------------+---------+
| Column Name      | Type    |
+------------------+---------+
| user_id          | int     |
| transaction_date | date    |
| amount           | int     |
+------------------+---------+
This table may contain duplicate rows.
Each row of this table indicates that user_id has made a transaction of the amount in transaction_date.
It is guaranteed that the user has visited the bank in the transaction_date.(i.e The Visits table contains (user_id, transaction_date) in one row)

Problem Definition

A bank wants to draw a chart of the number of transactions bank visitors did in one visit to the bank and the corresponding number of visitors who have done this number of transactions in one visit.

Write a solution to find how many users visited the bank and didn't do any transactions, how many visited the bank and did one transaction, and so on.

The result table will contain two columns:

  • transactions_count which is the number of transactions done in one visit.
  • visits_count which is the corresponding number of users who did transactions_count in one visit to the bank.

transactions_count should take all values from 0 to max(transactions_count) done by one or more users.

Return the result table ordered by transactions_count.

Example

Image

Output

Image

Try It Yourself

MYSQL
MYSQL

. . . .

Solution

To analyze the transaction behavior of bank visitors, we aim to determine how many transactions each visitor made during their visits and summarize this data to understand the distribution of transaction counts across all visits. Specifically, we want to generate a report that shows:

  • The number of transactions (transactions_count) a visitor made in a single visit.
  • The number of visitors (visits_count) who made that number of transactions in one visit.

This analysis helps the bank understand customer engagement and transaction patterns, which can inform service improvements and marketing strategies.

Approach Overview

  1. Generate a Sequence of Possible Transaction Counts (t CTE):

    • Create a list of all possible transaction counts from 0 up to the total number of transactions to ensure that even if no visits have a certain number of transactions, the count for that transaction number is still represented.
  2. Calculate Transactions per Visit (t1 CTE):

    • For each visit, count the number of transactions made by the visitor. Include visits with zero transactions by performing a LEFT JOIN between Visits and Transactions.
  3. Aggregate Visits by Transaction Count:

    • Join the sequence of possible transaction counts with the actual transaction counts per visit and count how many visits correspond to each transaction count.
  4. Finalize the Results:

    • Present the transactions_count alongside the corresponding visits_count, ensuring that all transaction counts from 0 to max(transactions_count) are included and ordered accordingly.

SQL Query

WITH t AS ( SELECT Row_number() OVER() row_num FROM Transactions UNION SELECT 0 ), t1 AS ( SELECT Count(transaction_date) transaction_count FROM Visits v LEFT JOIN Transactions t ON v.user_id = t.user_id AND v.visit_date = transaction_date GROUP BY v.user_id, v.visit_date ) SELECT row_num AS Transactions_count, Count(transaction_count) visits_count FROM t LEFT JOIN t1 ON row_num = transaction_count GROUP BY row_num HAVING row_num <= (SELECT Max(transaction_count) FROM t1) ORDER BY row_num;

Step-by-Step Approach

Step 1: Generate the t Common Table Expression (CTE)

Create a sequence of numbers representing possible transaction counts, including 0. This ensures that even if no visits have a certain number of transactions, the count for that transaction number is still represented as 0.

SQL Snippet:

WITH t AS ( SELECT Row_number() OVER() row_num FROM Transactions UNION SELECT 0 )

Explanation:

  • Row_number() OVER() row_num:
    • Assigns a unique sequential number (row_num) to each row in the Transactions table. If there are N transactions, row_num ranges from 1 to N.
  • UNION SELECT 0:
    • Adds 0 to the list of row_num values to account for visits with no transactions.

Intermediate Output After Step 1 (t CTE):

+---------+ | row_num | +---------+ | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 0 | +---------+

Step 2: Calculate Transactions per Visit (t1 CTE)

Determine the number of transactions each visitor made during each visit to the bank.

SQL Snippet:

t1 AS ( SELECT Count(transaction_date) transaction_count FROM Visits v LEFT JOIN Transactions t ON v.user_id = t.user_id AND v.visit_date = transaction_date GROUP BY v.user_id, v.visit_date )

Explanation:

  • LEFT JOIN Transactions t ON v.user_id = t.user_id AND v.visit_date = transaction_date:

    • Joins each visit with its corresponding transactions based on user_id and visit_date. This ensures that visits without any transactions are included with NULL values for transactions.
  • COUNT(transaction_date) AS transaction_count:

    • Counts the number of transactions for each visit. Since it's a LEFT JOIN, visits with no transactions will have a transaction_count of 0.
  • GROUP BY v.user_id, v.visit_date:

    • Aggregates the data for each unique visit.

Intermediate Output After Step 2 (t1 CTE):

Based on the example data:

user_idvisit_datetransaction_count
12020-01-010
22020-01-020
122020-01-010
192020-01-030
12020-01-021
22020-01-031
12020-01-041
72020-01-111
92020-01-253
82020-01-281

Step 3: Aggregate Visits by Transaction Count

Determine how many visits correspond to each number of transactions (transactions_count), ensuring that all possible counts from 0 to the maximum observed are included.

SQL Snippet:

SELECT row_num AS Transactions_count, Count(transaction_count) visits_count FROM t LEFT JOIN t1 ON row_num = transaction_count GROUP BY row_num HAVING row_num <= (SELECT Max(transaction_count) FROM t1) ORDER BY row_num;

Explanation:

  • LEFT JOIN t1 ON row_num = transaction_count:
    • Joins the sequence of possible transaction counts (t) with the actual transaction counts per visit (t1).
  • COUNT(transaction_count) AS visits_count:
    • Counts the number of visits that have a specific transactions_count. If no visits have a certain transactions_count, the count will be 0.
  • GROUP BY row_num:
    • Aggregates the data based on each transaction count.
  • HAVING row_num <= (SELECT Max(transaction_count) FROM t1):
    • Ensures that only relevant transaction counts (up to the maximum observed) are included in the final result.
  • ORDER BY row_num:
    • Sorts the results by transactions_count in ascending order for clarity.

Final Output:

Based on the example data, the final output will be:

+--------------------+--------------+ | Transactions_count | visits_count | +--------------------+--------------+ |0 |4 | |1 |5 | |2 |0 | |3 |1 | +--------------------+--------------+

.....

.....

.....

Like the course? Get enrolled and start learning!

Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible