0% completed
Problem
Table: Visits
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| visit_date | date |
+---------------+---------+
(user_id, visit_date) is the primary key (combination of columns with unique values) for this table.
Each row of this table indicates that user_id has visited the bank in visit_date.
Table: Transactions
+------------------+---------+
| Column Name | Type |
+------------------+---------+
| user_id | int |
| transaction_date | date |
| amount | int |
+------------------+---------+
This table may contain duplicate rows.
Each row of this table indicates that user_id has made a transaction of the amount in transaction_date.
It is guaranteed that the user has visited the bank in the transaction_date.(i.e The Visits table contains (user_id, transaction_date) in one row)
Problem Definition
A bank wants to draw a chart of the number of transactions bank visitors did in one visit to the bank and the corresponding number of visitors who have done this number of transactions in one visit.
Write a solution to find how many users visited the bank and didn't do any transactions, how many visited the bank and did one transaction, and so on.
The result table will contain two columns:
transactions_count
which is the number of transactions done in one visit.visits_count
which is the corresponding number of users who didtransactions_count
in one visit to the bank.
transactions_count
should take all values from 0
to max(transactions_count)
done by one or more users.
Return the result table ordered by transactions_count
.
Example
Output
Try It Yourself
Solution
To analyze the transaction behavior of bank visitors, we aim to determine how many transactions each visitor made during their visits and summarize this data to understand the distribution of transaction counts across all visits. Specifically, we want to generate a report that shows:
- The number of transactions (
transactions_count
) a visitor made in a single visit. - The number of visitors (
visits_count
) who made that number of transactions in one visit.
This analysis helps the bank understand customer engagement and transaction patterns, which can inform service improvements and marketing strategies.
Approach Overview
-
Generate a Sequence of Possible Transaction Counts (
t
CTE):- Create a list of all possible transaction counts from
0
up to the total number of transactions to ensure that even if no visits have a certain number of transactions, the count for that transaction number is still represented.
- Create a list of all possible transaction counts from
-
Calculate Transactions per Visit (
t1
CTE):- For each visit, count the number of transactions made by the visitor. Include visits with zero transactions by performing a
LEFT JOIN
betweenVisits
andTransactions
.
- For each visit, count the number of transactions made by the visitor. Include visits with zero transactions by performing a
-
Aggregate Visits by Transaction Count:
- Join the sequence of possible transaction counts with the actual transaction counts per visit and count how many visits correspond to each transaction count.
-
Finalize the Results:
- Present the
transactions_count
alongside the correspondingvisits_count
, ensuring that all transaction counts from0
tomax(transactions_count)
are included and ordered accordingly.
- Present the
SQL Query
WITH t AS ( SELECT Row_number() OVER() row_num FROM Transactions UNION SELECT 0 ), t1 AS ( SELECT Count(transaction_date) transaction_count FROM Visits v LEFT JOIN Transactions t ON v.user_id = t.user_id AND v.visit_date = transaction_date GROUP BY v.user_id, v.visit_date ) SELECT row_num AS Transactions_count, Count(transaction_count) visits_count FROM t LEFT JOIN t1 ON row_num = transaction_count GROUP BY row_num HAVING row_num <= (SELECT Max(transaction_count) FROM t1) ORDER BY row_num;
Step-by-Step Approach
Step 1: Generate the t
Common Table Expression (CTE)
Create a sequence of numbers representing possible transaction counts, including 0
. This ensures that even if no visits have a certain number of transactions, the count for that transaction number is still represented as 0
.
SQL Snippet:
WITH t AS ( SELECT Row_number() OVER() row_num FROM Transactions UNION SELECT 0 )
Explanation:
Row_number() OVER() row_num
:- Assigns a unique sequential number (
row_num
) to each row in theTransactions
table. If there areN
transactions,row_num
ranges from1
toN
.
- Assigns a unique sequential number (
UNION SELECT 0
:- Adds
0
to the list ofrow_num
values to account for visits with no transactions.
- Adds
Intermediate Output After Step 1 (t
CTE):
+---------+ | row_num | +---------+ | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 0 | +---------+
Step 2: Calculate Transactions per Visit (t1
CTE)
Determine the number of transactions each visitor made during each visit to the bank.
SQL Snippet:
t1 AS ( SELECT Count(transaction_date) transaction_count FROM Visits v LEFT JOIN Transactions t ON v.user_id = t.user_id AND v.visit_date = transaction_date GROUP BY v.user_id, v.visit_date )
Explanation:
-
LEFT JOIN Transactions t ON v.user_id = t.user_id AND v.visit_date = transaction_date
:- Joins each visit with its corresponding transactions based on
user_id
andvisit_date
. This ensures that visits without any transactions are included withNULL
values for transactions.
- Joins each visit with its corresponding transactions based on
-
COUNT(transaction_date) AS transaction_count
:- Counts the number of transactions for each visit. Since it's a
LEFT JOIN
, visits with no transactions will have atransaction_count
of0
.
- Counts the number of transactions for each visit. Since it's a
-
GROUP BY v.user_id, v.visit_date
:- Aggregates the data for each unique visit.
Intermediate Output After Step 2 (t1
CTE):
Based on the example data:
user_id | visit_date | transaction_count |
---|---|---|
1 | 2020-01-01 | 0 |
2 | 2020-01-02 | 0 |
12 | 2020-01-01 | 0 |
19 | 2020-01-03 | 0 |
1 | 2020-01-02 | 1 |
2 | 2020-01-03 | 1 |
1 | 2020-01-04 | 1 |
7 | 2020-01-11 | 1 |
9 | 2020-01-25 | 3 |
8 | 2020-01-28 | 1 |
Step 3: Aggregate Visits by Transaction Count
Determine how many visits correspond to each number of transactions (transactions_count
), ensuring that all possible counts from 0
to the maximum observed are included.
SQL Snippet:
SELECT row_num AS Transactions_count, Count(transaction_count) visits_count FROM t LEFT JOIN t1 ON row_num = transaction_count GROUP BY row_num HAVING row_num <= (SELECT Max(transaction_count) FROM t1) ORDER BY row_num;
Explanation:
LEFT JOIN t1 ON row_num = transaction_count
:- Joins the sequence of possible transaction counts (
t
) with the actual transaction counts per visit (t1
).
- Joins the sequence of possible transaction counts (
COUNT(transaction_count) AS visits_count
:- Counts the number of visits that have a specific
transactions_count
. If no visits have a certaintransactions_count
, the count will be0
.
- Counts the number of visits that have a specific
GROUP BY row_num
:- Aggregates the data based on each transaction count.
HAVING row_num <= (SELECT Max(transaction_count) FROM t1)
:- Ensures that only relevant transaction counts (up to the maximum observed) are included in the final result.
ORDER BY row_num
:- Sorts the results by
transactions_count
in ascending order for clarity.
- Sorts the results by
Final Output:
Based on the example data, the final output will be:
+--------------------+--------------+ | Transactions_count | visits_count | +--------------------+--------------+ |0 |4 | |1 |5 | |2 |0 | |3 |1 | +--------------------+--------------+
.....
.....
.....
Table of Contents
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible