Grokking SQL for Tech Interviews
Ask Author
Back to course home

0% completed

Vote For New Content
15. Suspicious Bank Accounts
Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Problem

Table: Accounts

+----------------+------+
| Column Name    | Type |
+----------------+------+
| account_id     | int  |
| max_income     | int  |
+----------------+------+
account_id is the column with unique values for this table.
Each row contains information about the maximum monthly income for one bank account.

Table: Transactions

+----------------+----------+
| Column Name    | Type     |
+----------------+----------+
| transaction_id | int      |
| account_id     | int      |
| type           | ENUM     |
| amount         | int      |
| day            | datetime |
+----------------+----------+
transaction_id is the column with unique values for this table.
Each row contains information about one transaction.
type is ENUM (category) type of ('Creditor','Debtor') where 'Creditor' means the user deposited money into their account and 'Debtor' means the user withdrew money from their account.
amount is the amount of money deposited/withdrawn during the transaction.

Problem Definition

A bank account is suspicious if the total income exceeds the max_income for this account for two or more consecutive months. The total income of an account in some month is the sum of all its deposits in that month (i.e., transactions of the type 'Creditor').

Write a solution to report the IDs of all suspicious bank accounts.

Return the result table in any order.

Example

Image

Output

Image

Try It Yourself

MYSQL
MYSQL

. . . .

Solution

To identify suspicious bank accounts based on their transaction histories, we need to analyze the Accounts and Transactions tables. A bank account is deemed suspicious if it meets below condition:

  1. Total Income Exceeds max_income for Two or More Consecutive Months:
    • Total Income for a month is the sum of all deposits ('Creditor' transactions) made into the account during that month.
    • The Total Income must exceed the account's max_income for two or more consecutive months.

Approach Overview

  1. Calculate Monthly Total Income for Each Account: Aggregate 'Creditor' transactions by account_id and month to determine the Total Income for each account per month.
  2. Identify Months Where Total Income Exceeds max_income: Compare the Total Income against the max_income for each account and mark the months where the income exceeds the threshold.
  3. Determine Consecutive Months of Exceeding Income: Analyze the marked months to find accounts that have two or more consecutive months where their Total Income exceeded max_income.
  4. Retrieve Suspicious Account IDs: Select the account_ids of accounts that meet the criteria.

SQL Query

WITH incomes AS ( SELECT a.account_id, DATE_FORMAT(day, '%Y-%m-01') AS trans_month, SUM(amount) AS total_income, b.max_income FROM Transactions a INNER JOIN Accounts b ON a.account_id = b.account_id WHERE type = 'Creditor' GROUP BY a.account_id, DATE_FORMAT(day, '%Y-%m-01'), b.max_income HAVING IF(SUM(amount) > b.max_income, 1, 0) = 1 ), consec_income AS ( SELECT account_id, TIMESTAMPDIFF( MONTH, LAG(trans_month, 1) OVER ( PARTITION BY account_id ORDER BY trans_month ), trans_month ) AS prev_month FROM incomes ) SELECT DISTINCT account_id FROM consec_income WHERE prev_month = 1;

Step-by-Step Approach

Step 1: Calculate Monthly Total Income for Each Account (incomes CTE)

Compute the Total Income for each account for every month by summing up all 'Creditor' transactions.

SQL Query:

WITH incomes AS ( SELECT a.account_id, DATE_FORMAT(day, '%Y-%m-01') AS trans_month, SUM(amount) AS total_income, b.max_income FROM Transactions a INNER JOIN Accounts b ON a.account_id = b.account_id WHERE type = 'Creditor' GROUP BY a.account_id, DATE_FORMAT(day, '%Y-%m-01'), b.max_income HAVING IF(SUM(amount) > b.max_income, 1, 0) = 1 ) SELECT * FROM incomes;

Explanation:

  • DATE_FORMAT(day, '%Y-%m-01') AS trans_month:
    • Extracts the year and month from the day column to group transactions by the first day of each month.
  • SUM(amount) AS total_income:
    • Calculates the total deposited amount ('Creditor' transactions) for each account per month.
  • JOIN Accounts b ON a.account_id = b.account_id:
    • Associates each transaction with its corresponding account to access the max_income.
  • WHERE type = 'Creditor':
    • Filters transactions to include only deposits.
  • GROUP BY a.account_id, DATE_FORMAT(day, '%Y-%m-01'), b.max_income:
    • Groups the data by account_id and month to aggregate the deposits.
  • HAVING IF(SUM(amount) > b.max_income, 1, 0) = 1:
    • Retains only those groups where the Total Income exceeds the max_income.

Output After Step 1:

+------------+------------+--------------+------------+ | account_id | trans_month| total_income | max_income | +------------+------------+--------------+------------+ | 3 | 2021-05-01 | 0 | 21000 | | 3 | 2021-06-01 | 298000 | 21000 | | 3 | 2021-07-01 | 64900 | 21000 | | 4 | 2021-05-01 | 49300 | 10400 | | 4 | 2021-06-01 | 10400 | 10400 | | 4 | 2021-07-01 | 56300 | 10400 | +------------+------------+--------------+------------+

Step 2: Determine Consecutive Months of Exceeding Income (consec_income CTE)

Identify accounts that have two or more consecutive months where their Total Income exceeded max_income.

SQL Query:

WITH incomes AS ( SELECT a.account_id, DATE_FORMAT(day, '%Y-%m-01') AS trans_month, SUM(amount) AS total_income, b.max_income FROM Transactions a INNER JOIN Accounts b ON a.account_id = b.account_id WHERE type = 'Creditor' GROUP BY a.account_id, DATE_FORMAT(day, '%Y-%m-01'), b.max_income HAVING IF(SUM(amount) > b.max_income, 1, 0) = 1 ), consec_income AS ( SELECT account_id, TIMESTAMPDIFF( MONTH, LAG(trans_month, 1) OVER ( PARTITION BY account_id ORDER BY trans_month ), trans_month ) AS prev_month FROM incomes ) SELECT * FROM consec_income;

Explanation:

  • LAG(trans_month, 1) OVER (PARTITION BY account_id ORDER BY trans_month):
    • Retrieves the previous month's trans_month for each account_id.
  • TIMESTAMPDIFF(MONTH, previous_month, current_month) AS prev_month:
    • Calculates the difference in months between the current month and the previous month.
    • A result of 1 indicates that the current month is exactly one month after the previous month, signifying consecutive months.
  • consec_income CTE:
    • Contains each account_id and the difference in months (prev_month) between consecutive entries where income exceeded max_income.

Output After Step 2:

+------------+------------+ | account_id | prev_month | +------------+------------+ | 3 | NULL | | 3 | 1 | | 4 | NULL | | 4 | 2 | +------------+------------+

Explanation of Output:

  • Account 3:
    • First Entry (2021-06-01): No previous month → prev_month = NULL.
    • Second Entry (2021-07-01): Previous month = 2021-06-01 → prev_month = 1 (Consecutive).
  • Account 4:
    • First Entry (2021-05-01): No previous month → prev_month = NULL.
    • Second Entry (2021-07-01): Previous month = 2021-05-01 → prev_month = 2 (Not consecutive).

Step 3: Retrieve Suspicious Account IDs

Select the account_ids of accounts that have two or more consecutive months where their Total Income exceeded max_income.

SQL Query:

WITH incomes AS ( SELECT a.account_id, DATE_FORMAT(day, '%Y-%m-01') AS trans_month, SUM(amount) AS total_income, b.max_income FROM Transactions a INNER JOIN Accounts b ON a.account_id = b.account_id WHERE type = 'Creditor' GROUP BY a.account_id, DATE_FORMAT(day, '%Y-%m-01'), b.max_income HAVING IF(SUM(amount) > b.max_income, 1, 0) = 1 ), consec_income AS ( SELECT account_id, TIMESTAMPDIFF( MONTH, LAG(trans_month, 1) OVER ( PARTITION BY account_id ORDER BY trans_month ), trans_month ) AS prev_month FROM incomes ) SELECT DISTINCT account_id FROM consec_income WHERE prev_month = 1;

Explanation:

  • WHERE prev_month = 1:
    • Filters the consec_income CTE to include only those records where the current month is exactly one month after the previous month, indicating consecutive months.
  • SELECT DISTINCT account_id:
    • Ensures that each suspicious account_id appears only once in the final output, even if there are multiple pairs of consecutive months.

Output After Step 3:

+------------+ | account_id | +------------+ | 3 | +------------+

.....

.....

.....

Like the course? Get enrolled and start learning!

Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible