0% completed
Problem
Table: Trips
+-------------+----------+
| Column Name | Type |
+-------------+----------+
| id | int |
| client_id | int |
| driver_id | int |
| city_id | int |
| status | enum |
| request_at | date |
+-------------+----------+
id is the primary key (column with unique values) for this table.
The table holds all taxi trips. Each trip has a unique id, while client_id and driver_id are foreign keys to the users_id at the Users table.
Status is an ENUM (category) type of ('completed', 'cancelled_by_driver', 'cancelled_by_client').
Table: Users
+-------------+----------+
| Column Name | Type |
+-------------+----------+
| users_id | int |
| banned | enum |
| role | enum |
+-------------+----------+
users_id is the primary key (column with unique values) for this table.
The table holds all users. Each user has a unique users_id, and role is an ENUM type of ('client', 'driver', 'partner').
banned is an ENUM (category) type of ('Yes', 'No').
Problem Definition
The cancellation rate is computed by dividing the number of canceled (by client or driver) requests with unbanned users by the total number of requests with unbanned users on that day.
Write a solution to find the cancellation rate of requests with unbanned users (both client and driver must not be banned) each day between "2013-10-01"
and "2013-10-03"
. Round Cancellation Rate
to two decimal points.
Example
Output
Try It Yourself
Solution
To calculate the cancellation rate of taxi trip requests with unbanned users between "2013-10-01" and "2013-10-03", we need to analyze the Trips
and Users
tables. The cancellation rate is defined as the ratio of the number of canceled trips (either canceled by the client or the driver) to the total number of trip requests, considering only those trips where both the client and the driver are not banned.
Approach Overview
-
Filter Unbanned Users:
- Clients: Exclude trips where the
client_id
corresponds to a banned user. - Drivers: Exclude trips where the
driver_id
corresponds to a banned user.
- Clients: Exclude trips where the
-
Filter by Date Range:
- Consider only trips with
request_at
dates between "2013-10-01" and "2013-10-03".
- Consider only trips with
-
Calculate Cancellation Rate:
- Numerator: Count of trips with status
'cancelled_by_driver'
or'cancelled_by_client'
. - Denominator: Total number of trips (including both completed and canceled) within the specified date range and with unbanned users.
- Cancellation Rate:
(Number of Canceled Trips) / (Total Number of Trips)
rounded to two decimal places.
- Numerator: Count of trips with status
-
Present the Results:
- Display the
Day
and the correspondingCancellation Rate
in the specified format.
- Display the
SQL Query
SELECT request_at AS Day, Round(SUM(IF(status = 'completed', 0, 1)) / Count(status), 2) AS 'Cancellation Rate' FROM Trips WHERE client_id NOT IN (SELECT users_id FROM Users WHERE banned = 'Yes') AND driver_id NOT IN (SELECT users_id FROM Users WHERE banned = 'Yes') AND request_at BETWEEN '2013-10-01' AND '2013-10-03' GROUP BY request_at;
Step-by-Step Approach
We'll break down the SQL query into smaller steps to understand how the Cancellation Rate is calculated. For illustration purposes, we'll use the provided sample data.
Step 1: Identify Unbanned Users
Filter out trips where either the client or the driver is banned.
SQL:
SELECT * FROM Trips WHERE client_id NOT IN ( SELECT users_id FROM Users WHERE banned = 'Yes' ) AND driver_id NOT IN ( SELECT users_id FROM Users WHERE banned = 'Yes' );
Explanation:
client_id NOT IN (...)
anddriver_id NOT IN (...)
:- Ensures that both the client and the driver associated with a trip are not banned.
Intermediate Output After Step 1:
+----+-----------+-----------+---------+-----------------------+------------+ | id | client_id | driver_id | city_id | status | request_at | +----+-----------+-----------+---------+-----------------------+------------+ | 1 | 1 | 10 | 1 | completed | 2013-10-01 | | 3 | 3 | 12 | 6 | completed | 2013-10-01 | | 4 | 4 | 13 | 6 | cancelled_by_driver | 2013-10-01 | | 5 | 1 | 10 | 1 | completed | 2013-10-02 | | 6 | 2 | 11 | 6 | completed | 2013-10-02 | | 7 | 3 | 12 | 6 | completed | 2013-10-02 | | 8 | 2 | 12 | 12 | completed | 2013-10-03 | | 9 | 3 | 10 | 12 | completed | 2013-10-03 | | 10 | 4 | 13 | 12 | cancelled_by_driver | 2013-10-03 | +----+-----------+-----------+---------+-----------------------+------------+
Note:
- Trips with
client_id
= 2 are excluded because user 2 is banned (banned = 'Yes'
).
Step 2: Filter Trips Within the Date Range
Consider only trips that occurred between "2013-10-01" and "2013-10-03".
SQL:
SELECT * FROM Trips WHERE client_id NOT IN ( SELECT users_id FROM Users WHERE banned = 'Yes' ) AND driver_id NOT IN ( SELECT users_id FROM Users WHERE banned = 'Yes' ) AND request_at BETWEEN '2013-10-01' AND '2013-10-03';
Explanation:
request_at BETWEEN '2013-10-01' AND '2013-10-03'
:- Limits the analysis to trips within the specified date range.
Intermediate Output After Step 2:
+----+-----------+-----------+---------+-----------------------+------------+ | id | client_id | driver_id | city_id | status | request_at | +----+-----------+-----------+---------+-----------------------+------------+ | 1 | 1 | 10 | 1 | completed | 2013-10-01 | | 3 | 3 | 12 | 6 | completed | 2013-10-01 | | 4 | 4 | 13 | 6 | cancelled_by_driver | 2013-10-01 | | 5 | 1 | 10 | 1 | completed | 2013-10-02 | | 6 | 2 | 11 | 6 | completed | 2013-10-02 | | 7 | 3 | 12 | 6 | completed | 2013-10-02 | | 8 | 2 | 12 | 12 | completed | 2013-10-03 | | 9 | 3 | 10 | 12 | completed | 2013-10-03 | | 10 | 4 | 13 | 12 | cancelled_by_driver | 2013-10-03 | +----+-----------+-----------+---------+-----------------------+------------+
Step 3: Calculate Cancellation Rate Per Day
Compute the Cancellation Rate for each day by dividing the number of canceled trips by the total number of trips for that day. Only consider trips where both the client and driver are not banned.
SQL:
SELECT request_at AS Day, Round(SUM(IF(status = 'completed', 0, 1)) / Count(status), 2) AS 'Cancellation Rate' FROM Trips WHERE client_id NOT IN ( SELECT users_id FROM Users WHERE banned = 'Yes' ) AND driver_id NOT IN ( SELECT users_id FROM Users WHERE banned = 'Yes' ) AND request_at BETWEEN '2013-10-01' AND '2013-10-03' GROUP BY request_at;
Explanation:
SUM(IF(status = 'completed', 0, 1))
:- Counts the number of canceled trips (
status
not equal to'completed'
).
- Counts the number of canceled trips (
Count(status)
:- Counts the total number of trips for that day.
Round(..., 2)
:- Rounds the cancellation rate to two decimal places.
GROUP BY request_at
:- Calculates the cancellation rate for each day separately.
Final Output:
+--------------------------+-------------------+ | Day | Cancellation Rate | +--------------------------+-------------------+ | 2013-10-01T00:00:00.000Z | 0.33 | | 2013-10-02T00:00:00.000Z | 0.00 | | 2013-10-03T00:00:00.000Z | 0.50 | +--------------------------+-------------------+
Explanation of Output:
- 2013-10-01:
- Cancellation Rate: 0.33 (1 canceled trip out of 3)
- 2013-10-02:
- Cancellation Rate: 0.00 (0 canceled trips out of 3)
- 2013-10-03:
- Cancellation Rate: 0.50 (2 canceled trips out of 4)
.....
.....
.....
Table of Contents
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible