0% completed
Problem Statement
Table: CollaborationAccepted
Each row in this table records an instance where one employee invites another to collaborate on a project, and the invitation is accepted.
+----------------+---------+
| Column Name | Type |
+----------------+---------+
| inviter_id | int |
| invitee_id | int |
| project_id | int |
| accept_date | date |
+----------------+---------+
The combination of (inviter_id, invitee_id, project_id) is the primary key for this table.
This table includes the ID of the inviter, the ID of the invitee, the project they are collaborating on, and the date the invitation was accepted.
Develop a solution to find the employee(s) who have collaborated on the most number of unique projects. In case of a tie, list all such employees.
Example
Input:
CollaborationAccepted table: +------------+------------+------------+-------------+ | inviter_id | invitee_id | project_id | accept_date | +------------+------------+------------+-------------+ | 1 | 2 | 100 | 2020-05-01 | | 1 | 3 | 101 | 2020-05-02 | | 2 | 3 | 102 | 2020-05-03 | | 3 | 4 | 103 | 2020-05-04 | | 2 | 4 | 104 | 2020-05-05 | +------------+------------+------------+-------------+
Output:
+----+--------------+ | id | num_projects | +----+--------------+ | 2 | 3 | | 3 | 3 | +----+--------------+
In this example, both employees 2 and 3 have collaborated on 3 unique projects, which is the highest among all employees.
Try It yourself
Solution
To identify the employee(s) who have collaborated on the most number of unique projects, we analyze the CollaborationAccepted
table. We can accomplishes this by leveraging Common Table Expressions (CTEs) and set operations to aggregate and determine the desired results. Below is a comprehensive breakdown of the approach, step-by-step execution, and detailed explanations for each line of the SQL query.
Approach Overview
- Combine Collaborations: Merge the inviter and invitee collaborations to treat all collaborations uniformly.
- Count Unique Projects per Employee: Calculate the number of distinct projects each employee has collaborated on.
- Identify Maximum Project Count: Determine the highest number of unique projects any employee has collaborated on.
- Select Top Collaborators: Retrieve the employee(s) whose project count matches the maximum identified.
- Present the Final Results: Display the employee IDs alongside their respective number of unique projects.
SQL Query
WITH AllCollaborations AS ( -- Combine inviter and invitee collaborations SELECT inviter_id AS id, project_id FROM CollaborationAccepted UNION SELECT invitee_id AS id, project_id FROM CollaborationAccepted ), ProjectCounts AS ( -- Count unique projects per employee SELECT id, COUNT(DISTINCT project_id) AS num_projects FROM AllCollaborations GROUP BY id ) -- Select employees with the maximum number of projects SELECT id, num_projects FROM ProjectCounts WHERE num_projects = ( SELECT MAX(num_projects) FROM ProjectCounts ) ORDER BY id;
Step-by-Step Approach
Step 1: Combine Collaborations (AllCollaborations
)
Merge the inviter and invitee collaborations to create a unified view of all collaborations, treating both roles equally.
SQL Snippet:
WITH AllCollaborations AS ( -- Combine inviter and invitee collaborations SELECT inviter_id AS id, project_id FROM CollaborationAccepted UNION SELECT invitee_id AS id, project_id FROM CollaborationAccepted ),
Explanation:
-
WITH AllCollaborations AS (
- Initiates a Common Table Expression (CTE) named
AllCollaborations
. CTEs allow for temporary result sets that can be referenced within the main query.
- Initiates a Common Table Expression (CTE) named
-
SELECT inviter_id AS id, project_id
- Selects the
inviter_id
from theCollaborationAccepted
table and aliases it asid
. - Retrieves the associated
project_id
for each invitation.
- Selects the
-
UNION
- Combines the results of the first
SELECT
with the secondSELECT
. - The
UNION
operator ensures that duplicate records are eliminated, resulting in a list of unique (id, project_id) pairs.
- Combines the results of the first
-
SELECT invitee_id AS id, project_id
- Selects the
invitee_id
from theCollaborationAccepted
table and aliases it asid
. - Retrieves the associated
project_id
for each accepted invitation.
- Selects the
Intermediate Output After Step 1 (AllCollaborations
):
+----+------------+ | id | project_id | +----+------------+ | 1 | 100 | | 2 | 100 | | 1 | 101 | | 3 | 101 | | 2 | 102 | | 3 | 102 | | 3 | 103 | | 4 | 103 | | 2 | 104 | | 4 | 104 | +----+------------+
Step 2: Count Unique Projects per Employee (ProjectCounts
)
Calculate the number of distinct projects each employee has collaborated on.
SQL Snippet:
ProjectCounts AS ( -- Count unique projects per employee SELECT id, COUNT(DISTINCT project_id) AS num_projects FROM AllCollaborations GROUP BY id )
Explanation:
-
ProjectCounts AS (
- Initiates a second CTE named
ProjectCounts
.
- Initiates a second CTE named
-
SELECT id, COUNT(DISTINCT project_id) AS num_projects
- Selects the
id
(employee ID) from theAllCollaborations
CTE. - Counts the number of distinct
project_id
s each employee has collaborated on. - Aliases the count as
num_projects
.
- Selects the
-
GROUP BY id
- Groups the data by
id
to perform the aggregation for each employee.
- Groups the data by
Intermediate Output After Step 2 (ProjectCounts
):
+----+--------------+ | id | num_projects | +----+--------------+ | 1 | 2 | | 2 | 3 | | 3 | 3 | | 4 | 2 | +----+--------------+
Step 3: Identify Maximum Project Count
Determine the highest number of unique projects any employee has collaborated on.
SQL Snippet:
SELECT id, num_projects FROM ProjectCounts WHERE num_projects = ( SELECT MAX(num_projects) FROM ProjectCounts )
Explanation:
-
SELECT id, num_projects
- Selects the
id
andnum_projects
columns from theProjectCounts
CTE.
- Selects the
-
WHERE num_projects = (
- Introduces a condition to filter employees whose
num_projects
equals the maximum found.
- Introduces a condition to filter employees whose
-
SELECT MAX(num_projects)
- Selects the highest value of
num_projects
across all employees.
- Selects the highest value of
-
ORDER BY id;
- Orders the final result set by
id
in ascending order for clarity.
- Orders the final result set by
Final Output:
+----+--------------+ | id | num_projects | +----+--------------+ | 2 | 3 | | 3 | 3 | +----+--------------+
Explanation of Output:
-
Employee 2:
- Collaborated on 3 unique projects (100, 102, 104)
-
Employee 3:
- Collaborated on 3 unique projects (101, 102, 103)
.....
.....
.....
Table of Contents
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible