Grokking Graph Algorithms for Coding Interviews
Ask Author
Back to course home

0% completed

Vote For New Content
Solution: Accounts Merge
Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Problem Statement

You are given a list of accounts, each of which is a list of strings. The first string in each list is the account name, and the rest are email addresses associated with that account.

You are required to merge accounts that belong to the same person. Two accounts are considered to belong to the same person if there is at least one common email address between them.

Note that even if two accounts have the same name, they may belong to different people, as people could have the same name.

After merging, the output should be a list of accounts in which each account is represented by a list containing the name and the unique email addresses sorted in alphabetical order. The order of accounts in the output does not matter.

Examples

Example 1:

Input:

accounts = [["Alice", "alice@mail.com", "alice123@mail.com"],
  ["Alice", "alice123@mail.com", "alice_newyork@mail.com"],
  ["Bob", "bob@mail.com"],
  ["Alice", "alice@mail.com", "alice00@mail.com"]
]

Expected Output:

[
  ["Alice", "alice@mail.com", "alice00@mail.com", "alice123@mail.com", "alice_newyork@mail.com"],
  ["Bob", "bob@mail.com"]
]

Justification: The first, second, and fourth accounts belong to Alice and have overlapping emails. Hence, they are merged into one account.

Example 2:

Input:

accounts = [["Charlie", "charlie@mail.com", "charlie00@mail.com"],
  ["Charlie", "charlie_newyork@mail.com", "charlie00@mail.com"],
  ["Eve", "eve@mail.com"],
  ["Bob", "bob@mail.com"]
]

Expected Output:

[
  [["Charlie","charlie00@mail.com","charlie@mail.com","charlie_newyork@mail.com"],
   ["Eve","eve@mail.com"], 
   ["Bob","bob@mail.com"]]
]

Justification: The first, and second accounts belong to Charlie and have overlapping emails. Hence, they are merged into one account.

Example 3:

Input:

accounts = [["David", "david@mail.com"],
  ["David", "david1@mail.com", "david123@mail.com"],
  ["John", "john@mail.com"],
  ["David", "david123@mail.com"]
]

Expected Output:

[
  ["David","david@mail.com"],
  ["David","david123@mail.com","david1@mail.com"],
  ["John","john@mail.com"]]
]

Justification: The second, and fourth accounts belong to David and have overlapping emails. The first account has name equal to David but it doesn't have overlapping email with second and fourth account.

Constraints:

  • 1 <= accounts.length <= 1000
  • 2 <= accounts[i].length <= 10
  • 1 <= accounts[i][j].length <= 30
  • accounts[i][0] consists of English letters.
  • accounts[i][j] (for j > 0) is a valid email.

Solution

To solve this problem, we will use the Union-Find (also known as Disjoint Set Union, DSU) data structure. This approach allows us to efficiently merge sets and find representative elements of sets. We will treat each email as a node and use the union-find data structure to connect these nodes based on their shared accounts. By merging sets of emails that belong to the same account, we can group all related emails together.

This approach is effective because it leverages the efficiency of the union-find operations (union and find) to handle merging operations. It ensures that we can quickly determine whether two emails belong to the same person and merge their sets if necessary. This method is efficient and scales well with larger input sizes.

Step-by-Step Algorithm

  1. Initialize DSU:

    • Create a DSU (Disjoint Set Union) object for managing the union and find operations.
    • Initialize parent and rank arrays.
  2. Create Email to Index Mapping:

    • Initialize a map to associate each email with the index of the account it appears in.
  3. Union Accounts:

    • Iterate through each account in the list.
    • For each email in the account:
      • If the email has not been seen before, map it to the current account index.
      • If the email has been seen before, union the current account index with the index of the previously seen email.
  4. Group Emails by Root Index:

    • Initialize a map to group emails by their root index.
    • For each email in the emailToIndex map:
      • Find the root index of the account containing the email.
      • Add the email to the list corresponding to the root index.
  5. Collect and Sort Results:

    • Initialize an empty list to store the merged accounts.
    • For each group of emails:
      • Sort the emails.
      • Add the account name followed by the sorted emails to the result list.
  6. Return Results:

    • Return the merged accounts list.

Algorithm Walkthrough

Using the example:

[
  ["Alice", "alice@mail.com", "alice123@mail.com"],
  ["Alice", "alice123@mail.com", "alice_newyork@mail.com"],
  ["Bob", "bob@mail.com"],
  ["Alice", "alice@mail.com", "alice00@mail.com"]
]

Step-by-Step Execution:

  1. Initialize DSU:

    • parent = [0, 1, 2, 3]
    • rank = [1, 1, 1, 1]
  2. Create Email to Index Mapping:

    • emailToIndex = {}
  3. Union Accounts:

    • Account 0: ["Alice", "alice@mail.com", "alice123@mail.com"]

      • email = "alice@mail.com"
        • Not seen before, map to index 0: emailToIndex["alice@mail.com"] = 0
      • email = "alice123@mail.com"
        • Not seen before, map to index 0: emailToIndex["alice123@mail.com"] = 0
    • Account 1: ["Alice", "alice123@mail.com", "alice_newyork@mail.com"]

      • email = "alice123@mail.com"
        • Seen before, union indices 1 and 0.
        • Union(1, 0):
          • Find(1) = 1
          • Find(0) = 0
          • Attach smaller tree under larger tree: parent[1] = 0, update rank: rank[0] += rank[1]
      • email = "alice_newyork@mail.com"
        • Not seen before, map to index 1: emailToIndex["alice_newyork@mail.com"] = 1
    • Account 2: ["Bob", "bob@mail.com"]

      • email = "bob@mail.com"
        • Not seen before, map to index 2: emailToIndex["bob@mail.com"] = 2
    • Account 3: ["Alice", "alice@mail.com", "alice00@mail.com"]

      • email = "alice@mail.com"
        • Seen before, union indices 3 and 0.
        • Union(3, 0):
          • Find(3) = 3
          • Find(0) = 0
          • Attach smaller tree under larger tree: parent[3] = 0, update rank: rank[0] += rank[3]
      • email = "alice00@mail.com"
        • Not seen before, map to index 3: emailToIndex["alice00@mail.com"] = 3
  4. Group Emails by Root Index:

    • components = {}

    • Email: "alice@mail.com", Index: 0

      • Find(0) = 0
      • Add to group 0: components[0] = ["alice@mail.com"]
    • Email: "alice123@mail.com", Index: 0

      • Find(0) = 0
      • Add to group 0: components[0].append("alice123@mail.com")
    • Email: "alice_newyork@mail.com", Index: 1

      • Find(1) = 0
      • Add to group 0: components[0].append("alice_newyork@mail.com")
    • Email: "bob@mail.com", Index: 2

      • Find(2) = 2
      • Add to group 2: components[2] = ["bob@mail.com"]
    • Email: "alice00@mail.com", Index: 3

      • Find(3) = 0
      • Add to group 0: components[0].append("alice00@mail.com")
  5. Collect and Sort Results:

    • mergedAccounts = []

    • Group 0: ["alice@mail.com", "alice123@mail.com", "alice_newyork@mail.com", "alice00@mail.com"]

      • Sort emails: ["alice00@mail.com", "alice123@mail.com", "alice@mail.com", "alice_newyork@mail.com"]
      • Create account: ["Alice", "alice00@mail.com", "alice123@mail.com", "alice@mail.com", "alice_newyork@mail.com"]
      • Add to result: mergedAccounts.append(...)
    • Group 2: ["bob@mail.com"]

      • Create account: ["Bob", "bob@mail.com"]
      • Add to result: mergedAccounts.append(...)
  6. Return Results:

    • mergedAccounts = [["Alice", "alice00@mail.com", "alice123@mail.com", "alice@mail.com", "alice_newyork@mail.com"], ["Bob", "bob@mail.com"]]

Code

Python3
Python3

. . . .

Complexity Analysis

Time Complexity

  • Initializing DSU: O(n)
  • Mapping emails to account indices: O(n * m * α(n)), where n is the number of accounts, m is the average number of emails per account, and α is the inverse Ackermann function (very small).
  • Sorting emails: O(n * m * log(m)), where n is the number of accounts and m is the average number of emails per account.
  • Overall: O(n * m * α(n) + n * m * log(m))

Space Complexity

  • Storing parent and rank arrays in DSU: O(n)
  • Storing email to index mapping: O(n * m)
  • Storing components: O(n * m)
  • Overall: O(n * m)

.....

.....

.....

Like the course? Get enrolled and start learning!

Table of Contents

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible

Contents are not accessible