0% completed
Problem Statement
Two strings, X
and Y
, are considered similar if they are identical or can be made identical by swapping at most two distinct positions.
For example, "abcd"
and "bacd"
are similar (swapping at positions 0 and 1), "abcd"
and "acbd"
are also similar, but "dabc"
is not similar to "abcd"
, "bacd"
, and "acbd"
.
Together, these form two connected groups by similarity: {"abcd"
, "bacd"
, "acbd"
} and {"dabc"
}. Notice that "bacd"
and "acbd"
are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.
Given a list of strings, where each string is an anagram of every other string in the list, find the total number of groups of strings.
Examples
Example 1
- Input: strs =
["rstuv", "uvrst", "rsutv", "urvst", "vutsr"]
- Expected Output:
3
- Justification: The groups are
["rstuv", "rsutv"]
,["uvrst", "urvst"]
and["vutsr"]
. All strings within each group can be made identical by swapping at most two characters.
Example 2
- Input: strs =
["abcd", "acbd", "bacd", "dabc"]
- Expected Output:
2
- Justification: The groups are
["abcd", "acbd", "bacd", ]
and["dabc"]
. "abcd", "acbd", and "bacd" can be made identical by swapping at most two characters. "dabc" form another group.
Example 3
- Input: strs =
["xyz", "xzy", "yxz", "yzx", "zxy", "zyx"]
- Expected Output:
1
- Justification: All strings are belongs to the same group.
Constraints:
- 1 <= strs.length <= 300
- 1 <= strs[i].length <= 300
- strs[i] consists of lowercase letters only.
- All words in strs have the same length and are anagrams of each other.
Solution
To solve this problem, we use the Union-Find algorithm, also known as Disjoint Set Union (DSU). This algorithm helps us efficiently group and find connected components. The main idea is to treat each string as a node in a graph. If two strings are similar, we connect their nodes. By the end, the number of connected components in the graph will be the number of similar string groups.
This approach works well because it efficiently manages the merging and finding of groups using path compression and union by rank, which keeps the operations nearly constant time.
Step-by-step Algorithm
-
Initialize: Create a
parent
array where each element is its own parent, and arank
array initialized to zero.parent[i] = i
for alli
rank[i] = 0
for alli
-
Find Function: Implement a
find
function with path compression to find the root parent of a node.- If
parent[x]
is notx
, setparent[x]
to the result offind(parent[x])
- Return
parent[x]
- If
-
Union Function: Implement a
union
function with union by rank to merge two sets.- Find the root parents of
x
andy
- If the root parents are different, compare their ranks
- If
rank[rootX]
is greater thanrank[rootY]
, setparent[rootY]
torootX
- If
rank[rootX]
is less thanrank[rootY]
, setparent[rootX]
torootY
- If ranks are equal, set
parent[rootY]
torootX
and incrementrank[rootX]
- If
- Find the root parents of
-
Similarity Check: Create a helper function
isSimilar
to check if two strings are similar by counting character differences.- Initialize a counter
diff
- Iterate through the characters of both strings
- Increment
diff
for each mismatch - If
diff
exceeds 2, returnfalse
- Increment
- Return
true
ifdiff
is 2 or 0
- Initialize a counter
-
Group Formation: Iterate over the list of strings, and for each pair
(i, j)
, if they are similar, union their sets.- For each
i
from 0 ton-1
- For each
j
fromi+1
ton
- If
isSimilar(strs[i], strs[j])
, callunion(i, j)
- If
- For each
- For each
-
Count Groups: Count the number of unique parents to determine the number of groups.
- Create a set to store unique root parents
- For each
i
from 0 ton-1
, addfind(i)
to the set - The size of the set is the number of groups
Algorithm Walkthrough
Let's walk through the algorithm using the input strs = ["rstuv", "uvrst", "rsutv", "urvst", "vutsr"]
.
-
Initialization:
parent = [0, 1, 2, 3, 4]
rank = [0, 0, 0, 0, 0]
-
Union-Find Operations:
-
Compare "rstuv" with each other string:
- "rstuv" and "uvrst" are not similar.
- "rstuv" and "rsutv" are similar (swap 'u' with 't') -> union(0, 2)
find(0) = 0
,find(2) = 2
parent[2] = 0
parent = [0, 1, 0, 3, 4]
,rank = [1, 0, 0, 0, 0]
- "rstuv" and "urvst" are not similar.
- "rstuv" and "vutsr" are not similar.
-
Compare "uvrst" with each other string:
- "uvrst" and "rsutv" are not similar.
- "uvrst" and "urvst" are similar (swap 'v' with 'r') -> union(1, 3)
find(1) = 1
,find(3) = 3
parent[3] = 1
parent = [0, 1, 0, 1, 4]
,rank = [1, 1, 0, 0, 0]
- "uvrst" and "vutsr" are not similar.
-
Compare "rsutv" with each other string:
- "rsutv" and "urvst" are not similar.
- "rsutv" and "vutsr" are not similar.
-
Compare "urvst" with "vutsr":
- "urvst" and "vutsr" are not similar.
-
-
Count Groups:
- Create a set to store unique root parents.
- For each
i
from 0 to 4, addfind(i)
to the set.find(0) = 0
find(1) = 1
find(2) = find(0) = 0
find(3) = find(1) = 1
find(4) = 4
- The set contains {0, 1, 4}.
- The number of groups is 3.
Code
Complexity Analysis
Time Complexity
- Initialization: Setting up the
parent
andrank
arrays takes O(n) time. - String Comparisons: Comparing every pair of strings takes O(n^2 \cdot L) time, where (n) is the number of strings and (L) is the length of each string.
- Union-Find Operations: Each
find
andunion
operation is O(\alpha(n)), nearly constant time. Since there are (O(n^2)) comparisons, the total time is O(n^2 \cdot \alpha(n)). - Counting Groups: Counting unique parents takes O(n) time.
Overall, the dominant factor is string comparisons, resulting in: O(n^2 \cdot L)
Space Complexity
- Union-Find Arrays: The
parent
andrank
arrays use O(n) space.
Overall, the space complexity is: O(n)
.....
.....
.....
Table of Contents
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible