0% completed
Problem Statement
Given a words
array containing non-duplicate strings, return all the concatenated words
that are in the words
array.
A concatenated word
is a string that is formed entirely of at least two shorter words (May be same) of the given array.
Examples
-
Example 1:
- Input: words =
["tree","cat","cattree","dog","catdog"]
- Expected Output:
["cattree", "catdog"]
- Justification: 'cattree' is formed by concatenating 'cat' and 'tree'. Similarly, 'catdog' is a combination of 'cat' and 'dog'. Both 'cattree' and 'catdog' consist entirely of shorter words found in the array.
- Input: words =
-
Example 2:
- Input: words =
["blue","sky","bluesky","sea","sun","seasun"]
- Expected Output:
["bluesky", "seasun"]
- Justification: 'bluesky' combines 'blue' and 'sky', while 'seasun' is formed by 'sea' and 'sun'. Both words are made up entirely of shorter words from the list.
- Input: words =
-
Example 3:
- Input:
["note","book","pen","notebook","penbook","notebookpenbook"]
- Expected Output:
["notebook", "penbook", "notebookpenbook"]
- Justification: 'notebook' is formed by concatenating 'note' and 'book'. 'penbook' combines 'pen' and 'book'. 'notebookpenbook' is an amalgamation of 'notebook' and 'penbook'.
- Input:
Solution
To solve this problem, we'll employ a dynamic programming approach to efficiently determine if a word can be formed by concatenating other words from the list. Initially, we'll sort the array of strings based on their lengths; this ensures that we build up solutions using smaller words first, which simplifies the process of checking for larger concatenated words. This approach is effective because it allows us to reuse solutions for smaller problems, thereby reducing the overall computational complexity.
The algorithm iterates through each word in the sorted list and uses a dynamic programming technique to check if the current word can be broken down into smaller words present in the list. For each word, we maintain a boolean array that keeps track of whether a substring of the word (up to a certain length) can be formed by concatenating other words from the list. By breaking down the problem into smaller subproblems and using the results of these subproblems to solve larger ones, we ensure an efficient solution that minimizes redundant computations.
Step-by-Step Algorithm
-
Sort Words by Length: Begin by sorting the array of words by their lengths in ascending order. This ensures that when checking if a word can be formed by concatenating other words, we start with the shortest words, building up to longer ones.
-
Initialize Data Structures: Create a hash set (or equivalent) to store all words for efficient lookup. Also, prepare a list or array to hold the final results.
-
Iterate Through Words: Go through each word in the sorted list. For each word, perform the following steps:
-
Skip Empty Words: If the current word is empty, skip it.
-
Remove Current Word from Set: Temporarily remove the current word from the hash set. This prevents the algorithm from using the word in its own construction.
-
Check for Concatenation: Determine if the current word can be formed by concatenating other words already seen (and thus, present in the hash set). Use dynamic programming for this step:
- Create a boolean DP array, where
dp[i]
indicates whether the substring of the word up to indexi
can be formed from words in the set. - Initialize
dp[0]
totrue
, representing the empty string. - For each position
i
in the word, check all possible splits (j
) up toi
. Ifdp[j]
is true and the substring fromj
toi
is in the hash set, setdp[i]
to true.
- Create a boolean DP array, where
-
Add Word to Results: If
dp[word.length]
is true, add the current word to the result list since it indicates the whole word can be formed by concatenation. -
Reinsert Word into Set: Add the current word back to the hash set to allow it to be used in forming longer words.
-
-
Return Results: After iterating through all words, return the list of concatenated words found.
Algorithm Walkthrough
Let's consider the Input: words = ["blue","sky","bluesky","sea","sun","seasun"]
Step 1: Sort Words by Length
Sorted array: ["sky", "sea", "sun", "blue", "bluesky", "seasun"]
.
Step 2: Initialize Data Structures
- Create a hash set containing all words.
- Prepare an empty list for results.
Step 3: Iterate Through Words
For each word, we temporarily remove it from the set, check if it can be formed by concatenating other words in the set using a DP array, add it to the results if true, and then reinsert it into the set.
"sky"
- DP array initialization:
[true, false, false, false]
(Length 4,dp[0]
is true for the base case). - No other words to form "sky" yet.
- Reinsert "sky".
"sea"
- DP array initialization:
[true, false, false, false]
(Length 4). - No other words to form "sea" yet. No need to update the DP array.
- Reinsert "sea".
"sun"
-
DP array initialization:
[true, false, false, false]
(Length 4). -
No other words to form "sun" yet. No need to update the DP array.
-
Reinsert "sun".
"blue"
- DP array initialization:
[true, false, false, false, false]
(Length 5). - No other words to form "blue" yet. No need to update the DP array.
- Reinsert "blue".
At this point, the algorithm hasn't found any concatenated words because all words checked so far are the building blocks and cannot be broken down into smaller words from the list.
"bluesky"
- DP array initialization:
[true, false, false, false, false, false, false, false]
(Length 8). - Check "blue" (positions 0 to 4):
- Since "blue" is in the set, set
dp[4]
to true.
- Since "blue" is in the set, set
- Check "sky" (positions 4 to 7):
- Since "sky" is in the set and
dp[4]
is true, setdp[7]
to true.
- Since "sky" is in the set and
- Update DP array:
[true, false, false, false, true, false, false, true]
. - Since
dp[7]
is true, "bluesky" is identified as a concatenated word and added to the result. - Reinsert "bluesky".
"seasun"
- DP array initialization:
[true, false, false, false, false, false, false]
(Length 7). - Check "sea" (positions 0 to 3):
- Since "sea" is in the set, set
dp[3]
to true.
- Since "sea" is in the set, set
- Check "sun" (positions 3 to 6):
- Since "sun" is in the set and
dp[3]
is true, setdp[6]
to true.
- Since "sun" is in the set and
- Update DP array:
[true, false, false, true, false, false, true]
. - Since
dp[6]
is true, "seasun" is identified as a concatenated word and added to the result. - Reinsert "seasun".
Step 4: Results
The final list of concatenated words is ["bluesky", "seasun"]
.
Code
Complexity Analysis
Time Complexity
-
Sorting: The initial sorting of the array of words by their lengths has a time complexity of O(N log N), where (N) is the number of words in the array.
-
Checking Concatenation: For each word, we check if it can be formed by concatenating other words from the set. This involves iterating over the length of the word ((M), the average length of a word) and, for each character, checking all possible prefixes. In the worst case, this is O(M^2) per word due to the substring operations and lookup in the set.
-
Overall, assuming (N) is the total length of all words combined, the time complexity can be approximated as O(N \log N + N * M^2).
Space Complexity
-
Storing Words: The space required to store the words in a set and the dynamic programming (DP) array for each word contributes to the space complexity. The set storage is O(N), where (N) is the number of words.
-
DP Array: For each word, a DP array of size equal to the length of the word ((M)) is used, leading to O(M) space per word. However, since the DP array is reused for each word, the maximum space needed at any time is O(M), where (M) is the maximum length of a word in the list.
-
Overall, the space complexity is O(N + M), where (N) is the number of words, and (M) is the maximum length of a word.
.....
.....
.....
Table of Contents
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible