0% completed
Problem Statement
You’re given a string s
and a list of words words
, where all words have the same length.
A concatenated substring is formed by joining all the words from any permutation of words
— each used exactly once, without any extra characters in between.
For example, if words = ["ab", "cd", "ef"]
, then valid concatenated strings include "abcdef"
, "abefcd"
, "cdabef"
, "cdefab"
, "efabcd"
, and "efcdab"
. A string like "acdbef"
is not valid because it doesn't match any complete permutation of the given words.
Return all starting indices in s
where such concatenated substrings appear. You can return the indices in any order.
Example 1:
Input: String="catfoxcat", Words=["cat", "fox"]
Output: [0, 3]
Explanation: The two substring containing both the words are "catfox" & "foxcat".
Example 2:
Input: String="catcatfoxfox", Words=["cat", "fox"]
Output: [3]
Explanation: The only substring containing both the words is "catfox".
Constraints:
- 1 <= words.length <= 10<sup>4</sup>
1 <= words[i].length <= 30
words[i]
consists of only lowercase English letters.- All the strings of words are
unique
. - 1 <= sum(words[i].length) <= 10<sup>5</sup>
Solution
To solve this problem, we need to check every possible starting position in the string where the concatenation of all given words might begin. We will use a sliding window approach. This approach involves creating a window of the size equal to the total length of all words combined and sliding it across the string, one character at a time. For each position of the window, we will check if it contains a valid concatenation of the given words. This method is effective because it reduces the number of unnecessary checks by focusing only on the relevant parts of the string.
This approach is effective because it systematically checks all possible positions in the string, ensuring no potential concatenation is missed. By using a hashmap to store the frequency of each word in the list and another hashmap to track the words seen in the current window, we can efficiently validate the concatenations.
Step-by-step Algorithm
-
Initialize Data Structures:
- Create a hashmap to store the frequency of each word in the list.
- Create a list to store the starting indices of valid concatenations.
-
Set Up Variables:
- Calculate the total number of words (
wordsCount
). - Determine the length of each word (
wordLength
).
- Calculate the total number of words (
-
Fill Frequency Map:
- Iterate through the list of words and populate the frequency hashmap with the count of each word.
-
Slide the Window:
- Iterate over the string, adjusting the starting position of the window from
0
tostr.length() - wordsCount * wordLength
.
- Iterate over the string, adjusting the starting position of the window from
-
Check Each Window:
- For each starting position
i
, create a hashmap (wordsSeen
) to track the words seen in the current window. - Within the window, iterate through the words:
- Calculate the starting index of the next word in the window (
nextWordIndex
). - Extract the word from the string using
substring
. - Check if the word exists in the frequency hashmap:
- If it does not exist, break out of the inner loop.
- If it exists, update the
wordsSeen
hashmap with the count of the word seen. - If the count of the word in
wordsSeen
exceeds the count in the frequency hashmap, break out of the inner loop.
- If all words are matched perfectly, add the starting index
i
to the result list.
- Calculate the starting index of the next word in the window (
- For each starting position
-
Return Results:
- Return the list of starting indices where valid concatenations are found.
Algorithm Walkthrough
Using the example:
-
Input:
String = "catfoxcat"
Words = ["cat", "fox"]
-
Steps:
-
Initialize:
- wordFrequencyMap:
{ "cat": 1, "fox": 1 }
- resultIndices:
[]
wordsCount = 2
wordLength = 3
totalLength = 2 * 3 = 6
- wordFrequencyMap:
-
Sliding Window:
-
Iterate from
i = 0
toi = str.length() - totalLength = 3
-
i = 0:
- Window: "catfox"
wordsSeen = {}
- j = 0:
nextWordIndex = 0
- Extracted Word: "cat"
- Update
wordsSeen = { "cat": 1 }
- j = 1:
nextWordIndex = 3
- Extracted Word: "fox"
- Update
wordsSeen = { "cat": 1, "fox": 1 }
- All words matched, add index
0
to result list. - resultIndices:
[0]
-
i = 1:
- Window: "atfoxc"
wordsSeen = {}
- j = 0:
nextWordIndex = 1
- Extracted Word: "atf"
- "atf" is not in the wordFrequencyMap, break.
- resultIndices remains
[0]
-
i = 2:
- Window: "tfoxca"
wordsSeen = {}
- j = 0:
nextWordIndex = 2
- Extracted Word: "tfo"
- "tfo" is not in the wordFrequencyMapp, break.
- resultIndices remains
[0]
-
i = 3:
- Window: "foxcat"
wordsSeen = {}
- j = 0:
nextWordIndex = 3
- Extracted Word: "fox"
- Update
wordsSeen = { "fox": 1 }
- j = 1:
nextWordIndex = 6
- Extracted Word: "cat"
- Update
wordsSeen = { "fox": 1, "cat": 1 }
- All words matched, add index
3
to result list. - resultIndices:
[0, 3]
-
- Output:
[0, 3]
Code
Here is what our algorithm will look like:
Complexity Analysis
Time Complexity
-
Building the frequency map: The algorithm first builds a frequency map for the given words. This step takes O(M) time, where
M
is the total number of words in thewords
array. -
Sliding window traversal: The outer loop iterates through the input string
str
up to the point where there are enough characters left to accommodate the concatenation of all the words. This loop runs for O(N - M \times L) iterations, where:N
is the length of the stringstr
.M
is the number of words.L
is the length of each word (all words are assumed to have the same length).
-
Word matching: For each starting index in the string, the inner loop checks for a valid sequence of words. This loop runs for M iterations, where each iteration takes O(L) time to extract a word from the string and update the
wordsSeen
map. So the total cost of word matching for each starting index is O(M \times L). -
Therefore, the overall time complexity is:
- Outer loop: O(N - M \times L)
- Inner loop: O(M \times L)
- Total time complexity: O((N - M \times L) \times M \times L) = O(N \times M \times L).
Overall time complexity: O(N \times M \times L), where N
is the length of the string, M
is the number of words, and L
is the length of each word.
Space Complexity
- HashMaps: The algorithm uses two hash maps:
wordFrequencyMap
: This map stores the frequency of each word from thewords
array. It contains at mostM
entries, so the space complexity for this map is O(M).wordsSeen
: This map stores the frequency of the words as they are found in the current window of the string. The space complexity for this map is also O(M).
- Result list: The result list stores the starting indices of the valid concatenations. In the worst case, all possible starting positions in the string could lead to valid concatenations. Hence, this list could have up to O(N) entries in the worst case.
Overall space complexity: O(M + N).
.....
.....
.....
Table of Contents
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible
Contents are not accessible