0% completed
Fault Tolerance section
L L
Dec 17, 2024
How can we efficiently retrieve a mapping between tweets and the index server? We have to build a reverse index that will map all the TweetID to their index server. [...] We will need to build a Hashtable where the ‘key’ will be the index server number and the ‘value’ will be a HashSet containing all the TweetIDs
First of all, I think there's a mistake in the first part, the mapping is between the index servers and the tweets, not the other way around.
Second, instead of adding yet another mapping (which would also need to be fault tolerant), wouldn't it be simpler to use a Write-Ahead Log (WAL)? Whenever we use an index server to map a given word to a tweet, we can log it to disk in a location dedicated to that index server. If both the leader and the replica of the index server crash, we can always reconstruct its state using the WAL.
This also means that we don't have to assign index servers based on tweet IDs at all. We can just randomly select an index server each time we have to index a tweet.
0
0