0% completed
I think Data Deduplication section should have more detail. I think it is missin...
nth learner
Dec 28, 2021
I think Data Deduplication section should have more detail. I think it is missing discussion about chunk management when updates or modifications are done. I do have a couple of questions about this section. Let's assume that we're keeping chunks of 4K. What happens if a file is modified in the middle due to appending a lot of text (> 1 chunk size)? What would happen in that case? Would the old chunk will replaced by 2 newly created chunks since new data cannot fit in one chunk? Assumption here is due to metadata info, the chunks that contains the data after the modification won't change at all. Or, will the client re-generate the chunks?
1
0
Comments
Design Gurus4 years ago
Good questions. This requires a lot of detail to answer. For complete understanding, you can read about how Google’s GFS or BigTable stores/modifies/appends chunks. These scenarios are discussed in detail in the Grokking the Advanced System Design. Not all engineers are...
swe 4 years ago
how would synchronization work if separate clients make changes to the same file that affect different chunks, but the updated chunks in combination would result in a corrupted file?
Is that possible?