Grokking the System Design Interview
Ask Author
Back to course home

0% completed

Vote For New Content
Question about Data partitioning - Not sure if there are some users who are very...

nth learner

Dec 31, 2021

Question about Data partitioning - Not sure if there are some users who are very active and sending a lot of data, so was wondering if it is useful to partition on the basis of conversation id - a hash constructed from the combination of sender id and received id? That way all messages that belongs to a conversation can be fetched all at once and a user's conversations can be stored on multiple shards. Not sure what is the downside of that except all of the data needs to be accessed at once at the time of list of chats that shows the previews. That can make it slow. Also, as the article says that we will be keeping two copies - a sender and a receiver's copy of the messages - so maybe, we also need to keep two copies but isn't that we'll be keeping in case of partition based on UserID.

Second question, was wondering if the web sockets can also be used here instead of http polling. I see web sockets to be a good candidate here.

1

0

Comments
Comments
J
Junaid Effendi3 years ago

Had same thoughts, similar to groupChatID, one conversationId can do for all types of convo, and its more important to keep all messages together that belongs to same conversation.

J
Junaid Effendi3 years ago

If we change connections to conversationId, then there would be too many connections, e.g one user can have 100 chats, meaning 500M * 100 connections. Thats one downside.