Grokking the System Design Interview
Ask Author
Back to course home

0% completed

Vote For New Content
I have a few questions bout this section that I'm hoping to clarify:1.) In secti...

Gary

Feb 27, 2022

I have a few questions bout this section that I'm hoping to clarify:

1.) In section 6 (DB schema), why is the PhotoID declared as int (which is usually 4 bytes and goes up to around 2 billion), when the number of photos created over 10 years is estimated to be around 7.3 billion (2M * 365 * 10)?

2.) In section 10 (Data Sharding), while the PhotoID-based partitioning resolves the various issues with hot user, doesn't it make it slower to retrieve all photos for one single user since his/her photos could be spread out amongst many shards?

3.) In section 12 (News Feed Creation...), to determine the chronological order of photo creation, why not just make PhotoID a bigint that's an autogenerated DB sequence and use that to determine order? I also don't think 5 bytes fit in an int?

I also echo JC's question about the UserPhoto table.

Thanks.

0

0

Comments
Comments
N
noob 3 years ago

I can comment views about No. 2. Since initially we say that consistency can take a hit, I believe it is okay for server filling UserNewsFeed table to take some time reading photos from multiple shards.

Also we anyway have to read photos for multiple user (whom the cur...

V
Viktoria 3 years ago

No 1. Later on it is proposed to use 64 bits PhotoID or 40 bits PhotoID (based on epoch time). Both options cover storing photos for 10 years with 2M daily photos.

On this page

  1. What is Instagram?

Try it yourself

Designing Instagram (video)

  1. Requirements and Goals of the System
  1. Some Design Considerations
  1. Capacity Estimation and Constraints
  1. High Level System Design
  1. Database Schema
  1. Data Size Estimation
  1. Component Design
  1. Reliability and Redundancy
  1. Data Sharding
  1. Ranking and News Feed Generation
  1. News Feed Creation with Sharded Data
  1. Cache and Load balancing