Grokking the Advanced System Design Interview
Ask Author
Back to course home

0% completed

Fault Tolerance, High Availability, and Data Integrity

Let's learn how GFS implements fault tolerance, high availability, and data integrity.

Fault tolerance

To make the system fault-tolerant and available, GFS makes use of two simple strategies:

  1. Fast recovery in case of component failures.
  2. Replication for high availability.

Let's first see how GFS recovers from master or replica failure:

  • On master failure: The Master being a single point of failure, can make the entire system unavailable in a short time. To handle this, all operations applied on master are saved in an operation log




Like the course? Get enrolled and start learning!