Grokking the Advanced System Design Interview
Ask Author
Back to course home

0% completed

Fault Tolerance

Let's explore what techniques HDFS uses for fault tolerance.

How does HDFS handle DataNode failures?


When a DataNode dies, all of its data becomes unavailable. HDFS handles this data unavailability through replication. As stated earlier, every block written to HDFS is replicated to multiple (default three) DataNodes. Therefore, if one DataNode becomes inaccessible, its data can be read from other replicas.


The NameNode keeps track of DataNodes through a heartbeat mechanism. Each DataNode sends periodic heartbeat messages (every few seconds) to the NameNode




Like the course? Get enrolled and start learning!