How can hdfs have a sequential block of 64MB when the underlying linux filesystem has only 4KB block sizes and a write of 64MB block can not be sequential.
Any thoughts on this? I am not able to get any explanation
How can hdfs have a sequential block of 64MB when the underlying linux filesystem has only 4KB block sizes and a write of 64MB block can not be sequential.
Any thoughts on this? I am not able to get any explanation
You may be confusing the terms "contiguous" and "sequential". We have sequential reads/writes (from/to disk) and "contiguous" disk space allocation.
A single HDFS block of 64 MB will be written to disk sequentially. Therefore there is a fair chance that the data will be written into contiguous space on disk (consisting of multiple blocks next to each other). So the disk/block fragmentation will be much lower compared to a random disk write.
Furthermore, sequential reads/writes are much faster than random writes with multiple disk seeks. See Difference between sequential write and random write for further information.