0

I have put 100 files on hadoop cluster. I want to determine size of metadata maintained by NameNode corresponding to these files.

user1492991
  • 151
  • 2
  • 9

1 Answers1

1

I believe the metadata you mean is the information about data blocks stored in datanode. All those details will be maintained in namenode memory RAM.

Namenode consumes about 150 bytes for block metadata storage and 150 bytes for file metadata storage. So let's assume that your cluster block size is 128Mb and each of your 100 file is around 100Mb size. Then each file consumes 300 bytes of memory in namenode. Name node will be consuming 300*100=30000bytes of data. This is considering the replication is 1x.

Detailed discussion is done here.

Community
  • 1
  • 1
Makubex
  • 419
  • 3
  • 19
  • Isn't block metadata stored on the DataNode rather than the NameNode? From the HDFS paper: `Each block replica on a DataNode is represented by two files in the local host’s native file system. The first file contains the data itself and the second file is block’s metadata including checksums for the block data and the block’s generation stamp.` – Prashanth Chandra May 22 '17 at 00:06