I would want to understand why is it required to format the namenode before starting the hadoop daemons. I know how to format it but don't know why am i actually doing it.
Asked
Active
Viewed 985 times
1
-
I think, you should read the documentation(s) about the topic and then ask if you still have doubt. – Harman Jun 04 '15 at 05:38
-
@Harman i have read multiple ones.And always i get how to format the deamon. I am yet to go through any other document which says Why to format ? Request you to understand the question asked before marking it negative. – Jignesh Rawal Jun 04 '15 at 09:15
-
[the page here, suggests](http://wiki.apache.org/hadoop/GettingStartedWithHadoop) that the first step to starting up your Hadoop installation is formatting the Hadoop filesystem, which is implemented on top of the local filesystems of your cluster, and [this page suggests](http://www.cloudera.com/content/cloudera/en/documentation/cdh4/v4-2-0/CDH4-Installation-Guide/cdh4ig_topic_11_2.html) formatting Namenode invalidates the DataNode storage locations Well, that was enough to start off. After this, did you try searching for it on Google ? – Harman Jun 05 '15 at 13:12
-
A simple Google search will yield you these results http://stackoverflow.com/questions/27143409/what-the-command-hadoop-namenode-format-will-do http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201103.mbox/%3C4D785324.4010208@orkash.com%3E I understood your question well enough, and hence I voted it down. Try looking for it, and you'll find it ! Hope, now you understand why it was marked down. – Harman Jun 05 '15 at 13:17
-
I have been through [many a times] to the links that you have provided in the comment.None of them have provided any clear picture about the actual reason for formatting the name node.The question that i have raised is a repeated one but had to post it due to improper explanation.Out of all the links you have posted find me one line which says this is why one needs to format the namenode. @Harman if you could do so i would accept the negative mark else its upto you to decide. – Jignesh Rawal Jun 05 '15 at 18:29
-
The first step to starting up your Hadoop installation is formatting the Hadoop filesystem, which is implemented on top of the local filesystems of your cluster. You need to do this the first time you set up a Hadoop installation. Before formatting, ensure that the dfs.name.dir directory exists. If you just used the default, then mkdir -p /tmp/hadoop-username/dfs/name will create the directory. (which simply initializes the directory specified by the dfs.name.dir variable) this is the wiki article. – Jignesh Rawal Jun 05 '15 at 18:31
-
Well, I would not go into that. – Harman Jun 17 '15 at 20:51
2 Answers
1
When we format namenode it formats the meta-data related to data-nodes. By doing that, all the information on the datanodes are lost and they can be reused for new data.

Viren Patel
- 31
- 1
- 5
0
Actually you dont have to format every time when you want to start hadoop deamon. It is required once whiile you setup your cluster.If you format everytime then you will loose your data....So it is advisable not to format namenode..Simply you can restart and start stop the deamons..

Amaresh
- 3,231
- 7
- 37
- 60
-
i know the fact that i dont need to format it every time i start my daemon.I also know that formatting the namenode which has data in hdfs can lead to data loss. – Jignesh Rawal Jun 04 '15 at 09:16
-
https://intellipaat.com/community/161/what-exactly-is-hadoop-namenode-formatting – TheCodeCache Mar 15 '22 at 08:04