Wiping out DFS in Hadoop

Question

How do I wipe out the DFS in Hadoop?

score 16 · Answer 1 · edited Oct 01 '12 at 11:41

16

You need to do two things:

Delete the main hadoop storage directory from every node. This directory is defined by the hadoop.tmp.dir property in your hdfs-site.xml.
Reformat the namenode:

hadoop namenode -format

If you only do (2), it will only remove the metadata stored by the namenode, but won't get rid of all the temporary storage and datanode blocks.

edited Oct 01 '12 at 11:41

romedius

775
6
20

answered Dec 21 '11 at 23:58

Eduard

3,482
2
27
45

deleting main hadoop storage directory from every single node is not feasible! – Mehraban Dec 10 '13 at 11:04
performing namenode -format will delete all the metadata and also makes your cluster unusable. This is not a advisable option. – Karthik Apr 21 '15 at 02:45
Also if a namenode -format will generate new cluster id for the namenode and all the other deamons will not be able to communicate with the namenode. Please update your answer to avoid misguidance. Thanks – Karthik Apr 21 '15 at 02:49

Jonathan Graehl · Answer 2 · 2014-01-14T00:50:18.280

10

hdfs dfs -rm -r "/*"

(the old answer was deprecated)

edited Jan 14 '14 at 00:50

answered Sep 10 '09 at 23:50

Jonathan Graehl

9,182
36
40

I get a delete failed error when I try this, I can delete subdirectories, but not the root – David Parks Nov 16 '12 at 03:14
1

`hdfs dfs -rmr` is now deprecated and also won't work for `/`. You should try `hdfs dfs -rm -r "/*"` instead. – Mehraban Dec 10 '13 at 11:03

score 10 · Answer 3 · answered Sep 10 '09 at 23:52

10

bin/hadoop namenode -format

answered Sep 10 '09 at 23:52

SquareCog

19,421
8
49
63

3

Watchout: existing old datanodes won't work with this newly formatted dfs. See http://issues.apache.org/jira/browse/HDFS-107 – Leonidas Jan 25 '10 at 12:31

score 3 · Answer 4 · answered Apr 12 '12 at 19:26

3

You may issue

hadoop fs -rmr /

This would delete all directories and sub-directories under DFS.

Another option is to stop your cluster and then issue:

hadoop namenode -format

This would erase all contents on DFS, and then start the cluster again.

answered Apr 12 '12 at 19:26

techlad

103
1
1
5

jonhurlock · Answer 5 · 2012-02-06T01:33:13.913

So this is what I have had to do in the past.

1. Navigate to your hadoop directory on your NameNode, then stop all the hadoop processes. By running the default stop-all script. This will also stop DFS. e.g.

cd myhadoopdirectory
bin/stop-all.sh

2. Now On every machine in your cluster (Namenodes, JobTrackers, datanodes etc.) delete all files in your main hadoop storage mine is set to the temp folder in the root folder. Yours can be found in the conf hdfs-site.xml file under hadoop.tmp.dir property e.g.

cd /temp/
rm -r *

3. Finally go back to your name node, and format it by going to your hadoop directory and running 'bin/hadoop namenode -format' e.g.

cd myhadoopdirectory
bin/hadoop namenode -format

4. Start up your cluster again by running the following command. It will also startup DFS again.

bin/start-all.sh

5. And it should work.

score 1 · Answer 6 · edited Apr 19 '12 at 08:39

1

You need to call bin/stop-all.sh to stop dfs and mapreduce.
Delete data dir which is configured in conf/hdfs-site.xml and conf/mapred-site.xml.
Make sure that you have deleted some temporary files existing in /tmp dir.

After all above steps, you can call bin/hadoop namenode -format to regenerate a dfs.

edited Apr 19 '12 at 08:39

Charles Menguy

40,830
17
95
117

answered Apr 18 '12 at 05:33

SomeOneSomeDay

11
2

score 1 · Answer 7 · answered Nov 13 '12 at 22:42

1

Stop you cluster

${HADOOP_HOME}/bin/stop-mapred.sh

${HADOOP_HOME}/bin/stop-dfs.sh

or if its pseudo distributed, simply issue:

${HADOOP_HOME}/bin/stop-all.sh
Format your hdfs

hadoop namenode -format

answered Nov 13 '12 at 22:42

stholy

322
1
7
12

Wiping out DFS in Hadoop

7 Answers7

Linked