Datanode not starts correctly

Question

I am trying to install Hadoop 2.2.0 in pseudo-distributed mode. While I am trying to start the datanode services it is showing the following error, can anyone please tell how to resolve this?

**2**014-03-11 08:48:15,916 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool <registering> (storage id unknown) service to localhost/127.0.0.1:9000 starting to offer service
2014-03-11 08:48:15,922 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2014-03-11 08:48:15,922 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2014-03-11 08:48:16,406 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/in_use.lock acquired by nodename 3627@prassanna-Studio-1558
2014-03-11 08:48:16,426 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
java.io.IOException: Incompatible clusterIDs in /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode: namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9
    at**** org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:837)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:808)
    at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:280)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:222)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
    at java.lang.Thread.run(Thread.java:662)
2014-03-11 08:48:16,427 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582) service to localhost/127.0.0.1:9000
2014-03-11 08:48:16,532 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-611836968-127.0.1.1-1394507838610 (storage id DS-1960076343-127.0.1.1-50010-1394127604582)
2014-03-11 08:48:18,532 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-03-11 08:48:18,534 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2014-03-11 08:48:18,536 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:

you might also benefit from one of my post... since I learned from my mistake other people could save time with the following link this should help to fix the correct VERSION file http://stackoverflow.com/questions/35108445/java-io-ioexception-incompatible-clusterids — lucTiber, Feb 01 '16 at 17:40

iceberg · Accepted Answer · 2016-01-10T04:31:12.257

91

You can do the following method,

copy to clipboard datanode clusterID for your example, CID-8bf63244-0510-4db6-a949-8f74b50f2be9

and run following command under HADOOP_HOME/bin directory

./hdfs namenode -format -clusterId CID-8bf63244-0510-4db6-a949-8f74b50f2be9

then this code formatted the namenode with datanode cluster ids.

edited Jan 10 '16 at 04:31

answered Sep 09 '14 at 06:59

iceberg

1,951
1
22
26

4

Great answer, this worked for me. I just wanted to note that you should run this command from the namenode (not from the problematic datanode). – vefthym Jan 24 '17 at 15:49
2

Thanks for iceberg, but it seems everybody miss an important step to make the re-formation effect: restart the namenode. – djzhu Aug 18 '17 at 10:08
1

Thanks a lot for sharing this command, this helped for me to format Hortonworks hadoop cluster. But they have to provide a generic solution to format the namenode and spread the configs to datanode as Cloudera does. – S.K. Venkat Sep 19 '17 at 12:37
Another option in order to avoid formatting could also to change the `VERSION` file, and set `clusterId` to the correct value. My path to the file was `/usr/local/Cellar/hadoop/hdfs/tmp/dfs/name/current` – pedropedro Jan 10 '18 at 11:11
Appreciate the answer iceberg, the answer worked to help solve the problem in my case. – Bitcoin Murderous Maniac Jan 21 '18 at 17:19
I am a little bit confused right now. Is it going to truncate all of my data on the namenode? `format` sounds scary. Could somebody please clarify it for me? – Sebastian Kaczmarek May 08 '18 at 08:15
3

Anybody know why this is necessary? What's the reason for this random corruption ? – smooth_smoothie May 20 '18 at 23:09
If I format the HDFS I lose all the data, why would anybody do this? – tribbloid Aug 21 '18 at 17:01
2

I wish the [Apache documentation](http://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/SingleCluster.html) weren't so obtuse with all this information! Thanks for your help!!! – Mike Williamson Sep 02 '18 at 20:05
Just wanted to mention that the `clusterId` from the datanode was used in `hdfs namenode -format -clusterId CID-8bf63244-0510-4db6-a949-8f74b50f2be9` (executed on the namenode). – holzkohlengrill Mar 30 '19 at 23:19

Mouna · Answer 2 · 2014-04-22T13:53:43.680

20

You must do as follow :

bin/stop-all.sh
rm -Rf /home/prassanna/usr/local/hadoop/yarn_data/hdfs/*
bin/hadoop namenode -format

I had the same problem until I found an answer in this web site.

edited Apr 22 '14 at 13:53

answered Mar 11 '14 at 10:44

Mouna

3,221
3
27
38

Hi Thanks for the timely help it worked correctly :-) – user2631600 Mar 11 '14 at 17:12
With pleasure :) , mark the answer as "accepted" so that other people can benefit from it ;) – Mouna Mar 12 '14 at 09:13
I have no yarn_data folders on my pc – Stepan Yakovenko Aug 04 '18 at 08:40
@StepanYakovenko The folder is where the OP put his data node. You just need to figure out where you datanode's folder is. – Eugene Jun 14 '20 at 08:22

score 16 · Answer 3 · answered Jun 24 '14 at 10:01

Whenever you are getting below error, trying to start a DN on a slave machine:

java.io.IOException: Incompatible clusterIDs in /home/hadoop/dfs/data: namenode clusterID= ****; datanode clusterID = ****

It is because after you set up your cluster, you, for whatever reason, decided to reformat your NN. Your DNs on slaves still bear reference to the old NN.

To resolve this simply delete and recreate data folder on that machine in local Linux FS, namely /home/hadoop/dfs/data.

Restarting that DN's daemon on that machine will recreate data/ folder's content and resolve the problem.

Vikas Hardia · Answer 4 · 2014-03-12T07:45:14.147

14

Do following simple steps

Clear the data directory of hadoop
Format the namenode again
start the cluster

After this your cluster will start normally if you are not having any other configuration issue

edited Mar 12 '14 at 07:45

answered Mar 11 '14 at 11:13

Vikas Hardia

2,635
5
34
53

Thank you. This is what finally worked for me. – Pavindu Feb 24 '21 at 01:05

score 8 · Answer 5 · edited Dec 19 '16 at 15:56

8

DataNode dies because of incompatible Clusterids compared to the NameNode. To fix this problem you need to delete the directory /tmp/hadoop-[user]/hdfs/data and restart hadoop.

rm -r /tmp/hadoop-[user]/hdfs/data

edited Dec 19 '16 at 15:56

Brian

14,610
7
35
43

answered Nov 22 '15 at 15:11

sofiene zaghdoudi

3,108
2
17
11

score 6 · Answer 6 · answered Nov 28 '14 at 11:30

6

I got similar issue in my pseudo distributed environment. I stopped cluster first, then I copied Cluster ID from NameNode's version file and put it in DataNode's version file, then after restarting cluster, its all fine.

my data path is here /usr/local/hadoop/hadoop_store/hdfs/datanode and /usr/local/hadoop/hadoop_store/hdfs/namenode.

FYI : version file is under /usr/local/hadoop/hadoop_store/hdfs/datanode/current/ ; likewise for NameNode.

answered Nov 28 '14 at 11:30

S N

540
1
6
18

Pretty old post but still, thanks for saving my data. All other solutions require formatting the whole namenode which will evaporate all data. Your solution is much safer, thanks! – Sebastian Kaczmarek May 08 '18 at 08:10

score 5 · Answer 7 · edited Dec 19 '16 at 17:14

Here, the datanode gets stopped immediately because the clusterID of datanode and namenode are different. So you have to format the clusterID of namenode with clusterID of datanode

Copy the datanode clusterID for your example, CID-8bf63244-0510-4db6-a949-8f74b50f2be9 and run following command from your home directory. You can go to your home dir by just typing cd on your terminal.

From your home dir now type the command:

hdfs namenode -format -clusterId CID-8bf63244-0510-4db6-a949-8f74b50f2be9

score 2 · Answer 8 · answered May 04 '17 at 12:12

2

Delete the namenode and datanode directories as specified in the core-site.xml. After that create the new directories and restart the dfs and yarn.

answered May 04 '17 at 12:12

KayV

12,987
11
98
148

score 2 · Answer 9 · answered Nov 15 '17 at 21:20

I also had the similar issue. I deleted namenode and datanode folders from all the nodes, and rerun:

$HADOOP_HOME/bin> hdfs namenode -format -force
$HADOOP_HOME/sbin> ./start-dfs.sh
$HADOOP_HOME/sbin> ./start-yarn.sh

To check the health report from command line (which I would recommend)

$HADOOP_HOME/bin> hdfs dfsadmin -report

and I got all the nodes working correctly.

score 2 · Answer 10 · edited Mar 27 '19 at 14:43

I had same issue for hadoop 2.7.7

I removed the namenode/current & datanode/current directory on namenode and all the datanodes

Removed files at /tmp/hadoop-ubuntu/*
then format namenode & datanode
restart all the nodes.
things work fine

steps: stop all nodes/managers then attempt below steps

rm -rf /tmp/hadoop-ubuntu/* (all nodes)
rm -r /usr/local/hadoop/data/hdfs/namenode/current (namenode: check hdfs-site.xml for path)
rm -r /usr/local/hadoop/data/hdfs/datanode/current (datanode:check hdfs-site.xml for path)
hdfs namenode -format (on namenode)
hdfs datanode -format (on namenode)
Reboot namenode & data nodes

score 0 · Answer 11 · answered Oct 28 '20 at 09:57

There's been different solutions to this problem, but I tested another easy solution and it worked like a charm :

So if someone get the same error, you just need to change the clusterID in the datanodes with clusterID of the namenode in the VERSION file.

With your case, here's were you can change it on datanode side :

namenode clusterID = CID-fb61aa70-4b15-470e-a1d0-12653e357a10; datanode clusterID = CID-8bf63244-0510-4db6-a949-8f74b50f2be9

Backup the current VERSION : cp /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION.BK
vim /home/prassanna/usr/local/hadoop/yarn_data/hdfs/datanode/current/VERSION and change

clusterID=CID-8bf63244-0510-4db6-a949-8f74b50f2be9

with

clusterID=CID-fb61aa70-4b15-470e-a1d0-12653e357a10

Restart the datanode and it should work.

Datanode not starts correctly

11 Answers11

Linked