Hadoop: There are 0 datanode(s) running and no node(s) are excluded in this operation

Question

I deployed Hadoop cluster on VMware. They all on CentOS 7.

Issue command jps on Master:

[root@hadoopmaster anna]# jps
6225 NameNode
6995 ResourceManager
6580 SecondaryNameNode
7254 Jps

Issue command jps on Slave:

[root@hadoopslave1 anna]# jps
5066 DataNode
5818 Jps
5503 NodeManager

However, I have no idea why the live nodes on http://localhost:50070/dfshealth.html#tab-overview shows 0. And I can't issue hdfs dfs -put in/file/f1. It shows error message:

[root@hadoopmaster hadoop]# hdfs dfs -put in/file/f1 /user
16/01/06 02:53:14 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3110)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3034)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:723)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

    at org.apache.hadoop.ipc.Client.call(Client.java:1476)
    at org.apache.hadoop.ipc.Client.call(Client.java:1407)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1430)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1226)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
put: File /user._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.

I have tried other posts like

rm -R /tmp/*

and check ssh

on master:

[root@hadoopmaster hadoop]# ssh hadoopmaster
Last login: Wed Jan  6 02:56:27 2016 from hadoopslave1
[root@hadoopmaster ~]# exit
logout
Connection to hadoopmaster closed.
[root@hadoopmaster hadoop]# ssh hadoopslave1
Last login: Wed Jan  6 02:43:21 2016
[root@hadoopslave1 ~]# exit
logout
Connection to hadoopslave1 closed.
[root@hadoopmaster hadoop]#

on slave:

[root@hadoopslave1 .ssh]# ssh hadoopmaster
Last login: Wed Jan  6 03:04:45 2016 from hadoopmaster
[root@hadoopmaster ~]# exit
logout
Connection to hadoopmaster closed.
[root@hadoopslave1 .ssh]# ssh hadoopslave1
Last login: Wed Jan  6 03:04:40 2016 from hadoopmaster
[root@hadoopslave1 ~]# exit
logout
Connection to hadoopslave1 closed.
[root@hadoopslave1 .ssh]#

Have a look at : http://stackoverflow.com/questions/26545524/there-are-0-datanodes-running-and-no-nodes-are-excluded-in-this-operation — Ravindra babu, Jan 06 '16 at 18:06

score 0 · Answer 1 · answered Jan 06 '16 at 12:00

You need to look at datanode logs to confirm whether the data node on slave is actually running fine. Running just jps command is not good enough, at times datanode can loose the connection. If your configuration files are right, run this:

Run stop-all.sh
Run jps on all nodes, if there are any processes still up and running - kill them
Run start-all.sh
Run jps command on all nodes
Check the namenode logs and datanode logs to confirm every thing is fine.

score -1 · Answer 2 · answered Jan 06 '16 at 14:33

-1

From the name node, Run the below command to ensure the data nodes are running properly

bin/hadoop dfsadmin -report

and you can see reports like

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 127.0.0.1:50010
Decommission Status : Normal
Configured Capacity: 176945963008 (164.79 GB)
DFS Used: 2140192768 (1.99 GB)
Non DFS Used: 42513027072 (39.59 GB)
DFS Remaining: 132292743168(123.21 GB)
DFS Used%: 1.21%
DFS Remaining%: 74.76%
Last contact: Wed Jan 06 20:04:51 IST 2016

answered Jan 06 '16 at 14:33

Thanga

7,811
3
19
38

Use of this script to execute dfsadmin is deprecated – Stepan Yakovenko Aug 04 '18 at 08:29
3

that's not an answer – pavel_orekhov Feb 06 '19 at 20:09

score -1 · Answer 3 · answered Dec 20 '16 at 18:16

-1

I had resolved similar problem by configuring machines in /etc/hosts. Viewing the datanode logs suggested that Datanodes could not resolve Namenode.

answered Dec 20 '16 at 18:16

Ruslan Kurkebayev

319
2
6

score -1 · Answer 4 · answered Jan 18 '17 at 09:49

Even i had same problem. copyFromLocal: File .COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. I have resolved this by freeing some space. You can also try by stopping data node and restarting it.

score -1 · Answer 5 · answered Jun 15 '17 at 07:52

-1

This is most likely due to free disk space. Check your hard disk usage with df -h

There are some similar questions and answer about this problem like here

answered Jun 15 '17 at 07:52

Malemi

49
3

Hadoop: There are 0 datanode(s) running and no node(s) are excluded in this operation

5 Answers5