1

I create an external table in Hive with partitions and then try to populate it from the existing table, however, I get the following exceptions:

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /apps/hive/warehouse/pavel.db/browserdatapart/.hive-staging_hive_2018-12-28_13-22-45_751_6056004898772238481-1/_task_tmp.-ext-10000/cityid=1/_tmp.000001_3 could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and no node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3372)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3296)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)

    at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:814)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
    at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:133)
    at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:170)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)
    ... 18 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /apps/hive/warehouse/pavel.db/browserdatapart/.hive-staging_hive_2018-12-28_13-22-45_751_6056004898772238481-1/_task_tmp.-ext-10000/cityid=1/_tmp.000001_3 could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and no node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3372)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3296)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:850)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:504)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)

    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
    at org.apache.hadoop.ipc.Client.call(Client.java:1498)
    at org.apache.hadoop.ipc.Client.call(Client.java:1398)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
    at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:459)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:290)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:202)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:184)
    at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1580)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1375)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:

According to the internet these exceptions occur when datanode can't communicate with namenode or when you are running low on memory, but in my case everything is fine. I have already tried formatting my namenode and datanode as well. What else could be the issue?

https://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo I've also read this. And it didn't help me.

I am running on tez. this works:

insert into table browserdatapart partition(cityid) select UserAgent,cityid from browserdata limit 100;

And this fails with the exception I provided:

insert into table browserdatapart partition(cityid) select UserAgent,cityid from browserdata;
leftjoin
  • 36,950
  • 8
  • 57
  • 116
pavel_orekhov
  • 1,657
  • 2
  • 15
  • 37
  • Just tried it on a smaller dataset and it looks like it works, so I guess it is a memory issue after all, but I don't understand how, because I have 37 gigs available. And the dataset is 24 gigs. – pavel_orekhov Dec 28 '18 at 14:00
  • Please provide more details. You are running MR, not tez, right? and if fails on mapper, right? What is the size of source table and how many mappers started? Also look at the failed mapper log, there can be something interesting – leftjoin Dec 28 '18 at 14:31
  • @leftjoin I am running on tez. this works: `insert into table browserdatapart partition(cityid) select UserAgent,cityid from browserdata limit 100;`, this fails with the exception I provided: `insert into table browserdatapart partition(cityid) select UserAgent,cityid from browserdata;` (note the limit in the end). So, if I try to insert 100 rows it's ok, but if I try to insert the entire dataset it fails. – pavel_orekhov Dec 28 '18 at 14:33
  • @leftjoin, also, my dataset is 24 gigs, I have more than 30 gigs of free memory, and I have 21 fields in my dataset, but load only 2 (useragent and cityid), which probably amounts to 24/10 = 2.4 of memory being taken up after the query completes, I don't think I should run out of memory. – pavel_orekhov Dec 28 '18 at 14:40
  • @leftjoin https://paste.fedoraproject.org/paste/SwWI~gHY34Ccw9Q4ksajVA this is my console output. – pavel_orekhov Dec 28 '18 at 14:45
  • @leftjoin the other logs give the same exception. – pavel_orekhov Dec 28 '18 at 14:47
  • @leftjoin OOM errors are thrown when we don't have enough RAM, while I am talking about hdfs memory. – pavel_orekhov Dec 28 '18 at 15:10
  • Similar question : https://stackoverflow.com/q/36015864/2700344 – leftjoin Dec 28 '18 at 15:16
  • Thanks, I have seen this, and increased all the resources, but this still happens. Weird... – pavel_orekhov Dec 28 '18 at 15:18
  • @leftjoin I found the solution. – pavel_orekhov Dec 29 '18 at 12:22

1 Answers1

1
SET hive.exec.max.dynamic.partitions=100000; 
SET hive.exec.max.dynamic.partitions.pernode=100000;

Setting the above parameters solved it for me. I guess hive was not able to replicate data to those partitions that show up in the exception, because there were more than the maximum (which is 224 in the case of my dataset).

pavel_orekhov
  • 1,657
  • 2
  • 15
  • 37
  • Congratulations! Usually the exceptions is something like "too many dynamic partitions" in this case. – leftjoin Dec 29 '18 at 12:36
  • @leftjoin Thanks! I don't know, these errors are very misleading. – pavel_orekhov Dec 29 '18 at 12:45
  • @leftjoin, ok, it seems like it only worked one time. I consequently removed the table and tried doing it again. In fact now even `insert into table browserdatapart partition(cityid) select useragent,cityid from browserdata limit 1;` fails. It can't insert even 1 row, what the heck? – pavel_orekhov Dec 29 '18 at 12:57
  • @leftjoin now it worked again! I don't understand what's happening. – pavel_orekhov Dec 29 '18 at 13:00
  • Maybe hortonworks did something completely bad in their distro. I keep having these weird problems. – pavel_orekhov Dec 29 '18 at 13:02
  • @leftjoin here's the real answer https://stackoverflow.com/questions/54561086/how-do-i-fix-file-could-only-be-replicated-to-0-nodes-instead-of-minreplication/54719797#54719797 – pavel_orekhov Feb 16 '19 at 04:06