I am running Hive insert overwrite query on the Google dataproc cluster from a table having
13783531
records to the another partitioned table without any transformation. which fails with the error
Diagnostic Messages for this Task:
Error: Java heap space
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 34 Cumulative CPU: 1416.18 sec HDFS Read: 6633737937
HDFS Write: 0 FAIL
cluster details
n1-standard-16 (16 vCPU, 60.0 GB memory)
with 5 worker nodes.
The error varies between Java heap space and GC overhead limit exceeded. I tried setting the param
set mapreduce.map.memory.mb=7698;
set mapreduce.reduce.memory.mb=7689;
set mapreduce.map.java.opts=-Xmx7186m;
set mapreduce.reduce.java.opts=-Xmx7186m;
Still Fails.