Error: java heap space on Google Data-Proc Cluster

Question

I am running Hive insert overwrite query on the Google dataproc cluster from a table having

13783531

records to the another partitioned table without any transformation. which fails with the error

Diagnostic Messages for this Task:
Error: Java heap space

FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 34   Cumulative CPU: 1416.18 sec   HDFS Read: 6633737937 
HDFS Write: 0 FAIL

cluster details

n1-standard-16 (16 vCPU, 60.0 GB memory)

with 5 worker nodes.

The error varies between Java heap space and GC overhead limit exceeded. I tried setting the param

set mapreduce.map.memory.mb=7698;
set mapreduce.reduce.memory.mb=7689;
set mapreduce.map.java.opts=-Xmx7186m;
set mapreduce.reduce.java.opts=-Xmx7186m;

Still Fails.

Possible duplicate of [What is Hive: Return Code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask](http://stackoverflow.com/questions/11185528/what-is-hive-return-code-2-from-org-apache-hadoop-hive-ql-exec-mapredtask) — Ravindra babu, Apr 18 '17 at 17:23
both tables? if that is the case, could you make sure that the source table files are valid? for example, not a big line with all the data? — hlagos, Apr 19 '17 at 15:57
In your example of setting params, you have `java.opts` correctly less than `memory.mb` for `map` but reversed for `reduce`; was that just a typo in the question, or did you actually have them reversed in hive? — Dennis Huo, Apr 20 '17 at 17:15

score 2 · Accepted Answer · answered Apr 20 '17 at 19:16

So the issue was insert overwrite was trying to create too many small files. seems we have a fix

 set hive.optimize.sort.dynamic.partition=true;

https://community.hortonworks.com/articles/89522/hive-insert-to-dynamic-partition-query-generating.html

There are two Solution available both of them worked

1. use    set hive.optimize.sort.dynamic.partition=true;

or

2. use DISTRIBUTE BY <PARTITION_COLUMN>

any of these will work. It is better not to use Solution #1.Seems the JIRA says it inserts records into the wrong partition when used with GROUP BY that is why it was disabled by default in the recent hive https://issues.apache.org/jira/browse/HIVE-8151

DrV · Answer 2 · 2017-04-20T01:45:56.550

There's a couple of things you need to address here:

Total JVM memory allocated vs. JVM heap memory

The total JVM memory allocated is set through these parameters:

mapreduce.map.memory.mb
mapreduce.reduce.memory.mb

The JVM heap memory is set through these parameters:

mapreduce.map.java.opts
mapreduce.reduce.java.opts

You must always ensure that Total memory > heap memory. (Notice that this rule is violated in the parameter values you provided)

Total-to-heap ratio

One of our vendors recommended that we should, for the most part, always use roughly 80% of the total memory for heap. Even with this recommendation you will often encounter various memory errors.

Error: heap memory

Probably need to increase both total and heap.

Error: Permgen space not enough

Need to increase the off-heap memory which means you might be able to decrease the heap memory without having to increase the total memory.

Error: GC overhead limit exceeded

This refers to the amount of time that the JVM is allowed to garbage collect. If too little space is received in a very long time, then it will proceed to error out. Try increasing both total and heap memory.

I did the required changes between the heap memory and JVM memory but looks like the changes doesn't reflect in the DataProc cluster , the config remain same as they were while setting up the cluster. Is there any way we can update the configs by setting that up in job level.? — Vishal, Apr 20 '17 at 06:18
These configuration will change your Hive memory configurations. If you are you using **Tez** or something other similar query engine, then you'll have to refer to that query engine's documentation to determine what parameters you need set. — DrV, Apr 20 '17 at 13:59

Error: java heap space on Google Data-Proc Cluster

2 Answers2