17

In Ubuntu, when I am running the hadoop example :

$bin/hadoop jar hadoop-examples-1.0.4.jar grep input output 'dfs[a-z.]+' 

$echo $HADOOP_HEAPSIZE
2000

In log, I am getting the error as :

INFO mapred.JobClient: Task Id : attempt_201303251213_0012_m_000000_2, Status : FAILED Error: Java heap space 13/03/25 15:03:43 INFO mapred.JobClient: Task Id :attempt_201303251213_0012_m_000001_2, Status : FAILED Error: Java heap space13/03/25 15:04:28 INFO mapred.JobClient: Job Failed: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201303251213_0012_m_000000 java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265) at org.apache.hadoop.examples.Grep.run(Grep.java:69) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.examples.Grep.main(Grep.java:93)

Let us know what is the problem.

Filburt
  • 17,626
  • 12
  • 64
  • 115
Senthil Porunan
  • 171
  • 1
  • 1
  • 3

4 Answers4

45

Clearly you have run out of the heap size allotted to Java. So you shall try to increase that.

For that you may execute the following before executing hadoop command:

export HADOOP_OPTS="-Xmx4096m"

Alternatively, you can achieve the same thing by adding the following permanent setting in your mapred-site.xml file, this file lies in HADOOP_HOME/conf/ :

<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx4096m</value>
</property>

This would set your java heap space to 4096 MB (4GB), you may even try it with a lower value first if that works. If that too doesn't work out then increase it more if your machine supports it, if not then move to a machine having more memory and try there. As heap space simply means you don't have enough RAM available for Java.

UPDATE: For Hadoop 2+, make the changes in mapreduce.map.java.opts instead.

Amar
  • 11,930
  • 5
  • 50
  • 73
  • 1
    thanks a lot, this saved the day for me. Clearly this should have been marked as the answer to the question! – Simon Ejsing Aug 28 '13 at 05:08
  • 1
    Perhaps it would be a good idea to set final to true in mapred-site.xml for this setting as well (since otherwise it might get overwritten by the configuration in hadoop-env.sh, should there happen to be one)? – sufinawaz Oct 10 '13 at 20:44
  • Just an update, for hadoop 2+, change mapreduce.map.java.opts instead. – Shiyu Feb 08 '17 at 01:00
7
<property>
   <name>mapred.child.java.opts</name>
  <value>-Xmx4096m</value>
</property>

Works for me.

export HADOOP_OPTS="-Xmx4096m"

doesn't work

JohnDavid
  • 391
  • 3
  • 4
3

Using Hadoop 2.5.0-cdh5.2.0, this worked for me to change the heap size of the local (sequential) java process:

export HADOOP_HEAPSIZE=2900
hadoop jar analytics.jar .....

The reason it worked is that /usr/lib/hadoop/libexec/hadoop-config.sh has

# check envvars which might override default args
if [ "$HADOOP_HEAPSIZE" != "" ]; then
  #echo "run with heapsize $HADOOP_HEAPSIZE"
  JAVA_HEAP_MAX="-Xmx""$HADOOP_HEAPSIZE""m"
  #echo $JAVA_HEAP_MAX
fi
Don Smith
  • 473
  • 4
  • 10
0

If you add property on mapred-site.xml

<property>
   <name>mapred.child.java.opts</name>
  <value>-Xmx2048m</value>
</property>

Sometimes it happens another because it more than virtual memory limit In this situation, you must add

<property>
        <name>yarn.nodemanager.vmem-pmem-ratio</name>
        <value>4.2</value>
</property>

on yarn-site.xml

because its default 2.1G sometimes too small.

Peter.Chu
  • 357
  • 5
  • 4