8

When I do rhadoop example, below errors are occurred.

is running beyond virtual memory limits. Current usage: 121.2 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container.

Container killed on request. Exit code is 143

Container exited with a non-zero exit code 143

hadoop streaming failed with error code 1

How can I fix it?

My hadoop settings.

mapred-site.xml

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
</configuration>

yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
                <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
                <name>yarn.nodemanager.local-dirs</name>
                <value>/usr/local/hadoop-2.7.3/data/yarn/nm-local-dir</value>
        </property>
        <property>
                <name>yarn.resourcemanager.fs.state-store.uri</name>
                <value>/usr/local/hadoop-2.7.3/data/yarn/system/rmstore</value>
        </property>
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>localhost</value>
        </property>
        <property>
                <name>yarn.web-proxy.address</name>
                <value>0.0.0.0:8089</value>
        </property>
</configuration>
yes89929
  • 319
  • 1
  • 4
  • 11
  • I suspect this has less to do with R, RStudio, RStudio Server, and RHadoop than Hadoop by itself. (I'm not a Hadoop expert, but my gut tells me 1GB is rather low for it.) Since this has nothing to do with *programming*, I suggest this question belongs on [Server Fault](https://serverfault.com/) or [SuperUser](https://superuser.com/) (both StackExchange sites). – r2evans Apr 16 '17 at 19:54
  • I couldn't figure out why there were 2 RStudio tags either. – IRTFM Apr 17 '17 at 00:04
  • I apologize if I offended you. I didn`t know what cause of errors is. and what are different r, rstudio and rstudio-server. Because my experience about hadoop and R is less than a month. – yes89929 Apr 17 '17 at 04:21

2 Answers2

15

I got almost same error while running a Spark application on YARN cluster.

"Container [pid=791,containerID=container_1499942756442_0001_02_000001] is running beyond virtual memory limits. Current usage: 135.4 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container."

I resolved it by disabling virtual memory check in the file yarn-site.xml

<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>

This one setting was enough in my case.

Anurag
  • 876
  • 12
  • 16
5

I referred below site. http://crazyadmins.com/tag/tuning-yarn-to-get-maximum-performance/

Then I got to know that I can change memory allocation of mapreduce.

I changed mapred-site.xml

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
        <property>
                <name>mapreduce.map.memory.mb</name>
                <value>2000</value>
        </property>
        <property>
                <name>mapreduce.reduce.memory.mb</name>
                <value>2000</value>
        </property>
        <property>
                <name>mapreduce.map.java.opts</name>
                <value>1600</value>
        </property>
        <property>
                <name>mapreduce.reduce.java.opts</name>
                <value>1600</value>
        </property>
</configuration>
yes89929
  • 319
  • 1
  • 4
  • 11