Diagnostic Messages for this Task: Container [pid=3347,containerID=container_1490354262227_0013_01_000104] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 1.5 GB of 5 GB virtual memory used. Killing container. Dump of the process-tree for container_1490354262227_0013_01_000104 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 3360 3347 3347 3347 (java) 7596 396 1537003520 262629 /usr/java/latest/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx864m -Djava.io.tmpdir=/mnt3/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1490354262227_0013/container_1490354262227_0013_01_000104/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/mnt/var/log/hadoop/userlogs/application_1490354262227_0013/container_1490354262227_0013_01_000104 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.35.178.86 49938 attempt_1490354262227_0013_m_000004_3 104 |- 3347 2563 3347 3347 (bash) 0 1 115806208 698 /bin/bash -c /usr/java/latest/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx864m -Djava.io.tmpdir=/mnt3/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1490354262227_0013/container_1490354262227_0013_01_000104/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/mnt/var/log/hadoop/userlogs/application_1490354262227_0013/container_1490354262227_0013_01_000104 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.35.178.86 49938 attempt_1490354262227_0013_m_000004_3 104 1>/mnt/var/log/hadoop/userlogs/application_1490354262227_0013/container_1490354262227_0013_01_000104/stdout 2>/mnt/var/log/hadoop/userlogs/application_1490354262227_0013/container_1490354262227_0013_01_000104/stderr
Asked
Active
Viewed 274 times
1
-
1Try to optimize your query first. – leftjoin Mar 25 '17 at 20:25
-
@leftjoin How to optimize, can you be little specific – shubh Mar 28 '17 at 09:23
-
It's may be possible to optimize query so it will consume less memory. Please provide query as well as configuration parameters. – leftjoin Mar 28 '17 at 09:28
-
@leftjoin Please find the Query in given link https://pastebin.com/wuNEFgnJ – shubh Mar 28 '17 at 12:00
-
is it failing on reducer or mapper? – leftjoin Mar 28 '17 at 12:18
-
@leftjoin t is failing on mapper – shubh Mar 28 '17 at 13:30
-
then see how to adjust memory settings for mapper in my answer – leftjoin Mar 28 '17 at 13:34
-
@leftjoin how much memory should I use if I am processing around 500 GB of data – shubh Mar 28 '17 at 14:01
-
Difficult to calculate, depend on file sizes, data itself, etc, try to increase until it will work. – leftjoin Mar 28 '17 at 14:08
-
Also try to tune mapper parallelism: https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works – leftjoin Mar 28 '17 at 14:12
-
Sorry it was for tez. See here: http://stackoverflow.com/a/42842117/2700344 – leftjoin Mar 28 '17 at 14:13
2 Answers
1
Container [pid=3347,containerID=container_1490354262227_0013_01_000104] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 1.5 GB of 5 GB virtual memory used.
Looks like your process needs more memory and it is exceeding the defined limit.
You need to increase the container size
SET hive.tez.container.size=4096MB
SET hive.auto.convert.join.noconditionaltask.size=1370MB
Read more about this here.

Ambrish
- 3,627
- 2
- 27
- 42
1
If it is failing on reducer:
- Add distribute by partition key to the query. It will distribute data between reducers and as a result reducers will create less partitions and consume less memory.
insert overwrite table items_s3_table PARTITION(w_id) select pk, cId, fcsku, cType, disposition, cReferenceId, snapshotId, quantity, w_id
from items_dynamodb_table distribute by w_id;
- Try to decrease bytes per reducer. Decreasing this parameter will increase parallelizm (the number of reducers) and may reduce memory consumption per reducer.
hive.exec.reducers.bytes.per.reducer=67108864;
- Adjust memory settings if nothing helps.
For mappers:
mapreduce.map.memory.mb=4096;
mapreduce.map.java.opts=-Xmx3000m;
For reducers:
mapreduce.reduce.memory.mb=4096;
mapreduce.reduce.java.opts=-Xmx3000m;

leftjoin
- 36,950
- 8
- 57
- 116