0

Version: spring-xd-1.0.1

Distributed mode: yarn

Hadoop version: cdh5

I have modified the config/servers.yml to point to right applicationDir, zookeeper, hdfs, resourcemanager,redis, mysqldb

However after the push, when I start admin, it is killed by yarn after sometime. I do not understand why the container will consume 31G of memory. Please point me in the right direction to debug this problem. Also, how do I increase the log level

Following error is observed in logs:

Got ContainerStatus=[container_id { app_attempt_id { application_id { id: 432 cluster_timestamp: 1415816376410 } attemptId: 1 } id: 2 } state: C_COMPLETE diagnostics: "Container [pid=19374,containerID=container_1415816376410_0432_01_000002] is running beyond physical memory limits. Current usage: 1.2 GB of 1 GB physical memory used; 31.7 GB of 2.1 GB virtual memory used. Killing container.\nDump of the process-tree for container_1415816376410_0432_01_000002 :\n\t|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE\n\t|- 19381 19374 19374 19374 (java) 3903 121 33911242752 303743 /usr/java/jdk1.7.0_45-cloudera/bin/java -DxdHomeDir=./spring-xd-yarn-1.0.1.RELEASE.zip -Dxd.module.config.location=file:./modules-config.zip/ -Dspring.application.name=admin -Dspring.config.location=./servers.yml org.springframework.xd.dirt.server.AdminServerApplication \n\t|- 19374 24125 19374 19374 (bash) 0 0 110804992 331 /bin/bash -c /usr/java/jdk1.7.0_45-cloudera/bin/java -DxdHomeDir=./spring-xd-yarn-1.0.1.RELEASE.zip -Dxd.module.config.location=file:./modules-config.zip/ -Dspring.application.name=admin -Dspring.config.location=./servers.yml org.springframework.xd.dirt.server.AdminServerApplication 1>/var/log/hadoop-yarn/container/application_1415816376410_0432/container_1415816376410_0432_01_000002/Container.stdout 2>/var/log/hadoop-yarn/container/application_1415816376410_0432/container_1415816376410_0432_01_000002/Container.stderr \n\nContainer killed on request. Exit code is 143\nContainer exited with a non-zero exit code 143\n" exit_status: 143

sagar
  • 1
  • Could you provide more info about environment like java version, OS, cdh5 version, etc. Also what type of streams/jobs you had running there and how long it actually took for yarn to kill that container. – Janne Valkealahti Nov 21 '14 at 13:59
  • cdh5.0.1, os : centos 6, No custom jobs. this is straight up the spring yarn zip file unzip and then with config changes. The admin container runs for 34 secs and then fails. – sagar Nov 21 '14 at 17:54
  • >INFO monitor.IntegrationMBeanExporter: Registering beans for JMX exposure on startup >INFO tomcat.TomcatEmbeddedServletContainerFactory: Server initialized with port: 9393 xxxxx INFO core.StandardService: Starting service Tomcat xxxxx INFO core.StandardEngine: Starting Servlet Engine: Apache Tomcat/7.0.55 NFO annotation.AnnotationConfigApplicationContext: Closing org.springframework.context.annotation.AnnotationConfigApplicationContext@3e990b4: startup date [xxx]; root of context hierarchy – sagar Nov 21 '14 at 17:55
  • Looks like you are running out of real physical memory. What is your `yarn.scheduler.maximum-allocation-mb` property set to? I know we have seen problems running YARN apps on Cloudera's Quickstart VM with default config. See https://github.com/spring-projects/spring-hadoop-samples/issues/13 and https://github.com/spring-projects/spring-hadoop-samples/tree/master/mapreduce#building-and-running for more background. – Thomas Risberg Nov 22 '14 at 19:02
  • Tried various tricks to reproduce this without luck. Could try to get more detailed process mem dump to see where the vmem goes. There's good discussion on that in these posts; http://stackoverflow.com/questions/561245/virtual-memory-usage-from-java-under-linux-too-much-memory-used and http://stackoverflow.com/questions/6240985/java-program-with-16gb-virtual-memory-and-growing-is-it-a-problem – Janne Valkealahti Nov 24 '14 at 13:44
  • There seem to be other reports where people generally have trouble with centos6 in yarn. Mentioned in http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/ which leads to https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en. Setting yarn.nodemanager.vmem-check-enabled to false is a workaround if problem indeed is in centos version. – Janne Valkealahti Nov 24 '14 at 14:25
  • the `yarn.scheduler.maximum-allocation-mb` is set to 96GB on 100+GB RAM. all my other mapreduce jobs are running fine. All other hive jobs are running fine. – sagar Nov 24 '14 at 15:44
  • Thanks @JanneValkealahti. I tried that. It seems that `glibc` is culprit. I do have a working admin and container. However, the virtual memory reported by the admin-container is 31G. – sagar Nov 24 '14 at 17:07

1 Answers1

1

Yes, with the current version 1.1.0/1.1.1 you don't need to run the admin explicitly. The containers and admin will be instantiated by yarn when you submit the application.