1

I am on GCP and have been reading different posts about controlling pyspark logging and I couldn't get anything to work until I found a post about overwriting the log4j.properties file. This is kind of working, except that I am getting errors about multiple classpaths for log4j like this:

[2019-09-23 20:38:48.495]Container exited with a non-zero exit code 134. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
/bin/bash: line 1: 23850 Aborted                 /usr/lib/jvm/java-8-openjdk-amd64/bin/java -server -Xmx38168m '-Dflogger.backend_factory=com.google.cloud.hadoop.repackaged.gcs.com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance' -Djava.io.tmpdir=/hadoop/yarn/nm-local-dir/usercache/davidsc/appcache/application_1569259407961_0001/container_1569259407961_0001_01_000400/tmp '-Dspark.driver.port=33949' '-Dspark.rpc.message.maxSize=512' -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/userlogs/application_1569259407961_0001/container_1569259407961_0001_01_000400 -XX:OnOutOfMemoryError='kill %p' org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@davidsc-prod-m.c.unity-ads-ds-prd.internal:33949 --executor-id 368 --hostname davidsc-prod-w-0.c.unity-ads-ds-prd.internal --cores 8 --app-id application_1569259407961_0001 --user-class-path file:/hadoop/yarn/nm-local-dir/usercache/davidsc/appcache/application_1569259407961_0001/container_1569259407961_0001_01_000400/__app__.jar --user-class-path file:/hadoop/yarn/nm-local-dir/usercache/davidsc/appcache/application_1569259407961_0001/container_1569259407961_0001_01_000400/tensorflow-hadoop-1.6.0.jar > /var/log/hadoop-yarn/userlogs/application_1569259407961_0001/container_1569259407961_0001_01_000400/stdout 2> /var/log/hadoop-yarn/userlogs/application_1569259407961_0001/container_1569259407961_0001_01_000400/stderr
Last 4096 bytes of stderr :
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

I followed the link on the second to last log entry, but didn't find it very helpful.

I'm trying to reduce logging to just errors. Following the post I found, I am uploading a log4j.properties file with the following content:

# Set everything to be logged to the console
log4j.rootCategory=ERROR, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c: %m%n

# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=ERROR
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=ERROR
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=ERROR

# Reduce verbosity for other spammy core classes.
log4j.logger.org.apache.spark=ERROR
log4j.logger.org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter=ERROR
log4j.logger.org.apache.hadoop.conf.Configuration.deprecation=ERROR
log4j.logger.org.spark-project.jetty.server.handler.ContextHandler=ERROR

# Spark 2.0 specific Spam
log4j.logger.org.spark_project.jetty.server.handler.ContextHandler=ERROR

# from https://stackoverflow.com/questions/40608412/how-can-set-the-default-spark-logging-level
log4j.logger.org.apache.spark.api.python.PythonGatewayServer=ERROR

I added that last thing about the Python API, maybe that is pulling in some conflicting logger? It seemed like a good thing to add.

In any case, what I'm uploading is just a modification of what I've found here:

/usr/lib/spark/conf/log4j.properties

Which is a soft link to:

./etc/spark/conf.dist/log4j.properties

I do the copy in my Cloud Dataproc cluster initialization. If I ssh to a worker and look at all the log4j.properties files I find:

 find . 2>/dev/null | grep log4j.prop 2>/dev/null
./etc/pig/conf.dist/log4j.properties.template
./etc/pig/conf.dist/test-log4j.properties
./etc/hadoop/conf.empty/log4j.properties
./etc/zookeeper/conf.dist/rest/log4j.properties
./etc/zookeeper/conf.dist/log4j.properties
./etc/spark/conf.dist/log4j.properties.template
./etc/spark/conf.dist/log4j.properties
./ump-ltv-spark-log4j.properties.error

and I can see that ./etc/spark/conf.dist/log4j.properties is my new ERROR only logger.

I hope I don't have to modify the ./etc/hadoop/conf.empty/log4j.properties file, that has a lot more stuff in it.

Following another link, I can look at my spark launch command:

~$ SPARK_PRINT_LAUNCH_COMMAND=1 spark-shell
Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/bin/java -cp /usr/lib/spark/conf/:/usr/lib/spark/jars/*:/etc/hadoop/conf/:/etc/hive/conf/:/usr/share/java/mysql.jar:/usr/local/share/google/dataproc/lib/* -Dscala.usejavacp=true -Dscala.usejavacp=true -Xmx106496m -Dflogger.backend_factory=com.google.cloud.hadoop.repackaged.gcs.com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name Spark shell spark-shell

That makes me think that GCP has put in some backend logger? Maybe that is where these messages are coming from.

fbraga
  • 711
  • 4
  • 9
MrCartoonology
  • 1,997
  • 4
  • 22
  • 38
  • It doesn't look like you posted full error output, may you post full content of the `prelaunch.err` and `stderr` files? – Igor Dvorzhak Oct 26 '19 at 21:40

0 Answers0