0

The info on logging from pyspark found in this question How to turn off INFO logging in PySpark? is not working for me: the logging is not happening.

I am familiar with setting up logging in pyspark from a locally built spark. I am however using the cloudera spark now. I have set a RollingFileAppender within the

$SPARK_HOME/log4j.properties

which is the correct thing to do according to the docs:

http://spark.apache.org/docs/1.2.0/configuration.html#configuring-logging

Configuring Logging
Spark uses log4j for logging. You can configure it by adding a 

log4j.properties file in the conf directory. One way to start is to copy the existing log4j.properties.template located there.

But that is not taking effect: no logging files are created in the destination directory.

Community
  • 1
  • 1
WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560

1 Answers1

0

It appears that the specific problem were due to one of the libraries Zookeeper does not use the setting for log4j.properties provided by SPARK_HOME. Instead ZK picks up the first log4j.properties on the classpath

The solution was to copy the log4j.properties already in the $SPARK_HOME/conf dir to the $HADOOP_CONF_DIR. Then the logging behaved as anticipated.

WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560