0

I want to create a custom logger in Spark. I want to send some messages from executors in a local file for debugging purposes. I tried to follow this tutorial and so I edited my log4j.properties file like this to create a cisutom looger that saves the logs in /mypath/sparkU.log:

# My added lines
log4j.logger.myLogger=WARN, RollingAppenderU 
log4j.appender.RollingAppenderU=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppenderU.File=/mypath/sparkU.log
log4j.appender.RollingAppenderU.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppenderU.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppenderU.layout.ConversionPattern=[%p] %d %c %M - %m%n

log4j.rootLogger=${root.logger}
root.logger=WARN,console       
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
shell.log.level=WARN
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR
log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR
log4j.logger.org.apache.spark.repl.Main=${shell.log.level}
log4j.logger.org.apache.spark.api.python.PythonGatewayServer=${shell.log.level}

Then I run with spark submit this (I usually work in Python but language is not the problem here):

from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext, SparkSession
from pyspark.sql.types import *

spark = SparkSession \
        .builder \
        .master("yarn") \
        .appName("test custom logging") \
        .config("spark.some.config.option", "some-value") \
        .getOrCreate()

log4jLogger = spark.sparkContext._jvm.org.apache.log4j 

log = log4jLogger.LogManager.getLogger(__name__) 

log.error("Hello demo")

log.error("I am done")

print 'hello from print'

But while the file SparkU.log is created it is empty. Spark logs in console and in hdfs are created correctly. Why the log file is empty and which is the correct way to do something like this? I work I Spark 2.1 under Yarn and I use Cloudera distribution. Thanks for any advice.

Michail N
  • 3,647
  • 2
  • 32
  • 51
  • Possible duplicate of [How to stop messages displaying on spark console?](https://stackoverflow.com/questions/27781187/how-to-stop-messages-displaying-on-spark-console) – Rahul Sharma Feb 23 '18 at 16:27
  • I want to redirect Spark's log stream to a FILE and not simply change log level to hide most of the messages – Michail N Feb 24 '18 at 08:48
  • You may have already tried this. Configure required logging using log4j, and once job is completed, I download yarn logs using `yarn logs applicationId ` then redirect it to file. – Rahul Sharma Feb 26 '18 at 14:56

0 Answers0