2

I'm writing a Scala / Spark program following this example. My tools are IntelliJ and sbt. (I'm not using the scala spark shell.) I'm using scala-logging with logback-classic and I need to reduce the logging from Spark or direct the spark logging to a different .log file.

I've tried calling sc.setLogLevel("WARN") in my code where sc is the SparkContext but it has no effect.

To make matters worse the Spark log output comes from several different packages (org.apache.spark, o.a.h.m.lib, o.a.h.s.a, etc.) I hope there is a better way than defining an appender for each package.

How do I turn off Spark logging, or better yet redirect logging from Spark to a different file than the logging calls from my code?

Dean Schulze
  • 9,633
  • 24
  • 100
  • 165
  • Sir, while [docs](https://spark.apache.org/docs/latest/configuration.html#dynamically-loading-spark-properties) do state that `..SparkConf take highest precedence, then flags passed to spark-submit or spark-shell, then options in the spark-defaults.conf file..`, you might still want to look at [`log4j` that `Spark` uses](https://spark.apache.org/docs/latest/configuration.html#configuring-logging) – y2k-shubham Jan 18 '19 at 05:46
  • Did you tried setting logger name = your package. Level= warn in logback xml file ? – Himanshu Ahire Jan 18 '19 at 05:55
  • Copying the log4j.properites that had logging turned down into my src/main/resources had no effect. – Dean Schulze Jan 18 '19 at 16:23

3 Answers3

4

You need to suppress the log messages in the Logger package using

Logger.getLogger("org").setLevel(Level.ERROR)

Example program - Try this

import org.apache.log4j.{Level, Logger}
import org.apache.spark._
import org.apache.spark.sql._
import org.apache.spark.sql.functions._


object casestudy1 {

  def main(args: Array[String]) {
    Logger.getLogger("org").setLevel(Level.ERROR)
    val spark = SparkSession.builder().appName("Sample-case_study1").master("local[*]").getOrCreate()

    import spark.implicits._

      val df = Seq( (1, "Hello how are you"),(1, "I am fine"),(2, "Yes you are")).toDF("a","b")
      df.show(false)
  }
}
stack0114106
  • 8,534
  • 3
  • 13
  • 38
0

You can set the log level of the spark logger directly from the sparkContext. To reduce the Spark verbosity you must set the level to ERROR that enable spark to write log only on case of error.

val session =  SparkSession.builder().appName("appName").master("local[*]").getOrCreate()
session.sparkContext.setLogLevel("ERROR")
gccodec
  • 343
  • 1
  • 8
0

Turned out to be easy. In my logback.xml I set <root level="error"> to turn off the noise from Spark. I added a <logger name="mypackage" level=debug additivity="false"> with appenders to where I want my log messages to go to.

Dean Schulze
  • 9,633
  • 24
  • 100
  • 165