5

spark-submit output on two different clusters (both run spark 1.2) look different: one is "log-style", i.e., a voluminous stream of messages like

15/04/06 14:53:13 INFO TaskSetManager: Starting task 262.0 in stage 4.0 (TID 894, XXXXX, PROCESS_LOCAL, 1785 bytes)
15/04/06 14:53:13 INFO TaskSetManager: Finished task 255.0 in stage 4.0 (TID 892) in 155 ms on XXXXX (288/300)
15/04/06 14:53:13 INFO BlockManagerInfo: Added rdd_16_262 in memory on XXXXX:49388 (size: 14.3 MB, free: 1214.5 MB)
15/04/06 14:53:13 INFO TaskSetManager: Finished task 293.0 in stage 4.0 (TID 893) in 156 ms on XXXXX (289/300)
15/04/06 14:53:13 INFO TaskSetManager: Finished task 262.0 in stage 4.0 (TID 894) in 168 ms on XXXXX (290/300)
15/04/06 14:53:16 INFO TaskSetManager: Starting task 1.0 in stage 4.0 (TID 895, ip-10-0-3-92.ec2.internal, NODE_LOCAL, 1785 bytes)
15/04/06 14:53:16 INFO TaskSetManager: Starting task 74.0 in stage 4.0 (TID 896, XXXXX, NODE_LOCAL, 1785 bytes)

and the other "progress-style", i.e., a growing progress bar at the bottom of the screen (which may be interrupted by errors, if any).

How do I switch between the two styles? (either on a per-job or a per-cluster basis)

I tried passing --conf spark.ui.showConsoleProgress=true to spark-submit with no effect.

sds
  • 58,617
  • 29
  • 161
  • 278

1 Answers1

8

I have encountered this before, My situation that time is just because different log4j.rootCategory levels are set in conf/log4j.properties between the two clusters.

The "progress-style" output occurs in the cluster have WARN level of logging, while "Log-style" occurs when I set logging level as INFO

Update (2015-05-10):

Come across the _progressBar startup logic in SparkContext, in branch-1.4.0, actually controlled by two conditions:

_progressBar =
  if (_conf.getBoolean("spark.ui.showConsoleProgress", true) && !log.isInfoEnabled) {
    Some(new ConsoleProgressBar(this))
  } else {
    None
  }

Therefore, to enable the progress-style output in Console, you have to both set spark.ui.showConsoleProgress to true and upgrade your log level in conf/log4j.properties to Not enabling Info, i.e, WARN or ERROR.

yjshen
  • 6,583
  • 3
  • 31
  • 40
  • I passed `--conf 'log4j.rootCategory=WARN,console'` to `spark-submit` and I do not see any change. – sds Apr 27 '15 at 15:58
  • @sds, I don't think you could pass `log4j.rootCategory=WARN,console` using `--conf`, it seems `--conf` only take effect when it is a spark configuration like "spark.some.variable=some_value" – yjshen Apr 28 '15 at 00:33
  • So, what do I need to do? Edit `log4j.properties`? – sds Apr 28 '15 at 01:24
  • @sds, Yes, edit log4j.properties in conf/ and spread it to the whole cluster. – yjshen Apr 28 '15 at 02:01
  • do I need to restart something or will the change in `log4j.properties` affect the very next `spark-submit` invocation? – sds Apr 28 '15 at 02:22
  • @sds, I think you don't need to restart cluster to make it take effect, but not sure, just have a try :) – yjshen Apr 28 '15 at 02:31
  • I tried it: indeed it did not require restarting the cluster. However, the log output just disappeared; there is no progress bar. – sds Apr 28 '15 at 15:18
  • @sds, come across the progressBar startup logic in SparkContext, in branch-1.4.0, actually controlled by two conditions: – yjshen May 10 '15 at 08:26
  • _progressBar = if (_conf.getBoolean("spark.ui.showConsoleProgress", true) && !log.isInfoEnabled) { Some(new ConsoleProgressBar(this)) } else { None } – yjshen May 10 '15 at 08:26
  • According to this, I think you could turn on spark.ui.showConsoleProgress manually and prompt the LogLevel to `Above INFO`, i.e. WARN or ERROR – yjshen May 10 '15 at 08:28
  • yeah, looks like that. thanks. please fold your comments into your answer so that I can accept it. – sds May 10 '15 at 14:46
  • I can't make it work! I'm using spark 1.4.0 and even I set both log4j.rootCategory to ERROR and spark.ui.showConsoleProgress to true, the progress bar doesn't show up. I've tried every possible way to set the showConsoleProgress to true (using spark-default.conf, System.setProperty(), --conf spark.ui.showConsoleProgress=true) nothing works. The last thing I've tried was to change the source code of spark to "_progressBar = Some(new ConsoleProgressBar(this))" and recompile. Still nothing. – naskoos Jul 16 '15 at 06:36
  • Answering to myself, I had to use yarn-client instead of yarn-cluster! – naskoos Jul 16 '15 at 07:12