9

When starting my spark-shell, I had a bunch of WARN messages. But I cannot understand them. Is there any important problems that I should take care of? Or is there any configuration that I missed? Or these WARN messages are normal.

cliu@cliu-ubuntu:Apache-Spark$ spark-shell 
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties
To adjust logging level use sc.setLogLevel("INFO")
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.5.2
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_66)
Type in expressions to have them evaluated.
Type :help for more information.
15/11/30 11:43:54 WARN Utils: Your hostname, cliu-ubuntu resolves to a loopback address: 127.0.1.1; using xxx.xxx.xxx.xx (`here I hide my IP`) instead (on interface wlan0)
15/11/30 11:43:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/11/30 11:43:55 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
Spark context available as sc.
15/11/30 11:43:58 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/11/30 11:43:58 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/11/30 11:44:11 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
15/11/30 11:44:11 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
15/11/30 11:44:14 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/11/30 11:44:14 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/11/30 11:44:14 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/11/30 11:44:27 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
15/11/30 11:44:27 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
SQL context available as sqlContext.

scala> 
Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
fluency03
  • 2,637
  • 7
  • 32
  • 62

3 Answers3

10

This one:

15/11/30 11:43:54 WARN Utils: Your hostname, cliu-ubuntu resolves to a loopback address: 127.0.1.1; using xxx.xxx.xxx.xx (`here I hide my IP`) instead (on interface wlan0)
15/11/30 11:43:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address

means that the hostname the driver managed to figure out for itself is not routable and hence no remote connections are allowed. In your local environment, it is not an issue, but if you go for multi-machine configuration, Spark won't work properly. Hence the WARN message as it may or may not be an issue. Just a heads-up.

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
7

The logging info are absolutely normal. Here the BoneCP tries to bind to a JDBC connection and this is why you receive these warnings. In any case if you would like to manage the log records you could specify the logging level by copying <spark-path>/conf/log4j.properties.template file to <spark-path>/conf/log4j.properties and make your configurations.

Lastly, a similar answer for logging level can be found here: How to stop messages displaying on spark console?

Community
  • 1
  • 1
raschild
  • 198
  • 1
  • 9
0

Adding to @Jacek Laskowski answer, with respect to the SPARK_LOCAL_IP warning:

15/11/30 11:43:54 WARN Utils: Your hostname, cliu-ubuntu resolves to a loopback address: 127.0.1.1; using xxx.xxx.xxx.xx (`here I hide my IP`) instead (on interface wlan0)
15/11/30 11:43:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address

I encountered the same running spark-shell over a standalone Spark cluster running on Ubuntu 20.04 server. As expected, setting the SPARK_LOCAL_IP environment variables to $(hostname) made the warning go away, but while the application was running without issues, the worker GUI was not reachable using port 4040.

For fixing this, we had to set SPARK_LOCAL_HOSTNAME instead of SPARK_LOCAL_IP. Doing this, the warning was gone, and the worker GUI became accessible though port 4040.

I couldn't find information about this variable in Spark documentation, but according to Spark's source code it is used for setting a custom local machine URI: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/Utils.scala#L1058

valiano
  • 16,433
  • 7
  • 64
  • 79