1

I'm trying to use org.slf4j.Logger in spark. If I write as follows, I will get the error that non-static field cannot be referenced from a static context. Because the method main is static but logger is non-static.

import org.apache.spark.api.java.*;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.broadcast.Broadcast;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class simpleApp {
    private final Logger logger = LoggerFactory.getLogger(getClass());

    public static void main(String[] args) {
        String logFile = "/user/beibei/zhaokai/spark_java/a.txt"; // Should be some file on your system
        SparkConf conf = new SparkConf().setAppName("Simple Application");
        JavaSparkContext sc = new JavaSparkContext(conf);
        JavaRDD<String> logData = sc.textFile(logFile).cache();

        logger.info("loading graph from cache");

        long numAs = logData.filter(new Function<String, Boolean>() {
            public Boolean call(String s) { return s.contains("a"); }
        }).count();

        long numBs = logData.filter(new Function<String, Boolean>() {
            public Boolean call(String s) { return s.contains("t"); }
        }).count();

        System.out.println("Lines with a: " + numAs + ", lines with t: " + numBs);
    }
}

However, if I write like follows. I will get another

error Exception in thread "main" org.apache.spark.SparkException: Task not serializable.

Because the Object of Class simpleApp is not serializable.

import org.apache.spark.api.java.*;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.broadcast.Broadcast;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class simpleApp {
    private final Logger logger = LoggerFactory.getLogger(getClass());

    public static void main(String[] args) {
        new simpleApp().start();
    }

    private void start() {
        String logFile = "/path/a.txt"; // Should be some file on your system
        SparkConf conf = new SparkConf().setAppName("Simple Application");
        JavaSparkContext sc = new JavaSparkContext(conf);
        JavaRDD<String> logData = sc.textFile(logFile).cache();

        logger.info("loading graph from cache");

        long numAs = logData.filter(new Function<String, Boolean>() {
            public Boolean call(String s) { return s.contains("a"); }
        }).count();

        long numBs = logData.filter(new Function<String, Boolean>() {
            public Boolean call(String s) { return s.contains("t"); }
        }).count();

        System.out.println("Lines with a: " + numAs + ", lines with t: " + numBs);
    }
}

So what am I supposed to do?
If I want to use some other packages like org.slf4j.Logger, will I meet the same problem?

Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
user8262885
  • 51
  • 1
  • 9
  • 1
    And how about making ``logger`` a static member of class ``simpleApp``? e.g ``LoggerFactory.getLogger(simpleApp.class)``? – Mithfindel Nov 28 '18 at 15:25

1 Answers1

-1

There might be several options available.... I'd offer org.apache.spark.internal.Logging provided by spark(>=2.2 version of spark).

Doc says :

/**
 * Utility trait for classes that want to log data. Creates a SLF4J logger for the class and allows
 * logging messages at different levels using methods that only evaluate parameters lazily if the
 * log level is enabled.
 */

private def isLog4j12(): Boolean = {
// This distinguishes the log4j 1.2 binding, currently
// org.slf4j.impl.Log4jLoggerFactory, from the log4j 2.0 binding, currently
// org.apache.logging.slf4j.Log4jLoggerFactory
val binderClass = StaticLoggerBinder.getSingleton.getLoggerFactoryClassStr
"org.slf4j.impl.Log4jLoggerFactory".equals(binderClass)
 }

If you want to do same thing on your own with out using spark provided api, you can mimic the same.

Note : In the above approach.. To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).

Also have a look at : apache-spark-logging-within-scala

Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121