I have the following worksheet in IntelliJ:
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}
/** Lazily instantiated singleton instance of SQLContext */
object SQLContextSingleton {
@transient private var instance: SQLContext = _
def getInstance(sparkContext: SparkContext): SQLContext = {
if (instance == null) {
instance = new SQLContext(sparkContext)
}
instance
}
}
val conf = new SparkConf().
setAppName("Scala Wooksheet").
setMaster("local[*]")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val df = sqlContext.read.json("/Users/someuser/some.json")
df.show
This code works in the REPL, but seems to run only the first time (with some other errors). Each subsequent time, the error is:
16/04/13 11:04:57 WARN SparkContext: Another SparkContext is being constructed (or threw an exception in its constructor). This may indicate an error, since only one SparkContext may be running in this JVM (see SPARK-2243). The other SparkContext was created at:
org.apache.spark.SparkContext.<init>(SparkContext.scala:82)
How can I find the context already in use?
Note: I hear others say to use conf.set("spark.driver.allowMultipleContexts","true")
but this seems to be a solution of increasing memory usage (like uncollected garbage).
Is there a better way?