9

I am trying to run the simple following code using spark within Eclipse:

import org.apache.spark.sql.SQLContext
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object jsonreader {  
  def main(args: Array[String]): Unit = {
    println("Hello, world!")
    val conf = new SparkConf()
      .setAppName("TestJsonReader")
      .setMaster("local")
      .set("spark.driver.memory", "3g") 
    val sc = new SparkContext(conf)

    val sqlContext = new SQLContext(sc)
    val df = sqlContext.read.format("json").load("text.json")

    df.printSchema()
    df.show   
  }
}

However, I get the following errors:

16/08/18 18:05:28 ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration.

I followed different tutorials like this one: How to set Apache Spark Executor memory. Most of time either I use --driver-memory option (not possible with Eclipse) or by modifiying the spark configuration but there is no corresponding file.

Does anyone have any idea about how to solve this issue within Eclipse environment?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Yassir S
  • 1,032
  • 3
  • 21
  • 44

7 Answers7

21

In Eclipse go to Run > Run Configurations... > Arguments > VM arguments and set max heapsize like -Xmx512m.

howlger
  • 31,050
  • 11
  • 59
  • 99
abaghel
  • 14,783
  • 2
  • 50
  • 66
9

I had this issue as well and this is how I solved it. Thought it might be helpful.

val conf: SparkConf = new SparkConf().setMaster("local[4]").setAppName("TestJsonReader").set("spark.driver.host", "localhost")
conf.set("spark.testing.memory", "2147480000")
Duy Bui
  • 1,348
  • 6
  • 17
  • 38
5

Working fine for me once modifying the script as conf.set("spark.testing.memory", "2147480000")

complete code below:

import scala.math.random
import org.apache.spark._

object SparkPi {
  def main(args: Array[String]) {
    val conf: SparkConf = new SparkConf().setMaster("local").setAppName("Spark Pi").set("spark.driver.host", "localhost")

     conf.set("spark.testing.memory", "2147480000")         // if you face any memory issues


    val spark = new SparkContext(conf)
    val slices = if (args.length > 0) args(0).toInt else 2
    val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow

    val count = spark.parallelize(1 until n, slices).map { i =>
      val x = random * 2 - 1
      val y = random * 2 - 1
      if (x * x + y * y < 1) 1 else 0
    }.reduce(_ + _)

    println("Pi is roughly " + 4.0 * count / n)
    spark.stop()
  }
}

Step-2

Run it as “Scala Application”

Step-3 Creating JAR file and Execution:

bin/spark-submit --class SparkPi --master local SparkPi.jar
S.I.
  • 3,250
  • 12
  • 48
  • 77
user8789594
  • 51
  • 1
  • 1
1

In my case mvn stopped packaging the project, with the same exception (java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200.).

I started debugging this issue by changing the settings for the VM heap size: export MAVEN_OPTS="-Xms1024m -Xmx4096m -XX:PermSize=1024m". It did not work..

Then I tried adding to spark config the spark.driver.memory option equal to 1g [SparkConfig.set("spark.driver.memory","1g")].

At the end it turned out that my java installation had somehow gotten messesed up. I reinstalled the JDK (to a newer version) and had to set up again the JAVA_HOME paths and then everything was working from the terminal.

If upgrading, to use Netbeans/Intellij/Eclipse someone would need to configure the JDK setting in each one of them to point to the new installation of the Java Development Kit.

Stanislav
  • 2,629
  • 1
  • 29
  • 38
1

i added .set("spark.testing.memory", "2147480000"); which allow me to run the code.

SparkConf conf = new SparkConf().setAppName("Text").setMaster("local").set("spark.testing.memory", "2147480000");
        JavaSparkContext sparkContxt = new JavaSparkContext(conf);
        SQLContext sqlContext = new SQLContext(sparkContxt);
0

You can set "spark.driver.memory" option by edit the "spark-defaults.conf" file in "${SPARK_HOME}/conf/", By default, there is no file called "spark-defaults.conf" in the directory of "${SPARK_HOME}/conf/", but there is a file "spark-defaults.conf.template", you can use the following command to create "spark-defaults.conf" file:

cp spark-defaults.conf.template spark-defaults.conf

then, edit it:

# Example:
# spark.master                     spark://master:7077
# spark.eventLog.enabled           true
# spark.eventLog.dir               hdfs://namenode:8021/directory
# spark.serializer                 org.apache.spark.serializer.KryoSerializer
# spark.driver.memory              5g
# spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"


spark.driver.memory              3g
StrongYoung
  • 762
  • 1
  • 7
  • 17
0

You need to increase also spark.testing.memory if you are running locally

spark.driver.memory, 571859200 spark.testing.memory, 2147480000

allojo
  • 51
  • 4