1

I have a problem with installing Spark 1.4.1 with Scala 2.11.7 in IntelliJ Idea 14.1.4. First of all: I installed the source code version. Should I install the version for Hadoop 2.4+ instead? What I did: I made a Maven project from the tgz file and saved it. Do I need to do more with it? The first lines of the pom.xml file are:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <groupId>org.apache</groupId>
    <artifactId>apache</artifactId>
    <version>14</version>
  </parent>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-parent_2.10</artifactId>
  <version>1.4.1</version>
  <packaging>pom</packaging>
  <name>Spark Project Parent POM</name>
  <url>http://spark.apache.org/</url>
  <licenses>
    <license>
      <name>Apache 2.0 License</name>
      <url>http://www.apache.org/licenses/LICENSE-2.0.html</url>
      <distribution>repo</distribution>
    </license>
  </licenses>
  <scm>
    <connection>scm:git:git@github.com:apache/spark.git</connection>
    <developerConnection>scm:git:https://git-wip-us.apache.org/repos/asf/spark.git</developerConnection>
    <url>scm:git:git@github.com:apache/spark.git</url>
    <tag>HEAD</tag>
  </scm>
  <developers>
    <developer>
      <id>matei</id>
      <name>Matei Zaharia</name>
      <email>matei.zaharia@gmail.com</email>
      <url>http://www.cs.berkeley.edu/~matei</url>
      <organization>Apache Software Foundation</organization>
      <organizationUrl>http://spark.apache.org</organizationUrl>
    </developer>
  </developers>

It tried to run spark in an easy example with this in the build.sbt:

name := "hello"
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" % "spark-parent_2.10" % "1.4.1"

But I get the error:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/08/27 11:14:03 INFO SparkContext: Running Spark version 1.4.1
15/08/27 11:14:06 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/08/27 11:14:07 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:368)
    at Hello$.main(Hello.scala:12)
    at Hello.main(Hello.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
15/08/27 11:14:07 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:368)
    at Hello$.main(Hello.scala:12)
    at Hello.main(Hello.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
15/08/27 11:14:07 INFO Utils: Shutdown hook called

My first idea would be that I need to install the prebuild version of Spark instead. If I download this one, do I need to delete the other one and just do the same steps over again? Or is there another mistake? Thanks a lot for all the help :D

Giselle Van Dongen
  • 465
  • 1
  • 9
  • 18

1 Answers1

8

I think your problem is related to your spark context initialization in your code.

You need to set the master url for the spark context to connect to. For example:

val sparkConf = new SparkConf().setAppName("My Spark Job").setMaster("local")
val sparkContext = new SparkContext(sparkConf)

Where:

  • "local" - means run the job locally on the master node.
Avihoo Mamka
  • 4,656
  • 3
  • 31
  • 44
  • This solved half the problem, now i still get the warning that he couldn't load the native-hadoop library and the error ERROR Shell: Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. – Giselle Van Dongen Aug 27 '15 at 11:14
  • This may help you: http://stackoverflow.com/questions/19620642/failed-to-locate-the-winutils-binary-in-the-hadoop-binary-path And also this: http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows – Avihoo Mamka Aug 27 '15 at 11:37