I am currently running a Java application that uses Spark.
Everything works fine, except at the initialization of the SparkContext. At this moment, Spark try to discover Hadoop on my system, and throws and error as I don't have AND I DON'T WANT to install Hadoop :
2018-06-20 10:00:27.496 ERROR 4432 --- [ main] org.apache.hadoop.util.Shell : Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
Here is my SparkConfig :
SparkConf cfg = new SparkConf();
cfg.setAppName("ScalaPython")
.setMaster("local")
.set("spark.executor.instances", "2");
return cfg;
My Spark dependencies :
<!-- Spark dependencies -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.3.0</version>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
<exclusion>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.datasyslab</groupId>
<artifactId>geospark_2.3</artifactId>
<version>1.1.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.datasyslab</groupId>
<artifactId>geospark-sql_2.3</artifactId>
<version>1.1.0</version>
</dependency>
So is there a way to disable Hadoop discovery programmatically (ie: give SparkConfig a specific property), as this error doesn't block Spark context creation (I can still use Spark functionality) ?
N.B. It's for testing purposes.
Thanks for your answers !