4

I got the following error when starting the spark-shell. I'm going to use Spark to process data in SQL Server. Can I ignore the errors?

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState'

Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':

Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog'

Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog'

Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\tmp\hive

Caused by: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\tmp\hive

Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\tmp\hive

Community
  • 1
  • 1
ca9163d9
  • 27,283
  • 64
  • 210
  • 413
  • Possible duplicate of [java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. spark Eclipse on windows 7](http://stackoverflow.com/questions/35652665/java-io-ioexception-could-not-locate-executable-null-bin-winutils-exe-in-the-ha) – T. Gawęda May 05 '17 at 19:15

1 Answers1

15

tl;dr You'd rather not.

Well, it may be possible, but given you've just started your journey to Spark's land the efforts would not pay off.


Windows has never been a developer-friendly OS to me and whenever I teach people Spark and they use Windows I just take it as granted that we'll have to go through the winutils.exe setup but many times also how to work on command line.


Please install winutils.exe as follows:

  1. Run cmd as administrator
  2. Download winutils.exe binary from https://github.com/steveloughran/winutils repository (use hadoop-2.7.1 for Spark 2)
  3. Save winutils.exe binary to a directory of your choice, e.g. c:\hadoop\bin
  4. Set HADOOP_HOME to reflect the directory with winutils.exe (without bin), e.g. set HADOOP_HOME=c:\hadoop
  5. Set PATH environment variable to include %HADOOP_HOME%\bin
  6. Create c:\tmp\hive directory
  7. Execute winutils.exe chmod -R 777 \tmp\hive
  8. Open spark-shell and run spark.range(1).show to see a one-row dataset.
Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
  • Thanks it works! Unfortunately, our IT doesn't provide Linux develop PC. These steps should be in the official documentation. – ca9163d9 May 05 '17 at 21:25