51

I am running Spark on Windows 7. When I use Hive, I see the following error

The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw- 

The permissions are set as the following

C:\tmp>ls -la
total 20
drwxr-xr-x    1 ADMIN Administ        0 Dec 10 13:06 .
drwxr-xr-x    1 ADMIN Administ    28672 Dec 10 09:53 ..
drwxr-xr-x    2 ADMIN Administ        0 Dec 10 12:22 hive

I have set "full control" to all users from Windows->properties->security->Advanced.

But I still see the same error. I have checked a bunch of links, some say this is a bug on Spark 1.5?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
user1384205
  • 1,231
  • 3
  • 20
  • 39

17 Answers17

88

First of all, make sure you are using correct Winutils for your OS. Then next step is permissions.
On Windows, you need to run following command on cmd:

D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive

Hope you have downloaded winutils already and set the HADOOP_HOME variable.

logi-kal
  • 7,107
  • 6
  • 31
  • 43
Nishu Tayal
  • 20,106
  • 8
  • 49
  • 101
  • 7
    @user1384205 : If you think, it helped you, can you accept the answer, so that It will help others to use – Nishu Tayal Dec 21 '16 at 12:03
  • 1
    It matters if you are using the 64-bit or 32-bit version of winutils. I ran this repeatedly and it looked like permissions were set correctly (via `winutils ls`) but only when I replaced winutils.exe did it work. – wrschneider Jul 13 '17 at 15:18
  • 4
    this solved the issue. Originally, I had the wrong winutils though. I'm using windows 10, 64 bit and the winutils.exe at this location [https://osdn.net/projects/win-hadoop/downloads/62852/hadoop-winutils-2.6.0.zip/] works for me – Partha Mishra Aug 03 '17 at 10:13
  • @ParthaMishra File not found on the link u provided – hshihab Aug 27 '17 at 13:46
  • you're right. it seems the file is moved. The only way out now is to search the web. The size is ~106 KB, please check it before using. There are other ones of smaller sizes and those don't work. – Partha Mishra Aug 28 '17 at 11:31
  • 3
    The 64 bit version of winutils can be found at https://codeload.github.com/gvreddy1210/64bit/zip/master – Nilav Baran Ghosh Jan 27 '18 at 17:39
  • I am trying spark on windows and I got same error. I have downloaded zip of spark prebuild for hadoop from [here](https://spark.apache.org/downloads.html) and extracted it in D: drive. Then created `hadoop\bin` directories inside extracted directory. Put winutils.exe in `hadoop\bin`. Ran python-spark hello world example and got this error. Point is where is `\tmp\hive` located \ configured? – Mahesha999 Mar 23 '18 at 09:22
  • @user1384205 : Please don't forget to accept the answer if it helps you. That make it easy for other people to find the solutions :) – Nishu Tayal Mar 07 '19 at 14:16
40

First thing first check your computer domain. Try

c:\work\hadoop-2.2\bin\winutils.exe ls c:/tmp/hive

If this command says access denied or FindFileOwnerAndPermission error (1789): The trust relationship between this workstation and the primary domain failed.

It means your computer domain controller is not reachable , possible reason could be you are not on same VPN as your system domain controller.Connect to VPN and try again.

Now try the solution provided by Viktor or Nishu.

Aaditya Raj
  • 1,078
  • 12
  • 19
  • 4
    This worked for me, I had to connect to a vpn to get on my domain controllers network. wait for sometime and then ran the chmod 777 command. – Mahesh Jan 13 '17 at 06:34
  • I was doing the %HADOOP_HOME%\bin\winutils.exe chmod 777 c:\tmp\hive over and over again, and was getting confused why the permissions were not being set properly, this was an absolute lifesaver. – D.S. Mar 05 '17 at 23:35
  • this error FindFileOwnerAndPermission may be caused by domain controller not reachable if you connected on a given AD Domain. In my case I've connected to the VPN so domain controller was reachable and worked for me. Thank you @Aaditya – zaki benz Jun 17 '20 at 11:41
  • @Aaditya - Absolute life saver. Curious to know how did you get this answer? – Anand Aug 04 '20 at 11:58
12

You need to set this directory's permissions on HDFS, not your local filesystem. /tmp doesn't mean C:\tmp unless you set fs.defaultFs in core-site.xml to file://c:/, which is probably a bad idea.

Check it using

hdfs dfs -ls /tmp 

Set it using

hdfs dfs -chmod 777 /tmp/hive
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
11

Next solution worked on Windows for me:

  • First, I defined HADOOP_HOME. It described in detail here
  • Next, I did like Nishu Tayal, but with one difference:C:\temp\hadoop\bin\winutils.exe chmod 777 \tmp\hive

\tmp\hive is not local directory

Community
  • 1
  • 1
L. Viktor
  • 143
  • 1
  • 6
  • Thanks Viktor. But I get the following error when i try your solution. C:\Programs\winutil\bin>winutils.exe chmod 777 \tmp\hive ChangeFileModeByMask error (5): Access is denied. – user1384205 Jun 07 '16 at 07:23
  • You need to run the command prompt or cygwin as Administrator for the command to work – Eduardo Sanchez-Ros Jan 20 '17 at 13:11
9

Error while starting the spark-shell on VM running on Windows: Error msg: The root scratch dir: /tmp/hive on HDFS should be writable. Permission denied

Solution: /tmp/hive is temporary directory. Only temporary files are kept in this location. No problem even if we delete this directory, will be created when required with proper permissions.

Step 1) In hdfs, Remove the /tmp/hive directory ==> "hdfs dfs -rm -r /tmp/hive"

2) At OS level too, delete the dir /tmp/hive ==> rm -rf /tmp/hive

After this, started the spark-shell and it worked fine..

SNK
  • 99
  • 1
  • 1
6

This is a simple 4 step process:

For Spark 2.0+:

  1. Download Hadoop for Windows / Winutils
  2. Add this to your code (before SparkSession initialization):

    if(getOS()=="windows"){
        System.setProperty("hadoop.home.dir", "C:/Users//winutils-master/hadoop-2.7.1"); 
    }   
    
  3. Add this to your spark-session (You can change it to C:/Temp instead of Desktop).

    .config("hive.exec.scratchdir","C:/Users//Desktop/tmphive")
    
  4. Open cmd.exe and run:

    "path\to\hadoop-2.7.1\bin\winutils.exe" chmod 777 C:\Users\\Desktop\tmphive
    
Abhinandan Dubey
  • 655
  • 2
  • 9
  • 15
2

The main reason is you started the spark at wrong directory. please create folders in D://tmp/hive (give full permissions) and start your spark in D: drive D:> spark-shell

now it will work.. :)

1

Can please try giving 777 permission to the folder /tmp/hive because what I think is that spark runs as a anonymous user(which will come in other user category) and this permission should be recursive. I had this same issue with 1.5.1 version of spark for hive, and it worked by giving 777 permission using below command on linux

chmod -r 777 /tmp/hive
Reena Upadhyay
  • 1,977
  • 20
  • 35
  • 3
    Thanks. I tried that. Didnt work.. Did you so this on Windows? – user1384205 Dec 10 '15 at 12:18
  • 3
    The question is about this problem on Windows environment not unix. And even then the chmod will not work all by itself under a root level /tmp. This is the correct command for unix, sudo chmod -R 777 /tmp/hive – nitinr708 Jul 27 '17 at 08:18
1

There is a bug in Spark Jira for the same. This has been resolved few days back. Here is the link.

https://issues.apache.org/jira/browse/SPARK-10528

Comments have all options, but no guaranteed solution.

sunil
  • 11
  • 2
1

Issue resolved in spark version 2.0.2 (Nov 14 2016). Use this version . Version 2.1.0 Dec 28 2016 release has same issues.

  • For me on a windows machine, this was the answer which pushed me in the correct direction. I also got another error but could solve this error with the help of the second answer of the following [post](http://stackoverflow.com/questions/38863003/sparkr-from-rstudio-gives-error-in-invokejavaisstatic-true-classname-meth) – Tobias Apr 05 '17 at 12:48
1

I also faced this issue. This issue is related to network. I installed spark on Windows 7 using particular domain.

Domain name can be checked

Start -> computer -> Right click -> Properties -> Computer name, domain and workgroup settings -> click on change -> Computer Name (Tab) -> Click on Change -> Domain name.

When I run spark-shell command, it works fine, without any error.

In other networks I received write permission error. To avoid this error, run spark command on Domain specified in above path.

Nishu Tayal
  • 20,106
  • 8
  • 49
  • 101
  • You said it yourself it worked for a different user / domain. Which means it's a permission issue not network. Thank you for an honest attempt – nitinr708 Jul 27 '17 at 08:20
1

Use the latest version of "winutils.exe" and try. https://github.com/steveloughran/winutils/blob/master/hadoop-2.7.1/bin/winutils.exe

Nagaraj Vittal
  • 881
  • 13
  • 26
1

I was getting the same error "The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-" on Windows 7. Here is what I did to fix the issue:

  1. I had installed Spark on C:\Program Files (x86)..., it was looking for /tmp/hive under C: i.e., C:\tmp\hive
  2. I downloaded WinUtils.exe from https://github.com/steveloughran/winutils. I chose a version same as what I chose for hadoop package when I installed Spark. i.e., hadoop-2.7.1 (You can find the under the bin folder i.e., https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1/bin)
  3. Now used the following command to make the c:\tmp\hive folder writable winutils.exe chmod 777 \tmp\hive

Note: With a previous version of winutils too, the chmod command was setting the required permission without error, but spark still complained that the /tmp/hive folder was not writable.

Vid123
  • 11
  • 2
0

Using the correct version of winutils.exe did the trick for me. The winutils should be from the version of Hadoop that Spark has been pre built for.

Set HADOOP_HOME environment variable to the bin location of winutils.exe. I have stored winutils.exe along with C:\Spark\bin files. So now my SPARK_HOME and HADOOP_HOME point to the same location C:\Spark.

Now that winultils has been added to path, give permissions for hive folder using winutils.exe chmod 777 C:\tmp\hive

Harsha
  • 1
  • 2
0

You don't have to fix the permission of /tmp/hive directory yourself (like some of the answers suggested). winutils can do that for you. Download the appropriate version of winutils from https://github.com/steveloughran/winutils and move it to spark's bin directory (e. x. C:\opt\spark\spark-2.2.0-bin-hadoop2.6\bin). That will fix it.

pedram bashiri
  • 1,286
  • 15
  • 21
0

I was running spark test from IDEA, and in my case the issue was wrong winutils.exe version. I think you need to match it with you Hadoop version. You can find winutils.exe here

gorros
  • 1,411
  • 1
  • 18
  • 29
0
/*
Spark and hive on windows environment
Error: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-

Pre-requisites: Have winutils.exe placed in c:\winutils\bin\
Resolve as follows:
*/

C:\user>c:\Winutils\bin\winutils.exe ls
FindFileOwnerAndPermission error (1789): The trust relationship between this workstation and the primary domain failed.
// Make sure you are connected to the domain controller, in my case I had to connect using VPN

C:\user>c:\Winutils\bin\winutils.exe ls c:\user\hive
drwx------ 1 BUILTIN\Administrators PANTAIHQ\Domain Users 0 Aug 30 2017 c:\user\hive

C:\user>c:\Winutils\bin\winutils.exe chmod 777 c:\user\hive

C:\user>c:\Winutils\bin\winutils.exe ls c:\user\hive
drwxrwxrwx 1 BUILTIN\Administrators PANTAIHQ\Domain Users 0 Aug 30 2017 c:\user\hive
   
Chandan Gawri
  • 364
  • 1
  • 4
  • 15