10

I am just trying to get Hadoop running on my laptop running 64-bit Windows 7 in standalone mode. I've installed Cygwin 1.7 in the default folder (c:\cygwin). I have the latest JDK in the folder c:\jdk1.7.0_03, and have set JAVA_HOME environment variable.

when I try to run the following command from a cygwin prompt:

$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'

Here's the error I get:

12/03/17 19:08:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
java.io.IOException: Failed to set permissions of path: \tmp\hadoop-ehtzrhf\mapred\staging\ehtzrhf837602798\.staging to 0700
        at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:682)
        at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:655)
        at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:484)
        at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:319)
        at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)
        at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:848)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:842)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:842)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:816)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1253)
        at org.apache.hadoop.examples.Grep.run(Grep.java:69)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.examples.Grep.main(Grep.java:93)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

I've tried with both Hadoop 1.0.1 and also hadoop-0.20.205.0 and get the same issue. I've updated my .bashrc with

export TMP=/cygdrive/c/temp
export TEMP=/cygdrive/c/temp

I've also added cygwin bin folder to the path:

export PATH=.:/cygdrive/c/cygwin/bin:$HADOOP_INSTALL/bin

I also find it very odd it's showing the path as \tmp... instead of /tmp/...

Short of recompiling or running a Linux VM, any ideas?

dplante
  • 2,445
  • 3
  • 21
  • 27

3 Answers3

7

Here is a simple-to-use workaround that doesn't require any yak shaving:

https://github.com/congainc/patch-hadoop_7682-1.0.x-win

toddfast
  • 91
  • 1
  • 3
2

fixed (major yak shaving)

https://issues.apache.org/jira/browse/HADOOP-7682?focusedCommentId=13236645#comment-13236645

FKorning
  • 166
  • 2
  • Yep, thanks for posting it to the jira as well. I fixed it for myself by building a custom version of 1.0.0 where the return value of file methods are not checked. This happens in `FileUtil.setPermission`, just removed all the lines with `checkReturnValue` – Thomas Jungblut Mar 23 '12 at 18:26
  • 1
    Dave Latham has patched setPermission().https://issues.apache.org/jira/browse/HADOOP-7682 – FKorning Apr 29 '12 at 20:10
0

I've managed to get this working to the point where jobs are dispatched, tasks executed, and results compiled.

However we still need to get the servlets to understand cygwin symlinks. I have no idea how to do this in Jetty.

These two links show how to allow Tomcat and jetty to follow symlinks, but I don't know if this works in cygwin. * http://www.lamoree.com/machblog/index.cfm?event=showEntry&entryId=A2F0ED76-A500-41A6-A1DFDE0D1996F925 * Configure Symlinks for single directory in Tomcat

Otherwise we'll have to open up the jetty code and replace java.io.File with org.apache.hadoop.fs.LinkedFile.

Community
  • 1
  • 1
FKorning
  • 166
  • 2