45

When I try to run my Job I am getting the following exception:

Exception in thread "main" java.io.IOException: Mkdirs failed to create /some/path
    at org.apache.hadoop.util.RunJar.ensureDirectory(RunJar.java:106)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:150)

Where the /some/path is hadoop.tmp.dir. However when I issue the dfs -ls cmd on /some/path I can see that it exists and the dataset file is present (was copied before lunching the job). Also the path is correctly defined in hadoop configs. Any suggestions will be appreciated. I am using hadoop 0.21.

lastr2d2
  • 3,604
  • 2
  • 22
  • 37
alien01
  • 1,334
  • 2
  • 14
  • 31

8 Answers8

95

Just ran into this problem running mahout from CDH4 in standalone mode in my MacBook Air.

The issue is that a /tmp/hadoop-xxx/xxx/LICENSE file and a /tmp/hadoop-xxx/xxx/license directory are being created on a case-insensitive file system when unjarring the mahout jobs.

I was able to workaround this by deleting META-INF/LICENSE from the jar file like this:

zip -d mahout-examples-0.6-cdh4.0.0-job.jar META-INF/LICENSE

and then verified it with

jar tvf mahout-examples-0.6-cdh4.0.0-job.jar | grep -i license
starball
  • 20,030
  • 7
  • 43
  • 238
Todd Nemet
  • 1,096
  • 1
  • 7
  • 4
  • 4
    Woh, thank you very much, that was also my problem. Just to be a bit clearer : "mahout-examples-0.6-cdh4.0.0-job.jar" is the mapreduce job to perform. So the general case is : zip -d META-INF/LICENSE – JohnCastle Dec 04 '12 at 20:12
  • I keep finding this solutions online but it doesnt work for me!! After running the zip and jar commands above I still get:: Exception in thread "main" java.io.IOException: Mkdirs failed to create /var/folders/9y/4dzrwg8n45z7fbhmlqc7bsgc0000gn/T/hadoop-unjar5690365448328571882/license – alex9311 Jun 19 '15 at 15:52
  • 2
    @alex9311 I had the same issue as yours and I used this to solve it: `zip -d examples.jar LICENSE` – mbbce Oct 20 '15 at 13:13
13

The problem is OSX specific it is due to the fact that by default the filesystem is set to case-insensitive on a Mac (case preserving but case insensitive, which to my opinion is very bad).

A hack to circumvent this is to create a .dmg disk image with disk utility which is case sensitive and mount this image where you need it (i.e. hadoop.tmp.dir or /tmp) with the following command (as a superuser):

sudo hdiutil attach -mountpoint /tmp <my_image>.dmg

I hope it helps.

ngrislain
  • 944
  • 12
  • 19
12

This is a file on the local disk that is being created (to unpack your job jar into), not in HDFS. Check you have permissions to mkdir this directory (try it from the command line)

Chris White
  • 29,949
  • 4
  • 71
  • 93
12

I ran into this issues several times in the past, I believe it is a Mac specific issue. Since I use Maven to build my project, I was able to get around it by adding a line in my Maven pom.xml like this:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>2.0</version>
    <executions>
        <execution>
            <phase>package</phase>
            <goals>
                <goal>shade</goal>
            </goals>
            <configuration>
                <transformers>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer">
                    </transformer>
                </transformers>
            </configuration>
        </execution>
    </executions>
</plugin>
RATabora
  • 393
  • 5
  • 10
10

In my case below lines of code in pom.xml in Maven project worked on Mac.

  <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>2.0</version>
    <configuration>
      <shadedArtifactAttached>true</shadedArtifactAttached>
    </configuration>
    <executions>
      <execution>
        <phase>package</phase>
        <goals>
          <goal>shade</goal>
        </goals>
          <configuration>
            <filters>
              <filter>
                <artifact>*:*</artifact>
                <excludes>
                  <exclude>META-INF/*.SF</exclude>
                  <exclude>META-INF/*.DSA</exclude>
                  <exclude>META-INF/*.RSA</exclude>
                  <exclude>META-INF/LICENSE*</exclude>
                  <exclude>license/*</exclude>
                </excludes>
              </filter>
            </filters>
        </configuration>
      </execution>
    </executions>
  </plugin>
Garry
  • 678
  • 1
  • 9
  • 21
  • Make sure you are using the "maven-shade-plugin" plugin, assembly plugin doesn't have filters in configuration. May be there is another way to define filters in assembly plugin. But this works for me and checked some others as well. – Garry Sep 28 '15 at 12:54
3

Check the Required space is available or not. THis is problem mostly get because of the space issues.

Kumar Basapuram
  • 179
  • 2
  • 7
3

I ran into this same issue while building MapReduce jobs on a Mac with MacOS Sierra. The same code runs without problems on Ubuntu Linux (14.04 LTS and 16.04 LTS). MapReduce distribution was 2.7.3, and was configured for Single Node, standalone operation. The problem appears to be related to copying license files into a META_INF directory. My problem was solved by adding a transformer into the Maven Shade plugin configuration, specifically: ApacheLicenseResourceTransformer.

Here is the relevant section of the POM.xml, which goes as part of the <build> section:

<plugin>                                                                                                             <groupId>org.apache.maven.plugins</groupId>                                                                      
   <artifactId>maven-shade-plugin</artifactId>                                                                      
   <version>3.0.0</version>                                                                                         
   <executions>                                                                                                     
     <execution>                                                                                                    
       <phase>package</phase>                                                                                       
       <goals>                                                                                                      
         <goal>shade</goal>                                                                                         
       </goals>                                                                                                     
       <configuration>                                                                                              
         <transformers>                                                                                             
           <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">       
             <mainClass>path.to.your.main.class.goes.here</mainClass>                                        
           </transformer>                                                                                           
           <transformer implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer">  
           </transformer>                                                                                           
         </transformers>                                                                                            
       </configuration>                                                                                             
     </execution>                                                                                                   
   </executions>                                                                                                    
 </plugin>  

Notice that I also use the ManifestResourceTransformerto specify the main class for the MapReduce Job.

Manuel
  • 248
  • 2
  • 5
0

In my case I just renamed the file "log_test.txt"

Because the OS (UBUNTU) was trying to generate a folder with the same name. "log_test.txt/__results.json"

Ammar Bozorgvar
  • 1,230
  • 19
  • 30