5

Summary

Stock hadoop2.6.0 install gives me no filesystem for scheme: s3n. Adding hadoop-aws.jar to the classpath now gives me ClassNotFoundException: org.apache.hadoop.fs.s3a.S3AFileSystem.

Details

I've got a mostly stock install of hadoop-2.6.0. I've only set directories, and set the following environment variables:

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/jre
export HADOOP_COMMON_HOME=/opt/hadoop
export HADOOP_HOME=$HADOOP_COMMON_HOME
export HADOOP_HDFS_HOME=$HADOOP_COMMON_HOME
export HADOOP_MAPRED_HOME=$HADOOP_COMMON_HOME
export HADOOP_OPTS=-XX:-PrintWarnings
export PATH=$PATH:$HADOOP_COMMON_HOME/bin

The hadoop classpath is:

/opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*:/opt/hadoop/share/hadoop/mapreduce/lib/*:/opt/hadoop/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar:/opt/hadoop/share/hadoop/tools/lib/*

When I try to run hadoop distcp -update hdfs:///files/to/backup s3n://${S3KEY}:${S3SECRET}@bucket/files/to/backup I get Error: java.io.Exception, no filesystem for scheme: s3n. If I use s3a, I get the same error complaining about s3a.

The internet told me that hadoop-aws.jar is not part of the classpath by default. I added the following line to /opt/hadoop/etc/hadoop/hadoop-env.sh:

HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_COMMON_HOME/share/hadoop/tools/lib/*

and now hadoop classpath has the following appended to it:

:/opt/hadoop/share/hadoop/tools/lib/*

which should cover /opt/hadoop/share/hadoop/tools/lib/hadoop-aws-2.6.0.jar. Now I get:

Caused by: java.lang.ClassNotFoundException:
Class org.apache.hadoop.fs.s3a.S3AFileSystem not found

The jar file contains the class that can't be found:

unzip -l /opt/hadoop/share/hadoop/tools/lib/hadoop-aws-2.6.0.jar |grep S3AFileSystem
28349  2014-11-13 21:20   org/apache/hadoop/fs/s3a/S3AFileSystem.class

Is there an order to adding these jars, or am I missing something else critical?

Community
  • 1
  • 1
Steve Armstrong
  • 5,252
  • 7
  • 32
  • 43

3 Answers3

6

Working from Abhishek's comment on his answer, the only change I needed to make was to mapred-site.xml:

<property>
  <!-- Add to the classpath used when running an M/R job -->
  <name>mapreduce.application.classpath</name>
  <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/tools/lib/*</value>
</property>

No changes needed to any other xml or sh files.

rs_atl
  • 8,935
  • 1
  • 23
  • 28
Steve Armstrong
  • 5,252
  • 7
  • 32
  • 43
4

You can resolve s3n issue by adding following lines to core-site.xml

<property>
<name>fs.s3n.impl</name>
<value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
<description>The FileSystem for s3n: (Native S3) uris.</description>
</property>

It should work after adding that property.

Edit: If it doesn't resolve your problem then you will have to add the jars in classpath. Can you check if mapred-site.xml has mapreduce.application.classpath: /usr/hdp//hadoop-mapreduce/*. It will include other related jars in classpath :)

Abhishek
  • 6,912
  • 14
  • 59
  • 85
  • Running a `distcp` with an `s3n` url, I get `java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3native.NativeS3FileSystem not found` even though that exact class is inside the `hadoop-aws-2.6.0.jar` – Steve Armstrong May 08 '15 at 18:36
  • 1
    You will have to add the jars in classpath. Can you check if mapred-site.xml has **mapreduce.application.classpath**: /usr/hdp//hadoop-mapreduce/*. it will include other relatd jars in classpath – Abhishek May 09 '15 at 04:09
  • Abhishek, looks like **mapreduce.application.classpath** was the right path (and the only change needed). If you post/edit an answer with that, I'll accept it and delete mine. – Steve Armstrong May 11 '15 at 19:18
  • @SteveArmstrong Edited & Added comment to my answer :) – Abhishek May 12 '15 at 05:24
1

In current Hadoop (3.1.1) this approach no longer works. You can fix this by uncommenting the HADOOP_OPTIONAL_TOOLS line in the etc/hadoop/hadoop-env.sh file. Among other tools, this enables the hadoop-aws library.

Pavla
  • 147
  • 1
  • 2