I am trying to run a Mahout project which I wrote in Eclipse with the Mahout and Hadoop libraries. It loads in a dataset and runs the FPGrowth algorithm. I set up the following Run configuration to run the project:
mvn exec:java -Dexec.mainClass=com.patternmatching.RecommendApp.TopPatternMatches
After running the program, I get the following error message:
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
I researched this issue, and realized that the Native hadoop libraries have to either be compiled, or downloaded from Apache (Hadoop "Unable to load native-hadoop library for your platform" warning) . I downloaded the libraries on a Cloudera Quickstart VM, on which I set up Mahout and Maven, along with my project package. After running it in cloudera, I get the same error. I also ran the Hadoop checknative -a
command, which verifies that the Native libraries are available:
[root@quickstart /]# hadoop checknative -a
16/10/22 19:32:16 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
16/10/22 19:32:16 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop: true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
zlib: true /lib64/libz.so.1
snappy: true /usr/lib/hadoop/lib/native/libsnappy.so.1
lz4: true revision:99
bzip2: true /lib64/libbz2.so.1
openssl: true /usr/lib64/libcrypto.so
The output of the command verifies that the libraries are available, but are not being correctly loaded into the program or classpath. I am not sure how to configure Maven so that it loads in the Hadoop native libraries when running the program. This is the dependencies section of the Maven pom.xml
file:
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.0.0-alpha1</version>
</dependency>
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-core</artifactId>
<version>0.9</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
and the command I run to execute my Mahout java program is
mvn exec:java -Dexec.mainClass=com.patternmatching.RecommendApp.TopPatternMatches
How can I configure Maven to see these libraries so they are used in the program?