0

I'm following the Mahout In Action tutorial for kmeans clustring, i use the same code found here: with the same pom.xml also. On my local machine using eclipse every thing works fine, so i build the jar file (clustering-0.0.1-SNAPSHOT.jar) and bring it to the cluster (Hortonworks 2.3) when trying to run it using: hadoop jar clustering-0.0.1-SNAPSHOT.jar com.digimarket.clustering.App (I named my project differently) I get this error:

java.lang.NoClassDefFoundError: org/apache/mahout/common/distance/DistanceMeasure

I know it's a dependency issue, I found questions asked by users who had this issue before but couldn't understand how they solved it. here and here

This is the content of mahout directory in my cluster:

ls /usr/hdp/2.3.4.0-3485/mahout/
bin
conf
doc
lib
mahout-examples-0.9.0.2.3.4.0-3485.jar
mahout-examples-0.9.0.2.3.4.0-3485-job.jar
mahout-integration-0.9.0.2.3.4.0-3485.jar
mahout-math-0.9.0.2.3.4.0-3485.jar
mahout-mrlegacy-0.9.0.2.3.4.0-3485.jar
mahout-mrlegacy-0.9.0.2.3.4.0-3485-job.jar

Thanks.

Community
  • 1
  • 1
Djoko
  • 113
  • 1
  • 12
  • Does maven produce two jars; `clustering-0.0.1-SNAPSHOT.jar` and `clustering-0.0.1-SNAPSHOT-jar-with-dependencies.jar`? – rj93 Mar 03 '16 at 10:33
  • It produces a jar named mia-0.5.jar (the writer of the book mentioned it here http://stackoverflow.com/a/11482253/5089324) – Djoko Mar 03 '16 at 10:36
  • How do you build the `clustering-0.0.1-SNAPSHOT.jar`? – rj93 Mar 03 '16 at 10:39
  • In Eclipse, right click on the project name and RunAs Maven Install. I forget to change the artifactId and groupId in pom.xml. I use the same as used by the writer of the book. – Djoko Mar 03 '16 at 10:47

1 Answers1

0

It looks like you have a dependency that is not available to your code on your cluster.

Based on the pom.xml from that project you should be using:

<properties>
  <mahout.version>0.5</mahout.version>
  <mahout.groupid>org.apache.mahout</mahout.groupid>
</properties>
...
<dependencies>
  <dependency>
    <groupId>${mahout.groupid}</groupId>
    <artifactId>mahout-core</artifactId>
    <version>${mahout.version}</version>
  </dependency>
  ...
</dependencies>

The class org.apache.mahout.common.distance.DistanceMeasure is included in the mahout-core-0.*.jar I have mahout-core-0.7.jar and the class is present in there.

enter image description here

You can download that jar and include it with the -libjars flag or you can put it on the hadoop classpath.

Jeremy
  • 587
  • 1
  • 7
  • 20