14

Is it possible to use Apache mahout without any dependency to Hadoop.

I would like to use the mahout algorithm on a single computer by only including the mahout library inside my Java project but i dont want to use hadoop at all since i will be running on a single node anyway.

Is that possible?

Eyal
  • 3,412
  • 1
  • 44
  • 60
skyde
  • 2,816
  • 4
  • 34
  • 53

2 Answers2

11

Yes. Not all of Mahout depends on Hadoop, though much does. If you use a piece that depends on Hadoop, of course, you need Hadoop. But for example there is a substantial recommender engine code base that does not use Hadoop.

You can embed a local Hadoop cluster/worker in a Java program.

Sean Owen
  • 66,182
  • 23
  • 141
  • 173
  • With Mahout 0.10 this just doesn't seem possible anymore. I'm trying to use the KMeans or FuzzyKMeans algorithms and they seem completely tied up in Hadoop. All I want to do is cluster some 2D (lat/longs actually) data points and having to rely on the hadoop file system seems extremely inefficient for the one-off operation I want it for. – crowmagnumb Apr 21 '15 at 01:03
11

Definitely, yes. In the Mahout Recommender First-Timer FAQ they advise against starting out with a Hadoop-based implementation (unless you know you're going to be scaling past 100 million user preferences relatively quickly).

You can use the implementations of the Recommender interface in a pure-Java fashion relatively easily. Or place one in the servlet of your choice.

Technically, Mahout has a Maven dependency on Hadoop. But you can use recommenders without the Hadoop JARs easily. This is described in the first few chapters of Mahout in Action - you can download the sample source code and see how it's done - look at the file RecommenderIntro.java.

However, if you're using Maven, you would need to exclude Hadoop manually - the dependency would look like this:

<dependency>
        <groupId>org.apache.mahout</groupId>
        <artifactId>mahout-core</artifactId>
        <exclusions>
            <exclusion>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-core</artifactId>
            </exclusion>
        </exclusions>
</dependency>
Eyal
  • 3,412
  • 1
  • 44
  • 60
  • 1
    The link mentioned in the answer does not direct to an article which can be read. [Mahout Recommender First-Timer FAQ](https://cwiki.apache.org/MAHOUT/recommender-first-timer-faq.html) Can you please see to that? – Rajith Gun Hewage Jan 13 '16 at 16:51