I am trying to run Association rules using Spark Scala. I first create an FPGrowth tree and pass that to the Association Rules method.
However, I wish to add a maximum pattern length parameter, to limit the number of items I want on the LHS and RHS. I only want one-to-one associations between items.
val model = new FPGrowth()
.setMinSupport(0.1)
.setNumPartitions(10)
.run(transactions)
// Generate association rules based on the frequent sets generated by FPgrowth
val ar = new AssociationRules().setMinConfidence(0.6)
val results = ar.run(model.freqItemsets)
The resulting association rules are:
ItemA => ItemB, {confidence}
ItemB => ItemC, {confidence}
ItemA,ItemB => ItemC, {confidence}
ItemA,ItemD => ItemE, {confidence}
But I only want it to return results that have one item on both sides, i.e.:
ItemA => ItemB, {confidence}
ItemB => ItemC, {confidence}
Basically, I am looking for a way to specify the maximum length parameter in Spark Scala/Spark Java
Any suggestions?