I have Vectors grouped by id
in RDD
like this: RDD[(Int,Array[Vector])]
. I want to make clustering on each group of vectors by id
in separation.
MlLib k-mean algorithm require RDD[Vector]
as an argument:
val kmean = new KMeans().setK(3)
.setEpsilon(100)
.setMaxIterations(10)
.setInitializationMode("k-means||")
.setSeed(System.currentTimeMillis())
But obviously - when I map my RDD - I get Array[Vector] not wrapped with RDD:
// not work since e._2 is an Array[Vector] not RDD[Vector]!
rdd.map(e => kmean.run(e._2))
So the question is - how can I perform such clustering?
Thanks for help in advice!