I have a RDD in which each entry belongs to a class. I want to separate the single RDD into several RDD, such that all entries of a class goes into one RDD. Suppose I have 100 such classes in the input RDD, I want each clas into its own RDD. I can do this with a filter for each class (as shown below), but it would launch several jobs. Is there a better way to do it in a single job?
def method(val input:RDD[LabeledPoint], val classes:List[Double]):List[RDD] =
classes.map{lbl=>input.filter(_.label==lbl)}
Its similar to another question, but I have more than 2 classes (around 10)