I have an RDD of labeled point in Spark. I want to count all the distinct values of labels. I try something
from pyspark.mllib.regression import LabeledPoint
train_data = sc.parallelize([ LabeledPoint(1.0, [1.0, 0.0, 3.0]),LabeledPoint(2.0, [1.0, 0.0, 3.0]),LabeledPoint(1.0, [1.0, 0.0, 3.0]) ])
train_data.reduceByKey(lambda x : x.label).collect()
But I get
TypeError: 'LabeledPoint' object is not iterable
I use Spark 2.1 and python 2.7. Thanks for any help.