I have the following pair RDD
val pairRDD = prFilterRDD.map(x => (x(1), x(4).toFloat))
x(1) = categoryid
x(4).toFloat = price
Result:
(2,124.99)
(17,129.99)
(2,129.99)
(17,199.99)
(17,299.99)
(17,299.99)
(2,139.99)
(17,149.99)
(17,119.99)
(17,399.99)
(3,189.99)
(17,119.99)
(3,159.99)
(18,129.99)
(18,189.99)
(3,199.99)
(18,134.99)
(18,129.99)
(18,129.99)
(18,139.99)
(3,149.99)
(18,129.99)
(3,159.99)
(18,124.99)
(4,299.98)
(18,129.99)
I would like to calculate the sum of the price by categoryid. I write the following the code:
val initialVal = 0.0f
val comb =(initialVal: Float, strVal:Float) => initialVal+ strVal
val mergeValSum= (v1:Float, v2:Float) => v1+v2
val output = pairRDD.aggregateByKey(initialVal)(comb, mergeValSum)
I have the following result :
(4,5689.7803)
(8,1184.95)
(19,1799.87)
(48,6599.831)
(51,1499.93)
(22,3114.95)
(33,587.97003)
(44,1744.8999)
(11,5619.8115)
(49,2789.89)
(5,2314.89)
I don't have the expected result. For example for category-id = 8, the expected result =792.0 and I have 1184.95. Do I use aggregateByKey correctly ?
Thank you for your answer.