Why is it that I have to cast elements of RDD[Int]
to Int
or String
to use it with sortBy (Spark 1.6)?
For e.g. this gives me an error
val t = sc.parallelize(1 to 9) //t: org.apache.spark.rdd.RDD[Int]
t.sortBy(_, ascending=false) //error: missing parameter type ...
whereas this works
t.sortBy(_.toInt, ascending=false).collect //Array[Int] = Array(9, 8, 7, 6, 5, 4, 3, 2, 1)
t.sortBy(_.toString,ascending=false).collect //Array[Int] = Array(9, 8, 7, 6, 5, 4, 3, 2, 1)
And why is it casting to toString
above returns Array[Int]
instead of Array[String]
or Array[Char]
?
Just started learning spark, so please go easy on me :-).