So I have this code
val expanededDf = io.readInputs().mapPartitions{
(iter:Iterator[Row]) => {
iter.map{
(item:Row) => {
val myNewColumn = getUdf($"someColumnOriginal")
Row.fromSeq(item.toSeq :+(myNewColumn))
}
}
}
}
I am getting a exception:"Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases." My imports are:
import spark.implicits._
import org.apache.spark.sql._
I have to use the UDF as the function is very complex making some REST calls. Basically the code tries to add a new column into a Row using a particular column value and then returns a dataframe. I have tried using withColumn but since I am dealing here with Petabytes of data it is extremely slow. I am a newbie to spark and scala and hence I apologise in advance if my question is extremely lame.