Say I have a dataframe with multiple columns of possibly various types. I need to write a UDF that takes inputs from multiple columns, does a fairly complicated computation and returns result (say a string).
val dataframe = Seq( (1.0, Array(0, 2, 1), Array(0, 2, 3), 23.0, 21.0),
(1.0, Array(0, 7, 1), Array(1, 2, 3), 42.0, 41.0)).toDF(
"c", "a1", "a2", "t1", "t2")
Eg: ("c" * sum("a1") + sum("a2")).toString + "t1".toString
In actuality, the computation is lengthy and arrays have about a million elements. I am fairly new to Spark and would be grateful if a sample code or a pointer to resource (with Scala examples) is provided.
TIA