If I understand correctly
using groupBy().agg(collect_list(column))
will get me a column of list.
How do I get the first and last element from that list to create a new column (in Spark Dataset Java)?
For first, I can do something like this
.withColumn("firstItem", functions.col("list").getItem(0))
but how do I handle empty list?
For last item, I was thinking about size()-1
, but in Java, -1 isn't supported in Spark data set, I tried:
withColumn("lastItem", function.col("list").getItem(functions.size(functions.col("list")).minus(1))
but it will complaint something about unsupported type error.