In our Spark-Scala application, we want to use typed Datasets. There is a JOIN operation. There is a join between DF1 & DF2 (DF - Dataframe)
.
My question is should we convert DF1 & DF2 both to Dataset[T]
and then perform JOIN
or should we do the JOIN
and then convert the result DataFrame
to Dataset
.
As I understand since here Dataset[T]
are being used for type safety so we should convert DF1 & DF2 to Dataset[T]
. Can someone please confirm and advise if something is not correct?