When grouping a Dataset
in Spark, there are two methods: groupBy
and groupByKey[K]
.
groupBy
returns RelationalGroupedDataset
, while groupByKey[K]
returns KeyvalueGroupedDataset
.
What are the differences between them?
Under what circumstances should I choose one over another?
How come my question is a duplicate of those questions about "Dataset vs DataFrame"? I don't get it. It is obviously totally different things! My question is very specific not generic.