I have a data frame like below.
scala> ds.show
+----+----------+----------+-----+
| key|attribute1|attribute2|value|
+----+----------+----------+-----+
|mac1| A1| B1| 10|
|mac2| A2| B1| 10|
|mac3| A2| B1| 10|
|mac1| A1| B2| 10|
|mac1| A1| B2| 10|
|mac3| A1| B1| 10|
|mac2| A2| B1| 10|
+----+----------+----------+-----+
For each value in attribute1, I want to find the top N keys and the aggregated value for that key. Output: aggregated value for key for attribute1 will be
+----+----------+-----+
| key|attribute1|value|
+----+----------+-----+
|mac1| A1| 30|
|mac2| A2| 20|
|mac3| A2| 10|
|mac3| A1| 10|
+----+----------+-----+
Now if N = 1 then the output will be A1 - (mac1,30) A2-(mac2,20)
How to achieve this in DataFrame/Dataset ? I want to achieve this for all the attributes. In the above example I want to find for attribute1 and attribute2 as well.