I have a DataFrame df
like this one:
df =
name group influence
A 1 2
B 1 3
C 1 0
A 2 5
D 2 1
For each distinct value of group
, I want to extract the value of name
that has the maximum value of influence
.
The expected result is this one:
group max_name max_influence
1 B 3
2 A 5
I know how to get max value but I don't know how to getmax_name
.
df.groupBy("group").agg(max("influence").as("max_influence")