I want to get single row for each id where only the maximum value of the charge column is present.
Example input data:
id name charge
11 hg 10
11 mm 20
22 aa 40
22 bb 40
Code I have tried:
df.agg(max("charge"))
I am getting only the maximum value, like this:
charge
40
However, I want to keep the whole row:
id name charge
11 mm 20
22 aa 40
22 bb 40
How to keep the first two columns as well? The name column can have different values for the same id so it's not possible to use groupBy
on both these columns and aggregate the result.
If two rows have the same id and charge both rows should be kept.