1

given a dataframe like this:

COUNTRY  CITIZENS  SURFACE

A        20000000   40
A        80000000   78
B        3000000    120
B        200000     27
C        10000000   56
A        5600000    20
C        10000000   30
B        2500000    20

I would like to subset the dataframe just with the rows corresponding to the max value of citizens for each country level.

I was able to obtain the max value of "citizens" for each level of country with dplyr and summarize, but I am not able to extract the corresponding surface value for each max value.

Do you know how can I achieve that?

Jeni
  • 918
  • 7
  • 19

1 Answers1

1

We can use slice after grouping by 'COUNTRY'

library(dplyr)
df1 %>%
  group_by(COUNTRY) %>%
  slice(which.max(CITIZENS))

Or with filter

df1 %>%
   group_by(COUNTRY) %>%
   filter(CITIZENS == max(CITIZENS))

Or with data.table

library(data.table)
setDT(df1)[, .SD[CITIZENS == max(CITIZENS)], COUNTRY]
akrun
  • 874,273
  • 37
  • 540
  • 662