Better way to apply which.max over dataframe

Question

so I'm trying to learn R while playing with a dataset from https://www.kaggle.com/abcsds/pokemon

data = read.csv("Pokemon.csv")
data$Name = sub(".*(Mega)", "Mega", data$Name) # replacing name duplications

And I want to find all the pokemon that have a maximum value on any columns (Total, Attack, HP, etc):

I know I can do this: sapply(data[5:11], max, na.rm = TRUE) to find out the max values and stuff like

data[which.max(data$Total),]
data[which.max(data$HP),]
data[which.max(data$Attack),]

to find all the rows that have a max.

Is there a way I can use something like sapply in order to get all the rows without going through them sequentially?

@jay.sf that doesn't work, it does pretty much the same thing as `sapply(data[5:11], max, na.rm = TRUE)` — Darkway, Jan 31 '21 at 14:19
something that looks like this https://imgur.com/a/Dvps7aa Edit: this is what I obtained by using which.max over all the columns — Darkway, Jan 31 '21 at 14:31
You should make a reproducible example: data - code you've tried - expected output. Read our guidelines please: [how-to-make-a-great-r-reproducible-example](https://stackoverflow.com/a/5963610/6574038). — jay.sf, Jan 31 '21 at 15:19
You could do `data[rowSums(sapply(data[, 5:11], function(x) x==which.max(x))) > 0 ,]` — user12728748, Jan 31 '21 at 15:35

score 0 · Accepted Answer · answered Feb 01 '21 at 09:09

I believe this is what you want to achieve

I use tidyverse for this, as the data is in wide format with different columns for stat, I first convert it into long format using pivot_longer then I group_by stats column and filter the max of each group to achieve the desired result.

library(tidyverse)
df %>% 
  select(c(2, 5:11)) %>% 
  pivot_longer(-1, names_to = "stats") %>% 
  group_by(stats) %>% 
  filter(value == max(value))

Better way to apply which.max over dataframe

1 Answers1