0

so I'm trying to learn R while playing with a dataset from https://www.kaggle.com/abcsds/pokemon

data = read.csv("Pokemon.csv")
data$Name = sub(".*(Mega)", "Mega", data$Name) # replacing name duplications

And I want to find all the pokemon that have a maximum value on any columns (Total, Attack, HP, etc):

I know I can do this: sapply(data[5:11], max, na.rm = TRUE) to find out the max values and stuff like

data[which.max(data$Total),]
data[which.max(data$HP),]
data[which.max(data$Attack),]

to find all the rows that have a max.

Is there a way I can use something like sapply in order to get all the rows without going through them sequentially?

Darkway
  • 3
  • 2

1 Answers1

0

I believe this is what you want to achieve

I use tidyverse for this, as the data is in wide format with different columns for stat, I first convert it into long format using pivot_longer then I group_by stats column and filter the max of each group to achieve the desired result.

library(tidyverse)
df %>% 
  select(c(2, 5:11)) %>% 
  pivot_longer(-1, names_to = "stats") %>% 
  group_by(stats) %>% 
  filter(value == max(value))