0

I have a dataset that has data from all football (Soccer) players in the top 5 leagues and I am trying to build a scout function that retrieves a shortlist of players that are in the top 85th percentile of the chosen metrics.

I tried experimenting with the function with a simple argument to see if it was working:

scout(Total_Big_5_new,"Nutmegs")

but it returns this error:

  the condition has length > 1
In addition: Warning message:
In percentile(database$metric) : NAs introduced by coercion

The code for the scout function is here:

scout <- function(database, ...) {
  l <- list(...)
  l2 <- list()
  j <- 1
  for(metric in l){
    if(metric %in% colnames(database)){
      l2[[j]] <-  percentile(database[[metric]])
      j <- j + 1
    }else{
      print(paste("The stat", metric, "is not recorded"))
    }
  }
  i <- 1
  k <- 1
  shortlist <- list()
  for (player in database){
    compared <- select(database, unlist(l))
    if (all(compared) > all(unlist(l2))){
      shortlist[[i]] <- player
      i <- i + 1
    }
  }
  return(shortlist)
  }

and the percentile function:

percentile <- function(metric, value = 0.85) {
   answer <- unname(quantile(metric, c(value)))
   return(as.numeric(paste(answer)))
 }

Edit: For example, say I make a dataframe with random data

df <- as_tibble(data.frame(
  Player  = c(LETTERS[1:13]),
  Goals = c(sample(1:45, 13, replace=FALSE)),
  Assists = c(sample(1:31, 13, replace=FALSE)),
  Nutmegs = c(sample(1:28, 13, replace = FALSE)),
  Dribbles = c(sample(43:208, 13, replace = FALSE))
))

Which returns this df:

Player Goals Assists Nutmegs Dribbles
   <chr>  <int>   <int>   <int>    <int>
 1 A         23      16       1      125
 2 B          7       2      19      195
 3 C         21       4      28      142
 4 D         28      19      23      112
 5 E          8      27      26      152
 6 F         17      23      16       45
 7 G         30       6      25      206
 8 H         26      24       8      136
 9 I         18       3      27       99
10 J         31      25       7      198
11 K          4      21      13       82
12 L          1      13      22       66
13 M         43       7       4      194

In this data frame, my percentile function would return 25.4. As seen below

percentile(df$Goals, 0.65) = 25.4

The aim of the scout function that I am creating is to retrieve the name of the players that exceed that value. EG

scout(df,"Goals")

should return players: D, G, H, J and M

med0504
  • 1
  • 1
  • 1
    What does `all_of` do? – user2974951 Nov 23 '22 at 14:28
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. We cant run and test this function without data. – MrFlick Nov 23 '22 at 14:39
  • @MrFlick, Sorry for the inconvenience, I've added an example with my desired output now – med0504 Nov 27 '22 at 21:15
  • @user2974951 The all_of function is from dplyr but I realised I just used the wrong one and it should be all function instead! – med0504 Nov 27 '22 at 21:18

0 Answers0