0

I would like to calculate the mean of several columns in my data frame. I wanted to select them using the ‘:’ in the dplyr package. The variable names are: Mcheck5_1_1, Mcheck5_2_1, ..., Mcheck5_8_1 (so there are 8 in total). I learnt that I can select them by

select(df, Mcheck5_1_1:Mcheck5_8_1)

in an online course taught by Roger Pang (https://www.youtube.com/watch?v=aywFompr1F4&feature=youtu.be) at 4min33sec.

However, R complained:

Error in select(df, Mcheck5_1_1:Mcheck5_8_1) : 
unused argument (Mcheck5_1_1:Mcheck5_8_1)

I also couldn’t find other people’s using of this ‘:’ feature on Google. I suspect this feature no longer exists?

Right now, I use the following code to solve the problem:

idx = grep("Mcheck5_1_1", names(df))
df$avg = rowMeans(df[, idx:idx+7], na.rm = TRUE)

(I’m hesitate to index those columns using number (e.g., df[138]) for fear that its positive might vary.)

However, I think this solution is not elegant enough. Would you advice me is there any other ways to do it? Is it still possible to use the colon(:) method to index my variables nowadays just that I made some mistakes in my code? Thanks all.

https://www.youtube.com/watch?v=aywFompr1F4&feature=youtu.be

(At 4:33)

JetLag
  • 296
  • 1
  • 4
  • 16

2 Answers2

1

Try dplyr::select(df, Mcheck5_1_1:Mcheck5_8_1). It is likely to be a package conflict. See here for a related question.

To calculate the mean for each of those columns:

library(magrittr)
library(purrr)
df %>% 
 dplyr::select(Mcheck5_1_1:Mcheck5_8_1) %>% 
 map(mean)
markus
  • 25,843
  • 5
  • 39
  • 58
  • Btw, since dplyr now imported the %>% from package magrittr, you might want to use it. For more info, please see https://stackoverflow.com/questions/23621209/differences-between-dplyr-and-magrittr – JetLag Oct 09 '17 at 13:52
0

maybe using contains can help because it's used to perform a name search in the columns, so in your case it would be: select(df, contains("Mcheck5_"))

edgararuiz
  • 625
  • 5
  • 9