-1

I have a dataframe with 562 columns. I want specific columns which have the string - 'mean' and 'std' in them only. How could I do it?

These are the names -

  [1] "tBodyAcc-mean-X"  
  [2] "tBodyAcc-mean-Y"                     
  [3] "tBodyAcc-mean-Z"                     
  [4] "tBodyAcc-std-X"                      
  [5] "tBodyAcc-std-Y"                      
  [6] "tBodyAcc-std-Z"                      
  [7] "tBodyAcc-mad-X"                      
  [8] "tBodyAcc-mad-Y"                      
  [9] "tBodyAcc-mad-Z"                      
 [10] "tBodyAcc-max-X"                      
 [11] "tBodyAcc-max-Y"                      
 [12] "tBodyAcc-max-Z"                      
 [13] "tBodyAcc-min-X"                      
 [14] "tBodyAcc-min-Y"                      
 [15] "tBodyAcc-min-Z"                                  
 [41] "tGravityAcc-mean-X"                  
 [42] "tGravityAcc-mean-Y"                  
 [43] "tGravityAcc-mean-Z"                  
 [44] "tGravityAcc-std-X"                   
 [45] "tGravityAcc-std-Y"                   
 [46] "tGravityAcc-std-Z"                   
 [47] "tGravityAcc-mad-X"                   
 [48] "tGravityAcc-mad-Y"                   
 [49] "tGravityAcc-mad-Z"                   
 [50] "tGravityAcc-max-X"                   
 [51] "tGravityAcc-max-Y"                   
 [52] "tGravityAcc-max-Z"                   

1 Answers1

0

You can use grep/grepl to match column names by a pattern. If your dataframe is called df.

df[grepl('mean|std', names(df))]

Or in dplyr you can use select :

library(dplyr)
df %>% select(matches('mean|std'))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213