1

I'm trying to use mutate_if or select_if, etc, verbs with column names within the predicate function.

See example below:

> btest <- data.frame(
+   sjr_first = c('1','2','3',NA, NA, '6'),
+   jcr_first = c('1','2','3',NA, NA, '6'),
+   sjr_second = LETTERS[1:6],
+   jcr_second = LETTERS[1:6],
+   sjr_third = as.character(seq(6)),
+   jcr_fourth = seq(6) + 5,
+   stringsAsFactors = FALSE)
> 
> btest %>% select_if(.predicate = ~ str_match(names(.), 'jcr'))
Error in selected[[i]] <- eval_tidy(.p(column, ...)) : 
  replacement has length zero

I'm aware I could use btest %>% select_at(vars(dplyr::matches('jcr'))) but my goal here is actually to combine the column name condition with another condition (e.g. is.numeric) using mutate_if() to operate on a subset of my columns. However I'm not sure how to get the first part with the name matching to work...

Brandon
  • 1,722
  • 1
  • 19
  • 32
  • Side note: Found this question which involves accessing column names within the `mutate_if` function, which appears to be quite a different process, but may be useful for other searchers: https://stackoverflow.com/questions/48868208/extract-column-name-in-mutate-if-call – Brandon Dec 07 '19 at 11:07

3 Answers3

4

You can do:

btest %>%
 select_if(str_detect(names(.), "jcr") & sapply(., is.numeric))

  jcr_fourth
1          6
2          7
3          8
4          9
5         10
6         11
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
3

Tidyverse solution:

require(dplyr)

# Return (get):    

btest %>% 

  select_if(grepl("jcr", names(.)) & sapply(., is.numeric))

# Mutate (set):    

btest %>%

  mutate_if(grepl("jcr", names(.)) & sapply(., is.numeric), funs(paste0("whatever", .)))

Base R solution:

# Return (get): 

btest[,grepl("jcr", names(btest)) & sapply(btest, is.numeric), drop = FALSE]

# Mutate (set): 

btest[,grepl("jcr", names(btest)) & sapply(btest, is.numeric)] <- paste0("whatever", unlist(btest[,grepl("jcr", names(btest)) & sapply(btest, is.numeric)]))
hello_friend
  • 5,682
  • 1
  • 11
  • 15
2

You could separate two select_if calls

library(dplyr)
library(stringr)

btest %>% select_if(str_detect(names(.), 'jcr')) %>% select_if(is.numeric)

#  jcr_fourth
#1          6
#2          7
#3          8
#4          9
#5         10
#6         11

We cannot combine the two calls because the first one operates on entire dataframe together whereas the second one operates column-wise.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213