There are plenty of posts on using dplyr's select_if
for multiple conditions. However, in whatever way, selecting for both is.factor
and variable names has not worked for me so far.
Ultimately, I would like to select all factors in a df/tibble and exclude certain variables by name.
Example:
df <- tibble(A = factor(c(0,1,0,1)),
B = factor(c("Yes","No","Yes","No")),
C = c(1,2,3,4))
Various attempts:
Attempt 1
df %>%
select_if(function(col) is.factor(col) & !str_detect(names(col), "A"))
Error in selected[[i]] <- .p(.tbl[[tibble_vars[[i]]]], ...) : replacement has length zero
Attempt 2
df %>%
select_if(function(col) is.factor(col) & negate(str_detect(names(col)), "A"))
Error: Can't convert a logical vector to function Call `rlang::last_error()` to see a backtrace
Attempt 3
df %>%
select_if(function(col) is.factor(col) && !str_detect(names(col), "A"))
Error: Only strings can be converted to symbols Call `rlang::last_error()` to see a backtrace
Attempt 4
df %>%
select_if(is.factor(.) && !str_detect(names(.), "A"))
Error in tbl_if_vars(.tbl, .predicate, caller_env(), .include_group_vars = TRUE) : length(.p) == length(tibble_vars) is not TRUE
In the meanwhile, individual conditions have no problem working:
> df %>%
+ select_if(is.factor)
# A tibble: 4 x 2
A B
<fct> <fct>
1 0 Yes
2 1 No
3 0 Yes
4 1 No
> df %>%
+ select_if(!str_detect(names(.), "A"))
# A tibble: 4 x 2
B c
<fct> <dbl>
1 Yes 1
2 No 2
3 Yes 3
4 No 4
The problem probably lies here:
df %>%
select_if(function(col) !str_detect(names(col), "A"))
Error in selected[[i]] <- .p(.tbl[[tibble_vars[[i]]]], ...) : replacement has length zero
However, I have little clue how to fix this.