There are a some similar questions (like here, or here), but none with quite the answer I am looking for.
The question:
How to use select() only on columns of a certain type?
The select helper functions used in select_if()
or select_at()
may only reference the column name or index. In this particular case I want to select columns of a certain type (numeric) and then select a subset of them based on their column sum while not losing the columns of other types (character).
What I would like to do:
tibbly = tibble(x = c(1,2,3,4),
y = c("a", "b","c","d"),
z = c(9,8,7,6))
# A tibble: 4 x 3
x y z
<dbl> <chr> <dbl>
1 1 a 9
2 2 b 8
3 3 c 7
4 4 d 6
tibbly %>%
select_at(is.numeric, colSums(.) > 12)
Error: `.vars` must be a character/numeric vector or a `vars()` object, not primitive
This doesn't work because select_at() doesn't recognize is.numeric
as a proper function to select columns.
If I do something like:
tibbly %>%
select_if(is.numeric) %>%
select_if(colSums(.) > 12)
I manage to only select the columns with a sum > 12, but I also loose the character cholumns. I would like to avoid having to reattach the lost columns afterwards.
Is there a better way to select columns in a dplyr fashion, based on some properties other than their names / index?
Thank you!