I am trying to figure how to efficiently select columns using dplyr::select_if
. The starwars
data set in dplyr 0.70 is a good dataset to use for this:
> starwars
# A tibble: 87 x 13
name height mass hair_color skin_color eye_color birth_year gender homeworld species films vehicles starships
<chr> <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <list> <list> <list>
1 Luke Skywalker 172 77 blond fair blue 19.0 male Tatooine Human <chr [5]> <chr [2]> <chr [2]>
2 C-3PO 167 75 <NA> gold yellow 112.0 <NA> Tatooine Droid <chr [6]> <chr [0]> <chr [0]>
3 R2-D2 96 32 <NA> white, blue red 33.0 <NA> Naboo Droid <chr [7]> <chr [0]> <chr [0]>
4 Darth Vader 202 136 none white yellow 41.9 male Tatooine Human <chr [4]> <chr [0]> <chr [1]>
5 Leia Organa 150 49 brown light brown 19.0 female Alderaan Human <chr [5]> <chr [1]> <chr [0]>
6 Owen Lars 178 120 brown, grey light blue 52.0 male Tatooine Human <chr [3]> <chr [0]> <chr [0]>
7 Beru Whitesun lars 165 75 brown light blue 47.0 female Tatooine Human <chr [3]> <chr [0]> <chr [0]>
8 R5-D4 97 32 <NA> white, red red NA <NA> Tatooine Droid <chr [1]> <chr [0]> <chr [0]>
9 Biggs Darklighter 183 84 black light brown 24.0 male Tatooine Human <chr [1]> <chr [0]> <chr [1]>
10 Obi-Wan Kenobi 182 77 auburn, white fair blue-gray 57.0 male Stewjon Human <chr [6]> <chr [1]> <chr [5]>
Now say that I would like select columns that are only integers. This works well:
library(dplyr)
starwars %>%
select_if(is.numeric)
But what should I do if I want to select based on multiple criteria. For example maybe I want both numeric and character columns:
starwars %>%
select_if(c(is.numeric, is.character))
Or maybe I want all numeric AND the name
column:
starwars %>%
select_if(name, is.character)
Neither of the two examples above work so I am wondering how I might accomplish what I've outlined here.