0

I want to create a 2 data frames from 'Arrests' first one only includes the variables with numeric values and the other one includes only the categorical variables. So for example:

X <- data.frame(Arrests)  
X
     released colour year age    sex employed citizen checks
1         Yes  White 2002  21   Male      Yes     Yes      3
2          No  Black 1999  17   Male      Yes     Yes      3
3         Yes  White 2000  24   Male      Yes     Yes      3
4          No  Black 2000  46   Male      Yes     Yes      1
5         Yes  Black 1999  27 Female      Yes     Yes      1
6         Yes  Black 1998  16 Female      Yes     Yes      0
7         Yes  White 1999  40   Male       No     Yes      0

I want to get a data frame that only includes year, age and checks because those are numeric variables. And another data frame which includes released, color, sex, employed, and citizen because those are categorical variables. I tried the below code

Y <- sapply(X, is.numeric)
Y
released   colour     year      age      sex employed  citizen   checks 
   FALSE    FALSE     TRUE     TRUE    FALSE    FALSE    FALSE     TRUE 

Now it knows which are the numeric variables but how can I create a data frame that only includes those 3 numeric variables? And also a data frame that only includes the 5 categorical variables?

Bustergun
  • 977
  • 3
  • 11
  • 17

2 Answers2

3

Using dplyr, you can use select_if:

library(dplyr)

# Your data
arrests <- data.frame(released = c("Yes", "No"),
                  colour = c("White", "Black"),
                  year = c(2002, 1999),
                  age = c(21,17))

# Solution
select_if(arrests, is.numeric) 

  year age
1 2002  21
2 1999  17
tifu
  • 1,352
  • 6
  • 17
3

Using base R:

Y <- X[, sapply(X, is.numeric)]

is your data.frame containing all numeric variables. And

Z <- X[, !sapply(X, is.numeric)]

is your data.frame containing all categorical variables.

symbolrush
  • 7,123
  • 1
  • 39
  • 67
  • Thanks this works. Just a follow up question. What if I also want to exclude variables that have all missing/Null values? How can I add that to the code? – Bustergun May 09 '18 at 10:43
  • Let's say I have variable "country" and it only has NULL or NA values. How can I exclude that variable. – Bustergun May 09 '18 at 10:49
  • Look at `complete.cases(NA)` or `is.na()`. You may want to start here: https://www.rdocumentation.org/packages/stats/versions/3.5.0/topics/complete.cases – symbolrush May 09 '18 at 18:19