check if all values in data.frame columns are integers to subset dummy variables aka are all values in a column TRUE?

Question

I would like to know if there is a simpler way of subsetting a data.frame`s integer columns.

My goal is to modify the numerical columns in my data.frame without touching the purely integer columns (in my case containing 0 or 1). The integer columns were originally factor levels turned into dummy variables and should stay as they are. So I want to temporarily remove them.

To distinguish numerical from integer columns I used the OP's version from here (Check if the number is integer).

But is.wholenumber returns a matrix of TRUE/FALSE instead of one value per column like is.numeric, therefore sapply(mtcars, is.wholenumber) does not help me. I came up with the following solution, but I thought there must be an easier way?

data(mtcars)
is.wholenumber <- function(x, tol = .Machine$double.eps^0.5)  abs(x - round(x)) < tol
integer_column_names <-  apply(is.wholenumber(mtcars), 2, mean) == 1
numeric_df <- mtcars[, !integer_column_names]

In case you wanted to identify columns with dummy values: try , `names(mtcars)[vapply(mtcars, function(x)length(unique(x))==2, logical(1))]`, want to be more strict for only 0 and 1 , you may try : `names(mtcars)[vapply(mtcars, function(x)all(sort(unique(x)) %in% c(0,1))==TRUE, logical(1))]` — PKumar, Mar 06 '20 at 09:42
@PKumar would you mind to add these versions as more strict ways to test for dummy variables? I marked camnesia's answer as it is what I asked for, but your suggestions are also really helpful in the specific case of only 0 and 1 — crazysantaclaus, Mar 06 '20 at 10:09

camnesia · Accepted Answer · 2020-03-06T09:36:29.017

1

You can use dplyr to achieve that as shown here

library(dplyr)

is_whole <- function(x) all(floor(x) == x)

df = select_if(mtcars, is_whole)

or in base R

df = mtcars[ ,sapply(mtcars, is_whole)]

edited Mar 06 '20 at 09:36

answered Mar 05 '20 at 16:29

camnesia

2,143
20
26

that looks really good, but do you know of a version without using an additional package? – crazysantaclaus Mar 05 '20 at 22:01
1

perfect, from your answer I was able to adjust the `is.wholenumber` function to include `all` : `is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) all(abs(x - round(x)) < tol)` which is why it is now working with vectors, too. – crazysantaclaus Mar 06 '20 at 10:07

check if all values in data.frame columns are integers to subset dummy variables aka are all values in a column TRUE?

1 Answers1

Linked

Related