0

I would like to know if there is a simpler way of subsetting a data.frame`s integer columns.

My goal is to modify the numerical columns in my data.frame without touching the purely integer columns (in my case containing 0 or 1). The integer columns were originally factor levels turned into dummy variables and should stay as they are. So I want to temporarily remove them.

To distinguish numerical from integer columns I used the OP's version from here (Check if the number is integer).

But is.wholenumber returns a matrix of TRUE/FALSE instead of one value per column like is.numeric, therefore sapply(mtcars, is.wholenumber) does not help me. I came up with the following solution, but I thought there must be an easier way?

data(mtcars)
is.wholenumber <- function(x, tol = .Machine$double.eps^0.5)  abs(x - round(x)) < tol
integer_column_names <-  apply(is.wholenumber(mtcars), 2, mean) == 1
numeric_df <- mtcars[, !integer_column_names]

crazysantaclaus
  • 613
  • 5
  • 19
  • 1
    In case you wanted to identify columns with dummy values: try , `names(mtcars)[vapply(mtcars, function(x)length(unique(x))==2, logical(1))]`, want to be more strict for only 0 and 1 , you may try : `names(mtcars)[vapply(mtcars, function(x)all(sort(unique(x)) %in% c(0,1))==TRUE, logical(1))]` – PKumar Mar 06 '20 at 09:42
  • @PKumar would you mind to add these versions as more strict ways to test for dummy variables? I marked camnesia's answer as it is what I asked for, but your suggestions are also really helpful in the specific case of only 0 and 1 – crazysantaclaus Mar 06 '20 at 10:09

1 Answers1

1

You can use dplyr to achieve that as shown here

library(dplyr)

is_whole <- function(x) all(floor(x) == x)

df = select_if(mtcars, is_whole)

or in base R

df = mtcars[ ,sapply(mtcars, is_whole)]
camnesia
  • 2,143
  • 20
  • 26
  • that looks really good, but do you know of a version without using an additional package? – crazysantaclaus Mar 05 '20 at 22:01
  • 1
    perfect, from your answer I was able to adjust the `is.wholenumber` function to include `all` : `is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) all(abs(x - round(x)) < tol)` which is why it is now working with vectors, too. – crazysantaclaus Mar 06 '20 at 10:07