15

Why doesn't this work with data.table?

It works with data.frame. Is there a way to do this with a data table?

x <- data.table(v1=1:20,v2=1:20,v3=1:20,v4=letters[1:20])
y <- x[ , sapply(x, is.numeric)]

This returns:

v1    v2    v3    v4
TRUE  TRUE  TRUE FALSE
Henrik
  • 65,555
  • 14
  • 143
  • 159
Fred R.
  • 557
  • 3
  • 7
  • 16
  • 1
    The data.table FAQ vignette covers this with FAQs 1.1 and 1.2 – mnel Aug 05 '14 at 03:34
  • Very nearly a duplicate of [select multiple columns in data table](http://stackoverflow.com/questions/13383840/select-multiple-columns-in-data-table-r) – thelatemail Aug 05 '14 at 04:17
  • Possible duplicate of [Selecting only numeric columns from a data frame](http://stackoverflow.com/questions/5863097/selecting-only-numeric-columns-from-a-data-frame) – Aramis7d Apr 04 '16 at 07:07

5 Answers5

39

From data.table 1.13.0 ".SDcols accepts a function which is used to select the columns of .SD". Thus, simply .SDcols = is.numeric:

x[ , .SD, .SDcols = is.numeric]
Henrik
  • 65,555
  • 14
  • 143
  • 159
Artem Klevtsov
  • 9,193
  • 6
  • 52
  • 57
15

data.table needs the with=FALSE to grab column numbers.

tokeep <- which(sapply(x,is.numeric))
x[ , tokeep, with=FALSE]
Mike.Gahan
  • 4,565
  • 23
  • 39
3

You may also try:

 x1 <- x[,Filter(is.numeric, .SD)]
 head(x1,3)
 #   v1 v2 v3
#1:  1  1  1
#2:  2  2  2
#3:  3  3  3

Although, I have to admit that it is slow for bigger datasets.

akrun
  • 874,273
  • 37
  • 540
  • 662
0

Similar to @akrun's answer

Filter(is.numeric, x)
Jeff Bezos
  • 1,929
  • 13
  • 23
0

We can write a custom helper calledwhere(), and then we can subset a data.frame/data.table where f is satisfied:

where <- function(x, f) {
  colnames(x)[vapply(x, f, logical(1))]
}

df[, where(df, is.numeric), with = FALSE]
Eyayaw
  • 1,033
  • 5
  • 10