1

I want to filter columns of a data.table based on their attribute. The answer is actually based on Nera's answer here Convert column classes in data.table

# example
DT <- data.table(x = c("a", "b", "c"), 
                 y = c(1L, 2L, 3L), 
                 z = c(1.1, 2.1, 3.1))

# filter by class with nested structure
changeCols <- colnames(DT)[which(as.vector(DT[,lapply(.SD, class)]) == "character")]
changeCols

# filter by class with pipeline
DT[,lapply(.SD, class)] %>%
  as.vector(.) == "character" %>%
  which(.)
# Error in which(.) : argument to 'which' is not logical

# simply break the pipeline corrects the error
cols <- DT[,lapply(.SD, class)] %>% as.vector(.) == "character"
which(cols)

I don't understand why there's an error when I use pipeline while the error was solved when breaking the pipe. Thanks!

Ian Wang
  • 135
  • 8
  • What's the error message? – neilfws Jun 15 '23 at 22:57
  • 1
    @neilfws, I've edited the question to show the error message. tks(Error in which(.) : argument to 'which' is not logical) – Ian Wang Jun 15 '23 at 22:59
  • 2
    Most likely an issue with operator precedence, which for `%>%` is higher than for `==`. See https://stackoverflow.com/questions/38531508/order-of-operation-with-piping. As a result in your case `"character" %>% which(.)` is evaluated first and results in an error. To fix that try `DT[, lapply(.SD, class)] %>% { as.vector(.) == "character" } %>% which(.)`. – stefan Jun 15 '23 at 23:23

1 Answers1

2

You need to encapsulate the as.vector(.) == "character" bit, otherwise I think what is being piped forwards is just the string "character":

DT[,lapply(.SD, class)] %>%
  {as.vector(.) == "character"} %>%
    which(.)
#x 
#1 

On this occasion, you can see where it went wrong if you switch out the %>% for the base |>, . for _ and then quote() it (though the base pipe isn't identical to %>% it was useful to debug the issue this time).

quote(
    DT[,lapply(.SD, class)] |>
    as.vector(x=_) == "character" |> 
    which(x=_)
)
##as.vector(x = DT[, lapply(.SD, class)]) == which(x = "character")

You could also avoid this by just working inside the data.table j argument of DT[i, j, by] like:

DT[, which(lapply(.SD, class) == "character")]
#x 
#1 
thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • 1
    Thanks for the detailed answer! Not sure if my understanding for the quote function was correct. I've learned that the quote function returned the expression without executing it. But when quote() received a command with base pipe, the function converted the command to nested structure then returned. That's why I can use it to illustrate the bug. – Ian Wang Jun 16 '23 at 05:09