Firstly apologies if this has been asked elsewhere. I wasn't sure how to search for it so I didn't re-post an existing question.
I am experiencing some strange behaviour in R
when attempting to filter a data.table
based on the value in one column existing in another column. I may not be going about this the best way, so am open to to guidance on that front, however I am wanting to better understand why R
is behaving the way it is.
I have a data set:
library(data.table)
dt <- data.table(GRP = c(rep("a","4"),rep("b","4")),
COLA = c("Type C plus more","Type C plus more", "Type D then some", "Type D then some"),
COLB = c("Type C","Type D"))
# GRP COLA COLB
# 1: a Type C plus more Type C
# 2: a Type C plus more Type D
# 3: a Type D then some Type C
# 4: a Type D then some Type D
# 5: b Type C plus more Type C
# 6: b Type C plus more Type D
# 7: b Type D then some Type C
# 8: b Type D then some Type D
I am wanting to filter dt
based on the value in COLB
existing in COLA
. I expected it would be some form of string or regex
matching so have thought the use of grepl
would be suitable.
dt[grepl(COLB,COLA)]
# GRP COLA COLB
# 1: a Type C plus more Type C
# 2: a Type C plus more Type D
# 3: b Type C plus more Type C
# 4: b Type C plus more Type D
even when I use fixed = TRUE
I get the same output.
How is it for COLA = "Type D plus more"
I always get a FALSE
and for COLA = "Type C plus more"
I always get TRUE
?
For the record when I do grepl("Type D", "Type C plus more")
it does return FALSE