I was wondering whether the most efficient way to extract text from a column was using the sub
function in a data.table
.
For example, we have the following data set:
test <- data.table(a = c("Hello world, this is Tom and I am a guy", "Hello world, this is Jack and I am a guy"))
and I would like to extract the names. One way to extract the names are using the substitution function
test[, Name := sub(".*? this is (.*?) and.*", "\\1", a)]
but I was wondering, is this the most efficient way?