I'm struggling to figure out a regex that can match the last triple underscore in a string that is preceded by a letter or number. Eventually, I want to be able to extract the characters before and after this match. I also need to accomplish this with base R
x <- c("three___thenfour____1",
"only_three___k")
The closest I've gotten is trying to adapt Regex Last occurrence?
sub("^(.+)___(?:.(?!___))+$", "\\1", x, perl = TRUE)
[1] "three___thenfour_" "only_three"
But what I really want to be able to get is
c("three___thenfour", "only_three")
and c("_1", "k")
(The only way I've managed to get those results so far is through strsplit
, but it feels clunky and inefficient)
do.call("rbind",
lapply(strsplit(x, "___"),
function(x){
c(paste0(head(x, -1), collapse = "___"), tail(x, 1))
}))
[,1] [,2]
[1,] "three___thenfour" "_1"
[2,] "only_three" "k"
Any suggestions?