We can either split the string with strsplit
, get the last 5 elements with tail
and paste
it together
paste(tail(strsplit(str1, "\\s+")[[1]],5), collapse=" ")
#[1] "TGAGGTAGTAGATTGTATAAA 0 I-AA 0 gtt"
If we have multiple elements, we loop through the list
(output from strsplit
) and do the same as above.
sapply(strsplit(rep(str1,2), " "), function(x) paste(tail(x, 5), collapse=" "))
#[1] "TGAGGTAGTAGATTGTATAAA 0 I-AA 0 gtt" "TGAGGTAGTAGATTGTATAAA 0 I-AA 0 gtt"
Or use str_extract
library(stringr)
str_extract(str1, "(\\S+\\s+){4}\\S+$")
#[1] "TGAGGTAGTAGATTGTATAAA 0 I-AA 0 gtt"
Part of the same pattern can be used in sub
from base R
sub(".*\\s+((\\S+\\s+){4})(\\S+)$", "\\1\\3", str1)
#[1] "TGAGGTAGTAGATTGTATAAA 0 I-AA 0 gtt"
data
str1 <- "hsa-let-7f-5p TGAGGTAGTAGATTGTATAAA 0 I-AA 0 gtt"