So say there is a string of t
and f
, how might one use the grep function to find the pattern of say, something starting with f
and stays in f
for some time and go to t
and I want to count the number of times it stays in t
a <- "fffftttfff"
b <- "fttttttfff"
c <- "tttttttttt"
d <- "fffffffftf"
path_ <- c(a,b,c,d)
ID <- 1:4
tf_dt <- data.table("ID" = ID,"path" = path_)
tf_dt
ID path
1: 1 fffftttfff
2: 2 fttttttfff
3: 3 tttttttttt
4: 4 fffffffftf
dt_raw <- tf_dt[,-1]
s <- paste0(as.vector(t(dt_raw)), collapse = "")
v <- substring(s,seq(1,nchar(s)-9,10), seq(10,nchar(s),10))
idx <- grep("^f*f.+t",v)
dt_final <- data.frame("ID" = tf_dt$ID, count = FALSE, time = NA)
dt_final$count[idx] <- TRUE
dt_final$time[idx] <- ???
What I reckon I should do is to remove the first string of f
and all the remaining string of letters after the first string of t
appearance. However I am not sure how might I be able to do that? Any help is appreciated.
My attempt:
nchar(gsub("^f*f","",gsub("something that relates to the end of the string","",v)))
More attempts:
#If I do gsub("^f*f+t*","",v) it gives me the last string that I want to remove
#But I cant do something like
nchar(gsub("^f*f","",gsub("gsub("^f*f+t*","",v)$",""v)))
Expected Output:
tf_count <- c(TRUE,TRUE,FALSE,TRUE)
tf_time <- c(3,6,NA,1)
output <- data.table("ID" = ID, "count" = tf_count,"time_taken" = tf_time)
# ID count time_taken
# 1: 1 TRUE 3
# 2: 2 TRUE 6
# 3: 3 FALSE NA
# 4: 4 TRUE 1
Also side note, is there somewhere that I can look at a lot of examples of how grep()
and stringr()
works. (I think from what I have seen this is under stringr()
?) I tried reading things on this, but nothing really came out of it, and I am still equally as confused as before. Thanks.