Suppose you have a string that consists of the concatenation of n
string tokens each of length d
. What is the regex pattern that would match this string if and only if n==N+2
(where N
is a fixed) and a particular token appears between k
and N
times in it.
Problem
Suppose we have (tokens of length d=4
):
s = "SbbESbbESbbEStbEStbESbbEStbESttE"
.
We would like to match substrings where SbtE
appears exactly 2, 3, or 4 times (k=2
) between a delimiting SbbE
token and a delimiting token that's not SbtE
, the delimiting tokens themselves 4 tokens apart (N=4
).
Desired solution
Adding spaces inside s
for readability s ~ "SbbE SbbE SbbE StbE StbE SbbE StbE SttE"
. Here, we have a total of 8 tokens, but we are looking for substrings containing exactly 4 tokens, which in turn contain between 2 and 4 appearances of SbtE
. So, the first match should be from token 1 to token 6, and the second match should be from token 3 to token 8.
Is this even possible using regular expressions?