I think the function I've seen that's most helpful for "number in a row" questions is rle
, which computes the run length encoding of a vector. For instance, you can see the run lengths of characters being the same or different in your strings with:
r1 = "ghuytut3jilujshdftgu"
r2 = "ghuytuthjilujshdftgu"
spl1 = unlist(strsplit(r1, ""))
spl2 = unlist(strsplit(r2, ""))
rle(spl1 == spl2)
# Run Length Encoding
# lengths: int [1:3] 7 1 12
# values : logi [1:3] TRUE FALSE TRUE
For your problem, you want to compute the run length of matches starting from some interior index i
, both forward and backward. Here's an implementation of that, using rle
(function assumes strings are same length and i
is a valid index; forward and backward run lengths include the character at index i
):
fxn = function(r1, r2, i) {
spl1 = unlist(strsplit(r1, ""))
spl2 = unlist(strsplit(r2, ""))
if (spl1[i] != spl2[i]) {
return(list(forward=0, backward=0))
}
rle.backward = rle(spl1[i:1] == spl2[i:1])
rle.forward = rle(spl1[i:nchar(r1)] == spl2[i:nchar(r2)])
return(list(forward=rle.forward$lengths[1], backward=rle.backward$lengths[1]))
}
fxn(r1, r2, 5)
# $forward
# [1] 3
#
# $backward
# [1] 5
fxn(r1, r2, 9)
# $forward
# [1] 12
#
# $backward
# [1] 1