According to the OP
Later on I wish to turn longer sub-sequences into sequences of 1's,
such as 10001 to 11111.
If I understand correctly, the final goal is to replace any sub-sequence of consecutive 0
into the same number of 1
if they are surrounded by a 1
on both sides.
In R, this can be achieved by the str_replace_all()
function from the stringr
package. For demonstration and testing, the input
vector contains some edge cases where substrings of 0
are not surrounded by 1
.
input <- c("110101101",
"11010110001",
"110-01101",
"11010110000",
"00010110001")
library(stringr)
str_replace_all(input, "(?<=1)0+(?=1)", function(x) str_dup("1", str_length(x)))
[1] "111111111" "11111111111" "110-01111" "11111110000" "00011111111"
The regex "(?<=1)0+(?=1)"
uses look behind (?<=1)
as well as look ahead (?=1)
to ensure that the subsequence 0+
to replace is surrounded by 1
. Thus, leading and trailing subsequences of 0
are not replaced.
The replacement is computed by a functions which returns a subsequence of 1
of the same length as the subsequence of 0
to replace.