I would like to select the parts of a string covered by a set of substrings with the following properties:
- They belong in the original string.
- They may have different lengths and positions.
- They can overlap.
- They may not be ordered as they appear in the original string.
For example:
string = "MGLSDGEWQQVLNVWGKVEADIAGHGQEVLIHSKHPGDFGADAQGAMTKALELFRNDIAAKYKELGFQG"
substring1 = "HPGDFGADAQGAMTKALELFR"
substring2 = "GEWQQVLNVWGK"
substringn = "ALELFRNDIAAKYK"
And I would like to get:
coverage = "MGLSD<b>GEWQQVLNVWGK</b>VEADIAGHGQEVLIHSK<b>HPGDFGADAQGAMTKALELFRNDIAAKYK</b>ELGFQG"
I tried to extract the positions of the substrings within the string like this:
substrings_array.each do |substring|
start_pos = string.index substring
end_pos = string.length - (string.reverse.index(substring.reverse) )
end
and that, way I get a start and an end position for each substring. How could I merge them all, especially considering they may overlap and appear in different orders? Is this even a good strategy?