I'll explain what I want using an example. I'm working with DNA sequences. Let's say I want to remove everything between GUA
and CAG
(including GUA
and CAG
) in a string. So if the input is : "AAAAGUAGGGGCAGCAGUUUUUGUAAAAACAG"
The output should be : ["AAAA","CAGUUUUU"]
. I initially used re.split(r'GUA\w*CAG',a)
, but that returns ["AAAA"]
. It seems to look for the last occurrence of CAG
in the string instead of the first occurrence.