I am interested in removing all occurrences of a pattern in a Python string where the pattern looks like "start-string
blah, blah, blah end-string
". This is a general problem I'd like to be able to handle. This is the same problem as How can I remove a portion of text from a string whenever it starts with &*( and ends with )(* but in Python and not Java.
How would I solve the same problem in Python?
Assume the string looks like this,
'Bla bla bla <mark asd asd asd /> bla bla bla. Yadda yadda yadda <mark alls lkja /> yadda.'
The start of the block to remove is <mark
and the end is />
. So I do the following:
import re
mystring = "Bla bla bla <mark asd asd asd /> bla bla bla. Yadda yadda yadda <mark akls lkja /> yadda."
tags = "<mark", "/>"
re.sub('%s.*%s' % tags, '', mystring)
My desired output is
'Bla bla bla bla bla bla. Yadda yadda yadda yadda.'
But what I get is
'Bla bla bla yadda.'
So clearly the command is using the first instance of the opening string and the last occurrence of the end string.
How do I make it match the pattern twice and give me the desired output? This has to be easy but despite searches on "remove multiple occurrences regex Python" and the like I have not found an answer. Thanks.