I am trying to remove quoted sequences from a string. For the example below my script works fairly:
import re
doc = ' Doc = "This is a quoted string: this is cool!" '
cleanr = re.compile('\".*?\"')
doc = re.sub(cleanr, '', doc)
print doc
Result (as expected):
' Doc = '
However when I have escaped string inside the quoted sentence I am not able to remove the escaped sequence using the pattern that I think would be the right one:
import re
doc = ' Doc = "This is a quoted string: \"this is cool!\" " '
cleanr = re.compile('\\".*?\\"') # new pattern
doc = re.sub(cleanr, '', doc)
print doc
Result
'Doc = this is cool!'
Expected:
'Doc = "This is a quoted string: " '
Does anyone knows what is happening? If the pattern '\\".*?\\"'
is wrong what would be the right one?