-2

I have some strings that I want to parse to extract the operator, but there may also be operators within quoted substrings that I want to ignore. In each string, there would only be one operator (==, !=, <, <=, >, >=) that is not within quotes.

Some Examples:

s1 = "'x > 1' == 2"
s2 = "'<Age>'<=32"
s3 = 'name == ""type<3>""'

I tried using re.sub('[\'"]+(.*?)[\'"]+', r'', s1) to replace any quoted material with nothing. This allows me to find the operator each time, but then I can't find the position of the operator. Is there a way to sub whitespace of the same length as the string that is being subbed? I'd like them to look like they do below so I can use re.search and then split at the operator.

s1 = "        ==  "
s2 = "       <=32"
s3 = 'name ==            '

Is this possible, or is there another approach I can take to this?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
dshanahan
  • 676
  • 5
  • 12

1 Answers1

-1

The replacement can be a function, which receives the match as an argument. It can then return a string with the appropriate number of spaces.

re.sub(r'[\'"]+(.*?)[\'"]+', lambda m: " " * len(m.group(0)), s1)
Barmar
  • 741,623
  • 53
  • 500
  • 612