0

Say I have the following string:

mystr = "6374696f6e20????28??????2c??2c????29"

And I want to replace every sequence of "??" with its length\2. So for the example above, I want to get the following result:

mystr = "6374696f6e2022832c12c229"

Meaning:

  • ???? replaced with 2
  • ?????? replaced with 3
  • ?? replaced with 1
  • ???? replaced with 2

I tried the following but I'm not sure it's the good approach, and anyway -- it doesn't work:

regex = re.compile('(\?+)')
matches = regex.findall(mystr)
if matches:
        for match in matches:
                match_length = len(match)/2
                if (match_length > 0):
                        mystr= regex .sub(match_length , mystr)
api pota
  • 109
  • 12
  • 1
    Just pass `lambda m: str(len(m.group())//2)` to `re.sub` – Aran-Fey Jan 24 '18 at 18:00
  • @Rawing Wouldn't that replace `???` with `1`? Not sure that's ok. – Stefan Pochmann Jan 24 '18 at 18:03
  • @StefanPochmann Well, what's the alternative? Replacing it with `1.5`? – Aran-Fey Jan 24 '18 at 18:04
  • @Rawing Replacing `??` with `1` and leaving the third `?` alone. – Stefan Pochmann Jan 24 '18 at 18:05
  • @StefanPochmann Either way, the lambda is fine. If you want to leave odd numbers of question marks alone, you'll have to use a different regex, not a different replacement function. `(?:\?\?)+` should do it. – Aran-Fey Jan 24 '18 at 18:08
  • Thanks @Rawing, that solved the problem. Will you elaborate this to an answer and explain how the lambda works in this case so I can accept it. Since you were the first to mention the solution – api pota Jan 24 '18 at 18:10
  • @apipota Please mark the question as a duplicate. I'm not sure what there is to explain; the duplicate I linked contains a short explanation and a link to the official docs if for some reason you need more information about replacement functions. – Aran-Fey Jan 24 '18 at 18:12

1 Answers1

3

You can use a callback function in Python's re.sub. FYI lambda expressions are shorthand to create anonymous functions.

See code in use here

import re

mystr = "6374696f6e20????28??????2c??2c????29"
regex = re.compile(r"\?+")

print(re.sub(regex, lambda m: str(int(len(m.group())/2)), mystr))

There seems to be uncertainty about what should happen in the case of ???. The above code will result in 1 since it converts to int. Without int conversion the result would be 1.0. If you want to ??? to become 1? you can use the pattern (?:\?{2})+ instead.

ctwheels
  • 21,901
  • 9
  • 42
  • 77