0

Using bytes regular expression works fine as follows:

In [48]: regexp_1 = re.compile(b"\xab.{3}")
In [49]: regexp_1.fullmatch(b"\xab\x66\x77\x88")
Out[49]: <re.Match object; span=(0, 4), match=b'\xabfw\x88'> # <----- good !

When I try formatting the bytes sequence according to this post I fail:

In [50]: byte = b"\xab"
In [51]: regexp_2 = re.compile(f"{byte}.{3}".encode())
In [52]: regexp_2.fullmatch(b"\xab\x66\x77\x88")
In [53]: # nothing found ... why ?
OrenIshShalom
  • 5,974
  • 9
  • 37
  • 87

1 Answers1

1

This happens because f-string converts the given object to string, and when the bytes object is converted to string, it doesn't look like what you'd expect:

>>> str(byte)
"b'\\xab'"

so when you put it through f-string as you did, it gets ugly, and it stays that way when it's encoded again!

>>> f"{byte}.{3}"
"b'\\xab'.3"
>>> f"{byte}.{3}".encode()
b"b'\\xab'.3"

Not to mention {3} gets parsed as 3. to prevent that you can use double brackets ({{3}}) instead, but that's not the point of this problem.

I recommend you to concate strings instead.

regexp = re.compile(byte + b'.{3}')

# <re.Match object; span=(0, 4), match=b'\xabfw\x88'>
regexp.fullmatch(b"\xab\x66\x77\x88")
KokoseiJ
  • 163
  • 6