0

I'm trying to read a binary file.

My objective is to find all the matches of "10, 10, [any hex value exactly one time], either EE or DD]"

Thought I could do it like this:

pattern = (b"\x10\x10\[0-9a-fA-F]?\[xDD|xEE]")

Clearly not working. It seems that it becomes an error at the third part. I tried dissecting the statement and x10 and x11 works, but the rest just won't.

My understanding of "[0-9a-fA-F]?" is that it matches the range in the brackets 0 or 1 times. and the third part "xDD or xEE" am I wrong?

Any ideas?

czsffncv
  • 3
  • 2

1 Answers1

1

Use the regex

b'\x10\x10.[\xdd\xee]'

A single . matches any character (any one-byte) single time, and a single [ab] matches a or b a single time.


>>> re.match(b'\x10\x10.[\xdd\xee]', b'\x10\x10\x00\xee')
<_sre.SRE_Match object; span=(0, 4), match=b'\x10\x10\x00\xee'>
Uriel
  • 15,579
  • 6
  • 25
  • 46
  • Why wouldn't the "." be like this "\x10\x10\." why no delimiter between the x10 and the dot? – czsffncv Jan 03 '17 at 14:07
  • Because the `.` is evaluated into a wild-card. `\.` is for matching strictly a dot character, as it serves as an escape sequence (and you want the wild card). – Uriel Jan 03 '17 at 14:08
  • Many thanks! so in my last statement everything was correct except for | which shoulde have been turned into a \ hence seperating the both hex values inside the brackets? – czsffncv Jan 03 '17 at 14:16
  • `|` in your regex will make the last character need to match `\xee`, `\xdd` **or** `|` (pipe symbol). (structure of `[abc]` -> `a|b|c`) – Uriel Jan 03 '17 at 14:23