2

Given the following string

1234

1234

We can use re.search(r'^\d{4}\Z') to find only the last four numbers or re.search(r'^\d{4}(?!\[\r\n\])'). How do I get a bytecode analysis (maybe with the dis module?) to compare which one is more effective? I know I could use timeit but am really interested in the internals.

Jan
  • 42,290
  • 8
  • 54
  • 79
  • 1
    if you are interested in the internals of regex (as opposed to python itself), you might want to try the regex debugger feature of regex101.com (which shows you step by step how the regex is evaluated) – Lukor Aug 17 '20 at 07:44
  • 1
    @Lukor: I know but am really interested how `\Z` is translated in bytecode. – Jan Aug 17 '20 at 07:51
  • 1
    In Python, debugging a regex pattern is done using `re.DEBUG` flag, `re.search(r'^\d{4}\Z', s, re.DEBUG)`. However, `\Z` and `$(?!(?s:.))` are really very close in performance. Just `\Z` is the solution in case you do not want to match before the final newline. – Wiktor Stribiżew Aug 17 '20 at 08:20
  • @WiktorStribiżew: It is, I close it myself as a duplicate. – Jan Aug 17 '20 at 08:28

0 Answers0