4

Let's say you have a program that search in a user-provided text for a user-provided RegEx.

Using the re module, how can I limit the number of steps the regex will make to prevent catastrophic backtracking.

Ideally, this SME should not hang:

regex = r"^(.*?,){11}P"

test_str = "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21P" # Example stolen from the previous link.

matches = list(re.finditer(regex, test_str, re.MULTILINE)) # This is an iterator so we consume it by casting to a list.
WayToDoor
  • 1,180
  • 9
  • 24

0 Answers0