I use the regular matching function re.match(pattern, str)
in (Python3.10, windows10), but when the regex pattern is wrong, sometimes a Catastrophic Backtracking occurs. As a result, the program stucks at re.match
and cannot continue.
Since I have a lot of regular expressions, I can't change them one by one.
I've tried to limit function execution time, but because I'm a windows platform, all of them don't work.
- signal (only work in Unix)
- func_timeout
- timeout-decorator
- evenlet
My test function as follow, I have tried the answer in How to limit execution time of a function call?, but doesn't work:
class TimeoutException(Exception):
def __init__(self, msg=''):
self.msg = msg
@contextmanager
def time_limit(seconds, msg=''):
timer = threading.Timer(seconds, lambda: _thread.interrupt_main())
timer.start()
try:
yield
except KeyboardInterrupt:
raise TimeoutException("Timed out for operation {}".format(msg))
finally:
# if the action ends in specified time, timer is canceled
timer.cancel()
def my_func():
astr = "http://www.fapiao.com/dzfp-web/pdf/download?request=6e7JGm38jfjghVrv4ILd-kEn64HcUX4qL4a4qJ4-CHLmqVnenXC692m74H5oxkjgdsYazxcUmfcOH2fAfY1Vw__%5EDadIfJgiEf"
pattern = "^([hH][tT]{2}[pP]://|[hH][tT]{2}[pP][sS]:)(([A-Za-z0-9-~]+).)+([A-Za-z0-9-~\/])+$"
reg = re.compile(pattern)
result = reg.match(astr)
return result
if __name__ == '__main__':
try:
my_func()
except TimeoutException as e:
print(e.msg)
So is there any way to:
- stop
re.match
when "Catastrophic Backtracking" occurs - limit the number/time of regular matching or raise Exception when too much match time
- or limit the execution time of a function