0

This doesn't address my question: How to timeout function in python, timeout less than a second

In the comments, it states the issue I'm having: According to the signal documentation, this won't work: "Although Python signal handlers are called asynchronously as far as the Python user is concerned, they can only occur between the “atomic” instructions of the Python interpreter. This means that signals arriving during long calculations implemented purely in C (such as regular expression matches on large bodies of text) may be delayed for an arbitrary amount of time."


I'm attempting to use regex to parse content from the web using the gevent Python library. It works well, except when I encounter really large content. Is there a way to terminate a thread that doesn't complete in x seconds?

Here's what I've come up with, but it doesn't work:

def get_all_matches(self, content, the_regex, timeout = 5):
    try:
        def kill_regex(*args, **kwargs):
            raise TimeoutError

        signal.signal(signal.SIGALRM, kill_regex)
        signal.alarm(int(timeout))

        return re.findall(the_regex, content, re.IGNORECASE)
    except Exception:
        return []
Gary
  • 909
  • 9
  • 26
  • 2
    Unfortunately this question was closed; the proposed solution is not valid. – Gary Aug 23 '22 at 20:15
  • Question has been opened now. Few tips from my side to make it more attractive for users to get answers: **1st** add samples of input, **2nd** add samples of output and it will make it more clear, thank you(already upvoted the question). – RavinderSingh13 Sep 04 '22 at 07:27

0 Answers0