21

Specification of the problem:

I'm searching through really great amount of lines of a log file and I'm distributing those lines to groups in order to regular expressions(RegExses) I have stored using the re.match() function. Unfortunately some of my RegExses are too complicated and Python sometimes gets himself to backtracking hell. Due to this I need to protect it with some kind of timeout.

Problems:

  • re.match, I'm using, is Python's function and as I found out somewhere here on StackOverflow (I'm really sorry, I can not find the link now :-( ). It is very difficult to interrupt thread with running Python's library. For this reason threads are out of the game.
  • Because evaluating of re.match function takes relatively short time and I want to analyse with this function great amount of lines, I need some timeout function that wont't take too long to execute (this makes threads even less suitable, it takes really long time to initialise new thread) and can be set to less than one second.
    For those reasons, answers here - Timeout on a function call and here - Timeout function if it takes too long to finish with decorator (alarm - 1sec and more) are off the table.

I've spent this morning searching for solution to this question but I did not find any satisfactory answer.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Jendas
  • 3,359
  • 3
  • 27
  • 55

2 Answers2

41

Solution:

I've just modified a script posted here: Timeout function if it takes too long to finish.

And here is the code:

from functools import wraps
import errno
import os
import signal

class TimeoutError(Exception):
    pass

def timeout(seconds=10, error_message=os.strerror(errno.ETIME)):
    def decorator(func):
        def _handle_timeout(signum, frame):
            raise TimeoutError(error_message)

        def wrapper(*args, **kwargs):
            signal.signal(signal.SIGALRM, _handle_timeout)
            signal.setitimer(signal.ITIMER_REAL,seconds) #used timer instead of alarm
            try:
                result = func(*args, **kwargs)
            finally:
                signal.alarm(0)
            return result
        return wraps(func)(wrapper)
    return decorator

And then you can use it like this:

from timeout import timeout 
from time import time

@timeout(0.01)
def loop():
    while True:
       pass
try:
    begin = time.time()
    loop()
except TimeoutError, e:
    print "Time elapsed: {:.3f}s".format(time.time() - begin)

Which prints

Time elapsed: 0.010s
Jendas
  • 3,359
  • 3
  • 27
  • 55
  • 2
    This is basically a whole-sale copy of the other answer, with the only difference is that you show the seconds parameter can be a float.. – Martijn Pieters Aug 10 '12 at 12:23
  • 5
    Yes, but using setitimer instead of alarm solved the problem - I can set now time to float - and I thought it will be more clear when I post it with whole syntax and I referenced to that answer. I didn't mean to steal someone’s credit. :-) – Jendas Aug 10 '12 at 12:30
  • Ah, that's the difference, and that's indeed more significant. – Martijn Pieters Aug 10 '12 at 12:30
  • @Jendas: I hope you're going to fix that bare except before you accept the answer. They are rarely what you want. – MRAB Aug 11 '12 at 01:17
  • 1
    According to the `signal` documentation, this won't work: "Although Python signal handlers are called asynchronously as far as the Python user is concerned, they can only occur between the “atomic” instructions of the Python interpreter. This means that signals arriving during long calculations implemented purely in C (such as regular expression matches on large bodies of text) may be delayed for an arbitrary amount of time." – Phillip Jan 19 '15 at 13:55
  • I'm noticing this sometimes doesn't work and just hangs, it seems to me it comes at random and I can't quite pin it down. Especially in some infinite loops where this doesn't terminate. Can't really pin it down though – Evan Pu Feb 23 '16 at 04:24
  • Well that is interesting, might it be the problem Phillip mentioned? Although I have never experienced anything like that... – Jendas Feb 23 '16 at 07:49
  • @Jendas Python now has TimeoutError as a built-in exception. Perhaps you could update :) Thanks for the nice solution – Muhammad Ali Jan 17 '19 at 12:51
  • 2
    Since you are using `signal` this code is only applicable at UNIX... what about Windows? – Dilshat Dec 24 '19 at 22:28
  • What's the solution if you're using regex? How do you terminate a long calculation? – Gary Aug 23 '22 at 19:14
0

This is how we can define a timeout slow_function function but it doesn't stop the slow_function even after raising exception:

import threading, time

class TimeoutError(Exception):
    pass

def slow_function():
    time.sleep(1000)
    return "Done"

def run_with_timeout(func, timeout):
    def target():
        nonlocal result
        result = func()

    result = None
    thread = threading.Thread(target=target)
    thread.start()
    thread.join(timeout)
    if thread.is_alive():
        raise TimeoutError("Function timed out")
    return result

try:
    result = run_with_timeout(slow_function, 3)
except TimeoutError:
    print("Function timed out")
else:
    print("Function returned:", result)
Mohammadreza
  • 79
  • 2
  • 9
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 16 '23 at 19:33