2

There are a lot of similar questions and asnwers, but I still can't find reliable answer.

So, I have a function, that can possibly run too long. Function is private, in sense that I can not change it code.

I want to restrict its execution time to 60 seconds. I tried following approaches:

  1. Python signals. Don't work on Windows and in multithreaded envorinment (mod_wsgi).
  2. Threads. Nice way, but thread can not be stopped, so that it lives even after raising TimeoutException.
  3. multiprocessing python module. I have problems with pickling and I don't know how to solve them. I want to make time_limit decorator and there are problems with importing required function in top-level. Function that is executed long is instance method, and wrapping it also doesn't help...

So, are there good solutions to the above problem? How to kill thread, that I started? How to use subprocesses and avoid problems with pickling? Is subprocess module of any help?

Thank you.

JasonMArcher
  • 14,195
  • 22
  • 56
  • 52
Paul R
  • 2,631
  • 3
  • 38
  • 72
  • This is sounding like several somewhat related questions. Do you need to know how to clean up threads? Can you give us some sample code to make this clearer? – BlackVegetable Feb 13 '15 at 19:40
  • I need to know how to stop thread or make something that will stop the process. – Paul R Feb 13 '15 at 19:46

2 Answers2

3

I think the multiprocessing approach is your only real option. You're correct that threads can't be killed (nicely) and signals have cross-platform issues. Here is one multiprocessing implementation:

import multiprocessing
import Queue

def timed_function(return_queue):
    do_other_stuff()
    return_queue.put(True)
    return

def main():

    return_queue = multiprocessing.Manager().Queue()

    proc = multiprocessing.Process(target=timed_function, args=(return_queue,))
    proc.start()

    try:

        # wait for 60 seconds for the function to return a value
        return_queue.get(timeout=60)

    except Queue.Empty:
        # timeout expired
        proc.terminate() # kill the subprocess
        # other cleanup

I know you said that you have pickling issues, but those can almost always be resolved with refactoring. For example, you said that your long function is an instance method. You can wrap those kinds of functions to use them with multiprocessing:

class TestClass(object):
    def timed_method(self, return_queue):
        do_other_stuff()
        return_queue.put(True)
        return

To use that method in a pool of workers, add this wrapper to the top-level of the module:

def _timed_method_wrapper(TestClass_object, return_queue):
    return TestClass_object(return_queue)

Now you can, for example, use apply_async on this class method from a different method of the same class:

def run_timed_method():
    return_queue = multiprocessing.Manager().Queue()
    pool = multiprocessing.Pool()
    result = pool.apply_async(_timed_method_wrapper, args=(self, return_queue))

I'm pretty sure that these wrappers are only necessary if you're using a multiprocessing.Pool instead of launching the subprocess with a multiprocessing.Process object. Also, I bet a lot of people would frown on this construct because you're breaking the nice, clean abstraction that classes provide, and also creating a dependency between the class and this other random wrapper function hanging around. You'll have to be the one to decide if making your code more ugly is worth it or not.

skrrgwasme
  • 9,358
  • 11
  • 54
  • 84
  • 1
    I managed to get it working with multiprocessing, but now code runs 5 (if not more) slower than without multiprocessing. Any ideas? – Paul R Feb 13 '15 at 20:30
  • No ideas without seeing your code or knowing what kind of data you're processing. Multiprocessing *does* have overhead, and generally, the less work any one process does, the smaller the benefits of parallelism will be. Perhaps you could open a new question about it, with more details and code? – skrrgwasme Feb 13 '15 at 20:33
  • I'm afraid function is too big, so I can't paste it, but comparing to the running without multiprocessing, code with multiprocessing is very slow. – Paul R Feb 13 '15 at 20:39
  • Are you passing a lot of data back and forth? Any large objects being passed into or returned from the function? Any input arguments will be *copied* into the memory space of the new process. If the function itself hasn't changed, then the only new overhead should be in startup and teardown of the new process. There's no reason it would affect the execution of the function between those points. – skrrgwasme Feb 13 '15 at 20:41
  • No. Not at all. In fact I implemented multiprocessing in sightly another manner and it worked slow, but your version is running with the same speed as without multiprocessing. Thank you very much! +1 – Paul R Feb 13 '15 at 20:59
  • Good! Since you're not actually parallelizing any of the work, I wouldn't expect it to be any faster, but it shouldn't be noticeably slower without something weird going on, either. – skrrgwasme Feb 13 '15 at 21:06
  • I didn't expect that it will be faster. I just want it to be not slower. – Paul R Feb 13 '15 at 21:11
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/70883/discussion-between-andrew-fount-and-skrrgwasme). – Paul R Feb 13 '15 at 22:35
-2

An answer to Is it possible to kill a process on Windows from within Python? may help: You need to kill that subprocess or thread: "Terminating a subprocess on windows"

Maybe also TerminateThread helps

Community
  • 1
  • 1