2

I'm calling a function in a for loop however, I want to check if that function takes longer than 5 seconds to execute, I want to pass that iteration and move on to the next iteration.

I have thought about using the time library, and starting a clock, but the end timer will only execute after the function executes, thus I won't be able to pass that specific iteration within 5 seconds

rishi
  • 652
  • 1
  • 6
  • 20
  • Will it be ok for those functions which *do* take longer than 5 seconds to continue to run until they finish? – quamrana Jan 20 '21 at 12:25
  • @quamrana No that's exactly what I'm trying to avoid – rishi Jan 20 '21 at 12:27
  • So you meant to ask: 'if a function is still running, can it be stopped so I can call something else?' – quamrana Jan 20 '21 at 12:28
  • @quamrana It's the same function running with different values, executed in a for loop. I want to see if it's execution takes longer than 5 seconds with a particular value then move onto the next iteration of the same function – rishi Jan 20 '21 at 12:29
  • 3
    Does this help? https://stackoverflow.com/questions/492519/timeout-on-a-function-call – AcK Jan 20 '21 at 12:29
  • I'm not sure that there is such a feature in the standard library. But you can probably build a kind of ProcessPool that kills processes running since more than 5 seconds. These points are discussed here for example: https://stackoverflow.com/a/31267963/5050917 and https://stackoverflow.com/a/38792237/5050917 (the later suggesting the use of the [Pebble](https://pypi.org/project/Pebble/) library). – mgc Jan 20 '21 at 12:30
  • Ok, so have you written this function, and are there loops in it which could be modified to detect a condition in a timely manner and just return? – quamrana Jan 20 '21 at 12:31
  • @quamrana I'm actually scrapping multiple websites, and the various links are passed as the parameters for it. I want to check if the scrapping takes more than 5 seconds to execute, then just move onto the next link – rishi Jan 20 '21 at 12:33
  • I don't know about this `scrapping`, but if you are sending some sort of request to a website (or multiple requests in a sequence) and it might take longer than 5 seconds, then you can run it in its own thread with a queue on which it will return a result. You have your main thread check the queue and move on if nothing is returned after 5 seconds. – quamrana Jan 20 '21 at 12:37
  • @quamrana I don't know how to use threads and I haven't understood it by reading any articles, as they all seem to have complicated examples. Could you provide a short example for the same. – rishi Jan 20 '21 at 12:40

3 Answers3

1

I am attaching an example below. Hope this might help you:

from threading import Timer 
class LoopStopper: 
 
    def __init__(self, seconds): 
        self._loop_stop = False 
        self._seconds = seconds 
  
    def _stop_loop(self): 
        self._loop_stop = True 
 
    def run( self, generator_expression, task): 
        """ Execute a task a number of times based on the generator_expression""" 
        t = Timer(self._seconds, self._stop_loop) 
        t.start() 
        for i in generator_expression: 
            task(i) 
            if self._loop_stop: 
                break 
        t.cancel() # Cancel the timer if the loop ends ok. 
 
ls = LoopStopper( 5) # 5 second timeout 
ls.run( range(1000000), print) # print numbers from 0 to 999999
  • Thank you for your answer, could you also provide an explanation with it. So does this mean if I run `ls.run( MyFunction)` it'll run it only for 5 seconds otherwise, it'll stop it? – rishi Jan 20 '21 at 13:42
1

Here's some code I've been experimenting with which has a task() which iterates over it params argument and takes a random amount of time to complete each.

I start a thread for each task, waiting for the thread to complete by monitoring a queue of return values. If the thread fails to complete, then the main loop abandons it, and starts the next thread.

The program shows which tasks fail or finish (different every time).

The tasks which finish have their results printed out (the param and the sleep time).

import threading, queue
import random
import time

def task(params, q):
    for p in params:
        s = random.randint(1,4)
        s = s * s
        s = s / 8
        time.sleep(s)
        q.put((p,s), False)
    q.put(None, False)  # None is sentinal value

def sampleQueue(q, ret, results):
    while not q.empty():
        item = q.get()
        if item:
            ret.append(item)
        else:
            # Found None sentinal
            results.append(ret)
            return True
    return False
    

old = []
results = []
for p in [1,2,3,4]:
    q = queue.SimpleQueue()
    t = threading.Thread(target=task, args=([p,p,p,p,p], q))
    t.start()
    end = time.time() + 5
    ret = []
    failed = True
    while time.time() < end:
        time.sleep(0.1)
        if sampleQueue(q, ret, results):
            failed = False
            break
    if failed:
        print(f'Task {p} failed!')
        old.append(t)
    else:
        print(f'Task {p} finished!')
        t.join()

print(results)
print(f'{len(old)} threads failed')
for t in old:
    t.join()
print('Done')

Example output:

Task 1 finished!
Task 2 finished!
Task 3 failed!
Task 4 failed!
[[(1, 1.125), (1, 1.125), (1, 2.0), (1, 0.125), (1, 0.5)], [(2, 0.125), (2, 1.125), (2, 0.5), (2, 2.0), (2, 0.125)]]
2 threads failed
Done
quamrana
  • 37,849
  • 12
  • 53
  • 71
  • Is the task completed in this code? I mean do u measure the time taken and then deicide if it failed or finished or does the task get interrupted if it takes too long? – rishi Jan 20 '21 at 16:20
  • I've taken the stance that if a given task is taking too long, and it is mostly network bound, then ignoring it will be just fine. But, yes the tasks that are being ignored are still running and the loop at the end with `t.join()` waits for all the failed tasks to finally complete just to be tidy. The fact that there can be many failed tasks running concurrently should not be a problem since they are all network bound and don't really take much cpu runtime. – quamrana Jan 20 '21 at 16:23
1

I will post an alternative solution using the subprocess module. You need to create a python file with your function, call it as a subprocess, and call the wait method. If the process wont finish in the desired time it will throw an error, so you kill that process and keep going with the iteration.

As an example, this is the function you want to call:

from time import time
import sys

x = eval(sys.argv[1])   

t = time()
a = [i for i in range(int(x**5))]

#pipe to the main process the computaiton time
sys.stdout.write('%s'%(time()-t))

And the main function, where I call the previous function on the func.py file:

import subprocess as sp
from subprocess import Popen, PIPE


for i in range(1,50,1):
    #call the process
    process = Popen(['python','~func.py', '%i'%i],
                    stdout = PIPE,stdin = PIPE)

    try:
        #if it finish within 1 sec:
        process.wait(1)
        print('Finished in: %s s'%(process.stdout.read().decode()))

    except:
        #else kill the process. It is important to kill it,
        #otherwise it will keep running.
        print('Timeout')
        process.kill()
        
Filipe
  • 169
  • 5