0

I am trying to execute certain number of python scripts at certain intervals. Each script takes a lot of time to execute and hence I do not want to waste time in waiting to run them sequentially. I tired this code but it is not executing them simultaneously and is executing them one by one:

Main_file.py

import time
def func(argument):

    print 'Starting the execution for argument:',argument
    execfile('test_'+argument+'.py')


if __name__ == '__main__':

    arg = ['01','02','03','04','05']

    for val in arg:
        func(val)
        time.sleep(60)

What I want is to kick off by starting the executing of first file(test_01.py). This will keep on executing for some time. After 1 minute has passed I want to start the simultaneous execution of second file (test_02.py). This will also keep on executing for some time. Like this I want to start the executing of all the scripts after gaps of 1 minute.

With the above code, I notice that the execution is happening one after other file and not simultaneously as the print statements which are there in these files appear one after the other and not mixed up.

How can I achieve above needed functionality?

Jason Donnald
  • 2,256
  • 9
  • 36
  • 49
  • [import subprocess](https://docs.python.org/2/library/subprocess.html#subprocess.Popen), and then use `subprocess.Popen(['python', 'yourscript.py'])` instead of `execfile()`. – Carl Groner Mar 06 '15 at 21:16
  • @CarlGroner I have tried subprocess as well as threading and in both the cases `memory limit exceeded` error comes – Jason Donnald Mar 06 '15 at 22:06

2 Answers2

2

Using python 2.7 on my computer, the following seems to work with small python scripts as test_01.py, test_02.py, etc. when threading with the following code:

import time
import thread

def func(argument):
    print('Starting the execution for argument:',argument)
    execfile('test_'+argument+'.py')


if __name__ == '__main__':

    arg = ['01','02','03']

    for val in arg:
        thread.start_new_thread(func, (val,))
        time.sleep(10)

However, you indicated that you kept getting a memory exception error. This is likely due to your scripts using more stack memory than was allocated to them, as each thread is allocated 8 kb by default (on Linux). You could attempt to give them more memory by calling

thread.stack_size([size])

which is outlined here: https://docs.python.org/2/library/thread.html

Without knowing the number of threads that you're attempting to create or how memory intensive they are, it's difficult to if a better solution should be sought. Since you seem to be looking into executing multiple scripts essentially independently of one another (no shared data), you could also look into the Multiprocessing module here:

https://docs.python.org/2/library/multiprocessing.html

Schiem
  • 589
  • 3
  • 12
  • My scripts run for few hours each. I tries you method with a simple script where in main script there is `time.sleep(3)` and then in each separate script first there is print statement then `time.sleep(10)` and then another print statement. When I execute this then I do see first print statements of each separate file but not the second print statements for all. – Jason Donnald Mar 06 '15 at 22:10
  • it seems as if as soon as the main process finishes, all threads are ended and hence we do not see second print statement for all of them – Jason Donnald Mar 06 '15 at 22:18
  • Join the threads. http://stackoverflow.com/questions/11968689/python-multithreading-wait-till-all-threads-finished – Schiem Mar 08 '15 at 00:30
  • I did used join and it runs into same memory limit exceeded error. I also specified the stack size to be 10MB but that also didn't resolved the issue – Jason Donnald Mar 09 '15 at 21:17
  • You said that each of them are running for several hours, and if you're using them for something that takes quite a bit of memory, it's likely that 10 MB isn't enough. You could either switch over to each one starting it's own process (using multiprocess or subprocess modules), therefore being allocated memory by the OS itself, or you could monitor the scripts and check how much memory they're using. – Schiem Mar 10 '15 at 14:38
  • Yes the queries run for long and process billions of rows. My initial try was with subprocess only and when that did not resolved the error then I tried for threading a well but still no success. – Jason Donnald Mar 10 '15 at 15:55
  • You still got a memory error while using subprocess? But you can run them standalone without the error occurring? – Schiem Mar 11 '15 at 02:00
  • Yes I get the memory error when I use subprocess or threading. But when I run them standalone that does not happen – Jason Donnald Mar 15 '15 at 17:11
  • actually even when I run them manually at the same time in different terminal no error comes – Jason Donnald Mar 15 '15 at 17:12
  • From the python docs on the subprocess module: "Note: The data read is buffered in memory, so do not use this method if the data size is large or unlimited." Ideas that I would have would be: a). Try the multiprocess module, but makes sure that any data you load is after the call to create a new process. b). Abandon python and use a cronjob instead (assuming Linux) – Schiem Mar 17 '15 at 02:40
  • Thanks for suggestions. can you elaborate your point (a) above specifically the part where you mentioned to load data after new process call? – Jason Donnald Mar 18 '15 at 15:09
  • 1
    Any data that you load into memory with an open command is going to be inherited by all processes that you spawn. So, for instance, if you load a file, and then spawn a process, the spawned process will inherit a copy of that data. Obviously, this is a problem. You can avoid this by making sure that any load calls come AFTER you spawn a new process. Looking at what you want to do, this shouldn't be a problem for you. – Schiem Mar 19 '15 at 15:39
0

If you need them to run parallel you will need to look into threading. Take a look at https://docs.python.org/3/library/threading.html or https://docs.python.org/2/library/threading.html depending on the version of python you are using.

Donkyhotay
  • 73
  • 6
  • I have tried threading but it is running into memory exception error. I can run these simultaneously in different shells at certain intervals but I want to automate that – Jason Donnald Mar 06 '15 at 20:54
  • Then post your threaded code so that we can take a look at it because the only way to have functions run in parallel is to thread them. – Donkyhotay Mar 06 '15 at 21:18