1

I have a text file which consist of list of dates. I want to pass every date to a shell script as a parameter and run the script for all the specified date from the file.

I want to execute this task in parallel using python. Since the script has complex logic and to monitor the execution I want to run 5 instances at a time. As soon as the scripts are completed python has to start new thread.

import threading
import time


class mythread(threading.Thread):
    def __init__(self, i):
        threading.Thread.__init__(self)
        self.h = i
        # Script will call the function

    def run(self):
        time.sleep(1)
        print("Value send ", self.h)


f = open('C:\Senthil\SenStudy\Python\Date.txt').readlines()
num = threading.activeCount()

for i in f:
    print("Active threads are ", num)
    time.sleep(1)
    if threading.activeCount() <= 5:
        thread1 = mythread(i)
        thread1.start()
    else:
        print("Number of Threads are More than 5 .. going to sleep state for 1 mint ...")
        time.sleep(1)

I tried using threading.activeCount() to get the number of threads running, but from the beginning it says number of threads are 30 (which is number of all date entries in the file).

Imperishable Night
  • 1,503
  • 9
  • 19
  • See https://stackoverflow.com/questions/55191051/how-does-thread-pooling-works-and-how-to-implement-it-in-an-async-await-env-lik/ – politinsa Jun 16 '19 at 10:35

2 Answers2

0

Your problem seems tailor-made for a python process pool or thread pool. If the input argument to each "thread" is just a date, I think a process pool may be better, as synchronization between threads can be tricky.

Please read the documentation for the multiprocessing module and see if it solves your problem. If you have any questions about it, I'll be happy to clarify.

(An example for a process pool is right at the beginning of the documentation. If you really think you need a thread pool, the syntax would be the same --- just replace multiprocessing with multiprocessing.dummy.)

Imperishable Night
  • 1,503
  • 9
  • 19
0

In the case you are sure you need threads and not processes, you can use ThreadPoolExecutor to run a fixed number of worker threads to do the job:

from concurrent.futures import ThreadPoolExecutor


DATE_FILE = 'dates.txt'
WORKERS = 5


def process_date(date):
    print('Start processing', date)

    # Put here your complex logic.

    print('Finish processing', date)


def main():

    with open(DATE_FILE) as date_file:
        dates = [line.rstrip() for line in date_file]

    with ThreadPoolExecutor(WORKERS) as executor:
        executor.map(process_date, dates)
        executor.shutdown()


if __name__ == '__main__':
    main()

If you use Python 2, you have to install futures library first to make this work:

pip install --user futures
constt
  • 2,250
  • 1
  • 17
  • 18