2

I have a huge list of information and my program should analyze each one of them. To speed up I want to use threads, but I want to limit them by 5. So I need to make a loop with 5 threads and when one finish their job grab a new one till the end of the list. But I don't have a clue how to do that. Should I use queue? For now I just running 5 threads in the most simple way: Thank you!

for thread_number in range (5):
    thread = Th(thread_number)
    thread.start()
Antonio
  • 482
  • 1
  • 5
  • 16

2 Answers2

2

It seems that you want a thread pool. If you're using python 3, you're lucky : there is a ThreadPoolExecutor class

Else, from this SO question, you can find various solutions, either handcrafted or using hidden modules from the python library.

Community
  • 1
  • 1
Scharron
  • 17,233
  • 6
  • 44
  • 63
2

Separate the idea of worker thread and task -- do not have one worker work on one task, then terminate the thread. Instead, spawn 5 threads, and let them all get tasks from a common queue. Let them each iterate until they receive a sentinel from the queue which tells them to quit.

This is more efficient than continually spawning and terminating threads after they complete just one task.

import logging
import Queue
import threading
logger = logging.getLogger(__name__)
N = 100
sentinel = object()

def worker(jobs):
    name = threading.current_thread().name
    for task in iter(jobs.get, sentinel):
        logger.info(task)
    logger.info('Done')


def main():
    logging.basicConfig(level=logging.DEBUG,
                            format='[%(asctime)s %(threadName)s] %(message)s',
                            datefmt='%H:%M:%S')

    jobs = Queue.Queue()
    # put tasks in the jobs Queue
    for task in range(N):
        jobs.put(task)

    threads = [threading.Thread(target=worker, args=(jobs,))
               for thread_number in range (5)]
    for t in threads:
        t.start()
        jobs.put(sentinel)     # Send a sentinel to terminate worker
    for t in threads:
        t.join()

if __name__ == '__main__':
    main()
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677