0

Basically I want make like 15000 get requests of the form GET www.somewebsite.com/archive/1, www.somewebsite.com/archive/2, and write the content to its own file locally. But doing all those in order takes a bit. And doing them all with their own thread results in all sorts of IO and HTTP errors. But if I do say 50 at a time it works fine. What I want to do is create a chunk thread that I spawn 50 threads off of, and then spawn another chunk thread when that one is finished. But I haven't found a way to do this.

I need a way to say "don't execute any more lines until this thread is completed" or a way to queue up threads that get executed asynchronously in order.

Michael
  • 11
  • 2

2 Answers2

0

"You need to use join method of Thread object in the end of the script."

This has been stated here by maksim skurydzin.

You might also want to take a look at the multiprocessing class here.

0

Python has a built in library multiprocessing that will allow you to implement simple batch processing with a queue.

import multiprocessing

static_input = range(100)
chunksize = 10

def work(item):
    return "Number " + str(item)

with multiprocessing.Pool() as pool:
    for out in pool.imap_unordered(work, static_input, chunksize):
        print(out)
James Burgess
  • 487
  • 1
  • 4
  • 12