0

I have the following code:

def upload_to_s3(filepath, unique_id):
    # do something
    print s3_url # <-- Confirming that this `s3_url` variable is not None
    return s3_url


threads = []
for num, list_of_paths in enumerate(chunked_paths_as_list):
    for filepath in list_of_paths:
        t = threading.Thread(target=upload_to_s3, args=(filepath, self.unique_id))
        t.start()
        threads.append(t)
results = map(lambda t: t.join(), threads)
print results

Unfortunately, this is returning None for every item:

[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
>>>>> TIME: 13.9884989262

What do I need to do to get the return statement in the above map ?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
David542
  • 104,438
  • 178
  • 489
  • 842

1 Answers1

6

t.join() always returns None. That's because the return value of a thread target is ignored.

You'll have to collect your results by some other means, like a Queue object:

from Queue import Queue

results = Queue()

def upload_to_s3(filepath, unique_id):
    # do something
    print s3_url # <-- Confirming that this `s3_url` variable is not None
    results.put(s3_url)


threads = []
for num, list_of_paths in enumerate(chunked_paths_as_list):
    for filepath in list_of_paths:
        t = threading.Thread(target=upload_to_s3, args=(filepath, self.unique_id))
        t.start()
        threads.append(t)
for t in threads:
    t.join()

while not results.empty():
    print results.get()

Alternatively, use multiprocessing.dummy module to get the multiprocessing.Pool behaviour but with threads, which can do what you want; collect return values from the async function calls.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thanks, that makes sense. Could you please show me how I would use the `Queue` object in the above example? – David542 Sep 22 '14 at 20:08