EDIT: I've now come back to this answer after some time to provide some more insight, having realized that my original answer was probably wrong. I've included a description of the problem which I think is interesting by itself, but you can skip it and go straight to a possible solution.
The problem
I originally thought that the issue was that Google Colab was prematurely stopping the process/threads when they were inactive. While that seemed reasonable at the time, I've realized that the answer is much simpler.
The issue here is that the main thread is not waiting for the created threads to end. After the main thread is done, Google Colab does not seem to wait for the other threads to end, and so the output they produce never reaches the main console. The following code runs as expected locally:
import threading
import time
maxthreads = 2
sema = threading.Semaphore(value=maxthreads)
threads = list()
def task(i):
sema.acquire()
print( "start %s" % (i,))
time.sleep(2)
sema.release()
for i in range(10):
thread = threading.Thread(target=task,args=(i,))
threads.append(thread)
thread.start()
Saving it locally to a file and running it yields:
start 0
start 1
start 2
start 3
start 4
start 5
start 6
start 7
start 8
start 9
However when running it in Google Colab (you can try it here) we get:
start 0
start 1
What's going on internally (I assume) is that the main thread is done, and then Google Colab doesn't wait for all the other threads to end. We only see the first to threads' output because those run fast enough that they are done before the main thread ends. An interesting experiment is to print something when the main thread is done:
import threading
import time
maxthreads = 2
sema = threading.Semaphore(value=maxthreads)
threads = list()
def task(i):
sema.acquire()
print( "start %s" % (i,))
time.sleep(2)
sema.release()
for i in range(10):
thread = threading.Thread(target=task,args=(i,))
threads.append(thread)
thread.start()
print('Main thread done')
We get the following output (output from running locally on the left, output from running on Google Colab on the right):
Locally: Google colab:
---------------------------------------
start 0 | start 0
start 1 | start 1
Main thread done | Main thread done
start 2 |
start 3 |
start 4 |
start 5 |
start 6 |
start 7 |
start 8 |
start 9 |
Indeed we see that once the main thread is done, the rest of the output is lost on Google Colab.
A solution
We can make use of Thread.join()
(docs) to wait until a thread is done. That way, we can make the main process wait for all the additional threads before finishing (you can try it in Google Colab here):
import threading
import time
maxthreads = 2
sema = threading.Semaphore(value=maxthreads)
threads = list()
def task(i):
sema.acquire()
print( "start %s" % (i,))
time.sleep(2)
sema.release()
for i in range(10):
thread = threading.Thread(target=task,args=(i,))
threads.append(thread)
thread.start()
for t in threads:
t.join()
And the output is the same, both locally and in Google Colab:
start 0
start 1
start 2
start 3
start 4
start 5
start 6
start 7
start 8
start 9
You can also try adding print('Main thread done')
at the end, and you'll see that it will be printed only when all the additional threads are done.
On an unrelated note, you should probably change
thread = threading.Thread(target=task,args=(str(i)))
To
thread = threading.Thread(target=task,args=(i,))
Or you might get problems when i
is a two-digit number. Note that (i,)
is a tuple with i
as its single element.