6

Extending on a question I asked before about multithreads on top of multiprocesses in python

Can I do multithreads on each process of a multiprocess program?

So I'm trying to implement an example that achieves that. First I spawn 2 processes and each one will create 10 threads within it, but something doesn't look right here. I'm not implementing any kind of locks or semaphores so I expect the output to be scrambled (as in example 3). When I run this code example 1 and 2 are printed in the correct format. And for example 2 I even tried creating 2 threads to start each process to make sure they're not starting sequentially. ! Why is that happening? What am I missing and where did the coordination happen?

import multiprocessing, threading, time

def startThreads(n):
    threads = [threading.Thread(target=printer, args=(n, i))  for i in range(10)]
    [t.start() for t in threads]
    [t.join() for t in threads]

def printer(process_num, thread_num):
    time.sleep(1)
    print(f"Process number: {process_num} thread number: {thread_num}")
    print(f"Process number: P thread number: T")

if __name__ == '__main__':

    # Example 1
    pros = [multiprocessing.Process(target=startThreads, args=(p_num, )) for p_num in range(5)]
    [p.start() for p in pros]
    [p.join() for p in pros]
    # Process number: 0 thread number: 0
    # Process number: P thread number: T
    # Process number: 0 thread number: 4
    # Process number: P thread number: T
    # Process number: 0 thread number: 1
    # Process number: P thread number: T
    # Process number: 0 thread number: 2
    # Process number: P thread number: T
    # Process number: 0 thread number: 3
    # Process number: P thread number: T
    # Process number: 1 thread number: 0
    # ...

    # Example 2
    print()
    startThreads(0)
    # Process number: 0 thread number: 1
    # Process number: P thread number: TProcess number: 0 thread number: 0
    # Process number: P thread number: T

    # Process number: 0 thread number: 2Process number: 0 thread number: 4Process number: 0 thread number: 3
    # Process number: P thread number: T

    # Process number: P thread number: T

    # Process number: P thread number: T

Notice how in example two how print behaved, on the other hand example one always prints in the correct format (safe print) while in both cases the print function is called by a thread, the same thing happens when I remove the print formatting and instead use a fixed string to be printed.

As the discussion in this question says that we need to implement some sort of safe printing method to get every print statement in a new line, but that's not the case with example 1


import multiprocessing, threading, time

def startThreads():
    threads = [threading.Thread(target=printer)  for i in range(5)]
    [t.start() for t in threads]
    [t.join() for t in threads]

def printer():
    time.sleep(0.05)
    print(f"Process number: P thread number: T")

if __name__ == '__main__':
    p = multiprocessing.Process(target=startThreads)
    p.start()
    p.join()
    print("=====================================")
    startThreads()

    # Process number: P thread number: T
    # Process number: P thread number: T
    # Process number: P thread number: T
    # Process number: P thread number: T
    # Process number: P thread number: T
    # =====================================
    # Process number: P thread number: TProcess number: P thread number: T
    # Process number: P thread number: T

    # Process number: P thread number: TProcess number: P thread number: T
    #

I tried to use only one process, but it still safe printing where every line is printing in a new line, but when I call startThreads explicitly, it doesn't behave the same and doesn't safe print, is it behaving this way?!

Marsilinou Zaky
  • 1,038
  • 7
  • 17
  • Which python version are you using (affects GIL)? Which os (affects preemptive scheduling)? – RafalS Dec 22 '19 at 12:13
  • Anyway, on my machine (linux mint 19 and python 3.8.1) the output is not synchronized in any of 2 cases. It must be somehow os/python version dependent if you always get the same results. – RafalS Dec 22 '19 at 12:20

1 Answers1

1

And with the same code I get scrambled output:

1:20:21:3, , 0:00:1, 0:71:5, , 1:40:91:1, 1:6, , , 1:0, , 0:5, 0:4, 0:3, 0:6, 1:7, 0:8, 1:9, , 1:8, 
0:0, 0:10:2, , 0:3, 0:4, 0:50:6, 1:0, , 1:1, 0:7, 1:2, 1:3, 0:9, 0:81:5, 1:4, , 1:71:6, , 1:91:8, , 
0:0, 0:1, 0:40:3, 0:2, , 0:60:5, , 0:70:8, , 0:9, 

Try running it multiple times. If 1 and 2 are always scrambled - maybe it's platform dependent.

so I expect the output to be scrambled

It's not synchronized in any way. The ordering is just random :)

RafalS
  • 5,834
  • 1
  • 20
  • 25
  • I think I wasn't clear enough, I'm not talking about the oder of execution but I'm talking about the following example, see how thread doesn't wait for the previous one to finish and print (following example not following tha pattern of "T:P, T:P") 0:0, 0:1, 0:20:40:3, , , 0:6, 0:5, 0:9, 0:70:8, , – Marsilinou Zaky Dec 10 '19 at 21:24
  • Why would it wait? Threads are under GIL (global interpreter lock) so only one bytecode can be executed at once, but threads are also pre-emtpive so the execution can be interrupted at any point. For example in the middle of print, which isn't an atomic operation. Processes on the other hand are truly parallel so 2 processes might be printing at the same time. – RafalS Dec 10 '19 at 21:40
  • Then why when I call startThreads independently the output is not formatted while when it's called in the processes I create the output is in the correct format, do you see my point? And is this implementation correct to create threads on top of the processes I created? – Marsilinou Zaky Dec 10 '19 at 21:47
  • 1
    That's the point. It's random :P. Your implementation is correct. For more about werid prints see https://stackoverflow.com/questions/3029816/how-do-i-get-a-thread-safe-print-in-python-2-6 – RafalS Dec 10 '19 at 21:51
  • 4
    It looks like the `end` value is printed separately, if you change your print to `print(f"{process_num}:{thread_num}, ", end='')` then there's no overlapping in any of this cases – RafalS Dec 10 '19 at 21:53
  • The same issue appears even without end='' – Marsilinou Zaky Dec 20 '19 at 23:36