My question is inspired by a comment on the solving embarassingly parallel problem with multiprocessing post.
I am asking about the general case where python multiprocessing is used to (1) read data from file, (2) manipulate data, (3) write results to file. In the case I describe, data that is read from file is passed to a queue A in (1) and fetched from this queue A in (2). (2) also passes results to a separate queue B and (3) fetches results from this queue B to write them to file.
When (1) is done, it passes a STOP signal* to queue A so (2) knows queue A is empty. (2) then terminates and passes a STOP signal to queue B so (3) knows queue B is empty and terminates when it has used up the results queue.
So is there any need to call the multiprocessing .join() method on (1) and (2)? I would have thought that (2) will not finish until (1) finishes and sends a STOP signal? For (3) it makes sense to wait as any subsequent instructions might else proceed without (3).
But maybe calling the .join() method costs nothing and can be used just to avoid having to think about it?
*actually, the STOP signal consists of a sequence of N stop signals where N is equivalent to the number of processes running in (2).