TL;DR: Just a remark that master/slave approach isn't to be favoured and might format programmer's mind durably, leading to poor code when put in production
Although Clarissa is perfectly correct and her answer is very clear, I'd like to add a few general remarks, not about the code itself, but about parallel computing philosophy and good habits.
First a quick preamble: when one wants to parallelise one's code, it can be for two main reasons: making it faster and/or permitting it to handle larger problems by overcoming the limitations (like memory limitation) found on a single machine. But in all cases, performance matters and I will always assume that MPI (or generally speaking parallel) programmers are interested and concerned by performance. So the rest of my post will suppose that you are.
Now the main reason of this post: over the past few days, I've seen a few questions here on SO about MPI and parallelisation, obviously coming from people eager to learn MPI (or OpenMP for that matter). This is great! Parallel programming is great and there'll never be enough parallel programmers. So I'm happy (and I'm sure many SO members are too) to answer questions helping people to learn how to program in parallel. And in the context of learning how to program in parallel, you have to write some simple codes, doing simple things, in order to understand what the API does and how it works. These programs might look stupid from the distance and very ineffective, but that's fine, that's how leaning works. Everybody learned this way.
However, you have to keep in mind that these programs you write are only that: API learning exercises. They're not the real thing and they do not reflect the philosophy of what an actual parallel program is or should be. And what drives my answer here is that I've seen here and in other questions and answers put forward recurrently the status of "master" process and "slaves" ones. And that is wrong, fundamentally wrong! Let me explain why:
As Clarissa perfectly pinpointed, "in MPI, each process executes the same code". The idea is to find a way of making several processes to interact for working together in solving a (possibly larger) problem (hopefully faster). But amongst these processes, none gets any special status, they are all equal. They are given an id to be able to address them, but rank 0 is no better than rank 1 or rank 1025... By artificially deciding that process #0 is the "master" and the others are its "slaves", you break this symmetry and it has consequences:
Now that rank #0 is the master, it commands, right? That's what a master does. So it will be the one getting the information necessary to run the code, will distribute it's share to the workers, will instruct them to do the processing. Then it will wait for the processing to be concluded (possibly getting itself busy in-between but more likely just waiting or poking the workers, since that's what a master does), collect the results, do a bit of reassembling and output it. Job done! What's wrong with that?
Well, the following are wrong:
- During the time the master gets the data, slaves are sitting idle. This is sequential and ineffective processing...
- Then the distribution of the data and the work to do implies a lot of transfers. This takes time and since it is solely between process #0 and all the others, this might create a lot of congestions on the network in one single link.
- While workers do their work, what should the master do? Working as well? If yes, then it might not be readily available to handle requests from the slaves when they come, delaying the whole parallel processing. Waiting for these requests? Then it wastes a lot of computing power by sitting idle... Ultimately, there is no good answer.
- Then points 1 and 2 are repeated in reverse order, with the gathering of the results and outputting or results. That's a lot of data transfers and sequential processing, which will badly damage the global scalability, effectiveness and performance.
So I hope you now see why master/slaves approach is (usually, not always but very often) wrong. And the danger I see from the questions and answers I've read of the past days is that you might get you mind formatted in this approach as if it was the "normal" way of thinking in parallel. Well, it is not! Parallel programming is symmetry. It is handling the problem globally, in all places at the same time. You have to think parallel from the start and see your code as a global parallel entity, not just a brunch of processes that need to be instructed on what to do. Each process is it's own master, dealing with it's peers on an equal ground. Each process should (as much as possible) acquire its data by itself (making it a parallel processing); decide what to do based on the number of peers involved in the processing and its id; exchange information with its peers when necessary, should it be locally (point to point communications) or globally (collective communications); and issue its own share of the result (again leading to parallel processing)...
OK, that's a bit extreme a requirement for people just starting to learn parallel programming and I want by no mean to tell you that your learning exercises should be like this. But keep the goal in mind and don't forget that API learning examples are only API learning examples, not reduced models of actual codes. So keep on experimenting with MPI calls to understand what they do and how they work, but try to slowly tend towards symmetrical approach on your examples. That can only be beneficial for you in the long term.
Sorry for this lengthy and somewhat off topic answer, and good luck with your parallel programming.