MPI matrix output by blocks

Question

I've divided matrix by blocks and multiplied it using Fox's algorithm.

How can I print the result matrix to screen, when that is stored by blocks in different processes, without sending these blocks back to the process with rank 0?

For example.
After multiplication I've got:

Block A:
83  64
112 76

Block B:
118 44
152 34

Block C:
54 68
67 56

Block D:
89  85
114 68

Entire matrix should look like:
83  64 118 44
112 76 152 34
54  68 89  85
67  56 114 68

So far I've made:
Send two blocks that contain one row and print it to screen. But is it possible to print entire result matrix without sending more than one block to process 0?

// Function for gathering the result matrix
// pCBlock - one block containing part of entire result matrix
// Size - matrix dimension
// BlockSize - block dimension
void ResultCollection(double* pCblock, int Size,
                  int BlockSize) {
double * pResultRow = new double[Size*BlockSize];
for (int i = 0; i<BlockSize; i++) {
    MPI_Gather(&pCblock[i*BlockSize], BlockSize, MPI_DOUBLE,
               &pResultRow[i*Size], BlockSize, MPI_DOUBLE, 0, RowComm);
}
//print two matrix rows from two blocks
delete[] pResultRow;
}

This can't help
( Ordering Output in MPI )
because for the matrix output I need to print not the entire block A, than B, than C, than D,
but rather
one line from A ( in process 0 ), one line from B ( from process 1 ),
one line from A ( in process 0 ), one line from B ( from process 1 ),
one line from C ( from process 2 ), one line from D ( from process 3 )
and etc.

Example matrix and blocks

Possible duplicate of [Ordering Output in MPI](https://stackoverflow.com/questions/5305061/ordering-output-in-mpi). The question may sound a bit different, but the Answers explain very well why the answer to your question is basically just **no**. — Zulan, Nov 12 '17 at 23:32

score 0 · Answer 1 · answered Nov 13 '17 at 09:25

How can I print ...
without sending these blocks back to the process with rank 0?

Well, it is time to realise,
that unless the process with rank 0 was equipped with some sort of clairvoyance, it will never be able to pretty-print any results, that were remotely computed in a herd of decentralised, distributed-processes.

Similarly, it is easy to test,
if you still do not believe what has been published on this, that MPI-distributed code was never promised to have any weak/strong warranty of how the principally uncoordinated delivery of any asynchronously remote-printed character-streams will centrally got ad-hoc ordered into one common serial output -- the system stdout -- and finally put onto the screen.

Even if you would play a lot with "addressable-ANSI-coded-screen", such design-efforts will not yield any universally working code and the tricks to inject an "absolute"-addressing into the ANSI-coded output would be obsessively awfull both to implement and to operate so as to paint a result on screen correctly.

No. Better do not try neither of these ideas.

Your actual MPI-infrastructure advisors / admins will for sure help you and show you appropriate tools for smart-collecting the results and post-process 'em accordingly.

MPI matrix output by blocks

1 Answers1

How can I print ... without sending these blocks back to the process with rank 0?

No. Better do not try neither of these ideas.

How can I print ...
without sending these blocks back to the process with rank 0?