OrderedDict loses its sequence while using Mpi4py

Question

I am trying to print the dictionary using Mpi4py:

from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.rank

if rank == 0:
    data = {'a':1,'b':2,'c':3}
else:
    data = None

data = comm.bcast(data, root=0)
print 'rank',rank,data

as soon as I run it for more processors using the command below:

mpiexec -n 10 python code.py

the results get mixed with other processor results as shown below:

rank 0 {'arank 2 {'a': 1, 'c': 3, 'b': 2}
rank' 3 {'a': 1, 'c': 3, 'b': 2}
: 1, 'c': 3, 'b': 2}
rank 8 {'a': 1, 'c': 3, 'b': 2}rank
 1 {'a': 1, 'c': 3, 'b': 2}
rank 4 {'a': 1, 'c': 3, 'b': 2}
rank 5 {'a': 1, 'c': 3, 'b': 2}
rank 9 {'a': 1, 'c': 3, 'b': 2}
rankrank  7 {'a': 1, 'c': 3, 'b': 2}
6 {'a': 1, 'c': 3, 'b': 2}

I assume this is happening because as and when the processors complete their tasks they print the results, leading to mix-matches. I tried using OrderedDict but even that didn't work. The code is as

from mpi4py import MPI
import collections 

comm = MPI.COMM_WORLD
rank = comm.rank

if rank == 0:
    data = collections.OrderedDict({'a':1,'b':2,'c':3})
else:
    data = None

data = comm.bcast(data, root=0)
print 'rank',rank,data

Is there a way to get the results to be consistent and not mixed with the the results of other processors?

See https://stackoverflow.com/questions/15711755/converting-dict-to-ordereddict — 9769953, Aug 09 '18 at 12:38
I think I mentioned it explicitly in the post, **"the results get mixed with other processor results as shown below"** — calvin, Aug 09 '18 at 13:51
The stackoverflow link that you shared does not help my problem. In the above code, when I run using pyhton MPI library. It runs the same code in multiple processors, and if N processors are running then it prints the results N times. So when I showed the results, you can very well see, that rank 0 dicgtionary is mixed with rank 1 dictionary, etc. That is the only problem I have. I want rank 0 dictionary to not mix with rank 1 dictionary. Thanks! — calvin, Aug 09 '18 at 13:57
As later in the code, I am exchanging the dictionary data between processors, I would like to see if the MPI code that I am writing is logically correct. For that I would want the results to be printed. I would like to point out that I didn't face such an issue while using C/C++ programming languages for MPI. So with python, I am trying to find a way to avoid interspersing of data while printing. Many thanks! — calvin, Aug 09 '18 at 14:29

Azat Ibrakov · Answer 1 · 2018-08-09T12:37:12.640

3

I'm not sure if it solves your issue with mixed output, but in statement

collections.OrderedDict({'a':1,'b':2,'c':3})

there are 2 things happen:

Creation of dict literal {'a':1,'b':2,'c':3}.
Passing it to a collections.OrderedDict, as a result making a copy with order inherited from original dict literal.

If you want to keep insertion order -- pass ordered iterable (like list) of key-value pairs.

So we can write it like

collections.OrderedDict([('a', 1), ('b', 2), ('c', 3)])

Test

>>> data = collections.OrderedDict([('a', 1), ('b', 2), ('c', 3)])
>>> list(data.keys()) == ['a', 'b', 'c']
True

edited Aug 09 '18 at 12:37

answered Aug 09 '18 at 12:30

Azat Ibrakov

9,998
9
38
50

Sorry, it doesn't help. So, you can see from the code that I'm trying to run it in **multiple processors**. And print the same result from multiple processors. The results(data dictionary) are getting overlapped. I am looking for a way to not get that overlapped. Let me know if there's anything that could help me get results from processors in sequence. – calvin Aug 09 '18 at 14:02

score 2 · Answer 2 · answered Aug 09 '18 at 14:55

Don't write to the terminal: multiple processes may overlap in their output.

Instead, write to file, each output file independently named for each process.

Also, considering using the logging module: it's purpose is precisely for logging information.

Something like this can work:

from mpi4py import MPI
import logging

comm = MPI.COMM_WORLD
rank = comm.rank

filename = 'output{}.log'.format(rank)
logger = logging.getLogger()
handler = logging.FileHandler(filename)
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter("%(asctime)s - %(levelname)s - %(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)

if rank == 0:
    data = {'a':1,'b':2,'c':3}
else:
    data = None

data = comm.bcast(data, root=0)
logger.debug('rank = %d, %s', rank, data)

Then inspect your "output0.log", "output1.log" etc files.

NB: using a dict or OrderedDict has nothing to do with the overlapping output. A dict will print its output in random (key) order, while with the way you have created your ordered dictionary, it will be created in random order (see Azat's answer and the link I provided in a comment).
It has to do with multiple processes writing to the same output source, like multiple people all saying the same thing in the same room, just slightly out of sync of each other.

Note: if you had no problem with overlapping using C or C++, that may be 1/ luck, 2/ different code under the hood doing the output, or 3/ MPI/mpiexec doing some fancy stdout handling. Try with more processes or (much) longer output lines for each individual process, to see if it still holds.

OrderedDict loses its sequence while using Mpi4py

2 Answers2

Test