0

Calling fprintf(stderr,...) from MPI processes never gives interleaved results for me.

Order of messages from different MPI processes is arbitrary of course. But no two fprintf-s get intermingled.

I.e., each single fprintf behaves as if it were atomic.

Is this behavior guaranteed by a standard? On Windows? On Linux (by Posix)?

Does it matter if a FILE is buffered, like stdout?

user2052436
  • 4,321
  • 1
  • 25
  • 46
  • 1
    Possible duplicate of [Output isn't ordered, Parallel Programing with MPI](https://stackoverflow.com/questions/29019744/output-isnt-ordered-parallel-programing-with-mpi) – Gilles Gouaillardet Jun 26 '19 at 23:55
  • The standard does __not__ guarantee a global order between different MPI processes. – Gilles Gouaillardet Jun 26 '19 at 23:57
  • @GillesGouaillardet while my first reaction was also to CV, but I believe this is actually a different question. This is not about ordering, but about atomicity of output. I can't help but wonder, since Open MPI provides the `--tag-output` and `--timestamp-output` options, does it guarantee atomicity on line-level? – Zulan Jun 27 '19 at 07:13
  • @Zulan Open MPI does __not__ guarantee atomicity on line-level (anything is prefixed, regardless it ends with a `\n` or not). – Gilles Gouaillardet Jun 27 '19 at 07:22
  • Yes, it is about atomicity as @Zulan noted – user2052436 Jun 27 '19 at 19:03

2 Answers2

2

Within a single process, POSIX requires it — so does C11. The POSIX information is buried in an unlikely spot, in the specification for the function flockfile() (and funlockfile() and ftrylockfile()):

All functions that reference (FILE *) objects, except those with names ending in _unlocked, shall behave as if they use flockfile() and funlockfile() internally to obtain ownership of these (FILE *) objects.

The general description of the functions says:

These functions shall provide for explicit application-level locking of stdio (FILE *) objects. These functions can be used by a thread to delineate a sequence of I/O statements that are executed as a unit.

So, the functions explicitly manipulate locks that functions such as printf() and fprintf() are required to use too — at least in effect.

An application would use those functions explicitly to group several operations on a single file stream; this is explicitly allowed — indeed, it is the intended purpose of the functions. A corollary is that if you want to achieve uninterleaved I/O, either ensure that you use a single fprintf() call or equivalent to print all the information, or you must use the POSIX locking functions to protect against interference.

MPI threads are probably sufficiently like POSIX threads, or use POSIX threads, so the rules apply there, too.

C11 has threading support and requires the behaviour too — though not the functions flockfile() etc. C11 §7.21.2 Streams ¶8 says:

All functions that read, write, position, or query the position of a stream lock the stream before accessing it. They release the lock associated with the stream when the access is complete.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • 1
    I don't think this addresses the question. `flockfile()` *etc*. lock *`FILE` objects* / streams, not the underlying device or file. They are scoped to one process, but I take the OP's question to be about multiple (separate) processes. – John Bollinger Jun 26 '19 at 20:49
  • I'm dealing with the output from multiple threads in a single (multithreaded) MPI process. I think that's what is being asked about. Feel free to present your alternative answer, of course. – Jonathan Leffler Jun 26 '19 at 20:50
  • Well yes, that's my point. MPI is not typically used for inter-thread communication within the same process. It is designed and built for parallelization across multiple processes, hence my supposition that that's the case of interest. – John Bollinger Jun 26 '19 at 20:55
2

I'm supposing that you're asking about multiple MPI processes all writing to the same device, perhaps a terminal. To the extent that you're interested in the output of multiple threads of individual processes, @JonathanLeffler's answer will be of great interest to you.

The C language does not speak to the question of separate processes' effects on each others' use of the same device. It is scoped to the behavior of individual runs of individual programs. For POSIX, and I suspect Windows as well, it depends to some extent on how it comes to pass that the multiple separate MPI processes even have the ability to write to the same device at the same time, if indeed that does come to pass.

If the processes have opened the device separately, then the behavior you can rely upon depends on the device and its driver, possibly on the characteristics of the data being printed, and possibly also on details of how the device was opened by each process. I would generally expect that for a given machine and device, there would be a characteristic size (larger than 1) such that data of that size or smaller written in one go by one process will not be commingled with data from other processes, but I am not aware of specific documentation promising that in general, or detailing what such characteristic sizes would be.

Under POSIX, if sharing a device arises from inheriting an open file description from a common ancestor (shell, master process, etc.), then in addition to the above considerations, POSIX places some requirements, detailed in section 2.5.1. Inasmuch as your example involves writing to stderr, which is unbuffered by default, your usage will satisfy those requirements as long as you have not changed its (non-)buffering. In that event, POSIX promises that

regardless of the sequence of handles used, implementations shall ensure that an application, even one consisting of several processes, shall yield correct results: no data shall be lost or duplicated when writing, and all data shall be written in order, except as requested by seeks. It is implementation-defined whether, and under what conditions, all input is seen exactly once.

And that's all, though it's still more than you have in the other case. "Written in order" does not guarantee that data from different writes will be segregated. In practice, however, each fprintf call will make write() requests to the underlying system, transferring the printed data in one or a few chunks, and, again in practice, each chunk will be written contiguously.

So,

  • it is not guaranteed by C or POSIX that multiple processes' writes to the same device will be serialized with respect to each other. Windows may make stronger guarantees, but I'm not prepared to speak to that.

  • it is nevertheless unsurprising that you see such serialization in practice.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157