Sequential consistency of file IO (and other OS-related operations) on Linux

Question

I want to know if file IO operations by multiple processes/threads are guaranteed to be sequential consistent on Linux. And if not (as this thread says), how should I code to make sure they are sequential consistent? Consider the following example, where FILE_1 and FILE_2 are two distinct file names with absolute paths where both Process A and B have read-write access.

Process A first creates FILE_1 and then creates FILE_2:

FILE* fp1A = fopen(FILE_1, "w");
fclose(fp1A);
FILE* fp2A = fopen(FILE_2, "w");
fclose(fp2A);

Process B first reads FILE_2 and if success, reads FILE_1:

FILE* fp2B = fopen(FILE_2, "r");
if (fp2B != NULL) {
    FILE* fp1B = fopen(FILE_1, "r");
    // QUESTION: is fp1B guaranteed to be not NULL here?
}

Question is given by the comment above. In other words, if one process does some file IO operations in a given order specified by its source code, are all other processes going to see the effects of these operations on the system in the same order? Is this guaranteed by some standards (POSIX etc.) or implementation defined?

What if I change "file IO" to other operations which have some visible effect on the system in a broader sense (e.g. changing a kernel parameter)?

BACKGROUND: I have been studying memory ordering in the C++11 thread model. But those concepts only concerns memory rather than OS functionalities such as file IO. I understand this is because it is a language standard independent of OS. So I want to know if any other standards provide similar concepts for OS.

The compiler won't reorder fopenA after fopenB because it doesn't know the dependency effects. So, there will definitely be a call to fopenA before fopenB. Consequently, fopenA is guaranteed to open a file and attach a stream. So, yes your code is sequentially consistent. — themagicalyang, Nov 03 '16 at 07:51
A small nit-pick about your code examples. If you want to ask about the Linux OS specifics you should probably use the lower-level system call `open` and file descriptors, instead of the standard C `fopen` function and the `FILE` structure. — Some programmer dude, Nov 03 '16 at 07:52
As for your problem, on a multi-tasking system two parallel processes have no inherent sequencing or synchronization. But if you have some sort of synchronization between the two processes (like explicitly run process A first to its end, followed by process B), and no other process will touch the specific files or filesystem, then yes process B is guaranteed to be successful in opening the files. That caveat about other processes is important though, another process may remove a file between the creation in A and usage in B, even if the span between A and B is short. — Some programmer dude, Nov 03 '16 at 08:11
@themagicalyang Although the code is sequentially consistent at the compiler level, it is still unclear to me whether it is so at the OS level (whether OS has the flexibility to reorder the file operations) — Qin Yixiao, Nov 03 '16 at 08:23
@Someprogrammerdude Thanks for your suggestion on the code example. Regarding your second comment, I'm afraid you misunderstood the point in my question. The point is that is the operation on `FILE_1` by Process A guaranteed to happen before the operation on `FILE_2` by Process A? If B sees `FILE_2` (possibly while A is still running) can it also see `FILE_1` for sure? Knowing that the operations would be completed upon A's ending does not answer the question. — Qin Yixiao, Nov 03 '16 at 08:27
@themagicalyang: Your statement is correct, the reasoning not. The real reason is that library I/O operations are observable effects, and reordering those is strictly forbidden. No amount of dependency information changes this. — MSalters, Nov 03 '16 at 08:31
@QinYixiao It doesn't make sense otherwise, the linux man pages say open will attach fd to the path file. So, I am assuming somewhere in between there will be a call to the filesystem trying to create a file. And open will return when this succeeds. I don't see any headroom for os reordering. — themagicalyang, Nov 03 '16 at 08:31
@themagicalyang But is it possible that when Process A creates the two files, the kernel just keeps this information in memory so that Process A can use the file descriptors or C file pointers **as if** these two files exist, but does not actually modifies the file system? And if this can happen, then when the kernel decides to actually modify the file system, does it have the flexibility to create `FILE_1` and `FILE_2` in any order? — Qin Yixiao, Nov 03 '16 at 08:41
@MSalters Please see my previous comment. Basically I agree the compiler won't reorder this, but what about at the Linux kernel level? — Qin Yixiao, Nov 03 '16 at 08:43
@QinYixiao There is no "as if" mention in the manpage, the only thing that comes close is O_ASYNC. "Enable signal-driven I/O: generate a signal (SIGIO by default, but this can be changed via fcntl(2)) when input or output becomes possible on this file descriptor." So, it's reasonable to assume open() creates with O_CREATE. And there is no "as if" behaviour. But yes, I agree this isn't much of a concrete answer. — themagicalyang, Nov 03 '16 at 09:07

Sequential consistency of file IO (and other OS-related operations) on Linux

0 Answers0