First, note that your question is "Will data be interleaved?", not "Are write()
calls [required to be] atomic?" Those are different questions...
"TL;DR" summary:
write()
to a pipe or FIFO less than or equal to PIPE_BUF
bytes won't be interleaved
write()
calls to anything else will be somewhere in the range between "probably won't be interleaved" to "won't ever be interleaved" with the majority of implementations in the "almost certainly won't be interleaved" to "won't ever be interleaved" range.
Full Answer
If you're writing to a pipe or FIFO, your data will not be interleaved at all for write()
calls for PIPE_BUF
or less bytes.
Per the POSIX standard for write()
(note the bolded part):
RATIONALE
...
An attempt to write to a pipe or FIFO has several major characteristics:
Atomic/non-atomic: A write is atomic if the whole amount written in one operation is not interleaved with data from any other process.
This is useful when there are multiple writers sending data to a
single reader. Applications need to know how large a write request can
be expected to be performed atomically. This maximum is called
{PIPE_BUF}. This volume of POSIX.1-2008 does not say whether write
requests for more than {PIPE_BUF} bytes are atomic, but requires that
writes of {PIPE_BUF} or fewer bytes shall be atomic.
...
Applicability of POSIX standards to Windows systems, however, is debatable at best.
So, for pipes or FIFOs, data won't be interleaved up to PIPE_BUF
bytes.
How does that apply to files?
First, file append operations have to be atomic. Per that same POSIX standard (again, note the bolded part):
If the O_APPEND flag of the file status flags is set, the file offset
shall be set to the end of the file prior to each write and no
intervening file modification operation shall occur between changing
the file offset and the write operation.
Also see Is file append atomic in UNIX?
So how does that apply to non-append write()
calls?
Commonality of implementation. See the Linux read/write syscall implementations for an example. (Note that the "problem" is handed directly to the VFS implementation, though, so the answer might also be "It might very well depend on your file system...")
Most implementations of the write()
system call inside the kernel are going to use the same code to do the actual data write for both append mode and "normal" write()
calls - and for pwrite()
calls, too. The only difference will be the source of the offset used - for "normal" write()
calls the offset used will be the current file offset. For append write()
calls the offset used will be the current end of the file. For pwrite()
calls the offset used will be supplied by the caller (except that Linux is broken - it uses the current file size instead of the supplied offset parameter as the target offset for pwrite()
calls on files opened in append mode. See the "BUGS" section of the Linux pwrite()
man page.)
So appending data has to be atomic, and that same code will almost certainly be used for non-append write()
calls in all implementations.
But the "write operation" in the append-must-be-atomic requirement is allowed to return less than the total number of bytes requested:
The write()
function shall attempt to write nbyte
bytes ...
Partial write()
results are allowed even in append operations. But even then, the data that does get written must be written atomically.
What are the odds of a partial write()
? That depends on what you're writing to. I've never seen a partial write()
result to a file outside of the disk filling up or an actual hardware failure. Or even a partial read()
result. I can't see any way for a write()
operation that has all its data on a single page in kernel memory resulting in a partial write()
in anything other than a disk full or hardware failure situation.
If you look at Is file append atomic in UNIX? again, you'll see that actual testing shows that append write()
operations are in fact atomic.
So the answer to "Will multi-thread do write() interleaved?" is, "No, the data will almost certainly not be interleaved for writes that are at or under 4KB (page size) as long as the data does not cross a page boundary in kernel space." And even crossing a page boundary probably doesn't change the odds all that much.
If you're writing small chunks of data, it depends on your willingness to deal with the almost-certain-to-never-happen-but-it-might-anyway result of interleaved data. If it's a text log file, I'd opine that it won't matter anyway.
And note that it's not likely to be any faster to use multiple threads to write to the same file - the kernel is likely going to lock things and effectively single-thread the actual write()
calls anyway to ensure it can meet the atomicity requirements of writing to a pipe and appending to a file.