2

One thread writes to a file (or even delete it), another one calls sendfile() for that file simultaneously for overlapping positions.

What is the expected behaviour here?

Also, If my understanding is correct, sendfile call will just add a record to socket (file description, position, length) and return to caller. (no copy to socket's buffer yet?), so even sendfile() returns before modification to file, OS depends on file existence until file is sent completely.

So what is the result of any modification until file is sent?

vkx
  • 424
  • 1
  • 7
  • 17
  • 1
    even if you delete an open file, the kernel's handle to the file (and your process's `fd`) will keep the file "alive" until it is closed. This is also why you can "move" open files without effecting their `fd` handles. as for race conditions caused by editing... well, it's a race. If sendfile operates after the editing, it will operate on the edited version. – Myst Dec 29 '17 at 00:52
  • Writes may be buffered, by the running program, the OS, and even the HD controller. Possibly even all three, in succession. Best practice is to add atomic guards when reading and writing the same file in a multithreaded program. – Jongware Dec 29 '17 at 01:01
  • 1
    @usr2564301 The HD controller is irrelevant to applications. Linux has a unified buffer cache, all processes use the same kernel file buffers. – Barmar Dec 29 '17 at 01:12

1 Answers1

3

The expected behavior is that the result is unpredictable. sendfile() is not atomic, it's effectively equivalent to writing your own loop that calls read() from the file descriptor and write() on the socket descriptor. If some other process writes to the file while this is going on, you'll get a mix of the old and new contents. Think of it mainly as a convenience function, although it also should be significantly more efficient since it doesn't require multiple system calls, and can copy directly between the file buffer and the socket buffer, rather than copying back and forth between application buffers. Because of this the window of vulnerability should be smaller than doing the read/write loop, but it's still there.

To avoid this problem, you should use file locking between the process(es) doing the writing and the one calling sendfile(). So the sequence should be:

lock the file
call sendfile()
unlock the file

and the writing process should do:

lock the file
write to the file
unlock the file

EDIT:

Actually, it looks like it isn't this simple, because sendfile() links the socket buffer to the file buffer cache, rather than copying it in the kernel. Since sendfile() doesn't wait for the data to be sent, modifying the file after it returns could still affect what gets sent. You need to check that the data has all been received with application-layer acknowledgements. See Minimizing copies when writing large data to a socket for an excerpt from an article that explains these details.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Do you know if mmap behave same way? void * p = mmap(fd0); send(fd1, p, 0, len). So, there is no way to be sure that data is sent? I mean to start editing file again safely. One way could be relying on clients acks though. Any ideas on this? – vkx Dec 29 '17 at 02:20
  • When `send()` or `sendfile()` returns, I think all the data has been copied to the socket buffer, so it's safe to modify the file then. I thought the issue you had was with another process modifying the file at the same time as you're calling `sendfile()`. – Barmar Dec 29 '17 at 02:22
  • send() works like that but I ve found this link, accepted answer says sendfile and mmap works different but original paper link in the answer does not work so I m not sure of this behaviour . https://stackoverflow.com/questions/20008707/minimizing-copies-when-writing-large-data-to-a-socket – vkx Dec 29 '17 at 02:30
  • Hmm, looks like that's a problem. So you're probably right that you need to wait for the receiver to acknowledge receiving the file before allowing writes. – Barmar Dec 29 '17 at 02:33
  • I mean according to explanation on that answer, even sendfile() returns, it s not safe to edit file. But I suspect OS actually uses copy on write on that situation, at least I hope it uses :) – vkx Dec 29 '17 at 02:35
  • 1
    It would be nice, but it doesn't sound like it does. – Barmar Dec 29 '17 at 02:39
  • If you want to append to your answer, I ve found the paper : https://systemsnotes.files.wordpress.com/2013/03/paper.pdf – vkx Dec 29 '17 at 05:21
  • @vkx Thanks, I updated the links in the referenced question. – Barmar Dec 29 '17 at 06:11