0

I'm confused with fsync + direct IO.

It's easy to understand the code like this:

fd = open(filename, O_RDWR, 00644);
write(fd, data, size);
fsync(fd);

In this case, write() will write the data to the page cache, and fsync will force all modified data in the page cache referred to by the fd to the disk device.

But if we open a file with O_DIRECT flag, like this,

fd = open(filename, O_RDWR|O_DIRECT, 00644);
write(fd, data, size);
fsync(fd);

In this case, write() will bypass the page cache, the write directly to disk device. So what will the fsync do, there is no dirty page in the page cache referred to by the fd.

And if we open a raw device, what will fsync do,

fd = open('/dev/sda', O_RDWR|O_DIRECT, 00644);
write(fd, data, size);
fsync(fd);

In this case, we open a raw device with O_DIRECT, there is no filesystem on this device. What will sync do here ?

Hao
  • 69
  • 1
  • 6

1 Answers1

1

The filesystem might not implement O_DIRECT at all, in which case it will have no effect.

If it does implement O_DIRECT, then that still doesn't mean it goes to disk, it only means it's minimally cached by the page cache. It could still be cached elsewhere, even in hardware buffers.

fsync(2) is an explicit contract between the kernel and application to persist the data such that it doesn't get lost and is guaranteed to be available to the next thing that wants to access it.

With device files, the device drivers are the ones implementing flags, including O_DIRECT.

Linux does use the page cache cache to cache access to block devices and does support O_DIRECT in order to minimize cache interaction when writing directly to a block device.

In both cases, you need fsync(2) or a call with an equivalent guarantee in order to be sure that the data is persistent on disk.

root
  • 5,528
  • 1
  • 7
  • 15
  • If I open a device ```fd = open('/dev/sda', O_RDWR|O_DIRECT, 00644)```, and write to the this device ```write(fd, data, size)```, how do I guarantee the data actually written on disk? Would ```fsync(fd)``` help in this case? – Hao Oct 13 '19 at 07:28
  • 1
    @Hao (Assuming Linux) Yes, `fsync()` would help (because it may cause devices to flush their volatile buffers) but you would have to `fsync()` AND check the return code from your `write()` AND `fsync()` for errors before assuming data is on disk. However this assumes you are the SOLE writer to the that **block device**. If you are talking about files in a filesystem that's another matter (see answers to https://stackoverflow.com/questions/12990180/what-does-it-take-to-be-durable-on-linux ). – Anon Dec 14 '19 at 16:20