4

We want to try our bests to avoid data loss during power failure. So I decide to use O_DIRECT flag to open a file to write data in disk. Does O_DIRECT mean that the data bypass OS cache completely? If the request returns successful to the application, does it mean that the data must have been flushed to the disk? If I open a regular file in one file system, how about the FS metadata? Is it also be flushed immediately, or is it cached?

By the way, O_DIRECT can be used in Windows? Or are there any corresponding method in Windows?

skaffman
  • 398,947
  • 96
  • 818
  • 769
flypen
  • 2,515
  • 4
  • 34
  • 51

4 Answers4

2

O_DIRECT will probably do what you want, but it will greatly slow down your I/O.
I think just calling fsync() or fflush() depending on whether you use direct file descriptor operations or FILE * should be enough.
As for the metadata question, it depends on the underlying file system and even on the hardware if you want to be extra paranoid. A hard drive (and especially a SSD) may report the operation finished but could take a while to actually write the data.

Torp
  • 7,924
  • 1
  • 20
  • 18
  • And here's how to do it on OS X :) http://stackoverflow.com/questions/2299402/how-does-one-do-raw-io-on-mac-os-x-ie-equivalent-to-linuxs-o-direct-flag – Torp Apr 23 '11 at 13:24
  • 1
    And this thread says using O_DIRECT isn't as simple as it seems: http://www.gossamer-threads.com/lists/linux/kernel/350610 – Torp Apr 23 '11 at 13:28
2

You can use O_DIRECT but for many applications, calling fdatasync() is more convenient. O_DIRECT imposes a lot of restrictions because the IOs completely bypass the OS cache. It bypasses read cache as well as write cache.

For filesystem metadata, all you can do is fsync() your file after writing it. fsync flushes the file metadata, so you can be sure that the file won't disappear (or change its attributes etc) if the power is lost immediately afterwards.

Any of these mechanisms depend on your IO subsystem not lying to the OS about having persisted data to storage, and in many cases, other hardware-dependent things (such as the RAID controller battery not running out before the power returns)

MarkR
  • 62,604
  • 14
  • 116
  • 151
1

Can I use O_DIRECT for write requests to avoid data loss during power failure?

No!

On Linux while O_DIRECT tries to bypass your OS's cache it never bypasses your disk's cache. If your disk has a volatile write cache you can still lose data that was only in the disk cache during an abrupt power off!

Does O_DIRECT mean that the data bypass OS cache completely?

Usually, but some Linux filesystems may fall back to buffered I/O with O_DIRECT (the Ext4 Wiki Clarifying Direct IO's Semantics page warns this can happen with allocating writes).

If the request returns successful to the application, does it mean that the data must have been flushed to the disk?

It usually means the disk has "seen" it but see the above caveats (e.g. data might have gone to buffer cache / data might only be in disk's volatile cache).

If I open a regular file in one file system, how about the FS metadata? Is it also be flushed immediately, or is it cached?

Excellent question! Metadata may still be rolling around in cache and not yet synced to disk even though the request finished successfully.

All of the above mean you HAVE to do the appropriate fsync() command in the correct places (and check their results!) if you want to be sure whether an operation has reached non-volatile storage. See https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/ and the LWN article "Ensuring data reaches disk" for details.

Anon
  • 6,306
  • 2
  • 38
  • 56
  • Just adding another [LWN article](https://lwn.net/Articles/752063/), it seems to suggest that even `fsync()` isn't enough to ensure consistency because (if I understand it correctly) even if `fsync()` returns success, it still may have happened that an error occured and nothing has been written. – Peter Jankuliak Oct 01 '20 at 12:42
  • 1
    @PeterJankuliak sure I should have put "assuming no bugs" but `fsync()` is going to be necessary even if it's not sufficient. See https://stackoverflow.com/a/52177578 for the retelling of your LWN article. – Anon Oct 01 '20 at 17:01
1

CreateFile can do this.

HANDLE WINAPI CreateFile(
  __in      LPCTSTR lpFileName,
  __in      DWORD dwDesiredAccess,
  __in      DWORD dwShareMode,
  __in_opt  LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  __in      DWORD dwCreationDisposition,
  __in      DWORD dwFlagsAndAttributes,
  __in_opt  HANDLE hTemplateFile
);

For dwFlagsAndAttributes you can specify FILE_FLAG_WRITE_THROUGH and FILE_FLAG_NO_BUFFERING.

If FILE_FLAG_WRITE_THROUGH and FILE_FLAG_NO_BUFFERING are both specified, so that system caching is not in effect, then the data is immediately flushed to disk without going through the Windows system cache. The operating system also requests a write-through of the hard disk's local hardware cache to persistent media.

cnicutar
  • 178,505
  • 25
  • 365
  • 392